## Ultra-Low Power Inductively-Coupled Wireless Transcranial Links



Wen Li

Electrical Engineering and Computer Sciences University of California at Berkeley

Technical Report No. UCB/EECS-2019-147 http://www2.eecs.berkeley.edu/Pubs/TechRpts/2019/EECS-2019-147.html

December 1, 2019

Copyright © 2019, by the author(s). All rights reserved.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission.

By

Wen Li

A dissertation submitted in partial satisfaction of the requirements for the degree of

Doctor of Philosophy

in

Engineering – Electrical Engineering and Computer Sciences

in the

**Graduate Division** 

of the

University of California, Berkeley

Committee in charge:

Professor Jan M. Rabaey, Chair Professor Michel M. Maharbiz Professor Liwei Lin

Fall 2017

### Ultra-Low Power Inductively-Coupled Wireless Transcranial Links

Copyright 2017

By

Wen Li

#### **Abstract**

Ultra-Low Power Inductively-Coupled Wireless Transcranial Links

by

#### Wen Li

Doctor of Philosophy in Engineering – Electrical Engineering and Computer Sciences

University of California, Berkeley

Professor Jan M. Rabaey, Chair

Recent advancement in brain-machine-interface (BMI) technology has made tremendous impact on both clinical treatment and neuroscience research. It enables neural scientists to record and study signals fired by an individual neuron. To decode simple actions such as open and close a hand or rudimentary movements of an arm, signals from close to a hundred neurons must be examined [Hochberg06]. The large amount of collected neural data must be transmitted out of the skull to an external processer to translate into action or to be analyzed. To minimize the risk of infection to the patients, wireless data transmission is the preferred option. A next-generation 1024-channel implanted neural recorder that uses 20KS/s 8b Analog-to-Digital Converter (ADC) to capture neural signals can generate up to 164Mb/s data, which imposes a stringent requirement on the communication throughput of the implanted wireless transmitter (TX). In addition, TX power consumption must be kept as low as possible for longer battery life or safe wireless power delivery, while the power consumption of the external receiver (RX) chip outside the skull can be The current state-of-the-art implantable transmitters that use backscattering, pulse harmonic modulation (PHM), or ultra-wide-band (UWB) communication suffer either from limited throughput or low TX energy efficiency. Although other techniques such as ultrasound have the potential to achieve ultra-low power, they suffer from low data-rate (tens to hundreds of Kb/s) and severe loss through the skull bone. To overcome these issues, inductive coupling technique is adopted in this work. The proposed transmitter only consists of an inductor driver, which significantly simplifies the TX architecture and lowers power. Inter-symbol-interference (ISI) caused by the ringing of the coupled inductors is alleviated by series de-Q resistors. To further reduce power, the entire TX uses a single 0.5V supply. To generate low jitter TX clock under such low supply and stringent power requirement, injection locked phase-locked loop (PLL) with fully-digital background frequency tracking and spur suppression is utilized. To demonstrate the proposed architecture, a 200Mb/s transceiver is implemented in 65nm CMOS process. The 10mmX10mm coupled inductors are fabricated on standard 2-layer FR-4 PCBs, on which the chips are directly bonded. The prototype achieves 5e-11 bit-error-rate (BER) over 11mm-thick scalp and skull bone of an 8-week primordial piglet and less than 1e-12 BER over 11mm air gap. The TX PLL rms jitter is 59ps. Including PLL, the entire TX chip consumes 300uW, achieving 1.5pJ/b energy efficiency. The power consumption of the external RX chip is 37.2mW. To the author's best knowledge, the prototype achieves the highest data-rate and lowest BER among all cm-range wireless transceivers for biomedical implant applications.

To my family

## **Contents**

| Contents                                                                                       | ii  |
|------------------------------------------------------------------------------------------------|-----|
| List of Figures                                                                                | iv  |
| List of Tables                                                                                 | vi  |
| Acknowledgements                                                                               | vii |
| Chapter 1 Introduction                                                                         | 1   |
| 1.1 Thesis Organization                                                                        | 3   |
| Chapter 2 Inductive-coupling for wireless transcranial communication                           | 4   |
| 2.1 Channel model for inductively-coupled transcranial links                                   | 4   |
| 2.2 Data transmission                                                                          | 8   |
| 2.2.1 Basics of inductive-coupled data transmission                                            | 8   |
| 2.2.2 Mitigating inter-symbol-interference (ISI)                                               | 10  |
| 2.2.3 Modeling for specific absorption rate (SAR)                                              | 15  |
| 2.3 Link budget and bit-error-rate (BER) analysis                                              | 16  |
| Chapter 3 Ultra-low power clock generation for implantable transmitter                         | 23  |
| 3.1 Design challenges                                                                          | 23  |
| 3.2 Injection locking                                                                          | 24  |
| 3.2.1 Reducing oscillator noise by injection locking                                           | 25  |
| 3.2.2 Frequency tracking                                                                       | 27  |
| 3.2.3 Suppressing injection spur                                                               | 29  |
| Chapter 4 Implementation of a 200Mb/s inductively-coupled wireless transcranial transceiver 32 |     |
| 4.1 System overview                                                                            | 32  |
| 4.2 Transmitter design                                                                         | 34  |
| 4.2.1 Inductor driver circuit                                                                  | 34  |
| 4.2.2 Injection-locked phase-lock-loop (PLL) with background digital spur suppression          | 36  |
| 4.2.2.1. Injection locked VCO                                                                  | 38  |
| 4.2.2.2. Pulse generator                                                                       | 39  |
| 4.2.2.3. Pulse width comparator                                                                | 40  |
| 4.3 Receiver design                                                                            | 40  |

| iii |
|-----|
|     |
| 41  |
| 44  |
| 46  |
| 49  |
| 50  |
| 53  |
| 58  |
| 60  |
|     |

## **List of Figures**

| Figure 1.1 Applications for BMI (from left to right: pre-surgical mapping [Doerner10], motor        | or    |
|-----------------------------------------------------------------------------------------------------|-------|
| prosthetics [Ajiboye17], and mapping brain activity [Viventi11])                                    | 1     |
| Figure 2.1 Overview of BMI integrated system.                                                       | 4     |
| Figure 2.2 Various tissue model (a) Conductivity; (b) Relative permittivity                         | 5     |
| Figure 2.3 (a) HFSS coils and tissue model setup; (b) coils dimension.                              | 6     |
| Figure 2.4 Self (a) and mutual inductance (b) of the 10mmX10mm coupled inductor with 2-             | turn  |
| TX coil and 1-turn RX coil                                                                          | 7     |
| Figure 2.5 (a) Data transmission of ideal coupled inductors; (b) waveforms                          | 8     |
| Figure 2.6 Data transmission with real coupled inductors                                            | 10    |
| Figure 2.7 Waveforms for non-ideal coupled inductors. (a) TX current pulse; (b) ISI in recei        | ved   |
| voltage; (c) reduced ISI in received voltage with help of de-Q resistor                             | 12    |
| Figure 2.8 (a) TX; (b) RX coils on PCB                                                              | 14    |
| Figure 2.9 Specific absorption rate (SAR) averaged over 1 gram of tissue at the brain surface       | e for |
| 165uW of TX power at 200MHz.                                                                        | 15    |
| Figure 2.10 Inductively-coupled transceiver model to calculate BER.                                 | 16    |
| Figure 2.11 PDF of received voltage sampled at t <sub>0</sub>                                       |       |
| Figure 2.12 Statistical eye-diagram with ISI from pulse response.                                   | 19    |
| Figure 2.13 Statistical eye-diagram with ISI from pulse response and RX amplitude noise             |       |
| Figure 2.14 Statistical eye-diagram with ISI from pulse response, RX amplitude noise, and T         | ГΧ    |
| jitter                                                                                              | 21    |
| Figure 2.15 Simulated BER vs. sampling phase.                                                       | 22    |
| Figure 3.1 Schematic of a 5-stage injection locked ring oscillator and its clock waveforms          | 24    |
| Figure 3.2 (a) The waveform of phase error that highlights the difference between free runni        | ng    |
| oscillator and injection locked oscillator; (b) the phase error rejection model of injection lock   | king. |
|                                                                                                     | 25    |
| Figure 3.3 Phase noise transfer function of injection locked oscillator and PLL                     | 26    |
| Figure 3.4 Waveform of injection locked oscillator when the oscillator output is locked at $\times$ | 8, ×  |
| 10, and × 6 of reference clock frequency.                                                           | 27    |
| Figure 3.5 Injection locked VCO with frequency tracking loop.                                       | 28    |
| Figure 3.6 Clock waveform, instantaneous period, and phase error of injection locked oscilla        | ator  |
| when (a) fnat is the same as Nfref, (b) fnat is slightly less than Nfref, and (c) fnat is           |       |
| slightly more than Nfref                                                                            | 29    |
| Figure 3.7 Implementation of spur suppression                                                       | 31    |
| Figure 4.1 Architecture of 200Mb/s inductively-coupled transcranial transceiver                     | 32    |
| Figure 4.2 Schematic of inductor driver circuit.                                                    | 34    |
| Figure 4.3 Block diagram of injection locked PLL with fully digital frequency tracking and s        | spur  |
| suppression                                                                                         | -     |
| Figure 4.4 Timing diagram of clocks and shift register counter outputs.                             | 38    |
| Figure 4.5 Schematic of injection locked VCO.                                                       | 38    |
| Figure 4.6 Schematic of injection pulse generator                                                   | 39    |

| Figure 4.7 Schematic of pulse width comparator.                                               | 40   |
|-----------------------------------------------------------------------------------------------|------|
| Figure 4.8 Block diagram of low noise amplifier chain.                                        | 41   |
| Figure 4.9 Schematic of a single amplifier stage.                                             | 42   |
| Figure 4.10 Amplifier chain (a) frequency response; (b) output referred noise PSD             | 43   |
| Figure 4.11 Schematic of (a) CML frequency divider and (b) phase interpolator and CML to      |      |
| CMOS converter.                                                                               | 44   |
| Figure 4.12 Schematic of phase alignment sampler and its timing diagram.                      | 46   |
| Figure 4.13 Input and output waveforms from phase alignment sampler at different PI settings  | S    |
| during sampling phase calibration.                                                            | 47   |
| Figure 4.14 Measured output waveforms of phase alignment sampler at different PI settings     |      |
| during sampling phase calibration.                                                            | 48   |
| Figure 5.1 Die photo.                                                                         | 49   |
| Figure 5.2 Boards setup (a) side view with channel; (b) measured channel thickness; (c) TX si | ide  |
| with channel; (d) RX side with channel.                                                       | 50   |
| Figure 5.3 Measurement setup for biological channel                                           | 52   |
| Figure 5.4 PLL phase noise measurement result with (a) SSA; (b) DSO                           | 53   |
| Figure 5.5 RX eye diagram for (a) over-the-skull measurements and (b) over-the-air            |      |
| measurements                                                                                  | 54   |
| Figure 5.6 Measured bathtub curves for both over-the-skull and over-the-air channels          | 55   |
| Figure 5.7 Measured bathtub curves for (a) various TX/RX alignment offsets; (b) various space | cing |
| between TX/RX.                                                                                | 56   |
| Figure 5.8 Power breakdown.                                                                   | 57   |

## **List of Tables**

| Table 1.1 mm to cm range state-of-the-art bio-telemetries. | 2  |
|------------------------------------------------------------|----|
| Table 5.1 Comparison with the selected state-of-the-arts   | 57 |

## Acknowledgements

I would first like to thank my advisor, Prof. Jan M. Rabaey, for his support and guidance throughout my PhD time at UC Berkeley. He has been a wonderful advisor and an extremely nice person. He has been very patient to me when I was struggling to find my research direction.

In addition, I would like to thank Prof. Michel Maharbiz for his helpful inputs to my research and being so kind to be on both my qualification exam and dissertation committee. I would like to thank Prof. Liwei Lin for being so kind to be on my dissertation committee and for his valuable feedback to my dissertation. I would like to thank Prof. Vladimir Stojanovic and Prof. Paul Wright for being on my qualification exam committee and for the useful technical discussions.

Last but not least, I would like to thank all the faculty, students, and staff of Berkeley Wireless Research Center and Electrical Engineering and Computer Sciences department for their help, support, and friendship.

### **Chapter 1 Introduction**

Brain Machine Interface (BMI) system serves as the bridge between neurons and computers. It provides tools for neural scientist to understand and decode the human brain. Many applications are enabled by BMI technology (Figure 1.1). For example, to restore sensory-motor functions to paralyzed patients, electrodes are placed on the brain to record signals fired by the specific neurons, and a robotic arm can be directly controlled through the patients' thoughts [Hochberg06, Collinger13, Ajiboye17]. Other applications include pre-surgical mapping to find seizure locations prior to surgical treatment for drug-resistant epilepsy [Kuruvilla03], returning sense of touch to person with long-term spinal cord injury [Flesher16], discovering organization of cortex with multiple speech articulators [Bouchard13], and performing deep brain stimulation (DBS) for brain disorders such as Parkinson's disease [Rodriguez-Oroz05]. In most BMI applications, the large amount of data generated by the neural readout circuits needs to be transmitted outside the skull to an external processor to be decoded. Currently, one of the major obstacles for implanted neural recorder is the wires that connect neuron readout channels to the external devices outside the skull. To reduce the risk of infection caused by these wires, wireless transmission is needed. However, the large amount of information generated by hundreds and thousands of neurons demands high throughput, imposing stringent requirement on the transcranial link bandwidth. In addition, the implanted transmitter (TX) is not accessible from outside world, so its power consumption must be kept as low as possible to prolong the recorder battery life or to allow safe wireless power delivery. For implantable neural recorder to be a viable clinical solution, the challenges of throughput and energy efficiency of the implant TX must be addressed.



Figure 1.1 Applications for BMI (from left to right: pre-surgical mapping [Doerner10], motor prosthetics [Ajiboye17], and mapping brain activity [Viventi11]).

Many neural recorder works in the past have the capability of over-the-air wireless telemetry [Gao12, Borton13, Yin14, Fernandez-Leon15], but efficient wireless transmission of the neuron data over the skull remains a challenge. The current state-of-the-art implantable TX for biomedical applications either have limited data-rate or unattractive power efficiency (Table 1.1). These approaches include backscattering, impulse-radio ultra-wide-band communication (IR-

UWB), pulse harmonic modulation (PHM), and even ultrasound. In the backscattering approach [Muller14, Biederman13], an external interrogator sends electromagnetic waves to the implanted TX, and the TX transmits binary encoded data by reflecting the incoming energy back to the interrogator. Since the reflected energy is a small fraction of the energy sent out by the interrogator, its received signal strength is very small under safe radiation limit, which limits its bit-error-rate (BER). In addition, backscattering is narrowband wireless transmission in which the data is modulated on a carrier frequency, so its bandwidth is only a small fraction of its carrier frequency. The carrier frequency in wireless transcranial links is usually in the order of hundreds of mega-hertz for minimum channel loss. This usually limits the data-rate of backscatter systems to a couple mega-hertz. The IR-UWB approach [Chae08, Abdelhalim13], on the other hand, is a broadband wireless communication technique. Many IR-UWB transmitter sends unmodulated short pulses as data to the antenna, so its transmitter design can be quite simple. An IR-UWB transmitter usually consists of only a Manchester-encoder and a pulse-shaping block to satisfy FCC required spectral mask. However, the power efficiency of IR-UWB transmitter might not be sufficiently low for BMI applications because its output impedance needs to match the antenna, which is usually designed to be around  $50\Omega$ . This requires a large output stage for IR-UWB transmitter, and extra power is spent to charge and discharge the parasitic caps of its large driver. The recently proposed PHM approach [Inanlou11] takes advantages of inductive-coupling, which does not require impedance matching. However, it relies on self-resonance of high-Q inductor, and transmit data through on-off keying (OOK). The received OOK waveform is modulated by the inductor ringing, so its data-rate is limited to a small fraction of inductor self-resonance, hundreds of mega-hertz to a gig-hertz depending on the inductor size. In addition, envelope detection is usually used in PHM due to the uncertainty in the inductor resonance, which can limit the BER of the system at high data-rates. Other techniques such as ultrasound [Chang17, Seo16] has great potential for ultra-low power communication inside bio-tissue because sound propagation has lower loss than electromagnetics in media with high water content. However, their data-rate is usually limits to tens to hundreds of kilo-bits-per-second because of the high-Q of common piezoelectric materials, not to mention the significant loss it suffers when penetrating the skull bone.

|                | Modulation                | Data-rate<br>(Mbps) | TX coil size | Channel distance | Channel<br>media | BER         | TX power | TX FOM   | CMOS<br>Tech. |
|----------------|---------------------------|---------------------|--------------|------------------|------------------|-------------|----------|----------|---------------|
| [Muller14]     | Backscatter @300MHz       | 1                   | 6.5X6.5mm    | 10mm             | in-vivo          | <1e-7       | 13uW     | 13pJ/b   | 65nm          |
| [Biederman13]  | Backscatter @800MHz       | 1                   | 500X250um    | 1mm              | in-vivo          | NA          | <1uW     | <1pJ/b   | 65nm          |
| [Chae08]       | UWB                       | 90                  | 5X10mm       | NA               | NA               | NA          | 1.6mW    | 17.8pJ/b | 0.35um        |
| [Abdelhalim13] | UWB                       | 10                  | NA           | 5cm              | NA               | 5e-3        | 100uW    | 10pJ/b   | 130nm         |
| [inanlou11]    | PHM                       | 10.2                | 10X10mm      | 10mm             | NA               | 6.3e-8      | 3.52mW   | 345pJ/b  | 0.5um         |
| [Kiani15]      | Pulse Delay mod @13.56MHz | 13.56               | 30X30mm      | 10mm             | NA               | 4.3e-7      | 13mW     | 960pJ/b  | 0.35um        |
| [Mandal08]     | Impedance mod @25MHz      | 2.8; 4              | 3.5X3.5cm    | 20mm             | NA               | <1e-6; 1e-3 | 0.1mW    | 35pJ/b   | 0.5um         |
| [Jung10]       | OOK @2.4GHz               | 136                 | 8X9mm        | 20cm rat         | skin-mimic       | <1.7e-3     | 3mW      | 22pJ/b   | 180nm         |
| [Liu14]        | QPSK, QAM @900MHz         | 100                 | NA           | NA               | NA               | EVM < 6%    | 1.3mW    | 13pJ/b   | 65nm          |
| [Chen11]       | FSK @ 570M/690M           | 9.7                 | mm size      | 5mm Salir        | ne + 10cm air    | <1e-6       | 4.57m    | 4.7nJ/b  | 180nm         |
| [Harrison07]   | FSK @433MHz               | 0.33                | 470X470um    | 130mm            | air              | 3e-3        | 1.81mW   | 5.4nJ/b  | 0.5um         |
| [Chang17]      | Ultrasound                | 0.095               | 550X550um    | 8.5cm            | animal tissue    | <1e-4       | 157uW    | 1.65nJ/b | 65nm          |

Table 1.1 mm to cm range state-of-the-art bio-telemetries.

As the data generated by neural recorders increases, there is a demand for high speed implantable TX. For example, a next-generation 1024 channel neural recorder can generate around 200Mb/s uncompressed data stream [Ballini13, Ha13]. New architecture is needed to satisfy the stringent requirements on both the speed and power. In this work, current pulses modulated by binary phase shift keying (BPSK) is directly transmitted to the external RX by inductive coupling to achieve high speed transcranial communication. The inter-symbol-interference (ISI) caused by the inductor ringing is mitigated by adding de-Q resistors. To alleviate BER degradation by TX jitter, an injection locked PLL is used to generate the TX clock. The issue of limited frequency locking range and large deterministic jitter caused by the injection spur are mitigated by a fully digital frequency tracking and spur suppression loop. The prototype transceiver is fabricated in silicon, and the proposed architecture is verified through measurements in samples of piglet carcasses that mimic the human head.

### 1.1 Thesis Organization

Chapter 2 of this thesis starts the discussion from inductive coupling in the context of wireless transcranial links. The cranium channel models are investigated, and inductor designs are discussed. Then basic inductive-coupled signaling is introduced, and design equations are presented. After an in-depth investigation on inductor non-idealities such as ringing and ISI, the de-Q'ing technique that mitigates these non-idealities is discussed. At the end of the chapter, a system analysis method to predict the performance of inductive coupled transcranial links is developed. Chapter 3 focuses on the clocking generation for the implanted TX. It starts from the design challenges of conventional PLL approach, then injection locked clock generation is introduced to overcome these challenges. The issues of injection locking is discussed in details, and solution to these issues are investigated. Chapter 4 discusses circuit implementation of the 200Mb/s prototype transceiver. After a description of the transceiver architecture, the chapter continues with detailed discussion on TX inductor driver and injection locked PLL. On the receiver (RX) side, it focuses on the front-end amplifiers, clock distribution circuits, and phase alignment circuits. Chapter 5 presents the measurement results and discusses the implications. Finally, Chapter 6 concludes the thesis.

# Chapter 2 Inductive-coupling for wireless transcranial communication

This Chapter focuses on theories and system architecture of inductively coupled transcranial links. Section 2.1 develops electromagnetic models of the human cranium channel for inductive-coupled data communication. Section 2.2 discusses the architecture and signaling technique for the inductive-coupled transcranial transceiver. Inductor non-idealities and their mitigation techniques are investigated. A study on the specific absorption rate (SAR) of the link is also included in this section. Lastly, a system BER analysis method for inductive-coupled transcranial links are developed in section 2.3.

### 2.1 Channel model for inductively-coupled transcranial links



Figure 2.1 Overview of BMI integrated system.

The conceptual system diagram for a wireless transcranial link is shown in Figure 2.1. Inside the implanted neural recorder chip, the analog signals emitted by neurons are amplified and quantized by analog-to-digital converters (ADCs). The quantized bits are transmitted over the cranium channel by a transmitter (TX) on the same chip to an external receiver (RX). The cranium channel is usually 11mm-thick on average for adult humans. From the top to bottom, it consists of a 2mm-thick layer of skin, a 2mm-thick layer of fat underneath, and a 7mm-thick layer of the skull bone [Mark10a, Mark10b, Mark11a, Mark11b]. In addition, the TX is completely covered by dura and brain tissues underneath. Although the communication distance is extremely short, the channel exhibits severe loss. This is because most biological tissues, including skin, fat, and even bones, have high water content, and therefore conducts electrical current. The non-zero conductivity of the scalp and skull bones (Figure 2.2a) causes severe attenuation on the received signal power. In addition, relative permittivity of these tissue is much larger than air (Figure 2.2b), which can significantly increase the parasitic capacitance associated with the inductors and lowers its resonant frequency. The electrical properties of tissues are from [Andreuccetti97, Gabriel96a, Gabriel96b].



Figure 2.2 Various tissue model (a) Conductivity; (b) Relative permittivity.



Figure 2.3 (a) HFSS coils and tissue model setup; (b) coils dimension.

To obtain properties of coupled-coils through the cranium channel model for electromagnetic transmission, Maxwell's equations are solved numerically. A 10mm×10mm two-turn coil shown in Figure 2.3 is used as the TX inductor, and a 10mm×10mm single-turn coil is used as the RX inductor. 200um wide and 18um thick traces are used for both TX and RX coils.

The turn-to-turn spacing in the TX coil is 200um. For unmodulated binary random signals at the target  $f_h$ =200Mb/s, most of the signal energy are limited within the 200MHz band. The maximum wavelength of the frequency components that contain most of electromagnetic energy is  $\lambda_{max} =$  $\frac{c}{\sqrt{\varepsilon_r}f_h}$ , where c is the speed of light in free space and  $\varepsilon_r$  is the relative permittivity of the surrounding brain tissue, which can be as large as 100 from Figure 2.2b. With this information, the minimum wavelength,  $\lambda_{min}$ , is estimated to be 15cm, which is still much greater than the communication distance, 11mm. Therefore, the coupled inductors operate in near-field region [Balanis16], and both TX and RX inductors as well as the surrounding environment needs to be carefully included for analysis. As shown in Figure 2.3a, the TX copper coil is placed on brain tissue. 7mm-thick of bone tissues are added on top of the TX. 2mm-thick fat tissues and 2mmthick skin tissues are stacked on top of the bone, accordingly. Finally, the RX copper coil is placed on top of the skin. This structure is simulated in Ansoft HFSS electromagnetic field simulator, and 2 port scattering-parameters (S-parameters) of the structure are obtained. From the Sparameters, both TX and RX self-inductances, as well as the mutual inductance can be calculated. The result is shown in Figure 2.4. The RX self-inductance is roughly 30nH. On the other hand, the TX self-inductance (91nH) is much larger than that of RX because of the extra turn in the TX coil. The mutual inductance at low frequency is around 1nH, indicating weak coupling. The TX resonance frequency is about 312MHz, which is because of its larger inductance and the surrounding tissues with high electrical permittivity. The RX inductor resonance (2.4GHz), on the other hand, is much higher. With the electromagnetic model of the coupled inductors, the next section will discuss signaling strategy of inductively-coupled transcranial links.



Figure 2.4 Self (a) and mutual inductance (b) of the 10mmX10mm coupled inductor with 2-turn TX coil and 1-turn RX coil.

### 2.2 Data transmission

This section discusses the data transmission in inductive-coupled transcranial link in details. It starts from a brief review of basic equations for coupled-inductor and its design trade-offs. Then, inductively-coupled data transmission in the context of transcranial links is described. The section later introduces the issues of coupled-inductor parasitic. A method to alleviate the issues caused by parasitic is provided.

### 2.2.1 Basics of inductive-coupled data transmission



Figure 2.5 (a) Data transmission of ideal coupled inductors; (b) waveforms.

For an ideal coupled-inductor shown in Figure 2.5a, it can be shown that in s-domain, voltages at the TX/RX port are related to the currents flowing into the 2 ports as [Niknejad07]:

$$\begin{bmatrix} V_{RX} \\ V_{TX} \end{bmatrix} = \begin{bmatrix} sL_{RX} & sM \\ sM & sL_{TX} \end{bmatrix} \begin{bmatrix} I_{RX} \\ I_{TX} \end{bmatrix}$$
 (2.1)

where  $L_{RX}$  is the inductance seen from the RX port,  $L_{TX}$  is the inductance seen from the TX port, and M is the mutual inductance between  $L_{TX}$  and  $L_{RX}$ . Since the load of RX inductor is infinite,  $I_{TX} = 0$ . Eq. 2.1 can be simplified as:

$$V_{RX} = sMI_{TX} (2.2)$$

In time-domain, Eq. 2.2 becomes:

$$v_{RX}(t) = M \frac{d}{dt} i_{TX}(t) \qquad (2.3)$$

Eq. 2.3 suggests the received voltage waveform,  $v_{RX}$ , is simply the derivative of the transmitted current waveform,  $i_{RX}$ . If  $i_{TX}$  is binary coded return-to-zero signal as shown in Figure 2.5, then  $v_{RX}$  would be a sequence of positive and negative pulses. Since every  $i_{TX}$  pulse have 2 transitions, the derivative of  $i_{TX}$  would have 2 pulses in the same period. For example, when a bit-1 is sent to the coupled inductor, the  $v_{RX}$  waveform would be a positive pulse that corresponds to the rising edge of the  $i_{TX}$  pulse followed by a negative pulse that corresponds to the falling edge of  $i_{TX}$ (Figure 2.5b). The pulse width of  $v_{RX}$  is roughly equal to the rise/fall time of  $i_{TX}$ , and its peak amplitude is proportional to the maximum slope of  $i_{TX}$ . On the contrary, if a bit-0 is transmitted,  $v_{RX}$  would be a negative pulse followed by a positive pulse. To make a functional link of a coupled inductor pair, one can simply use a current driver to send return-to-zero current pulses to the TX coil. The received bit can be detected by strobing a voltage comparator at the either peak of the 2  $v_{RX}$  pulses. Since the first  $v_{RX}$  pulse carries the same sign as the  $i_{TX}$ , it alone can determine the transmitted bit. The  $2^{nd}$   $v_{RX}$  pulse carries redundant information. Although the  $2^{nd}$   $v_{RX}$  pulses can be used to increase signal-to-noise ratio (SNR) of the transceiver, they are discarded in this work to simplify RX design. As shown in Figure 2.5b, the comparator is triggered only at the peak of first  $v_{RX}$  pulse.

#### 2.2.2 Mitigating inter-symbol-interference (ISI)



Figure 2.6 Data transmission with real coupled inductors.

The ideal coupled-inductor model described in previous section is an over simplification. In reality, several non-idealities affect signal response of coupled inductors. First, the inductor coil has associated self-capacitance, which can be modeled as a lumped capacitor in parallel with the inductor. This parasitic self-capacitance (C) together with the inductor (L) form a resonance tank, and its resonance frequency is,  $f_0 = \frac{1}{2\pi\sqrt{LC}}$ . Usually, the inductor can only be used within its self-resonance frequency because above it, the inductor behaves like a capacitor due to parasitic capacitance (Figure 2.4a). In addition, the parasitic resistance of metal coils acts as a resistor in series with the inductor. It is characterized by the quality factor of the inductance, which is defined as  $Q = \frac{2\pi f_0 L}{R}$ , where R is the series resistance, L is the inductance, and  $f_0$  is the self-resonance frequency. For an off-chip inductor, Q can easily be greater than 10. In this work, 10mmX10mm coupled inductors are used. As described in section 2.1, the resonant frequency of the RX inductor used in this design is 2.4GHz. On the other hand the resonant frequency of the TX inductor is as low as 312MHz because 2-turn TX coil is used.

As shown in Figure 2.6, the practical coupled-inductor pair model consists of ideal coupled inductors with added parasitic resistors and capacitors. Using Kirchhoff's current law, voltages of TX and RX port can be related to their current as:

$$\begin{bmatrix} V_{RX} - (I_{RX} - sC_{RX}V_{RX})R_{RX} \\ V_{TX} - (I_{TX} - sC_{TX}V_{TX})R_{TX} \end{bmatrix} = \begin{bmatrix} sL_{RX} & sM \\ sM & sL_{TX} \end{bmatrix} \begin{bmatrix} I_{RX} - sC_{RX}V_{RX} \\ I_{TX} - sC_{TX}V_{TX} \end{bmatrix}$$
(2.4)

Assuming  $I_{RX}$  is 0, Eq. 2.4 can be solved, and  $V_{RX}$  can be rewritten in terms of  $I_{TX}$ :

$$V_{RX} = \frac{sM}{(1 + sR_{TX}C_{TX} + L_{TX}C_{TX}s^2)(1 + sR_{RX}C_{RX} + L_{RX}C_{RX}s^2) - s^4M^2C_{TX}C_{RX}} \cdot I_{TX} \quad (2.5)$$

In application of transcranial links, the coupling factor of the coupled inductors, k, is very small because of the relatively large gap between them. Therefore, it can be assumed that:

$$k = \frac{M}{\sqrt{L_{TX}L_{RX}}} \ll 1 \quad \Rightarrow \quad M^2 \ll L_{TX}L_{RX} \tag{2.6}$$

With this assumption, the  $2^{nd}$  term in the denominator of Eq. 2.5 can be neglected comparing to the  $1^{st}$  term. Eq. 2.5 can be written as:

$$V_{RX} = \frac{sM}{(1 + sR_{TX}C_{TX} + L_{TX}C_{TX}s^2)(1 + sR_{RX}C_{RX} + L_{RX}C_{RX}s^2)} \cdot I_{TX}$$
 (2.7)

Using the definition of resonance frequency and quality factor, Eq. 2.7 can be rewritten as:

$$V_{RX} = \frac{sM}{\left[1 + \frac{s}{2\pi Q_{TX} f_{TX}} + \left(\frac{s}{2\pi f_{TX}}\right)^2\right] \left[1 + \frac{s}{2\pi Q_{RX} f_{RX}} + \left(\frac{s}{2\pi f_{RX}}\right)^2\right]} \cdot I_{TX}$$
 (2.8)

where  $f_{TX} = \frac{1}{2\pi\sqrt{L_{TX}C_{TX}}}$ ,  $f_{RX} = \frac{1}{2\pi\sqrt{L_{RX}C_{RX}}}$ ,  $Q_{TX} = \frac{2\pi f_{TX}L_{TX}}{R_{TX}}$ , and  $Q_{RX} = \frac{2\pi f_{RX}L_{RX}}{R_{RX}}$ . The numerator of Eq. 2.8 represents the derivative of TX current, and the denominators are 2 classic 2<sup>nd</sup> order systems. The overall response of the actual coupled-inductor can be thought of as a differentiator followed by a cascade of two  $2^{nd}$  order systems. As mentioned before, both  $Q_{TX}$  and  $Q_{RX}$  are quite large, so both TX and RX coils are under-damped, resulting in 2 pairs of complex conjugate poles in the overall system. This leads to ringing in the pulse response (the received voltage waveform when only bit-1 is transmitted once). As shown in Figure 2.7, the 2  $v_{RX}$  pulses of an otherwise ideal coupled inductors are heavily distorted by this ringing. The large amplitude, low frequency ringing is caused by the TX coil, and the higher frequency, smaller amplitude ringing is caused by the RX coil. For a data-rate of 200Mb/s, the bit-time is only 5ns, and  $i_{TX}$  pulse width is roughly 2.5ns to allow sufficient time for return-to-zero operation. The low frequency TX inductor ringing can cause the received voltage pulse  $(v_{RX})$  span much longer than one bit-time. As shown in Figure 2.7, the amplitude of ringing after 5ns is as large as the signal amplitude and lasts until ~20ns. Since the inductively coupled channel is a linear time invariant (LTI) system, for an actual data stream, this ringing will add to the subsequent  $v_{RX}$  pulses and thus will cause severe BER degradation. This issue is well known as ISI. In the presence of this ringing-induced ISI, one must wait until the ringing dies down to transmit the next bit, which significantly lowers the date-rate.



Figure 2.7 Waveforms for non-ideal coupled inductors. (a) TX current pulse; (b) ISI in received voltage; (c) reduced ISI in received voltage with help of de-Q resistor.

The issue of ISI has been well studied in the field of wireline data communication, and there are many proven methods to mitigate its effects. The common approaches include linear equalization and decision feed-back equalization (DFE). Linear equalization commonly uses a filter in the RX to invert the frequency response of the channel, making the overall frequency response of the link an all pass filter that is free of ISI. This linear equalization filter can be easily implemented if the channel has a low pass characteristic as in most backplanes, because a simple high-pass filter is sufficient. In the context of inductive coupling, however, the equalization filter is not practical. To filter out ringing, the equalization filter needs to have a high-Q notch at the exact ringing frequency. This ringing frequency depends on inductor resonance and it is very sensitive to electrical properties of surrounding biological tissues, PCB trace variations, bond wire length, and even on-chip parasitic capacitance like electrostatic discharge diode (ESD) capacitance. The uncertainties in the value of the resonance makes linear equalization extremely difficult to

implement. On the other hand, DFE cancels ISI in time-domain. It calculates the amplitude of ISI's at the detection instance (post-cursors) based on the current received bit, and subtracts them from subsequent bits. The DFE approach is effective in ideal situation. However, the implanted TX can introduce large amount of random jitter, causing the received  $v_{RX}$  pulses to randomly shift in time. Since TX ringing frequency is comparable to data-rate, post-cursor amplitude is very sensitive to timing error of the sampling instance. Because of TX random jitter, post-cursor amplitude varies wildly from bit to bit, which makes DFE not practical in this context.

To mitigate the issue of ringing induced ISI, this work adds series de-Q resistors to lower the quality factor of the TX inductor and to make its response critically damped. From Eq. 2.8, the low frequency ringing is caused by a pair of complex conjugate poles from the 2<sup>nd</sup> order RLC response of TX inductor. If  $Q_{TX} = \frac{1}{2}$ , Eq. 2.8 becomes:

$$V_{RX} = \frac{sM}{\left(1 + \frac{s}{2\pi f_{TX}}\right)^2 \left[1 + \frac{s}{2\pi Q_{RX} f_{RX}} + \left(\frac{s}{2\pi f_{RX}}\right)^2\right]} \cdot I_{TX}$$
 (2.9)

In this way, the pair of complex conjugate poles from Eq. 2.8 are removed, and the two real poles in Eq. 2.9 at  $2\pi f_{TX}$  give a low-pass response. Since TX inductor resonant frequency,  $f_{TX}=312MHz$ , is much greater than the data-rate (200MHz), the newly formed low-pass filter does not disperse the pulse wide enough to give additional ISI. This is verified by the pulse response of coupled inductors with de-Q'ed TX coil shown in Figure 2.7c. In the new pulse response, the positive and negative  $v_{RX}$  pulses that corresponds to the rising and falling edge of  $i_{TX}$  are clearly visible, and the pulse response remains flat around 0 after 5ns. The small amplitude ringing around 2.4GHz caused by RX inductor is still present, but does not have significant negative effect on BER. An additional low pass filter in RX front-end circuit can completely remove this high frequency ringing. To find the required de-Q resistor value, the definition of quality factor can be used:

$$Q_{TX} = \frac{2\pi f_{TX}' L_{TX}}{R_{TX}} = \frac{1}{2}$$
 (2.10)

where  $f_{TX}'$  is the effective resonance frequency, which is not necessarily equal to inductor self-resonance,  $f_{TX}$ . To find  $f_{TX}'$ , the effect of ESD diode capacitance and TX driver output capacitance on the inductor resonance must be considered. The actual resonance with these additional parasitic capacitors is:

$$2\pi f_{TX}' = \sqrt{\frac{1}{L_{TX}(C_{TX} + C_p)}}$$
 (2.11)

where  $C_p$  is the lump sum of ESD diode capacitance and TX driver output capacitance. In 65nm CMOS process, 1.2pF is a good estimate for  $C_p$ . From Figure 2.4,  $L_{TX} = 91nH$ , and  $f_{TX} = \frac{1}{2\pi\sqrt{L_{TX}C_{TX}}} = 312MHz$ , and  $C_{TX}$  is 2.86pF. With these parameters, the actual TX resonance,  $f_{TX}'$ ,

becomes 261MHz. Using Eq. 2.10, TX resistance,  $R_{TX}$ , must be 300 $\Omega$ . Since TX coil resistance is negligibly small, 300 $\Omega$  de-Q resistance must be added to the coil.



Figure 2.8 (a) TX; (b) RX coils on PCB.

As shown in Figure 2.8, the coupled inductors are fabricated in standard 2-layer FR-4 PCBs. As described in Section 2.1, the dimensions of both TX and RX coils are  $10\text{mm} \times 10\text{mm}$ . 200um wide copper traces are used. TX inductor is implemented with a 2-turn coil to increase signal strength coupled to the RX coil. Unfortunately, the 2-turn TX structure with the surrounding brain tissue lowers its self-resonance to 312MHz. Thanks to de-Q technique, the ISI effect can be mitigated. Since the self-resonance of the inductor is actually caused by distributed LC network, the series de-Q resistor should be distributed along the coil to maximize its effect. To accomplish this, the  $300\Omega$  series de-Q resistance is split into seven identical surface-mount (SMT) resistors evenly distributed on the TX coil. Instead of using SMT resistors with nominal  $42\Omega$  resistance, a slightly smaller value  $(36.5\Omega)$  is used due to the availability of the SMT resistors of the right size and value.

### 2.2.3 Modeling for specific absorption rate (SAR)



Figure 2.9 Specific absorption rate (SAR) averaged over 1 gram of tissue at the brain surface for 165uW of TX power at 200MHz.

The strength of electromagnetic waves inside human tissue must be under the safety limit for health concerns. This limit is commonly measured in specific absorption rate (SAR), which is defined as the electromagnetic power per unit weight averaged over 1 gram of tissue within the frequency range of 100kHz to 10GHz. The federal communication committee (FCC) specifies the SAR limit to be 1.6W/kg [FCC13]. In the proposed return-to-zero inductive coupling scheme, power spectral density of the TX current is roughly a sinc function with main lope width of 400MHz. Since the generate magnetic field in the tissue is the derivative of the TX current, its frequency content is 0 at DC and gradually increases with frequency. It peaks around 200MHz before returning to 0 at 400MHz. This frequency range is well within the coverage of FCC SAR regulation. Therefore, careful investigation is needed to verify that the proposed inductive coupling technique satisfies the SAR limit. Since calculating SAR within the tissue requires numerically solving Maxwell's equations for the proposed inductor structure, the electromagnetic simulation tool, Ansoft HFSS, is used. In the electromagnetic simulation, sinusoidal TX signal is used instead of the real data waveform for simplicity. So a 200MHz TX signal is injected into the de-Q'ed TX inductor to approximate the power spectral density of real binary data stream. The same tissue dielectric models described in section 2.1 are used. To guarantee SAR of the design is well below the safety limit, an extremely pessimistic TX power level of 165uW is used in the simulation. Chapter 5 will show that 165uW is equal to the measured power consumption of the TX inductor driver, which is much higher than the actual transmitted power. Figure 2.9 shows the simulated average SAR over 1 gram of tissue at the surface of the brain, where the radiation is the

strongest. The maximum SAR slightly below the center of the TX coil is around 0.02W/kg, which is around 80 times lower than the FCC's limit. This result shows the SAR limit can be easily satisfied using inductively coupled transcranial communication techniques.

### 2.3 Link budget and bit-error-rate (BER) analysis

In any communication links, signal-to-noise ratio (SNR) needs to be above a certain limit to achieve the required BER. If BER is too high, the link cannot operate reliably. In most wireless systems, the calculation of the required SNR to satisfy the BER limit can be quite complicated. It depends on modulation scheme, channel variation, as well as interference and blocker signal levels. Thanks to the simplicity of the inductively coupled transceiver architecture, its budget analysis is straightforward. In this section, a simple and effective method to find statistical eye diagram and to directly calculate BER from pulse response is developed. The method takes advantages of numerical analysis tools like Matlab to calculate BER of the overall link with the effect of all the present error sources.



Figure 2.10 Inductively-coupled transceiver model to calculate BER.

We start from the simple transceiver model shown in Figure 2.10. The transmitter is modeled by a simple current driver that sends return-to-zero pulses to the coupled inductor. Since the amplitude of this TX current pulses is quite large compared to any current noise, any amplitude noise of  $I_{TX}$  is ignored. However, the TX clock can be noisy due to stringent TX power requirement. As a result, the TX clock jitter is modeled as a random shift in time of the transmitted  $I_{TX}$  pulses with variance of  $t_{j,rms}^2$ . The RX front-end is modeled as a gain block with gain of  $A_v$  followed by a low-pass filter that models the frequency response of the amplifier as well as any added filter to reject RX inductor ringing. Any thermal and flicker noise in the RX circuit can be lumped into one random variable,  $V_{n,rms}$ , and is added to the output amplitude of the RX.



Figure 2.11 PDF of received voltage sampled at t<sub>0</sub>.

Since the overall system from  $i_{TX}$  to  $v_{RX}$  is linear-time-invariant (LTI),  $v_{RX}$  waveform can be calculated by convolving the pulse response of the overall system with the transmitted bit sequence. In fact, one can directly plot statistical eye diagram from pulse response alone with the assumption that the data bits are stationary with equal probability of being 0 and 1 [Stojanović04]. This method is best explained through an example. Suppose the pulse response, p(t), of the link is the same as the one shown in Figure 2.11. Assume the bit time is T (T=1 in the example of Figure 2.11), and ignore all the noise and jitter. If the sampling instance starts at  $t_0$ , then there are 3 cursors, with  $h_0 = p(t_0)$  as the main cursor that carries the bit information. Post cursors  $h_1 = p(T + t_0)$  and  $h_2 = p(2T + t_0)$  cause ISI and corrupt main cursor. Let  $v_{RX}(t)$  be the received signal waveform at the output of the RX amplifier chain when a binary random sequence of equally probable values is transmitted. Let  $u(n, \tau)$  denote the received binary signal amplitude at instance n, and sampling phase  $\tau$ , such that  $\tau = mod(t, T)$ . Then,  $x(n, \tau) = v_{RX}(nT + \tau)$  and  $\tau \in [0, T]$ . In this example,  $\tau = t_0$ , and only 3 non-zero cursors are present, so the value of each  $u(n, \tau)$  is affected by 2 previous samples.

$$v_{RX}(t) = x(n, t_0) = h_0 d(n) + h_1 d(n-1) + h_2 d(n-2)$$
 (2.12)

where  $d(n) = \{-1, +1\}$  is the transmitted binary data sequence. Assuming samples d(n) is uncorrelated, the probability density function (PDF) of the 3 terms in Eq. 2.12 can be written as:

$$\begin{cases} PDF_{h0}(x) = 0.5\delta(x - h_0) + 0.5\delta(x + h_0) \\ PDF_{h1}(x) = 0.5\delta(x - h_1) + 0.5\delta(x + h_1) \\ PDF_{h2}(x) = 0.5\delta(x - h_2) + 0.5\delta(x + h_2) \end{cases}$$
(2.13)

Then the PDF of the received amplitude at sampling phase  $t_0$  is simply the convolution of the PDF of the 3 cursors:

$$PDF(x,t_0) = PDF_{h0} \otimes PDF_{h1} \otimes PDF_{h2}$$

$$PDF(x,t_0) = \sum_{i=0}^{n} 0.125 \cdot \delta(x \pm h_0 \pm h_1 \pm h_2) \qquad (2.14)$$

The resulting PDF of Eq. 2.14 is plotted in Figure 2.11. Statistical eye diagram is a visualization of the PDF as a function of x and t. To plot eye diagram, we need to calculate the PDF(x) of received amplitude for every sampling phase  $t \in [0, T]$ :

$$PDF(x,\tau) = \sum_{i=0}^{N} 0.5\delta\{x \pm p[-NT + t]\} \otimes \sum_{i=0}^{N} 0.5\delta\{x \pm p[(-N+1)T + t]\} \dots \otimes \sum_{i=0}^{N} 0.5\delta\{x \pm p[NT + t]\}$$

$$(2.15)$$

With Eq. 2.15, the statistical eye diagram for the inductively coupled transcranial link can be plotted using the simulated pulse response shown in Figure 2.7c. The RX front-end has a gain of 59dB, the single-pole low-pass filter has a bandwidth of 330MHz, and amplitude noise and jitter are ignored. The eye diagram is shown in Figure 2.12. Each color of each pixel in the plot represents the probability of receiving a signal at amplitude x and sampling phase  $\tau$ . The 2 eye openings correspond to the 2  $v_{RX}$  pulses in the pulse response.



Figure 2.12 Statistical eye-diagram with ISI from pulse response.

Amplitude noise in the system can be modeled as random variable with a zero-mean Gaussian PDF function and a standard deviation  $V_{n,rms}$ . Its effect on the system is another additive term to Eq. 2.15. Therefore,  $PDF_{Vn}$ , the PDF with the effect of both pulse response and amplitude noise is the convolution of the previous noiseless PDF and the PDF of the amplitude noise in x dimension:

$$PDF_{V_n}(x,\tau) = PDF(x,\tau) \otimes^x \frac{1}{\sqrt{2\pi}V_{n,rms}} e^{-\frac{x^2}{2V_{n,rms}^2}}$$
 (2.16)

where the symbol  $\bigotimes^x$  denotes convolution in x. With RX simulated  $V_{n,rms} = 20mV$  as shown in section 4.3.1, the resulting eye diagram is plotted in Figure 2.13. Note the outlines of the eye are spread wider by amplitude noise. The minimum eye height and width are significantly reduced.



Figure 2.13 Statistical eye-diagram with ISI from pulse response and RX amplitude noise.

Random jitter affects the eye diagram in a similar way as amplitude noise, but the convolution is in phase domain (t) instead of amplitude domain (x). Assuming TX random jitter also has zero-mean Gaussian PDF with standard deviation of  $t_{j,rms}$ , then the PDF of received signal with the effect of pulse response, amplitude noise and jitter is:

$$PDF_{V_n,t_j}(x,\tau) = PDF_{V_n}(x,\tau) \otimes^{\tau} \frac{1}{\sqrt{2\pi}t_{j,rms}} e^{-\frac{\tau^2}{2t_{j,rms}^2}}$$
(2.17)

where the symbol  $\otimes^{\tau}$  denotes convolution in t.



Figure 2.14 Statistical eye-diagram with ISI from pulse response, RX amplitude noise, and TX jitter.

The resulting eye diagram with the effect of pulse response, amplitude noise and jitter is shown in Figure 2.14, with pessimistic assumption that  $t_{j,rms} = 60ps$ . The simulated eye width and height are further reduced. From the eye diagram, the BER versus sampling phase plot (bathtub curve) can be obtained. As shown in Figure 2.15, less than 1e-15 BER is achievable within a sampling window of 1.5ns.



Figure 2.15 Simulated BER vs. sampling phase.

In summary, this chapter presents the electromagnetic model of human cranium channel. The design and layout of a 10mm × 10mm coupled inductors for transcranial wireless data communication are discussed in details. To obtain a concrete analytical model, the coupled inductor structure inside the environment of the human skull is simulated in electromagnetic simulator, Ansoft HFSS. With the help of the coupled inductor model, the channel effects and pulse response are investigated in the proposed 200Mb/s return-to-zero modulation scheme. Intersymbol-interference caused by inductor ringing severely limits achievable communication throughput. A simple and straightforward inductor de-Q technique is proposed to address this issue. Using the simulated pulse response of the coupled inductors along with TX clock jitter, RX amplifier noise and gain, the transceiver eye diagram and BER can be obtained through a simple statistical analysis method developed in this chapter.

# Chapter 3 Ultra-low power clock generation for implantable transmitter

### 3.1 Design challenges

As shown in section 2.3, TX jitter has the single biggest impact on the BER of the link. Thus, TX clock generation is the most critical component of the transceiver circuit design. The 200MHz high frequency TX clock needs to be generated on chip from a low frequency external reference. For most bio-implant applications, this low frequency reference is either provided by crystal oscillator on the implant or inductively coupled in from an external clock source. In either case, the random jitter from the reference source is much cleaner than the low power oscillator on the TX chip. A wide range of reference frequency (10MHz~50MHz) can be used. The selection of reference frequency is usually constraint by the availability, cost, and the size of the parts. In this work, 10MHz reference is used because it is widely available in testing; most test equipment has a 10MHz output reference port readily available. The clock generation approach proposed here is not limited to only 10MHz reference.

The most important criteria of TX clock generation are total jitter and power consumption. Since the TX chip is to be implanted, the inductor current driver as well as clock generator must consume as little power as possible to prolong battery life or to allow safe wireless power delivery. In addition, neural recorder circuits demand the TX chip supply to be 0.5V to satisfy stringent power constraint, so clock generator must operate under the same low supply voltage. Conventional clock generation techniques like phase-locked loop (PLL) have many issues in this context. First, constrains on power consumption come with the sacrifice on oscillator phase noise. Ring oscillators can be extremely noisy operating in micro-watt range. Although low power LCoscillator potentially alleviates this problem, the size of the inductor is usually too large to be attractive at the required operating frequency of 200MHz. Oscillator phase noise has a low-pass shape, and PLL acts as a high-pass filter to reject oscillator noise. For conventional PLL, the bandwidth of this high-pass noise rejection filter is usually limited to  $\frac{1}{10}f_{ref}$  for stability reason [Cowles10]. As a result, significant amount of oscillator noise passes through to the clock output because of the low noise rejection bandwidth. To make the matter worse, charge pump circuit, which is an essential building block of conventional PLL, cannot operate under 0.5V supply. This is because under 0.5V supply, operational amplifiers that is needed to reduce mismatch between charging and discharging current does not have sufficient headroom. Due to these challenges, conventional PLL cannot meet stringent requirements for clock generation of implanted TX. Fortunately, injection locking comes to rescue. The next section discusses this technique in details.

### 3.2 Injection locking



Figure 3.1 Schematic of a 5-stage injection locked ring oscillator and its clock waveforms.

Injection locking technique, also known as phase realignment, is a well-known method to reduce oscillator phase noise [Ye02, Helal08, Lee09, Razavi04]. Its key advantage over conventional PLL is its large noise rejection bandwidth with respect to oscillator. This is best understood through an example. For an injection locked ring oscillator illustrated in Figure 3.1, it consists of a 5-stage ring oscillator and a pull-down switch. The incoming injection clock ( $CK_{inj}$ ) that runs at reference frequency consists of short pulses. The injection pulses turn on the injection switch for a short amount of time every reference cycle and pulls the output clock ( $CK_{out}$ ) to ground. This phase alignment by injection resets any phase error between the oscillator and the injection clock. In between injection pulses, the oscillator runs at its own natural frequency undisturbed. The oscillator errors are accumulated during this period. This phase error correction mechanism is illustrated in Figure 3.2. The free running oscillator acts as a phase error integrator, and its output phase error,  $\Phi_e$ , grows with time. This is also the reason why free running oscillator cannot be used directly as a clock source. For an injection locked oscillator, the injection clock simply pulls  $\Phi_e$ , back to 0 periodically. In between injection pulses, the time domain waveform of  $\Phi_e$  is the same as free running oscillator.

#### 3.2.1 Reducing oscillator noise by injection locking



Figure 3.2 (a) The waveform of phase error that highlights the difference between free running oscillator and injection locked oscillator; (b) the phase error rejection model of injection locking.

The effect of injection on  $\phi_e$  can be modeled as a feedforward error cancellation circuit shown in Figure 3.2b. For every reference cycle, the oscillator phase error,  $\phi_{e,osc}$ , is sampled, and the held value is subtracted from itself to produce the output phase error,  $\phi_{e,inj}$ , after injection. In time-domain, the waveform of the output phase error can be written as:

$$\phi_{e,inj}(t) = \phi_{e,osc}(t) - \left\{ \phi_{e,osc}(t) \sum_{-\infty}^{\infty} \delta(t - nT_{ref}) \right\} \otimes p(t)$$
where 
$$p(t) = \begin{cases} 1 & \text{if } 0 \le t \le T_{ref} \\ 0 & \text{otherwise} \end{cases}$$
(3.1)

To find the phase error transfer function from oscillator phase error to the output injection locked output, we need to take Fourier transform of Eq. 3.1. The result is:

$$\Phi_{e,inj}(f) = \Phi_{e,osc}(f) - \frac{1}{2\pi} \left\{ \Phi_{e,osc}(f) \otimes \frac{2\pi}{T_{ref}} \sum_{-\infty}^{\infty} \delta\left(f - \frac{1}{T_{ref}} \cdot n\right) \right\} \cdot \frac{\sin(\pi f T_{ref}) e^{-j\pi f T_{ref}}}{\pi f}$$
(3.2)

Note Eq. 3.2 is a periodic function with period of  $\frac{1}{T_{ref}}$ . The random phase noise of oscillator is usually band-limited to very low frequencies, so if Eq. 3.2 is applied to random phase noise, any non-zeros terms outside  $\pm \frac{\pi}{T_{ref}}$  can be ignored. Let  $\Phi_{n,osc}$  be oscillator phase noise, and  $\Phi_{n,inj}$  be output phase noise of after injection locking is applied, then:

$$\Phi_{n,inj}(f) = \left| 1 - \frac{\sin(\pi f T_{ref}) e^{-j\pi f T_{ref}}}{\pi f T_{ref}} \right| \Phi_{n,osc}(f)$$
 (3.3)

And the phase noise transfer function is:

$$H(f) = 1 - \frac{\sin(\pi f T_{ref}) e^{-j\pi f T_{ref}}}{\pi f T_{ref}}$$
(3.4)

Similar to conventional PLL, the phase noise transfer function of injection locked oscillator in Eq. 3.4 is also a high-pass filter. The low frequency oscillator phase noise is rejected by injection locking. However, the noise rejection bandwidth of injection locked oscillator is much larger than that of conventional PLL. To highlight this difference, the oscillator noise transfer function of a conventional critical damping PLL with a bandwidth of  $\frac{1}{10}f_{ref}$ , as well as that of an injection locked oscillator are plotted in Figure 3.3. As shown, the noise rejection ratio (|H(f)|) of the injection locked oscillator is significantly better than that of the conventional PLL below 3.5MHz, where the oscillator phase noise is most pronounced.



Figure 3.3 Phase noise transfer function of injection locked oscillator and PLL.

#### 3.2.2 Frequency tracking



Figure 3.4 Waveform of injection locked oscillator when the oscillator output is locked at  $\times$  8,  $\times$  10, and  $\times$  6 of reference clock frequency.

Despite its large noise rejection bandwidth, the injection locked oscillator has several drawbacks. Some of its issues are so severe that it cannot not be directly used as clock generator without modification. One of these issue is limited locking range. As illustrated in Figure 3.1, the only function of injection locking is to align the falling edge of the oscillator output to the rising edge of injection clock every reference period,  $T_{ref}$ . It does not keep track of how many oscillator cycles have passed from one injection pulse to the next. As a result, the oscillator can be locked to any frequency that satisfies  $T_{ref} = N \cdot T_{osc}$ , where N is integer. The 3 example waveforms in Figure 3.4 ( $T_{ref} = 8T_{osc}$ ,  $T_{ref} = 10T_{osc}$ ,  $T_{ref} = 6T_{osc}$ ) are all valid after locking. In reality, the locked frequency depends on the natural frequency of the oscillator. The locking ratio between oscillator output frequency and reference frequency, N, also known as divider ratio in conventional PLL's, is the integer that is closest to the ratio between oscillator natural frequency ( $f_{nat}$ ) and reference frequency ( $f_{ref}$ ). Thus, the locking range can be calculated as:

$$\frac{1}{(N+1)f_{ref}} < \frac{1}{f_{nat}} < \frac{1}{(N-1)f_{ref}} \tag{3.5}$$

Eq. 3.5 imposes a stringent requirement on the oscillator natural frequency when *N* is large. In modern CMOS processes, supply, process, or temperature (PVT) variation can easily push the oscillator out of this locking range, thus causing locking failure.

To alleviate the issue of limited locking range, the oscillator needs to be first modified to enable adjustment of its natural frequency. One common method is to use a voltage-controlled-oscillator (VCO) shown in Figure 3.5. The voltage controlled current sources are added to the inverters in the 5-stage ring oscillator shown in Figure 3.5. The control voltage of the current sources,  $V_c$ , changes the inverters' delay by modulating their pull-down strength. This in turn adjusts the natural frequency of the VCO. In addition, the oscillation frequency is monitored and continuously calibrated by a frequency tracking loop. This calibration loop counts the number of oscillation cycles between injection pulses and adjusts  $V_c$  through a digital-to-analog converter (DAC). If the counter is greater than 20 (value of N in this system), the control code to the DAC is subtracted by 1 from its previous value to decrease  $V_c$ . On the other hand, the control code remains unchanged.



Figure 3.5 Injection locked VCO with frequency tracking loop.

#### 3.2.3 Suppressing injection spur



Figure 3.6 Clock waveform, instantaneous period, and phase error of injection locked oscillator when (a)  $f_{nat}$  is the same as  $Nf_{ref}$ , (b)  $f_{nat}$  is slightly less than  $Nf_{ref}$ , and (c)  $f_{nat}$  is slightly more than  $Nf_{ref}$ .

In addition to limited locking range, another issue with the injection locked oscillator is its large injection spur [Ye02, Helal08, Lee09]. To understand injection spur, one needs to examine the injection locking mechanism more closely. When the natural frequency of the oscillator,  $f_{nat}$ , is not exactly integer multiples of  $f_{ref}$ , the rising edges of injection pulses cannot be aligned exactly to the falling edge of the oscillator without disturbing it. In this case, locking is achieved by disrupting instantaneous oscillation cycle at the instance when injection occurs. Figure 3.6b illustrates this point. It also shows if the oscillator is too slow,  $Nf_{ref} > f_{nat}$ , the injection pulse reduces instantaneous oscillation period  $(T_N)$  of the last cycle right at the injection by shorting its pulse width. The amount of reduction in the instantaneous period compensates the accumulated excess phase between the injection pulses. Although under locked condition, the oscillation period on average is equal to  $N \cdot T_{ref}$ , its instantaneous period  $(T_{inst})$  is not constant. It remains  $\frac{1}{f_{nat}}$  for N-1 cycles in between injection pulses and drops to a lower value  $(T_N)$  at the cycle when the

injection occurs. Thus,  $T_{inst}$  is a periodic function with period  $T_{ref}$ . Since phase error,  $\phi_e$ , is proportional to the integral of  $T_{inst}$ , it is also a periodic function with period  $T_{ref}$ . On contrary, if the oscillator is too fast as Figure 3.6c shows,  $Nf_{ref} < f_{nat}$ , the injection pulse forces the average oscillation frequency to be equal to  $Nf_{ref}$  by increasing the instantaneous period ever N cycles. The resulting phase error is also a periodic function but in the opposite direction. This periodic variation of phase error manifests into spurs at integer multiples of  $f_{ref}$  in phase noise. In time domain, these phase errors are translated into deterministic jitter (DJ) that reduces the eye width and lowers BER. If the offset between  $f_{nat}$  and  $Nf_{ref}$  is large, the DJ can easily be hundreds of picoseconds for a 200MHz injection locked oscillator using 10MHz reference and causes severe BER degradation. Therefore, spurs must be mitigated for injection locked oscillator to be a viable solution.

To suppress injection spur, the oscillator natural frequency must be tuned to the integer multiples of reference frequency. The frequency adjustment can be accomplished by the same VCO proposed in section 3.2.2, so a foreground frequency calibration is sufficient for process and supply voltage variation. However, the natural frequency of the oscillator can be very sensitive to temperature, and may drift significantly over time. Continuous background adjustment of its natural frequency to suppress the spur is required for injection locked clock generator. Many works have studied this subject. One common way to suppress spur in background is to use a PLL loop that works in tandem with injection locking [Ye02, Lee09]. However, this approach is not feasible in bio-implantable applications because of the high supply voltage (~1V) required for PLL to operate, not to mention the potential problem of conflicts of phase error convergence between PLL loop and injection locking. A much more elegant method to deal with this issue is to measure the difference in instantaneous period prior to the injection pulse  $(T_{N-1})$  in Figure 3.6) and right at the injection instance ( $T_N$  in Figure 3.6) [Helal08]. The VCO is adjusted according to this difference, and optimal frequency is reached when  $T_{N-1} = T_N$ . Figure 3.7 shows an implementation of this approach. The spur suppression loop uses a counter to generate pulses whose pulse widths are equal to  $T_{N-1}$  and  $T_N$ , then the pulse widths are extracted by 2 integrators. Comparison result of these 2 pulse width  $(T_{N-1} - T_N)$  is used to adjust the  $V_c$  of the VCO. If the VCO is too slow, meaning  $T_{N-1} > T_N$ ,  $V_c$  is increased to speed it up; if the VCO is too fast, meaning  $T_{N-1} < T_N$ ,  $V_c$  is decreased to slow it down. The first counter output pulse  $T_1$ , instead of injection clock  $(CK_{inj})$ , is used as the reset signal of the integrator and as the strobe clock for the comparator ("sgn" block) because  $CK_{inj}$  pulse is in the middle of integration phase and will interrupt pulse width extraction. Note that there are minimal analog components in the shown spur suppression loop. Section 4.2.2.3 will show that the only mixed-signal components in spur suppression loop besides VCO, integrator and comparison circuits, can be easily implemented under 0.5V supply. This feature makes the proposed architecture very amenable to bioimplantable applications in which lower than normal supply is usually preferred to reduce power.



Figure 3.7 Implementation of spur suppression.

In summary, this chapter studies clock generation strategies for the implanted TX chip. Since the conventional PLL has severe limitation on its noise suppression bandwidth, the ultralow power VCO used in the TX chip can give rise to large random jitter that severely degrade BER of the link. Injection locking addresses this issue by resetting the accumulated oscillator phase noise every reference period. However, it has limited frequency locking range and introduces large injection spur, which causes large deterministic jitter. The limited locking range can be easily addressed by a counter-based frequency tracking loop. On the other hand, the injection spurs occur because the oscillator natural frequency is non-integer-multiple of the reference frequency. Injection pulses force the average oscillation frequency to be integer-multiple of reference frequency by changing the instantaneous period of the oscillator at the instance of the injection. Therefore, monitoring instantaneous oscillation period at injection instance and one cycle prior provides a way to continuously adjust the VCO and suppress injection spur. A background spur suppression loop is built based on this principle.

# Chapter 4 Implementation of a 200Mb/s inductively-coupled wireless transcranial transceiver

In this chapter, the circuit implementations for the 200Mb/s inductively-coupled transcranial transceiver is discussed in details. It starts with an overview of the transceiver architecture, and then detailed implementation of the TX and RX circuits are discussed. The TX discussion focusses on the inductor driver and clock generation circuit. On the RX side, low noise amplifiers, comparator circuits, clock distribution, as well as sampling phase adjustment circuits are discussed in sequence. A discussion of supporting circuitry for phase alignment and bit-synchronization is also included.

# 4.1 System overview



Figure 4.1 Architecture of 200Mb/s inductively-coupled transcranial transceiver.

Figure 4.1 shows the architecture of the 200Mb/s transceiver. Main functional blocks in the TX chip consist of an injection locked PLL, an inductor driver, and a pseudo random binary sequence (PRBS-7) generator to provide testing data. The injection locked PLL takes 10MHz as reference clock and multiplies its frequency by 20X to generator 200MHz clock. The 200MHz clock is used to time the PRBS-7 generator, which provides differential random bit stream (data and  $\overline{data}$ ) at 200Mb/s. The inductor driver takes both data and  $\overline{data}$  as input and converts the bits into TX output current signal,  $I_{TX}$ . The 200MHz clock is also used by the inductor driver to

produce return-to-zero  $I_{TX}$  waveform. The details for the driver circuit are discussed in section 4.2.1.

The binary encoded TX current,  $I_{TX}$ , is coupled to the input of the RX by off-chip coupled inductors. The input signal to the RX chip,  $V_{RX}$ , consists of voltage pulses whose polarity carries the transmitted bits. As mentioned in section 2, 2  $V_{RX}$  pulses are produced for every bit transmitted. The first pulse is used to determine the bit, and the 2<sup>nd</sup> pulse is ignored for simplicity. Since the received voltage amplitude is very small, a low noise amplifier chain is used to amplify this signal to reduce the effect of comparator noise and offset. The outputs of the amplifiers  $(AMP_p)$  and  $AMP_n$ are fed into a clocked comparator, and the transmitted bit is decided. To minimize BER, the comparator must be triggered at the time when vertical eye opening is the largest. To accomplish this, a phase alignment sampler is enabled when the chip is first started up during phase calibration to measure the eye height, and a phase interpolator (PI) is used to adjust the clock phase until the maximum eye height is reached. It requires quadrature 200MHz clock inputs to enable phase adjustment over the entire unit interval. Thus, differential 400MHz inputs are provided to the RX chip, and a current-mode-logic (CML) divider is used to generate these quadrature 200MHz input clocks to the PI. In addition, a broadband buffer (eye buffer) is added to send the amplifier outputs directly off chip to plot RX eye diagram. The following sections will discuss these circuit components in details.

## 4.2 Transmitter design

#### 4.2.1 Inductor driver circuit



Figure 4.2 Schematic of inductor driver circuit.

As shown in Figure 4.2a, the output stage of inductor driver consists of a pair of NMOS transistor,  $M_{1,2}$ , and a pair of PMOS transistors,  $M_{3,4}$  [Yoshida07, Yoshida09, Miura07, Miura08, Kawai10]. Either  $M_{1,3}$  or  $M_{2,4}$  can be turned on at a time. Turning on  $M_{1,3}$  steers the direction of inductor current,  $I_{TX}$ , to the left. On contrary, turning on  $M_{2,4}$  steers  $I_{TX}$  toward the right. The transmitted bit is encoded in the direction of  $I_{TX}$ . If positive polarity of  $I_{TX}$  is defined as the direction from the right to the left, a bit 1 is represented by positive  $I_{TX}$ , and bit 0 is represented by negative  $I_{TX}$ . Transistor  $M_{5,6,7,8}$  and  $M_{9,10,11,12}$  are current starved inverters to slow down the transition edge of PMOS transistors  $M_{3,4}$ . This would in turn controls the rise and fall time of  $I_{TX}$ . To understand this,  $M_{3,4}$  can be assumed to operate in velocity saturation region when turned on. During 0 to 1 transition time of the incoming data (X transitions from high to low),  $I_{TX}$  is roughly to equal  $I_{M3}$  as the output impedance of the triode NMOS transistor  $M_1$  is low.

$$I_{TX} \approx K \frac{W}{L} \left[ (V_{dd} - V_X - V_{TH}) \cdot V_{sat} - V_{sat}^2 \right]$$
 (4.1)

where K is a process dependent parameter, W and L are the width and length of  $M_3$  respectively,  $V_{TH}$  is the threshold voltage,  $V_{sat}$  is the velocity saturation voltage, and  $V_{dd}$  is the supply voltage. The slope of  $I_{TX}$  over time is:

$$\frac{dI_{TX}}{dt} \approx -KV_{sat} \frac{W}{L} \cdot \frac{dV_X}{dt} \tag{4.2}$$

It is clear from Eq. 4.2 that the slope of inductor current  $I_{TX}$  is proportional to the slope of gate voltage of  $M_3(V_X)$ . Therefore, by controlling the rise/fall time of X and  $\overline{X}$ , the slope and rise/fall time of  $I_{TX}$  can be adjusted. Since the voltage pulse coupled to the RX chip  $(V_{RX})$  is also proportional to  $\frac{dI_{TX}}{dt}$ , the current starved inverters essentially set the amplitude and pulse width of  $V_{RX}$ . Note that the smaller the rise/fall time of X and  $\overline{X}$  is, the larger the  $\frac{dI_{TX}}{dt}$  and  $V_{RX}$  are. This suggests rise/fall time of X and  $\overline{X}$  should be minimized for larger  $V_{RX}$  and higher SNR. However, this is not the case in practice. Making rise/fall time of X and  $\overline{X}$  shorter decreases  $V_{RX}$  pulse width and makes BER more sensitive to TX jitter. In addition, the sharp transition of  $I_{TX}$  invokes larger ringing in pulse response resulting in worse ISI. As mentioned earlier, although TX inductor is de-Q'ed to eliminate its ringing, the RX inductor is still underdamped as adding resistors on RX coil increases noise and significantly reduces SNR. Through simulation, the optimal rise/fall time in this design is found to be  $\sim 1.2$ ns. The 200MHz clock signal (ck) is also used in the inductor driver circuit. It is needed to generate return-to-zero  $I_{TX}$  waveform. As shown in Figure 4.2b, the differential pseudo-random bit sequence coming out of PRBS-7 block, data and  $\overline{data}$ , are synchronized to the rising edge of ck. The gate voltages of the PMOS transistors  $M_{3,4}$ , X and  $\overline{X}$ , are gated by inverted clock,  $\overline{ck}$ , through an or-gate formed by the CMOS nor gates and the current starved inverters. Therefore, X and  $\overline{X}$  can only be 0 to allow  $I_{TX}$  to flow through the inductor during the high phase of ck (when  $\overline{ck}$  is low). The low phase of ck blocks  $I_{TX}$  flow by shutting down  $M_{3,4}$ . As a result,  $I_{TX}$  pulses last about 50% of bit-time; the other 50% time of return-tozeros phase allows inductor current to completely discharge.

# 4.2.2 Injection-locked phase-lock-loop (PLL) with background digital spur suppression



Figure 4.3 Block diagram of injection locked PLL with fully digital frequency tracking and spur suppression.

Figure 4.3 shows the block diagram of the entire injection locked PLL with frequency tracking and spur suppression loop. It consists of an injection locked VCO, a thermometer combined R-2R resistor DAC to adjust its control voltage, a pulse generator, a shifter register counter, a pulse width comparator, a digital accumulator, and arithmetic blocks to perform loop updates. The pulse generator takes the 10MHz reference and produce periodic narrow pulses as injection clock,  $CK_{inj}$ . The injection locked VCO is a 5-stage ring oscillator. Clock injection is accomplished by pulling node X to ground through a NMOS transistor. The natural frequency of

the ring oscillator is inversely proportional to control voltage  $V_c$ . The detailed implementation of the injection locked VCO will be discussed in a few paragraphs.

The output of oscillator  $(CK_{out})$  is directly connected to a shift register counter that consists of 21 standard CMOS flip-flop registers to implement a division ratio of 20. The injection pulse resets all the flip-flops to 0 except  $FF_{20}$  and  $FF_{21}$ . As shown in Figure 4.3, 1 is propagated down the flip-flop chain at rising edges of  $CK_{out}$ . Before the next injection pulse arrives,  $FF_{20}$  is set to 1 if at least 20 rising edges of  $CK_{out}$  have passed; otherwise, it remains 0. Similarly, if more than 21 rising edges of  $CK_{out}$  have passed during this time,  $FF_{21}$  is set to 1. The values of  $FF_{20}$  and  $FF_{21}$  are sampled and held by 2 extra flip-flops ( $FF_{0,20}$  and  $FF_{0,21}$ ) at the rising edge of the next injection pulse so that their values would not be destroyed by the shift register reset. Their combined output, a 2-bit bus S(1:0), is used to indicate whether correct average oscillation frequency is reached. If  $FF_{0,21} = 0$  and  $FF_{0,20} = 0$ , then S(1:0) = 0, and it means the VCO is too slow. The accumulator is decremented in this case to decrease  $V_c$  and to speed up the oscillation. If  $FF_{0,21} = 1$  and  $FF_{0,20} = 1$ , then S(1:0) = 3. In this case, the VCO is too fast, and the accumulator is incremented to slow down the oscillation. If  $FF_{0,21} = 0$  and  $FF_{0,20} = 1$ , then S(1:0) = 1. This suggests the VCO frequency is correct and stays within the range that Eq. 3.5 provides, so the accumulator does not need to be updated by the frequency tracking loop. Then the input to the accumulator is passed to spur suppression loop. The scenario that S(1:0) = 2 $(FF_{o,21} = 1 \text{ and } FF_{o,20} = 0)$  will never happen in the design because  $FF_{o,20}$  must rise to 1 first before  $FF_{o,21}$  can be set. Therefore, the multiplexor that is controlled by S(1:0) does not need a S(1:0) = 2 selection. Note that  $FF_{20}$  is reset by the rising edge of  $P_1$  (or equivalently, the rising edge of  $FF_1$  output), instead of  $Ck_{inj}$ . This is to allow sufficient hold time for  $FF_{o,20}$ . For the same reason,  $FF_{21}$  is not reset by  $Ck_{inj}$ . In fact, explicit reset of  $FF_{21}$  is not required because the subsequent  $CK_{out}$  edge forces its value to 0.

In addition to frequency tracking loop, spur suppression is needed to reduce this PLL's deterministic jitter. As mentioned earlier, monitoring the instantaneous period  $T_{19}$  and  $T_{20}$  provides a way to adjust the VCO natural frequency, and injection spur is minimized when  $T_{19} = T_{20}$ . To accomplish this, 2 pulse waveforms as shown in Figure 4.4,  $P_{19}$  and  $P_{20}$ , whose pulse width are equal to  $T_{19}$  and  $T_{20}$  are first generated by combining the outputs of  $FF_{19}/FF_{20}$  and  $FF_{20}/FF_{1}$  respectively. The pulses then, are fed into a pulse width comparator to decide which one is longer. If  $T_{19} > T_{20}$ ,  $V_c$  is decreased to raise the oscillation frequency; if  $T_{20} > T_{19}$ ,  $V_c$  is increased to lower the oscillation frequency. The pulse width comparator is triggered by the first rising edge of  $CK_{out}$  pulse after  $P_{20}$  (equivalent to the rising edge of  $P_{1}$ ). The implementation of the pulse width comparator will be discussed in section 4.2.2.3. Note spur suppression loop is only enabled after the average frequency is locked in this design. Therefore, frequency tracking and spur suppression can work in tandem to converge the PLL frequency. Thanks to these techniques, the entire PLL can operate under 0.5V while consuming only 130uW.



Figure 4.4 Timing diagram of clocks and shift register counter outputs.

#### 4.2.2.1. Injection locked VCO



Figure 4.5 Schematic of injection locked VCO.

The schematic of injection locked VCO is shown in Figure 4.5. As mentioned before, the injection locked VCO consists of 5 stage CMOS current starved inverters. The top PMOS and NMOS transistors of the current starved inverters,  $M_2$  and  $M_3$ , are small width and minimum channel devices (65nm) to reduce power. Since the bottom current limiting transistor of the current

starved inverters,  $M_1$ , is biased to a constant voltage,  $V_b$ , by a current mirror, long channel device (400nm) is used to reduce the current mismatch. A side benefit of using long channel device for  $M_1$  is its smaller contribution to phase noise. Note the body voltage of  $M_1$  is chosen as VCO control voltage. This is because the quantization error of the DAC that generates the control voltage,  $V_c$ , causes deterministic jitter in steady state. The oscillator natural frequency is extremely sensitive to the gate voltage of  $M_1$  at this low supply (0.5V), so using the gate voltage as control knob imposes stringent requirement on the DAC resolution. On the other hand,  $g_{mb}$  of the transistor is orders of magnitude lower than  $g_m$ , so using body terminal instead of gate voltage significantly lowers  $K_{VCO}$  (the sensitivity of VCO frequency to  $V_c$ ). A simple rail-to-rail 10b R-2R DAC suffices in this design. One potential issue with this approach is the body-to-source voltage of  $M_1$ ,  $V_{bs}$ , is greater than 0, so its body-source p-n junction is forward biased. However, thanks to the low supply voltage used, the forward bias body-source current is negligible even when  $V_c = 0.5V$ .

#### 4.2.2.2. Pulse generator



Figure 4.6 Schematic of injection pulse generator.

The narrow pulse of the injection clock is generated using the simple circuits shown in Figure 4.6. It uses an and-gate to combine the reference clock  $(CK_{ref})$  and a delayed and inverted version of the same clock. The inserted delay determines the pulse width. A 5-bit delay control is added to enable the capability to adjust the pulse width of the injection clock.

#### 4.2.2.3. Pulse width comparator



Figure 4.7 Schematic of pulse width comparator.

The schematic of pulse width comparator used in spur suppression loop is shown in Figure 4.7. The incoming pulses ( $P_{19}$  and  $P_{20}$  in Figure 4.4) are connected to  $V_{ip}$  and  $V_{in}$  respectively. The pulse width ( $T_{19}$  and  $T_{20}$ ) is first converted into voltage signals ( $V_{intp}$  and  $V_{intn}$ ) by the frontend integrator ( $M_{1-6}$ ). The current sources  $M_{1,2}$  set the integration current. The integrator outputs are held on a pair of digitally controlled capacitors. The 7-bit control bus ( $D_1\langle 6:0\rangle$ ) can be adjusted to tune out the offset of the integrator. PMOS transistors  $M_{5,6}$  reset  $V_{intp}/V_{intn}$  to 0.5V before the pulses arrive. The integrated voltage difference ( $V_{intp}-V_{intn}$ ) is resolved by a classic StrongArm comparator ( $M_{7-14}$ ). Transistors  $M_{7,8}$  are the input differential pair, and auxiliary differential pair  $M_{9,10}$  is added to provide coarse offset tuning capability. The gate voltages of  $M_{9,10}$  ( $V_{refp}$  and  $V_{refn}$ ) are driven by a 5-bit differential resistor DAC. Cross-coupled pairs  $M_{11-14}$  regenerates the integrated voltage difference, and the digital output ( $D_{out}$ ) is held in an S-R latch. Note the integrator is reset a short time ( $t_d$ ) after the StrongArm comparator is triggered to provide enough hold-time margin for the comparator. The offset of this pulse width comparator is calibrated in foreground.

### 4.3 Receiver design

In this section, the detailed implementations of RX circuits are discussed. Section 4.3.1 starts discussion from the front-end low noise amplifiers. Then, the clock path including the quadrature frequency divider and phase interpolator is discussed in section 4.3.2. The implementation of StrongArm comparator circuit used in the RX is the same as in the pulse width

comparator of TX PLL, so it is not investigated here. At last, an RX clock phase alignment and RX/TX bit synchronization strategy used in the testing are presented in section 4.3.3.

#### 4.3.1 Low noise amplifier



Figure 4.8 Block diagram of low noise amplifier chain.

As shown in Figure 4.8, the RX front-end consists of a cascade 4 stage low noise amplifiers (S1-4). The input to the amplifier chain is DC coupled to the RX inductor. 2 large resistors,  $R_b$ , bias the amplifier inputs to a voltage  $V_b$  generated on chip. Each amplifier stage is resistor loaded. As mentioned earlier, ringing caused by RX inductor is present at the RX inputs, so bandwidth limiting capacitor C is added to the output of the first stage (S<sub>1</sub>) to mitigate its effect. AC coupling capacitor between S3 and S4,  $C_C$ , blocks DC offsets from the first 3 stages (S<sub>1-3</sub>). The DC wandering caused by the  $C_C$ 's is not an issue because the RX input pulses are DC balanced in the inductive coupling scheme. The offset of S4 has relatively small effect on the SNR because its input signal is amplified by S<sub>1-3</sub>. Nevertheless, an offset adjustment capability is still added to the subsequent comparator circuit to cancel offset of S<sub>4</sub>. Same architecture is used for all the amplifiers (S1-4). Since the earlier stages have large contribution to the input referred noise and require larger transistor width and bias current, later stages (S2-4) are scaled down progressively to save power. Additional capacitance  $C_{BW}$  with 4-bit control (D(3:0)) is added at the S4 output to limit noise bandwidth. The overall bandwidth of the amplifier chain is 330MHz.



Figure 4.9 Schematic of a single amplifier stage.

As shown in Figure 4.9, the amplifier stage used in RX chain is implemented using cascode common-source amplifier with resistor load. The differential pair (M1,2) uses long channel transistors to reduce flicker noise. The cascode devices (M3,4) are added to increase the output impedance of the differential pair and to reduce Mill effect on the input capacitance. Since amplifier stages are DC coupled together, the output common mode voltage determines the input common mode bias of the subsequent stage. Therefore, common mode feedback (CMFB) must be added to ensure all transistors are in saturation. This is accomplished by a folded-cascode common mode amplifier  $(M_{6-14})$ . The main amplifier common mode voltage  $(V_{cmo})$  is sensed by a pair of large resistors,  $R_b$ . And the CMFB amplifier feeds back the signal  $V_{fb}$  to the gate of the tail transistor  $M_5$ . To increase the phase margin of the CMFB loop, the tail transistor of the main amplifier is split into  $M_0$  and  $M_5$ , and only  $M_5$  is used to inject feedback current to lower the loop gain. To increase the headroom of the amplifier, 1.5V supply is used for the entire amplifier chain. The transistors are carefully biased so that none of their gate-to-source and drain-to-source voltages exceed 1V. Thanks to the relaxed power requirement for the external RX chip, the power penalty of using higher than nominal supply is not an issue.



Figure 4.10 Amplifier chain (a) frequency response; (b) output referred noise PSD.

Each amplifier stage has roughly 15dB of gain. The overall amplifier chain achieves 59dB DC gain, 330MHz bandwidth, and ~20mV output referred RMS noise. The simulated frequency response and noise power spectral density (PSD) of the entire amplifier chain are plotted in Figure 4.10.

#### 4.3.2 Frequency dividers and phase-interpolator





Figure 4.11 Schematic of (a) CML frequency divider and (b) phase interpolator and CML to CMOS converter.

As mentioned earlier, phase interpolator with full unit interval range requires quadrature phase clock inputs. In this design, a current mode logic (CML) frequency divider converts a

400MHz differential clock inputs ( $CK_i$  and  $\overline{CK_i}$ ) from an off-chip source to 200MHz quadrature phase clocks ( $CK_{0^{\circ}}$ ,  $CK_{90^{\circ}}$ ,  $CK_{180^{\circ}}$ ,  $CK_{270^{\circ}}$ ). The CML frequency divider shown in Figure 4.11a consists of 2 CML latches with resistor load ( $R_L$ ). The first latch ( $M_{0-6}$ ) is transparent in positive clock phase, and the following latch ( $M_{7-13}$ ) is transparent in the negative clock phase. 1.1V nominal supply is used for the divider.

The 10-bit phase interpolator (PI) (Figure 4.11b) uses the current mixing approach to interpolate phase between adjacent quadrature clock edges [Sidiropoulos98]. The PI core consists of 4 differential pairs ( $M_{5-12}$ ) that are controlled by the 4 quadrature clock phases. The output phase is moved by mixing the output currents from 2 of the differential pairs that are controlled by adjacent clock phases. 2 8-bit current DACs are used to adjust the relative current strength between the 2 differential pairs. The lower 8-bit of the PI control bus ( $D_1\langle 7:0\rangle$ ) is used to control DAC current. The upper 2-bit,  $D_1\langle 9:8\rangle$ , selects the quadrant by controlling the 4 switches  $M_{1-4}$ . To ensure monotonicity of the PI, edge rate control capacitor ( $C_{EC}$ ) is added to smooth out PI output ( $V_{op}$ ,  $V_{on}$ ) waveform, and long channel transistors are used to implement the 2 current DACs. To convert low swing PI output into rail-to-rail CMOS voltage levels, a simple CML-to-CMOS converter circuit is used. An AC coupling capacitor ( $C_C$ ) and a large feedback resistor ( $C_D$ ) bias the inverter at its trip point. The low swing PI output ( $C_D$ ) is amplified by the inverter gain. A dummy load is added on the other PI output to balance the loads between  $C_D$  and  $C_D$  are definition.

#### 4.3.3 Phase alignment sampler circuit



Figure 4.12 Schematic of phase alignment sampler and its timing diagram.

The schematic of phase alignment sampler circuit is shown in Figure 4.12. It is only enabled during start-up time when sampling phase is under calibration. It monitors the amplifier output eye height while the PI codes update. The sampler consists of a pair of PMOS switch  $(M_{1,2})$ , and is followed by 2 NMOS broadband source follower buffers  $(M_{3-6})$ . The output of NMOS source follower goes through a pair of NMOS switches  $(M_{7,8})$ , and is then followed by 2 PMOS source follower buffers. The outputs  $(V_{op} \text{ and } V_{on})$  are brought off chip for measurement. When enabled, the front-end switch is driven by a 50% duty-cycle clock.  $V_{sp}$  and  $V_{sn}$  track the inputs

when CK is low, and hold their voltages on  $C_1$ 's when CK is high. If the bandwidth of the frontend track-and-hold circuits ( $M_{1,2}$  and  $C_1$ 's) is sufficient, the held values of  $V_{sp}$  and  $V_{sn}$  are the same as  $V_{ip}$  and  $V_{in}$  at rising edge of CK. The subsequent track-and-hold circuits ( $M_{7,8}$  and  $C_2$ 's) are driven by  $CK_1$ , which are short pulses that occur during hold time of front-end track-and-hold. They propagate the signals held on  $C_1$  onto  $C_2$ . If the input signal is repetitive with the CK period, this circuit implements an ideal impulse sampler that triggers at the rising edges of CK. How this circuit helps sampling phase calibration is examined next.



Figure 4.13 Input and output waveforms from phase alignment sampler at different PI settings during sampling phase calibration.

During calibration, the TX is set to transmit a sequence consisting of only bit-1's. The received voltage waveform is a repetitive pattern that consists of a positive pulse followed by a negative pulse as shown in Figure 4.13. Therefore, the outputs of the amplifier chain (equivalently, the input to the phase alignment sampler,  $V_{in}$  in Figure 4.12) are periodic function with the same period as CK. If the relative phase between CK and  $V_{in}$  is fixed, the sampler output voltage,  $V_{out}$ , would be constant DC voltage which can be easily measured by an off-chip DC meter. Since the relative phase can be adjusted by the PI, the received voltage amplitude at different sampling phases can be measured with this method by sweeping the PI codes. The optimal sampling phase or PI setting is the point where the sampler output voltage is the largest.



Figure 4.14 Measured output waveforms of phase alignment sampler at different PI settings during sampling phase calibration.

Figure 4.14 shows the measured output waveforms of phase alignment sampler at different PI settings during sampling phase calibration. In this case, the optimal PI setting is 270 out of the 10-bit PI control.

In summary, this chapter discusses the detailed circuit implementation of the 200Mb/s inductive-coupled wireless transcranial transceiver. The TX chip consists of a simple inductor driver with rise/fall time control, a PRBS generator, and an injection locked PLL with fully digital frequency tracking and spur suppression loop. An injection locked ring oscillator with current starved inverters is used to implement the VCO. A pulse width comparator is used to measure the difference in the instantaneous oscillation periods at the injection instance and one cycle before. On the RX side, the received signal is amplified by a low noise amplifier chain with 59dB of gain and 330MHz bandwidth, and is then detected by a StrongArm comparator. A phase interpolator is used to adjust strobe signal of the comparator. A phase alignment sampler circuit is used in conjunction to move the trigging time of the comparator to the peak of the eye.

# **Chapter 5 Measurement results**



Figure 5.1 Die photo.

To verify the proposed architecture, a 200Mb/s inductively-coupled wireless transceiver was designed and fabricated using TSMC 65nm CMOS technology. Figure 5.1 shows the die photo. Both TX and RX are located on the same die that measures 1.5mm×1.5mm. The TX portion occupies the upper one-third of the die, and the RX portion occupies the bottom two-thirds.

The same die is directly attached and wire-bonded on both TX and RX printed circuit boards (PCB's). On the TX board, the RX chip portion is disabled by shorting its supply to ground via bond wires. On the RX board, the TX chip supply is disabled. The entire TX portion uses a single 0.5V supply, while the RX uses 1.5 supply for front-end amplifiers and 1.1V for the rest of the circuits. As shown in Figure 5.2, the 10mm×10mm coupled inductor is fabricated on a standard FR-4 2-layer PCB. The width of both TX and RX inductor coil traces is 200um. 0.5oz copper is used for the PCB top layer, which makes the trace thickness about 18um (0.7mil). The TX inductor uses a 2-turn coil, with 200um spacing in between. The 200um trace width creates just enough landing pad to mount 400um×200um (0402 metric package size) SMT de-Q resistors.

## 5.1 Measurement setup



Figure 5.2 Boards setup (a) side view with channel; (b) measured channel thickness; (c) TX side with channel; (d) RX side with channel.

The wireless link is characterized over 11mm air gap and biological channel. For the biological testing channel, 8-week primordial piglet carcasses are used to mimic the human head. As shown in Figure 5.2, the skull channel consists of roughly 7mm of bone, 2mm of fat, and 2mm of skin. Although lab measurement shows the overall thickness is 11.8mm as in Figure 5.2b, we round it as 11mm for simplicity in Table 5.1. The TX PCB is amounted inside of the skull bone (Figure 5.2c), and the RX PCB is amounted outside of the scalp (Figure 5.2d). No precision alignment between TX and RX coils are applied.

The measurement setup for biological channel is shown in Figure 5.3. The head of the piglet carcass is placed underneath the TX board to mimic the real environment inside human skull. It is important to note that the surrounding brain tissue changes the electrical property of the TX coil inductor considerably, so adding biological tissue directly underneath the TX board is essential for accurate modeling of the actual cranium channel. The RX board is visible on top of the scalp, and the TX board is underneath. Chip configuration commands are sent to TX/RX from a PC through 2 FPGA boards, one for TX and one for RX. A Keysight E4438C signal generator and a balun are used to provide 400MHz differential input clock for the RX chip. The 10MHz reference output of the signal generator provides the reference clock for the TX chip. A balun and an attenuator are used to convert the clock to differential and to reduce the reference clock amplitude to ~500mV peak-to-peak. To measure the BER of the link, a Keysight N4903B BER tester is used. A Keysight 81150A pulse generator is used to generate the sampling clock of the BER tester, and this sampling clock is synchronized to the TX reference clock through the 10MHz reference input port of the pulse generator. A Keysight Infiniium 54855A digital oscilloscope (DSO) is used to plot the RX eye diagram from the eye monitor output. The phase noise of TX PLL is plotted using a Keysight E5052B signal source analyzer (SSA). To shield the over-the-air radio interference to the RX inductor coil, a small paper cap covered with aluminum foil is placed on top of RX board. This shielding method is found to be sufficient in the indoor lab environment. It is noted that the BER of the link degrades considerably when a strong interference source is present. However, electromagnetic shielding in the form of over-the-head metal caps or helmets are readily available and are not terribly inconvenient for the patients.



Figure 5.3 Measurement setup for biological channel.

# 5.2 Results and discussion





Figure 5.4 PLL phase noise measurement result with (a) SSA; (b) DSO.

The measured 200MHz PLL phase noise plot is shown in Figure 5.4. Thanks to the spur suppression technique, the injection spur is significantly reduced to roughly -43dBc. The remaining residue spur is caused by quantization errors of the 10-bit DAC used to generate VCO control voltage. The total integrated RMS jitter which includes both random and deterministic jitter is only 58.7ps with Keysight E5052B Signal Source Analyzer (Figure 5.4a). The jitter measurement is repeated in time domain using Keysight 54855A digital sampling oscilloscope (Figure 5.4b), and the result agrees with frequency domain measurement. The entire PLL consumes a total of 130uW power, with only 20uW for VCO.



Figure 5.5 RX eye diagram for (a) over-the-skull measurements and (b) over-the-air measurements.

The eye diagram at the output of RX eye buffer is shown in Figure 5.5. Note that 2 eyes are present because in this design, 2 transitions in every TX current pulse generate 2 RX voltage pulses through inductive coupling. Note that the eye diagrams for both over-the-skull and over-the-air measurements are not symmetrical about the center horizontal line. This is mainly caused by nonlinearity of the RX amplifiers. The amplitude of the last amplifier stage is quite large, causing significant second order distortion. In addition, the second eye is considerably worse than the first eye in both cases. This is caused by different rise and fall time of the current starved inverters that control the PMOS's of the inductor driver in Figure 4.2a. Since the PMOS's current ( $I_{TX}$ ) is proportional to the outputs of these inverters, the  $I_{TX}$  slope is different between when its amplitude rises and when it returns to zero. The first eye of over-the-skull measurement is smaller than over-the-air because of the larger channel loss. But over-the-skull measurement shows much less ISI than over-the-air measurement. This is because the self-resonance frequency of the TX inductor depends on the electrical permittivity of its surroundings and can be different between the two cases. The de-Q resistor values are optimized for over-the-skull channel, so the TX coil might still have residue ringing for over-the-air measurement.



Figure 5.6 Measured bathtub curves for both over-the-skull and over-the-air channels.

To generate BER versus sampling phase plot (bathtub curve), PI control codes are swept and the BER is measured at various PI settings. As shown in Figure 5.6, error free operation for over 1e13 bits is achieved for roughly 10% of UI window in over-the-air measurements. Over-the-skull measurements show considerable BER degradation. Nevertheless, a minimum BER of 5e-11 is achieved. It is worth mentioning that the measured BER is worse than the analytical result in section 2.3. This is mainly because the TX chip used for BER testing has larger-than-expected random jitter. Upon further investigation, this issue is mainly contributed by the marginal injection pulse width. The designed injection pulse width is too narrow and is prone to chip-to-chip variation, causing weak injection strength for many chips. The weak injection strength in turn reduces the VCO noise suppression bandwidth, and increases PLL random jitter. This issue can be simply fixed by adding more delay stages in the injection pulse generator shown Figure 4.6.



Figure 5.7 Measured bathtub curves for (a) various TX/RX alignment offsets; (b) various spacing between TX/RX.

To investigate the sensitivity of the inductively coupled link, over-the-air BER versus sampling phase measurements are repeated for various offsets and spacing between the TX and RX coils. As shown in Figure 5.7a, BER gradually degrades as the offset ( $\Delta X$ ) between the coils increases. Even for offsets as large as 6.4mm, roughly 64% of the coil diameter, the link still achieves almost 1e-9 BER. As the distance increases from 11mm to 17.5mm, a nearly 60% increment, the BER drops from 1e-12 to 1e-7. With these BER levels, the reliable communication can be easily achieved with forward error correction codes. These experiments demonstrate the robustness of the inductive coupling approach.



Figure 5.8 Power breakdown.

|                  | This work [Li18]   |        | [Muller14]     | [Inanlou11] | [Chae08]    | [Abdelhalim13] | [Chang17]     |
|------------------|--------------------|--------|----------------|-------------|-------------|----------------|---------------|
| Technology       | 65nm CMOS          |        | 65nm CMOS      | 0.5um CMOS  | 0.35um CMOS | 130nm CMOS     | 65nm CMOS     |
| Data-rate        | 200Mb/s            |        | 1Mb/s          | 10.2Mb/s    | 90Mb/s      | 10Mb/s         | 95Kb/s        |
| Modulation       | Inductive-coupling |        | Backscattering | PHM         | UWB         | UWB            | Ultrasound    |
| TX antenna size  | 10X10mm            |        | 6.5X6.5mm      | 10X10mm     | 10mmX5mm*   | N/A            | 0.55X0.55mm   |
| Channel distance | 11mm               |        | 10mm           | 10mm        | N/A         | 5cm            | 8.5cm         |
| Channel media    | scalp & skull      | air    | in-vivo        | N/A         | N/A         | N/A            | animal tissue |
| BER              | 5e-11              | <1e-12 | <1e-7          | 6.3e-8      | N/A         | 5e-3           | <1e-4         |
| TX power         | 300uW              |        | 13uW           | 3.52mW**    | 1.6mW**     | 100uW**        | 157uW***      |
| TX FOM           | 1.5pJ/b            |        | 13pJ/b         | 345pJ/b**   | 17.8pJ/b**  | 10pJ/b**       | 1.65nJ/b***   |

<sup>\*</sup> Antenna size estimated from PCB photo

Table 5.1 Comparison with the selected state-of-the-arts.

The power break down of the transceiver is shown in Table 5.1. The TX chip consumes a total of 300uW. The power consumption of the injection locked PLL is 130uW, and the inductor driver consumes 165uW. The rest 5uW is used by PRBS generator and inverter buffers on clock and data paths. At 200Mbs/s, the TX energy efficiency is 1.5pJ per pit. The RX chip consumes a total of 37.2mW. The most power-hungry component is the low noise amplifier chain, which consumes 25.5mW. The clock path that includes frequency divider and phase interpolator consumes 9.9mW. The slicer (comparator) and inverter buffers consume 1.8mW. As mentioned earlier, the large power consumption of the external RX chip outside the skull is not an issue. Comparing to state-of-the-art implantable wireless TX designs (Table 5.1), this work achieves the highest data-rate and the lowest BER. At the same time, it improves the TX energy efficiency by about 7X.

<sup>\*\*</sup> Does not include clock generation power

<sup>\*\*\*</sup> Power estimated by adding peak PA power and clock generator power

# **Chapter 6 Conclusion**

Recent advancement in BMI technology has enabled neural scientists to examine neural networks in details. The potential of recording individual neural signals from thousands of neurons simultaneously creates demands of implantable ultra-low power, high-speed wireless TX. This work demonstrates inductive coupling approach can satisfy the stringent requirements for implanted transcranial wireless TX. It is able to achieve reliable communication at hundreds of mega-bit-per-second data-rate with ultra-low power consumption. The key circuit and system techniques that enable the inductively coupled transcranial link are the following:

- 1. De-Q'ing the TX inductor alleviates inter-symbol-interference caused by ringing in pulse response and improves the data-rate.
- 2. To achieve ultra-low power consumption, the entire TX chip is under low supply voltage of 0.5V. All the TX circuits operate in sub-threshold region.
- 3. Injection locked PLL is used to reduce RMS jitter and improves the link BER. Fully-digital frequency tracking and spur suppression loop increase the PLL locking range and greatly reduces injection spur.

In this thesis, the theory behind inductive coupling in the context of wireless transcranial links is investigated in details. Its speed limitations and mitigation technique of inductor de-Q are presented. To study impact of channel response, amplitude noise, and jitter on the BER of the inductively-coupled link, a system level analysis method is developed. It also includes an in-depth analysis of TX jitter reduction by injection locking and a discussion on its spur mitigation technique. To verify the proposed techniques, a 200Mb/s wireless transcranial transceiver with implantable TX is fabricated in 65nm CMOS technology, and the transceiver is measured over the 11mm thick skull channel of an 8-week primordial piglet carcass. The results demonstrate 3 orders of magnitude improvement for BER over state-of-the-arts implantable TX (5e-11) and 7X reduction in TX energy efficiency (1.5 joule-per-bit).

# Forward projection

In the near future, the proliferation of BMI technology will require even higher throughput and more reliable wireless communication over cranium channel. There are still rooms to improve communication throughput using the de-Q technique proposed in this work. The simulated and measured eye diagram shows comfortable margin at 200Mb/s. With careful design of the injection-locked PLL and optimized the inductor driver, greater than 500Mb/s or even up to 1Gb/s data-rate can be achieved with this technique. Beyond 1Gb/s data-rate, active channel equalization will be needed in conjunction with inductor de-Q. Techniques such as feed-forward-equalization or decision-feed-back equalization can be employed in the external RX to avoid impact on the TX

energy efficiency. In addition, RX interference mitigation will be needed to improve the link reliability. Over-the-air radio interference can be sensed and subsequently cancelled in a more sophisticated RX design, and the bulky metal interference helmet or shields for the inductive-coupled transceiver can be removed. RX interference cancellation can be especially useful if TX power is delivered wireless from the RX module. In this case, the strong interferer imposed by the wireless power delivery circuits can be several orders of magnitude higher than the received signal strength, and mitigating this blocker signal is critical for the link reliability. In conclusion, there is tremendous potential of inductive coupling in wireless transcranial links, and a lot of more research is to be done to advance this subject.

# **Bibliography**

[Abdelhalim13] Karim Abdelhalim, Hamed Mazhab Jafari, Larysa Kokarovtseva, Jose Luis Perez Velazquez, Roman Genov, "64-Channel UWB Wireless Neural Vector Analyzer SOC With a Closed-Loop Phase Synchrony-Triggered Neurostimulator", *IEEE Journal of Solid-State Circuits*, vol. 48, no. 6, pp. 2494-2510, Oct. 2013.

[Ajiboye17] A Bolu Ajiboye, Francis R Willett, Daniel R Young, William D Memberg, Brian A Murphy, Jonathan P Miller, Benjamin L Walter, Jennifer A Sweet, Harry A Hoyen, Michael W Keith, P Hunter Peckham, John D Simeral, John P Donoghue, Leigh R Hochberg, Robert F Kirsch, "Restoration of reaching and grasping movements through brain-controlled muscle stimulation in a person with tetraplegia: a proof-of-concept demonstration", *Lancet*, vol. 389, pp.1821-1830, May 6, 2017.

[Andreuccetti97] D.Andreuccetti, R.Fossi, C.Petrucci, "An Internet resource for the calculation of the dielectric properties of body tissues in the frequency range 10 Hz - 100 GHz", Website at http://niremf.ifac.cnr.it/tissprop/. IFAC-CNR, Florence (Italy), 1997. Based on data published by C.Gabriel et al. in 1996.

[Balanis16] Constantine A. Balanis, *Antenna Theory: Analysis and Design*, 4th Edition, Wiley, 2016.

[Ballini13] M. Ballini, J. Müller, P. Livi, Y. Chen, U. Frey, A. Shadmani, I. L. Jones, W. Gong, M. Fiscella, M. Radivojevic, D. Bakkum, A. Stettler, F. Heer, A. Hierlemann, "A 1024-channel CMOS microelectrode-array system with 26400 electrodes for recording and stimulation of electroactive cells in-vitro," *Proc. Symposium on VLSI Circuits*, Kyoto, Japan, 2013, pp. C54–C55.

[Biederman13] William Biederman, Daniel J.Yeager, Nathan Narevsky, Aaron C.Koralek, Jose M. Carmena, Elad Alon, Jan M. Rabaey, "A Fully-Integrated, Miniaturized (0.125 mm²) 10.5 μW Wireless Neural Sensor", *IEEE Journal of Solid-State Circuits*, vol. 48, no. 4, pp. 960-970, Apr. 2013.

[Borton13] David A Borton, Ming Yin, Juan Aceros, Arto Nurmikko, "An implantable wireless neural interface for recording cortical circuit dynamics in moving primates", *Journal of Neural Engineering*, 10, 2013.

[Bouchard13] Kristofer E. Bouchard, Nima Mesgarani, Keith Johnson, Edward F. Chang, "Functional organization of human sensorimotor cortex for speech articulation", *Nature*, 495(7441):327-332, 2013

[Chae08] Moosung Chae, Wentai Liu, Zhi Yang, Tungchien Chen, Jungsuk Kim, Mohanasankar Sivaprakasam, Mehmet Yuce, "A 128-Channel 6mW Wireless Neural Recording IC with On-the-Fly Spike Sorting and UWB Tansmitter", 2008 IEEE International Solid-State Circuits Conference - Digest of Technical Papers, pp. 146-147, Feb. 2008.

[Chang17] Ting Chia Chang, Max L. Wang, Jayant Charthad, Marcus J. Weber, Amin Arbabian, "A 30.5mm3 Fully Packaged Implantable Device with Duplex Ultrasonic Data and Power Links Achieving 95kb/s with <10-4 BER at 8.5cm Depth", 2017 IEEE International Solid-State Circuits Conference - Digest of Technical Papers, pp. 460-461, Feb. 2017.

[Chen11] Gregory Chen, Hassan Ghaed, Razi-ul Haque, Michael Wieckowski, Yejoong Kim, Gyouho Kim, David Fick, Daeyeon Kim, Mingoo Seok, Kensall Wise, David Blaauw, Dennis Sylvester, "A Cubic-Millimeter Energy-Autonomous Wireless Intraocular Pressure Monitor", 2011 IEEE International Solid-State Circuits Conference - Digest of Technical Papers, pp. 310-311, Feb. 2011.

[Collinger13] Jennifer L Collinger, Brian Wodlinger, John E Downey, Wei Wang, Elizabeth C Tyler-Kabara, Douglas J Weber, Angus J C McMorland, Meel Velliste, Michael L Boninger, Andrew B Schwartz, "High-performance neuroprosthetic control by an individual with tetraplegia", *Lancet*, vol. 381, pp. 557-564, Feb. 16, 2013.

[Cowles10] John Cowles, "Introduction to PLLs for Frequency Synthesis", 2010 IEEE International Solid-State Circuits Conference - short course on CMOS Phase-Locked Loops for Frequency Synthesis, Feb. 2010.

[Doerner10] Steffen Doerner, Sören Hirsch, Bertram Schmidt, "Toward high electrode density grids for electrocorticographic recordings", *1st Workshop on Brain Machine Interfacing*, Magdeburg, Feb. 27, 2010,

[FCC13] "Radio Frequency Safety", Retrieved from <a href="https://www.fcc.gov/general/radio-frequency-safety-0">https://www.fcc.gov/general/radio-frequency-safety-0</a>

[Fernandez-Leon15] Jose A Fernandez-Leon, Arun Parajuli, Robert Franklin, Michael Sorenson, Daniel J. Felleman, Bryan J Hansen, Ming Hu, Valentin Dragoi, "A wireless transmission neural interface system for unconstrained non-human primates", *Journal of Neural Engineering*, 12, 2015.

[Flesher16] Sharlene N. Flesher, Jennifer L. Collinger, Stephen T. Foldes, Jeffrey M. Weiss, John E. Downey, Elizabeth C. Tyler-Kabara, Sliman J. Bensmaia, Andrew B. Schwartz, Michael L. Boninger, Robert A. Gaunt, "Intracortical microstimulation of human somatosensory cortex", *Sci. Transl. Med.* 8, Oct.19 2016.

[Gabriel96a] C Gabriel, S Gabriel, and E Corthout, "The dielectric properties of biological tissues: I. Literature survey," *Phys. Med. Biol.* 41, pp.2231–2249, 1996.

[Gabriel96b] S Gabriel, R W Lau, and C Gabriel, "The dielectric properties of biological tissues: III. Parametric models for the dielectric spectrum of tissues," *Phys. Med. Biol.* 41, pp.2271–2293, 1996.

[Gao12] Hua Gao, Ross M. Walker, Paul Nuyujukian, Kofi A. A. Makinwa, Krishna V. Shenoy, Boris Murmann, Teresa H. Meng, "HermesE: A 96-Channel Full Data Rate Direct Neural Interface in 0.13 m CMOS", *IEEE Journal of Solid-State Circuits*, vol. 47, no. 4, pp. 1043-1055, Apr. 2012.

[Ha13] Sohmyung Ha, Jongkil Park, Yu M. Chi, Jonathan Viventi, John Rogers, Gert Cauwenberghs, "85 dB dynamic range 1.2 mW 156 kS/s biopotential recording IC for high-density ECoG flexible active electrode array," *Proc. Eur. Solid-State Circuits Conf. (ESSCIRC)*, Sep. 2013, pp. 141–144.

[Harrison07] Reid R. Harrison, Paul T. Watkins, Ryan J. Kier, Robert O. Lovejoy, Daniel J. Black, Bradley Greger, Florian Solzbacher, "A Low-Power Integrated Circuit for a Wireless 100-Electrode Neural Recording System", *IEEE Journal of Solid-State Circuits*, vol. 42, no. 1, pp. 123-133, Jan. 2007.

[Helal08] Belal M. Helal, Matthew Z. Straayer, Gu-Yeon Wei, Michael H. Perrott, "A Highly Digital MDLL-Based Clock Multiplier That Leverages a Self-Scrambling Time-to-Digital Converter to Achieve Subpicosecond Jitter Performance", *IEEE Journal of Solid-State Circuits*, vol. 43, no. 4, pp. 885-863, Apr. 2008.

[Hochberg06] Leigh R. Hochberg, Mijail D. Serruya, Gerhard M. Friehs, Jon A. Mukand, Maryam Saleh, Abraham H. Caplan, Almut Branner, David Chen, Richard D. Penn, John P. Donoghue, "Neuronal ensemble control of prosthetic devices by a human with tetraplegia", *Nature*, vol. 442, pp. 164-171, Jul. 13 2006.

[Inanlou11] Farzad Inanlou, Mehdi Kiani, Maysam Ghovanloo, "A 10.2 Mbps Pulse Harmonic Modulation Based Transceiver for Implantable Medical Devices", *IEEE Journal of Solid-State Circuits*, vol. 46, no. 6, pp. 1296-1306, Jun. 2011.

[Jung10] Jaeyoung Jung, Siqi Zhu, Peng Liu, Yi-Jan Emery Chen, Deukhyoun Heo, "22-pJ/bit Energy-Efficient 2.4-GHz Implantable OOK Transmitter for Wireless Biotelemetry Systems: In Vitro Experiments Using Rat Skin-Mimic", *IEEE Transactions on Microwave Theory and Techniques*, vol. 58, no. 12, Dec. 2010.

[Kawai10] Shusuke Kawai, Hiroki Ishikuro, Tadahiro Kuroda, "A 2.5Gb/s/ch 4PAM inductive-coupling transceiver for non-contact memory card", 2010 IEEE International Solid-State Circuits Conference - Digest of Technical Papers, pp. 264-265, Feb. 2010.

[Kiani15] Mehdi Kiani, Maysam Ghovanloo, "A 13.56-Mbps Pulse Delay Modulation Based Transceiver for Simultaneous Near-Field Data and Power Transmission", *IEEE Transactions on Biomedical Circuits and Systems*, vol. 9, no. 1, pp. 1-11, Feb. 2015.

[Kuruvilla03] Abraham Kuruvilla, Roland Flink, "Intraoperative electrocorticography in epilepsy surgery: Useful or not?", *Seizure*, vol. 12, issue 8, pp. 577–584, Dec. 2003.

[Lee09] Jri Lee, Huaide Wang, "Study of Subharmonically Injection-Locked PLLs", *IEEE Journal of Solid-State Circuits*, vol. 44, no. 5, pp. 1539-1553, May 2009.

[Li18] Wen Li, Yida Duan, Jan Rabaey, "A 200Mb/s Inductively Coupled Wireless Transcranial Transceiver Achieving 5e-11 BER and 1.5pJ/b Transmit Energy Efficiency", accepted to 2018 IEEE International Solid-State Circuits Conference - Digest of Technical Papers, Feb. 2018.

[Liu14] Xiayun Liu, MehranM. Izad, Libin Yao, Chun-Huat Heng, "A 13 pJ/bit 900 MHz QPSK/16-QAM Band Shaped Transmitter Based on Injection Locking and Digital PA for Biomedical Applications", *IEEE Journal of Solid-State Circuits*, vol. 49, no. 11, pp. 2408-2421, Nov. 2014.

[Mandal08] Soumyajit Mandal, Rahul Sarpeshkar, "Power-Efficient Impedance-Modulation Wireless Data Links for Biomedical Implants", *IEEE Transactions on Biomedical Circuits and Systems*, vol. 2, no. 4, pp. 301-315, Dec. 2008.

[Mark10a] Michael Mark, Wireless Channel Characterization for mm-Size Neural Implants, M. sc., University of California at Berkeley, 2010.

[Mark10b] Michael Mark, Toni Bjorninen, Yuhui David Chen, Subramaniam Venkatraman, Leena Ukkonen, Lauri Sydanheimo, Jose M Carmena, Jan M Rabaey, "Wireless channel characterization for mm-size neural implants." 32nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Conference, vol. 1, pp. 1565–8, Jan 2010.

[Mark11a] Michael Mark, *Powering mm-Size Wireless Implants for Brain-Machine Interfaces*, Dissertation, University of California at Berkeley, 2011.

[Mark11b] M. Mark, Y. Chen, C. Sutardja, C. Tang, S. Gowda, M. Wagner, D. Werthimer, J. Rabaey, "A 1mm3 2Mbps 330fJ/b transponder for implanted neural sensors", *Symposium on VLSI Circuits*, pp. 168-169, Jun. 2011.

[Miura07] Noriyuki Miura, Hiroki Ishikurol, Takayasu Sakurai, Tadahiro Kuroda, "A 0.14pJ/b Inductive-Coupling Inter-Chip Data Transceiver with Digitally-Controlled Precise Pulse Shaping", 2007 IEEE International Solid-State Circuits Conference - Digest of Technical Papers, pp. 358-359, Feb. 2007.

[Miura08] Noriyuki Miura, Hiroki Ishikuro, Kiichi Niitsu, Takayasu Sakurai, and Tadahiro Kuroda, "A 0.14 pJ/b Inductive-Coupling Transceiver With Digitally-Controlled Precise Pulse Shaping", *IEEE Journal of Solid-State Circuits*, vol. 43, no. 1, pp. 285-291, Jan. 2008.

[Muller14] Rikky Muller, Hanh-Phuc Le, Wen Li, Peter Ledochowitsch, Simone Gambini, Toni Bjorninen, Aaron Koralek, Jose M. Carmena, Michel M. Maharbiz, Elad Alon, Jan M. Rabaey, "A Miniaturized 64-Channel 225µW Wireless Electrocorticographic Neural Sensor", 2014 IEEE International Solid-State Circuits Conference - Digest of Technical Papers, pp. 412-413, Feb. 2014.

[Niknejad07] Ali M. Niknejad, *Electromagnetics for High-Speed Analog and Digital Communication Circuits*, Cambridge, 2007.

[Rodriguez-Oroz05] M. C. Rodriguez-Oroz, J. A. Obeso, A. E. Lang, J.-L. Houeto, P. Pollak, S. Rehncrona, J. Kulisevsky, A. Albanese, J. Volkmann, M. I. Hariz, N. P. Quinn, J. D. Speelman, J. Guridi, I. Zamarbide, A. Gironell, J. Molet, B. Pascual-Sedano, B. Pidoux, A. M. Bonnet, Y. AgidJ. Xie, A.-L. Benabid, A. M. Lozano, J. Saint-Cyr, L. Romito, M. F. Contarino, M. Scerrati, V. Fraix, N. Van Blercom, "Bilateral deep brain stimulation in Parkinson's disease: a multicentre study with 4 years follow-up", *Brain*, vol. 128, issue 10, 1 Oct. 2005, pp. 2240–2249.

[Razavi04] Behzad Razavi, "A Study of Injection Locking and Pulling in Oscillators", *IEEE Journal of Solid-State Circuits*, vol. 39, no. 9, pp. 1415-1424, Sept. 2004.

[Seo16] Dongjin Seo, Ryan M. Neely, Konlin Shen, Utkarsh Singhal, Elad Alon, Jan M. Rabaey, Jose M. Carmena, Michel M. Maharbiz, "Wireless Recording in the Peripheral Nervous System with Ultrasonic Neural Dust", *Neuron*, vol. 91, issue 3, pp. 529–539, 2016.

[Sidiropoulos98] Stefanos Sidiropoulos, *High Performance Inter-Chip Signaling*, Dissertation, Stanford University, 1998.

[Stojanović04] Vladimir Stojanović, *Channel-Limited High-Speed Links; Modeling, Analysis and Design*, Dissertation, Stanford University, 2004.

[Viventi11] Jonathan Viventi, Dae-Hyeong Kim, Leif Vigeland, Eric S Frechette, Justin A Blanco, Yun-Soung Kim, Andrew E Avrin, Vineet R Tiruvadi, Suk-Won Hwang, Ann C Vanleer, Drausin F Wulsin, Kathryn Davis, Casey E Gelber, Larry Palmer, Jan Van der Spiegel, Jian Wu, Jianliang Xiao, Yonggang Huang, Diego Contreras, John A Rogers, Brian Litt, "Flexible, foldable, actively multiplexed, high-density electrode array for mapping brain activity in vivo", *Nature Neuroscience*, 14, pp. 1599-1605, Nov. 2011.

[Ye02] Sheng Ye, Lars Jansson, and Ian Galton, "A multiple-crystal interface PLL with VCO realignment to reduce phase noise", *IEEE Journal of Solid-State Circuits*, vol. 37, no. 12, pp. 1795–1803, Dec. 2002.

[Yin14] Ming Yin, David A. Borton, Jacob Komar, Naubahar Agha, Yao Lu, Hao Li, Jean Laurens, Yiran Lang, Qin Li, Christopher Bull, Lawrence Larson, David Rosler, Erwan Bezard, Gre´goire Courtine, Arto V. Nurmikko, "Wireless Neurosensor for Full-Spectrum Electrophysiology Recordings during Free Behavior", *Neuron*, 84, pp. 1170–1182, Dec. 17, 2014.

[Yoshida07] Yoichi Yoshida, Noriyuki Miura, Tadahiro Kuroda, "A 2 Gb/s bi-directional interchip data transceiver with differential inductors for high density inductive channel array", *IEEE Asian Solid-State Circuits Conference - Digest of Technical Papers*, pp.127–130, Nov. 2007.

[Yoshida09] Yoichi Yoshida, Koichi Nose, Yoshihiro Nakagawa, Koichiro Noguchi, Yasuhiro Morita, Masamoto Tago, Tadahiro Kuroda, Masayuki Mizuno, "Wireless DC Voltage

Transmission Using Inductive Coupling Channel for Highly-Parallel Wafer-Level Testing", 2009 *IEEE International Solid-State Circuits Conference - Digest of Technical Papers*, pp. 470-471, Feb. 2009.