## Towards Wideband Linear RF Transmitters for Millimeter-Wave Arrays



Yikuan Chen

### Electrical Engineering and Computer Sciences University of California, Berkeley

Technical Report No. UCB/EECS-2025-53 http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-53.html

May 14, 2025

Copyright © 2025, by the author(s). All rights reserved.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission.

#### Towards Wideband Linear RF Transmitters for Millimeter-Wave Arrays

by

Yikuan Chen

A report submitted in partial satisfaction of the

requirements for the degree of

Master of Science, plan II

in

Engineering - Electrical Engineering and Computer Sciences

in the

Graduate Division

of the

University of California, Berkeley

Committee in charge:

Professor Ali M. Niknejad, Chair Professor Kristofer Pister

Spring 2025

The report of Yikuan Chen, titled Towards Wideband Linear RF Transmitters for Millimeter-Wave Arrays, is approved:

| Chair | an     | Date | 5/4/2025  |
|-------|--------|------|-----------|
|       |        | Date |           |
|       | 12 Jan | Date | 5/12/2025 |

University of California, Berkeley

#### Towards Wideband Linear RF Transmitters for Millimeter-Wave Arrays

Copyright 2025 by Yikuan Chen

#### Abstract

#### Towards Wideband Linear RF Transmitters for Millimeter-Wave Arrays

by

#### Yikuan Chen

#### Master of Science in Engineering - Electrical Engineering and Computer Sciences

University of California, Berkeley

Professor Ali M. Niknejad, Chair

This report focuses on designing a high-linearity transmitter (TX) for millimeter-wave (mm-Wave) wireless communication. To realize a common module transceiver that interfaces with different front end modules for different functionalities, the TX is required to cover a wide frequency band with high instantaneous radio-frequency (RF) bandwidth, low noise, and high linearity at RF and baseband ports. Different architectures to realize these goals for a mm-Wave TX are investigated and discussed. A high-linearity active mixer is proposed to achieve a flat input impedance curve versus varying baseband input from the digital-to-analog converter (DAC). This design was fabricated in 28nm bulk complementary metal-oxide-semiconductor (CMOS) technology. Next, a complete TX with 10-bit baseband DAC, filter, and distributed active mixer with transmission line (T-Line) power combiner and local oscillator (LO) chain, fabricated in the same process, is discussed. This design operates from 13 GHz to 50 GHz and demonstrates 2.5 dBm compression point and power consumption of 71 mW on a 1.2 V supply in simulations.

To my family.

# Contents

| Co            | ontents                                                            | ii            |
|---------------|--------------------------------------------------------------------|---------------|
| $\mathbf{Li}$ | st of Figures                                                      | iii           |
| $\mathbf{Li}$ | st of Tables                                                       | v             |
| 1             | Introduction<br>1.1 Traditional Transmitter vs. Distributed RF-DAC | <b>1</b><br>1 |
| <b>2</b>      | High Linearity Active Mixer                                        | 4             |
|               | 2.1 High-Linearity Mixer                                           | 4             |
|               | 2.2 Passive Mixer                                                  | 5             |
|               | 2.3 Active Mixer                                                   | 6             |
| 3             | High-Linearity DAC with Distributed Mixer                          | 14            |
|               | 3.1 Overview                                                       | 14            |
|               | 3.2 High-Linearity DAC                                             | 14            |
|               | 3.3 High-Speed FPGA-to-Chip CDR                                    | 18            |
|               | 3.4 Current Mirror Filter                                          | 20            |
|               | 3.5 Distributed Active Mixer                                       | 22            |
|               | 3.6 Dual Mode Wideband LO Chain                                    | 24            |
|               | 3.7 Transmitter Overview                                           | 29            |
|               | 3.8 Test Setup                                                     | 30            |
|               | 3.9 Measurement                                                    | 32            |
| 4             | Conclusion                                                         | 36            |
| Bi            | bliography                                                         | 37            |

# List of Figures

| 1.1<br>1.2        | Proposed wideband "common module" covers 28-50 GHz RF bandwidth at the input with 200 MHz baseband bandwidth and can interface to many different front-end modules to realize different functionality.                               | $2 \\ 2$ |
|-------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
| 1.3               | RF-DAC block diagram.                                                                                                                                                                                                                | 3        |
| 2.1               | Passive mixer schematic. Note the output capacitors represent the ESD diodes.<br>The output is matched via an asymmetric T-coil. The resistors bootstrap the IF signal to the mixer gates, to remove IF-dependent linearity effects. | 6        |
| 2.2               | Passive mixer layout, viewed in Cadence Virtuoso. The T-coil matching is at the                                                                                                                                                      |          |
|                   | top, the driver choke is below                                                                                                                                                                                                       | 7        |
| 2.3               | Simplified model for single-balanced active mixer.                                                                                                                                                                                   | 8        |
| $\frac{2.4}{2.5}$ | $G_m$ curve similing due to asymmetric dimerential MOSFET pair, cited from [0].<br>Linearity-improved active mixer                                                                                                                   | 0        |
| 2.0<br>2.6        | Schematic and floorplan of the linearity-improved active mixer.                                                                                                                                                                      | 01       |
| 2.7               | Layout of linearity-improved active mixer.                                                                                                                                                                                           | 10       |
| $2.8 \\ 2.9$      | Comparison of gain compression of traditional mixer and the proposed mixer Comparison of Signal-to-Noise-and-Distortion Ratio of traditional mixer and the                                                                           | 11       |
|                   | proposed mixer                                                                                                                                                                                                                       | 11       |
| 2.10              | Fundamental and $IM_3$ output power measurements                                                                                                                                                                                     | 12       |
| 2.11              | Chip photo for the passive and active mixers                                                                                                                                                                                         | 13       |
| 3.1               | System block diagram of the transmitter                                                                                                                                                                                              | 15       |
| 3.2               | Conventional current-steering DAC                                                                                                                                                                                                    | 15       |
| 3.3               | Conventional and Cascoded current-steering DAC                                                                                                                                                                                       | 16       |
| 3.4               | Folded-Cascode current-steering DAC.                                                                                                                                                                                                 | 17       |
| 3.5               | Six-bit binary-weighted input branches are connected to the same folded branch.                                                                                                                                                      | 17       |
| 3.6               | Layout of the folded-cascode current-steering DAC.                                                                                                                                                                                   | 18       |
| 3.7               | Block Diagram of Clock-Data Recovery (CDR) circuit to align the bits 1                                                                                                                                                               | 19       |
| 3.8               | Clock-Data Recovery (CDR) circuit to align the bits.                                                                                                                                                                                 | 19       |
| 3.9               | Current mirror filter schematic. The input is on the left side, and the output is on the right.                                                                                                                                      | 20       |
| 3.10              | Current mirror filter Bode plot. Note the zero from the <i>m</i> -derived sections                                                                                                                                                   | 21       |

| 3.11 | Layout of the current mirror filter                                                                                                                |
|------|----------------------------------------------------------------------------------------------------------------------------------------------------|
| 3.12 | Simulation of the DAC with the filter                                                                                                              |
| 3.13 | Distributed mixer and output matching network                                                                                                      |
| 3.14 | Double-Balanced active mixer quad                                                                                                                  |
| 3.15 | Distributed mixer layout                                                                                                                           |
| 3.16 | Distributed mixer layout on the chip                                                                                                               |
| 3.17 | Simulated distributed mixer gain                                                                                                                   |
| 3.18 | Distributed mixer with I/Q pulling at $OP_{1dB}$                                                                                                   |
| 3.19 | Distributed mixer without I/Q Pulling at $OP_{1dB}$                                                                                                |
| 3.20 | Distributed mixer with I/Q pulling at $OP_{0.5dB}$                                                                                                 |
| 3.21 | Distributed mixer without I/Q Pulling at $OP_{0.5dB}$                                                                                              |
| 3.22 | LO chain concept. Low and high frequency amplifier paths, followed by a switch                                                                     |
|      | network to select the desired frequency band. The LO signal is then fed into the                                                                   |
|      | distributed mixer (the transmission line)                                                                                                          |
| 3.23 | The low frequency LO chain, prior to the final driver stage. Note that the am-                                                                     |
|      | plifiers are simply inverters. An artificial transmission line LC hybrid is based on                                                               |
|      | $[2]$ , with 4-bit tuned capacitors. $\ldots \ldots 29$ |
| 3.24 | The four finger Lange coupler. The dummy metal (blue) fills the surrounding                                                                        |
|      | regions to pass the difficult density DRC rules                                                                                                    |
| 3.25 | Complete layout of the LO chains. The lumped and Lange hybrids are in the                                                                          |
|      | center of the chip, with the $I/Q$ amplifier chains laid out symmetrically around                                                                  |
|      | the mixer. $\ldots \ldots 30$                           |
| 3.26 | Architecture of the transmitter                                                                                                                    |
| 3.27 | Test setup block diagram                                                                                                                           |
| 3.28 | Probe station setup of the chip                                                                                                                    |
| 3.29 | Layout of the wideband linear transmitter                                                                                                          |
| 3.30 | Die Photo of wideband linear transmitter                                                                                                           |

# List of Tables

| 2.1 | Simulated OIP3 of the proposed active mixer.         | 5  |
|-----|------------------------------------------------------|----|
| 2.2 | Simulated OIP3 of the proposed active mixer.         | 9  |
| 2.3 | Measured $P_{1dB}$ of the proposed active mixer      | 11 |
| 3.1 | Summary of the performance of the distributed mixer. | 25 |
| 3.2 | Summary of the performance of the overall system.    | 30 |

#### Acknowledgments

I would like to begin by expressing my deepest gratitude to my advisor, Professor Ali Niknejad. His unwavering support throughout my degree has been instrumental in shaping me into a well-rounded engineer, and his profound expertise has been a constant source of inspiration for my research. I am also immensely thankful to Professor Kristofer Pister for his insightful guidance and advice.

A special thanks goes to my colleague, Rohit Braganza, with whom I collaborated closely on the MIDAS project, successfully completing two tapeouts together. I am also grateful to Sashank Krishnamurthy and Nima Baniasadi for their invaluable suggestions and assistance throughout this research, as well as to Hesham Beshary and Ali Ameri for their help with testing.

I extend my appreciation to Averal Kandala for his thoughtful feedback, which greatly improved the clarity and professionalism of my writeup. Additionally, I thank DARPA for funding this project and acknowledge the members of DARPA, Intel, and Texas Instruments for their constructive feedback during our discussions.

# Chapter 1 Introduction

This project is part of the Millimeter-wave Digital Arrays (MIDAS) research program which aimed to advance the state-of-the-art design of complementary metal-oxide-semiconductor (CMOS) wireless transceivers to address emerging applications in digital beamforming. 5G communication utilizes a rich spectrum in millimeter-wave (mm-Wave) bands. In particular, the FR2 spectrum extends from 26.5 GHz to 71 GHz in different countries/regions. To support these disparate bands, different transceivers designed for specific frequency ranges are required. This approach is not only costly, but demands lots of valuable printed circuit board (PCB) area, especially on mobile devices. Hence, it is highly beneficial to design a single transceiver that supports a wide RF frequency range, to allow the system to support multiple applications, such as Multiple-In Multiple-Out (MIMO) communication and mm-Wave radar.

The aim of the project is to provide a "common module", realized in low-cost CMOS technology, that can interface with a high performance TX/RX front-end module designed for a specific band or application, as shown in Fig. 1.1. For TX applications, the "common module" should be as flexible as possible, accommodating different PA output power, modulation schemes, and other transmitter specifications.

To achieve such versatility, the transceiver is required to have wide operational bandwidth, high instantaneous RF bandwidth, low noise figure, as well as high linearity at both RF and baseband ports.

#### 1.1 Traditional Transmitter vs. Distributed RF-DAC

A traditional transmitter typically consists of the following stages: a Digital-to-Analog Converter (DAC), a baseband Filter, an up-converting Mixer, an RF Filter, and a Power Amplifier, as shown in Fig. 1.2. This approach benefits from simple LO distribution to the upconverting mixer. However, to achieve high linearity in the overall transmitter, each stage must be extremely linear by itself. Often, the linearity degrades due to signal-dependent current through the mixer and finite output impedance in the DAC.



Figure 1.1: Proposed wideband "common module" covers 28-50 GHz RF bandwidth at the input with 200 MHz baseband bandwidth and can interface to many different front-end modules to realize different functionality.



Figure 1.2: Traditional transmitter block diagram.

In contrast, an RF-DAC transmitter uses a distributed design, as shown in Fig. 1.3, where each element consists of a DAC and a mixer with less output power than the traditional counterpart. N such elements are then combined together at the RF output. There is no explicit PA in the transmitter, as a power combiner, such as that formed of transmission lines (T-lines), sums up the power delivered by each cell for delivery to the output. To obtain high linearity at the output, considerable design optimization is required. LO distribution to each mixer is needed, and the matching between each element limits the overall linearity of the system. The key advantage of this approach is eliminating the PA, which avoids the nonlinearity and efficiency trade-offs inherent in single-stage PAs. By distributing power generation across many low-power, linear elements, the RF-DAC achieves high output power with improved linearity and efficiency.



Figure 1.3: RF-DAC block diagram.

This project explores the potential and limits of different designs on this spectrum. Highlinearity global mixers (passive and active), described in Ch. 2, are designed and fabricated, along with a complete RF-DAC topology utilizing distributed combining, with a baseband DAC, analog filters, and distributed up-conversion mixers to drive different points on a transmission line to combine I and Q signals while boosting the output power compression point. The RF-DAC prototype is described in Ch. 3. The transmitter was jointly designed and fabricated by the author (high-linearity DAC, high-speed digital link with clock-datarecovery circuit), Rohit Braganza (baseband filter, LO chains) and Sashank Krishnamurthy (distributed mixer).

## Chapter 2

## High Linearity Active Mixer

### 2.1 High-Linearity Mixer

In a transmitter, an ideal mixer translates the baseband signal to a higher frequency without distorting the information contained in the signal. A mixer itself is a non-linear system with respect to both its baseband (or intermediate frequency, IF) input and the local oscillator (LO) input because new frequency content is created at the output port. However, a good mixer should still behave "linear" in the sense that its RF output amplitude should be proportional to the IF input amplitude. In other words, the conversion gain, which is defined for a transmitter mixer as the ratio of the desired RF output signal amplitude to the IF input signal amplitude, should be constant and independent of the input signal value. In real mixers, there are limits beyond which the RF output has a sub-linear dependence on the IF input [9]. The output compression point for a transmitter mixer is the power of the output signal at which the conversion gain decreases from the ideal constant gain value. Usually a 1-dB compression value (known as  $OP_{1dB}$ ) is specified. The output twotone third-order intercept point (OIP3) is also often used to characterize the linearity of transmitter mixers. It is an extrapolated value of the output power at which the third-order intermodulation components would be equal to that of the desired RF output signal (called the fundamental).

One of the key specifications of a good transmitter mixer is its linearity. The higher the  $P_{1dB}$  or OIP3 is, the better the linearity. There are many ways to improve the linearity of the mixer: [4] uses dynamic current injection on a double-balanced active current mixer in a receiver for sub-6 GHz. [1] introduced fully-differential Darlington cells in the RF transconductance stage to reduce the third-order non-linearity. [7] improved the linearity of an up-converting mixer using the Improved Derivative Super-Position (I-DS) technique cascaded between the mixer's transconductance and switching stage. This technique enhances mixer linearity by canceling third-order distortion (IM3) using opposing nonlinear currents from carefully biased transconductance devices. This maintains gain and efficiency but requires precise device matching and biasing, making it sensitive to process variations

and temperature changes. Imperfections can degrade performance, complicating design and manufacturing. Most of the aforementioned methods come with increased current consumption. In fact, improving the  $P_{1dB}$  by 1dB usually requires significantly increased current consumption and degraded flicker noise performance.

The method proposed here relies on flattening the large signal transconductance  $G_m$  of the "switching" stage (the devices always work in saturation region, so technically they are not switching devices, but the term "switching" is used here to indicate that it is the stage that has common-source LO and common-gate IF input) across  $V_{GS}$  due to the changing IF current. The simulation shows that the resulting mixer with extracted devices has better signal-to-distortion-ratio (SNDR) than a conventional double balanced active mixer, with a trade-off in signal-to-noise ratio at low IF input amplitude.

The mixer chip discussed in this section was designed and submitted for fabrication in a TSMC 28nm Bulk CMOS process in October 2019. The mixers are assumed to be used in a traditional transmitter architecture as the global mixer, so the key design goal was to achieve high linearity with a single mixer. Two mixer designs were investigated: a passive mixer, designed by Rohit Braganza, and an active mixer, both of which were aimed to achieve high linearity by reducing the signal-dependent quantities in the circuit. The principle of the proposed active mixer will be discussed in detail.

### 2.2 Passive Mixer

The schematic of the designed passive mixer is shown in Fig. 2.1. It consists of the standard double-balanced mixer topology, with the addition of bootstrapping resistors from the IF input to the gate of the mixer transistors to further improve the linearity. These resistors cause the gate to track the (low frequency) IF port, keeping the transistors'  $V_{gs}$  independent of the IF signal, which reduces intermodulation (IM) products due to the IF signal [8].

The mixer is driven by a simple common source amplifier with a choke inductor load. To keep the structure broadband, asymmetric T-coils were used as output matching networks to the probe pads and their ESD diodes. The overall layout is shown in Fig. 2.2. The measured results are given in Fig. 2.1, and were within 1-2 dB of simulation; the mixer showed a reasonable OIP3 for a low power consumption (8.4 mW at 24 GHz, 15.6 mW at 40 GHz), and was capable of operating across a wide bandwidth.

| Frequency(GHz) | Measured $P_{1dB}(dBm)$ | Measured OIP3(dBm) | Simulated OIP3(dBm) |
|----------------|-------------------------|--------------------|---------------------|
| 24             | -4.3                    | 4.16               | 5.25                |
| 32             | -9.1                    | 1.89               | 3.1                 |
| 40             | -8.9                    | 0.41               | 1.5                 |

Table 2.1: Simulated OIP3 of the proposed active mixer.



Figure 2.1: Passive mixer schematic. Note the output capacitors represent the ESD diodes. The output is matched via an asymmetric T-coil. The resistors bootstrap the IF signal to the mixer gates, to remove IF-dependent linearity effects.

### 2.3 Active Mixer

An active mixer is named so because it provides power gain with active devices. Fig. 2.3 shows the current flow in a single-balanced active mixer. For analysis purposes, we simplify the IF input to be a current source with some finite output conductance,  $G_0$ . The differential pair for the common-source (CS) LO input is biased in saturation mode at all times and there is some constant DC bleeding current sunk by the tail device. At any given moment, the following equations describe the relationship between the current in different branches:

$$i_{1} + i_{2} = i_{in} + G_{o}V_{s}$$

$$i_{1} \approx f(V_{G_{0}} + \frac{v_{LO}}{2}, V_{s})$$

$$i_{2} \approx f(V_{G_{0}} - \frac{v_{LO}}{2}, V_{s})$$
(2.1)

If we write  $i_1$  in terms of the DC term and the derivative of  $i_1$  with respect to  $V_G$  and  $V_S$ 



Figure 2.2: Passive mixer layout, viewed in Cadence Virtuoso. The T-coil matching is at the top, the driver choke is below.

times the differential quantities, we will get:

$$i_{1} \approx f(V_{G_{0}}, V_{S_{0}}) + \frac{\partial f}{\partial V_{G}} \frac{v_{LO}}{2} + \frac{\partial f}{\partial V_{s}} v_{S} + \frac{\partial^{2} f}{\partial V_{G} \partial v_{S}} \frac{v_{LO}}{2} \times v_{S} + \dots$$

$$\approx I_{Q} + G_{m1}(t) \frac{v_{LO}}{2} + \frac{\partial f}{\partial V_{s}} v_{S} + \frac{\partial^{2} f}{\partial V_{G} \partial v_{S}} \frac{v_{LO}}{2} \times v_{S} + \dots$$
(2.2)

The first term is the DC current with no signal. The second is the LO feedthrough term which will get cancelled in the double-balanced differential output. The third term, which is a little bit more complicated if we take the full derivative, is a non-linear and time-varying term (as the IF signal changes), which generates distortion and mixing. The fourth term is also non-linear and time-varying, and it also changes as the IF signal changes. This simple Taylor Series expansion shows that the output current depends, in a non-linear way, on both the source (input) voltage and gate (LO) voltage. The LO controls the gates in a time periodic way, and the source varies due to the DAC and the second harmonic of the LO. At any given moment, the value of  $v_S$ , and, consequently, the current that flows into the differential pair versus the current that flows into the load  $G_o$  depends on the ratio between  $G_o(t)$  and  $G_m(t)$ . Even with an ideal DAC with output impedance independent of the signal, the design would still suffer from non-linearity because  $G_m(t)$  varies in magnitude due to the source voltage varying. Therefore, flattening the  $G_m$  with respect to the varying source voltage, would allow the total proportional current flowing into the differential pair compared to what the current source generates to be more constant. As the  $G_m$  is flattened, the input impedance seen by the DAC stays constant across input power, and the output amplitude of the mixer will be linearized as the baseband DAC current changes.

#### CHAPTER 2. HIGH LINEARITY ACTIVE MIXER

In a differential amplifier with tail current  $I_{SS}$ , if the two input MOSFETs are different in size, the large signal transconductance  $G_m$  vs.  $V_{GS}$  curve will shift to the left or to the right, as shown in Fig. 2.4[6]. If the drains of the two opposite copies of such an amplifier are combined, the overall  $G_m$  curve could be flattened, and this is exactly what is needed.

The architecture of the proposed active mixer is shown in Fig. 2.5. It is a modified current-commutating mixer (Gilbert-type mixer). Instead of one current-steering DAC (+ and – outputs), we use two, but each only burns half of the current. The topology is adapted from the amplifier by connecting the drains of the two  $G_m$ -linearized amplifiers together. By purposely introducing asymmetry between the LO+ and LO- transistors, we can reduce the non-linear mixing term at the output.



Figure 2.3: Simplified model for single-balanced active mixer.



Figure 2.4:  $G_m$  curve shifting due to asymmetric differential MOSFET pair, cited from [6]

For testing purposes, we replaced each of the two DACs with a common-gate input stage to allow enough headroom for swing at the output. The IF input is AC-coupled to the sources



Figure 2.5: Linearity-improved active mixer.

of these common-gate transistors with DC set to ground through baluns. The schematic and layout floorplan is shown in Fig. 2.6 and Fig. 2.7. The DC current consumption of the mixer is 13.5 mA under 1.2 V supply, but this current will be shared with the DAC. We simulated its performance at relatively high output power (non-linearity dominating the signal-tointerference-and-distortion ratio (SINAD)) in comparison to a traditional double-balanced mixer at the same output power and total DC current consumption, and the result is shown in Fig. 2.8 and Fig. 2.9. Note that because more active MOSFETs are used in this design, the SINAD will be lower than that of a traditional double-balanced mixer when the signal power is very low (such that the noise power dominates the SINAD). The OIP3 simulated at 25.6 GHz with 200 MHz bandwidth IF input using extracted transistors and ideal 50 $\Omega$ source/load impedance is 10.96 dBm, and it is 10.38 dBm at 50 GHz, as shown in Table 2.2. The output 50 Ohm loads are directly connected to the  $V_{DD}$ .

| Frequency | $25.6~\mathrm{GHz}$ | $50 \mathrm{~GHz}$   |
|-----------|---------------------|----------------------|
| OIP3      | 10.96  dBm          | $10.38~\mathrm{dBm}$ |

Table 2.2: Simulated OIP3 of the proposed active mixer.

The mixer chip was fabricated in TSMC 28nm bulk CMOS technology. The chip photo is shown in 2.11. The output power at 24, 28, 32, and 36 GHz with 200 MHz IF bandwidth was measured and shown in Fig. 2.10. The conversion gains are less than unity due to the loss of the probes, cables, and matching networks. For the OIP3 test, one IF tone at 150 MHz and one at 200 MHz are power-combined before connecting to the power divider for the IF input. Note that the OIP3 extrapolated at different points varies, as the slope of the thirdorder harmonics (in dB scale) is not constant, as opposed to the commonly seen 3dB/dB

#### CHAPTER 2. HIGH LINEARITY ACTIVE MIXER



Figure 2.6: Schematic and floorplan of the linearity-improved active mixer.



Figure 2.7: Layout of linearity-improved active mixer.

slope, and hence the intersections of the trend-lines does not represent the IP3 points. This is likely caused by higher-order nonlinear terms that fall at the IM3 frequencies, causing the amplitude to change. Table 2.3 shows the  $P_{1dB}$  measured at different frequencies. The measured OIP3 values are lower than the simulated OIP3 values (loaded with a 50 Ohm load directly) at each frequency due to the output matching on the actual fabricated chip.



Figure 2.8: Comparison of gain compression of traditional mixer and the proposed mixer.



Figure 2.9: Comparison of Signal-to-Noise-and-Distortion Ratio of traditional mixer and the proposed mixer.

| Frequency           | 24 GHz    | 28 GHz    | 32 GHz    | 36 GHz              |
|---------------------|-----------|-----------|-----------|---------------------|
| $P_{1dB}$           | -4.71 dBm | -9.29 dBm | -7.74 dBm | -8.82 dBm           |
| OIP3 (extrapolated) | 4.89 dBm  | 0.31  dBm | 1.86  dBm | $0.78~\mathrm{dBm}$ |

Table 2.3: Measured  $P_{1dB}$  of the proposed active mixer.



Figure 2.10: Fundamental and  $IM_3$  output power measurements.



Figure 2.11: Chip photo for the passive and active mixers.

## Chapter 3

# High-Linearity DAC with Distributed Mixer

#### 3.1 Overview

As discussed in Ch. 1, the traditional transmitter architecture with global DAC, mixer, and PA and the distributed RF-DAC architecture with N DAC-mixer elements are on opposite ends of the design spectrum. There is, however, still a large design space between the two extremes. One possible choice is to eliminate the explicit PA as prescribed by the RF-DAC architecture, but instead of using the inherently linear one-bit DAC per mixer approach, the baseband can be generated using a high-linearity global DAC and its output can be fed to distributed mixers along a transmission line.

Following the high-linearity mixer, a complete transmitter with 10 bit I/Q and RF at 40-60 GHz was designed and fabricated in TSMC 28nm bulk CMOS. Fig. 3.1 shows the system block diagram of the transmitter, which consists of both I and Q paths.

### 3.2 High-Linearity DAC

Traditional current-steering DACs, such as the one shown in 3.2, suffer from a number of problems. One of the main problems is limited headroom. Usually the current source MOSFETs are long channel devices in pursuit of high output impedance for linearity reasons. Therefore, to generate a certain required current, a larger  $V_{GS}$  bias is required than for short channel devices. Consequently, a high source-drain DC voltage is needed for operating in saturation. Therefore, the headroom voltage at the output node is reduced, which makes current sharing between the DAC and active mixer extremely difficult. Also, the swing at the output node will directly impact the bias of the current source. Cascoding the current source to boost its output impedance will result in the same situation, which makes mixer cascoding or low-supply voltage operation not an option.



Figure 3.1: System block diagram of the transmitter.



Figure 3.2: Conventional current-steering DAC.

Another issue is that, since there is no shielding between the input differential pair and the output node, charge feedthrough from the input to the output due to parasitic coupling between the gate node and the drain node could severely degrade the performance of the DAC, creating a series of spur tones on the output spectrum. One way to decouple the output node from the input node is to add a cascode transistor before the output node as shown in Fig. 3.3. However, this is only feasible if the supply voltage is high enough, otherwise the headroom will still be an issue.



Figure 3.3: Conventional and Cascoded current-steering DAC.

The designed DAC adopts a folded-cascode structure as shown in Fig. 3.4, a topology that was modified from an existing folded-cascode differential amplifier, first proposed with an NMOS-input topology by [5] but not verified through silicon.  $M_{5,6}$  provides constant bleeding current. To make the DAC stackable with an active mixer to share the DC bleeding current, a PMOS-input topology is selected. By putting the input pair and current source on the separate branch, the  $V_{DS}$  of  $M_1$  and  $M_2$  can be set high independent of the output voltage bias. Because headroom is no longer an issue (due to reduced stacking of transistors),  $M_{7,8}$  can be kept as the shielding device to remove input/output direct coupling and mitigate the impact of charge feedthrough considerably.

The DAC was segmented as 6-thermometer + 6-binary digits. The binary cells share one folded branch (right side of the structure) as shown in Fig. 3.5. The size of each thermometer cell is  $2^6 = 64 \times \text{LSB}$  size. The final layout of the DAC is shown in Fig. 3.6. The DC power consumption of the 10-bit DAC is 1.4 mW, and the AC power (with the bit latches included) is 1.47 mW.



Figure 3.4: Folded-Cascode current-steering DAC.



Figure 3.5: Six-bit binary-weighted input branches are connected to the same folded branch.



Figure 3.6: Layout of the folded-cascode current-steering DAC.

### 3.3 High-Speed FPGA-to-Chip CDR

One significant challenge in testing the described DAC is sending all the bits (10 differential pairs for I and Q-channels each) at a rate of 10 GHz with aligned phases, as the bits were not fully synchronized due to the limited bank size of FPGA and skews due to different transmission delays. For high-speed bit generation and transmission to the DAC input on the chip, the Xilinx Ultrascale+ VCU-118 FPGA platform was used with its built-in 28 GHz serial LVDS transmitters. Each bank is capable of driving four differential channels, so in order to drive the 20 differential I/Q bits, we need a total of five banks. Because each bank has a separate clock domain, the relative delay between the banks and the delay due to routing need to be corrected on-chip.

Fig. 3.7 shows the phase calibration loop that was designed for bit-alignment. The loop calibrates the delay of each bit by comparing the phase of the bit against the master clock that drives the latches of the DAC. Each bit will go through a digital delay line initialized with minimum delay, and the loop will detect if the rising edge of the master clock arrived earlier compared to the bit, and the delay will be added to that bit (with a step of 9 ps each time), until the rising edge of that bit lags behind the clock. The calibration will be performed for all bits for I and Q channels, and then additional delay will be added to the DAC (via scan bits) to meet setup and hold time requirements. The layout is shown in



Fig. 3.8. This system consumes 110 mA of AC current and 2.9 mA of DC current under 1.2 V supply when operating with an activity factor  $\alpha = 1$ .

Figure 3.7: Block Diagram of Clock-Data Recovery (CDR) circuit to align the bits.



Figure 3.8: Clock-Data Recovery (CDR) circuit to align the bits.



Figure 3.9: Current mirror filter schematic. The input is on the left side, and the output is on the right.

#### 3.4 Current Mirror Filter

The anti-alias filter schematic is shown in figure 3.9. This filter is a modified version of that shown in [3]. The input to the filter has a low impedance thanks to the negative feedback (akin to a super source follower or gain boosted stages), while the loop gain is not extraordinarily large (note the common source drives the source of the transistor above it). Even with a moderate gain, this allows for a lower input impedance at no additional power cost. The (small-signal) impedance was designed and simulated at around 15  $\Omega$ , with approximately 2 mA of current consumption in the input branch (4mA in total for the filter). The filtering action is created via two, *m*-derived sections, as well as an output *RC* network. The *m*-derived sections provide both a sharp roll-off, as well as a notch, and were created using T-coils and variable capacitors (3 bit cap-DAC). The capacitor DAC adjusts the corner frequency (as well as the notch frequency, though the ratio of the corner and notch frequencies remains constant). The Bode plot is shown in Fig. 3.10. The overall layout is shown in Fig. 3.11. Note that the signal is differential, so care was taken to keep the layout symmetric. Each filter consumes around 4 mW of power.

With the filter knocking down the clock tone at 10 GHz, the Effective Number of Bits (ENOB) is 9.05 bits when integrating the noise up to 50 GHz, as shown in Fig. 3.12.



Figure 3.10: Current mirror filter Bode plot. Note the zero from the m-derived sections.



Figure 3.11: Layout of the current mirror filter.



Figure 3.12: Simulation of the DAC with the filter.

### 3.5 Distributed Active Mixer

To achieve high OIP3, a distributed mixer is implemented with LO and RF signals delayed by the same amount along the transmission lines. Because we used transmission lines, the delay is a true time delay rather than a phase delay, and the design is inherently broadband. To connect the gates of each sub-mixer as well as the drains, we designed a 125  $\Omega$  differential high-impedance transmission line, and the output capacitance of the mixers is absorbed in the design. The IF signal is injected into the sub-mixers using similar techniques, and the phase mismatch of the IF signal introduces little loss on the gain since the IF bandwidth is significantly less than the LO frequency. The simulated loss of gain with 500 MHz filtered IF signal and 30 GHz LO frequency is less than 0.1 dB.

Fig. 3.13 shows the distributed mixer with output matching network to absorb the ESD and pad capacitances. The combiner for I/Q signals is also built-in by alternating the I-mixers and the Q-mixers along the output transmission line.

Because the mixers are distributed, the linearity requirement on each sub-element is not as high as in the traditional single-mixer design. Shown in Fig. 3.14, the common source IF input is located at the bottom, and the common gate LO input is located at the top. The IF port uses class A bias so the mixing gain is higher compared to class B, but the efficiency is lower. For better integration with the DAC-filter, we chose to bias the IF at 400 mV  $V_{GS}$ 



with class A operation.

Figure 3.13: Distributed mixer and output matching network.



Figure 3.14: Double-Balanced active mixer quad.

The RF output pad capacitance is simulated to be 40 fF, and the ESD diodes contribute 80 fF of capacitance. To absorb that, we designed an output matching network with an artificial T-line that matches to the 50 $\Omega$  output load. The other side of the transmission line is terminated with an RF choke with 150 pH inductance connected to  $V_{DD}$ , so the output swing of the mixer is enhanced.

Fig. 3.15 shows the high-impedance transmission line with mixer loads distributed across the line. The unloaded line has  $Z_0 = 125\Omega$ , and the loaded line has slightly lower impedance of 115 $\Omega$ . The bandwidth of the line reduces with the addition of taps for LO gate and RF drain nodes.

The gain and linearity  $(P_{1dB})$  of the mixer with the I/Q combining loss is shown in 3.17. A small-signal 3-dB bandwidth from 13-50 GHz is achieved, with the peak  $P_{1dB}$  of -0.5 dBm at 30 GHz. The distributed mixer consumes a total of 33 mA DC current when operated at  $P_{1dB}$  ( $V_{DD} = 1.2V$ ). The entire transmission line with the taps, RF choke, and inductors is EM-simulated, and all active devices are extracted from the layout including the metal stacks from M10.



Figure 3.15: Distributed mixer layout.

Static EVM simulation was conducted with 16-QAM modulation. Fig. 3.18 to Fig. 3.21 show one quadrant of the 16-QAM constellation. While operating at peak power =  $P_{1dB}$ , the EVM turns out to be -19.1 dB with I/Q pulling, and -21.5 dB without I/Q pulling, so the degradation in EVM due to I/Q pulling is 2.4 dB when the mixer is at the  $OP_{1dB}$  point. When the output power is backed off to  $OP_{0.5dB}$ , the EVM is -22.5 dB with I/Q pulling and -25.7 dB without I/Q pulling, showing a 3.2 dB degradation due to I/Q pulling.

Table 3.1 summarizes the performance of the distributed mixer.

#### 3.6 Dual Mode Wideband LO Chain

A major challenge in the transmitter is providing LO power across the entire frequency regime, from around 18 GHz to potentially 60 GHz. As this is far more than an octave, and thus out of range for a typical broadband matching network, a dual frequency drive chain was designed. The general idea is to have separate high and low frequency amplifier chains, with a switch to decide between them, as shown in Fig. 3.22. Both LO paths end in a two stage (with inter-stage low-k transformer matching) class A/B driver that drives the LO through the switch network and into the transmission line mixer input. The gate bias voltages in the drivers were made to be fully re-configurable. This tunability will allow for



Figure 3.16: Distributed mixer layout on the chip.

| Metric                                           | Performance       |
|--------------------------------------------------|-------------------|
| Bandwidth                                        | 13-50 GHz         |
| $P_{1dB}$ at 30 GHz (w/o I/Q combining)          | 2.5  dBm          |
| $P_{1dB}$ at 30 GHz (w/ I/Q combining)           | -0.5 dBm          |
| DC power (at $OP_{1dB}$ )                        | $39 \mathrm{~mW}$ |
| DC supply voltage                                | 1.2 V             |
| Efficiency                                       | 2.3%              |
| EVM (at $OP_{1dB}$ at 30 GHz with I/Q pulling)   | -19.5 dB          |
| EVM (at $OP_{0.5dB}$ at 30 GHz with I/Q pulling) | -22.5 dB          |

Table 3.1: Summary of the performance of the distributed mixer.

increased LO drive power if necessary during testing (class A bias). The nominal setting has 15 mA DC current, and a 1.2 V supply for both stages combined, with higher power needed to overcome the lossy switch network.

Low Frequency Chain The low frequency chain operates from around 18 to 35 GHz. The schematic is shown in Fig. 3.23. At these frequencies, an inverter-based chain is still



Figure 3.17: Simulated distributed mixer gain.



Figure 3.18: Distributed mixer with I/Q pulling at  $OP_{1dB}$ .

viable, and hence used for simplicity. An artificial transmission line LC hybrid is used to provide I/Q signaling, based on [2], with tuning capacitors used to set the balance and center frequency. A balun provides on-chip differential signaling, as well as impedance matching. The entire low frequency chain consumes around 40 mA, on a 1.2 V supply, primarily due to the power hungry inverter chains.



Figure 3.19: Distributed mixer without I/Q Pulling at  $OP_{1dB}$ .



Figure 3.20: Distributed mixer with I/Q pulling at  $OP_{0.5dB}$ .

**High Frequency Chain** The higher frequency chain operates from around 35 to 60 GHz. The input probe pad and its ESD diodes are matched via a T-coil. The key feature in this part of the chain is the broadband 50  $\Omega$  hybrid for I/Q signaling. This was accomplished using a four-finger Lange coupler, shown in 3.24. The coupler achieves at worst 1.25 dB of amplitude mismatch and 3 degrees phase imbalance across the 35-60 GHz band, with



Figure 3.21: Distributed mixer without I/Q Pulling at  $OP_{0.5dB}$ .



Figure 3.22: LO chain concept. Low and high frequency amplifier paths, followed by a switch network to select the desired frequency band. The LO signal is then fed into the distributed mixer (the transmission line).

about 1 dB of insertion loss. The structure was designed and simulated in HFSS. From the hybrid, the signal is directly fed into a balun and matching network to provide the required differential signaling and 50  $\Omega$  match between the driver amplifiers and the hybrid. To provide a broadband match, a two-stage network was used; the balun adds 10 degrees of phase imbalance across the entire band, which is not ideal, but still provides acceptable performance. The total power consumption here is simply that of the driver amplifiers, 15 mA at 1.2 V (18 mW), though this can be adjusted higher if a larger LO drive is required.

The layout of the dual LO chain paths can be viewed in Fig. 3.25. The lumped and Lange



Figure 3.23: The low frequency LO chain, prior to the final driver stage. Note that the amplifiers are simply inverters. An artificial transmission line LC hybrid is based on [2], with 4-bit tuned capacitors.



Figure 3.24: The four finger Lange coupler. The dummy metal (blue) fills the surrounding regions to pass the difficult density DRC rules.

hybrids are in the center of the chip, with the I/Q amplifier chains laid out symmetrically around the mixer. The signals are routed around the chip with grounded co-planar waveguide transmission lines.

#### 3.7 Transmitter Overview

The architecture of each path is shown in Fig. 3.26. The output compression point  $(OP_{1dB})$  for each element in the distributed architecture with I/Q combining is around -2 dBm when the DAC is operating at full scale with a 500 MHz sinusoidal waveform, it is oversampled to 10 GHz, and the DAC flipflops are driven with an ideal 10-bit ADC Verilog-A model that outputs ideal, jitter-free waveforms. The system consumes 71 mW of power under the above operating conditions (including I/Q but excluding the power consumed by the high-speed FPGA-to-chip link, which is not part of the core transmitter architecture). Table 3.2



Figure 3.25: Complete layout of the LO chains. The lumped and Lange hybrids are in the center of the chip, with the I/Q amplifier chains laid out symmetrically around the mixer.

summarizes the simulated performance.

| Metric                                  | Simulation       |
|-----------------------------------------|------------------|
| Tunable RF Bandwidth                    | 18-60 GHz        |
| IF Bandwidth                            | 500 MHz          |
| $P_{1dB}$ at 30 GHz (w/o I/Q combining) | 1.0 dBm          |
| $P_{1dB}$ at 30 GHz (w/ I/Q combining)  | -2.1 dBm         |
| DC power (I and Q)                      | $71 \mathrm{mW}$ |
| DC supply voltage                       | 1.2 V            |

Table 3.2: Summary of the performance of the overall system.

### 3.8 Test Setup

The test setup consists of several components: digital I/Q bit generation and feeding, external LO generation, RF output probing, and scan-chain control for programming of the chip configurations.

In order to generate 20 differential bits at a 10 GS/s rate for the I/Q DACs, the Xilinx VCU118 evaluation board with the XCVU9P FPGA was used. It has 24 GTY serial transceivers which can run up to 28 Gbps. The bits for all 20 channels are pre-generated using Python code and stored in a COE memory file, which will be loaded into the block



Figure 3.26: Architecture of the transmitter.

RAM on the FPGA. The FPGA is programmed to fetch the bits for 32 samples from the memory at a clock rate of 1/32 of the line rate (10 GS/s) in parallel, and the GTY transceiver IP (used in transmitter mode) will serialize the user input and send these bits via a high speed FMC+ (VITA 57.4 standard) board-to-board cable. The clock-data-recovery (CDR) circuitry on chip to make sure the DAC flip-flops register all the bits at the correct time is described in Section 3.3.

The GTY transmitters require external reference clocks from the FMC+ pins in order to generate the required clocks for the desired line rate at each bank, and the FMC+ board-to-board cable has a Skyworks Si5380A-RevD low-jitter clock chip built on it. A 54 MHz crystal is used as the reference for this chip. The chip itself must be programmed via I2C interface before it can generate the required reference clock frequency.

In case of the failure of the DAC section, the chip can be configured to bypass the digital input and directly take analog differential I/Q baseband signals using switches controlled by the scan-chain. When the analog IF I/Q is used, the signals will be generated by a signal generator or an arbitrary waveform generator (AWG). The input is AC-coupled to the IF bypass input port through a balun followed by two bias-Tees on each I/Q side with proper DC bias, as required by the IF input transistors of the distributed mixers.

The LO is fed by a signal generator through a Cascade Infinity G-S-G probe, and the RF output is measured using a vector signal analyzer through a Cascade Infinity G-S-S-G probe. The probes will be landed from the top and the bottom side of the chip, respectively. Fig. 3.27 shows the block diagram of the test setup. The probe station setup is shown in Fig. 3.28.



Figure 3.27: Test setup block diagram.

### 3.9 Measurement

Fig. 3.29 shows the top level layout of the transmitter chip, which was taped out in a TSMC 28nm bulk CMOS process (chip photo shown in Fig. 3.30). The first attempt at measurement failed due to PCB errors and lack of space for probing. The second attempt, after fixing the PCB issues, did not yield observable output at the RF output port, and the DAC output alone does not have a test port. We attempted to identify whether some signal had coupled from the baseband filter output to the bypass input of the distributed mixer, which does not have a directly connected path to the output of the baseband filter. We observed some signs of signal feedthrough, indicating that the DAC is producing some output power. However, the signal was too weak to show meaningful SINAD.



Figure 3.28: Probe station setup of the chip.



Figure 3.29: Layout of the wideband linear transmitter.



Figure 3.30: Die Photo of wideband linear transmitter.

# Chapter 4 Conclusion

In this project, we explored the design of wideband mm-wave CMOS transmitters with enhanced linearity, applicable to modern highly linear front-end modules.

We proposed and fabricated an active and a passive transmitter mixer with improved linearity. The measurements demonstrate the validity of these approaches. We also designed and fabricated a complete transmitter chain to realize I/Q combining and broadband performance. The chip includes a high-speed digital interface to feed 20 bits of I/Q data into the transmitter as well as LO generation and distribution. The chip has a bandwidth of 13-50 GHz and a compression point of 1.0/-2.0 dBm without/with IQ combining, while consuming 71 mW on a 1.2 V supply in simulation.

# Bibliography

- [1] Bijari, A., Z. S. Linearity improvement in a cmos down-conversion active mixer for wlan applications. *Analog Integrated Circuit and Signal Processing 100* (2019), 483–493.
- [2] Iotti, L., LaCaille, G., and Niknejad, A. M. A 12mw 70-to-100ghz mixer-first receiver front-end for mm-wave massive-mimo arrays in 28nm cmos. In 2018 IEEE International Solid - State Circuits Conference - (ISSCC) (2018), pp. 414–416.
- [3] Jann, B., Chance, G., Roy, A. G., Balakrishnan, A., Karandikar, N., Brown, T., Li, X., Davis, B., Ceballos, J. L., Tanzi, N., Hausmann, K., Yoon, H., Huang, Y.-l., Freiman, A., Geren, B., Pawliuk, P., and Ballantyne, W. 21.5 a 5g sub-6ghz zero-if and mm-wave if transceiver with mimo and carrier aggregation. In 2019 IEEE International Solid-State Circuits Conference - (ISSCC) (2019), pp. 352–354.
- [4] Mohsenpour, M.-M., and Saavedra, C. E. Method to improve the linearity of active commutating mixers using dynamic current injection. In 2016 IEEE MTT-S International Microwave Symposium (IMS) (2016), pp. 1–4.
- [5] Radiom, S., Sheikholeslami, B., Aminzadeh, H., and Lotfi, R. Folded-current-steering dac: an approach to low-voltage high-speed high-resolution d/a converters. In 2006 IEEE International Symposium on Circuits and Systems (2006), pp. 4 pp.-4786.
- [6] Razavi, B. Design of Analog CMOS Integrated Circuits, second ed. 2015.
- [7] Siddique, A., Delwar, T. S., Behera, P., Biswal, M. R., Haider, A., and Ryu, J.-Y. Design and analysis of a novel 24 ghz up-conversion mixer with improved derivative super-position linearizer technique for 5g applications. *Sensors* 21, 18 (2021).
- [8] Tang, C.-C., Lee, Y.-B., Sun, C.-H. E., Lin, C.-C., Syu, J.-S., Wu, M.-H., Chen, Y., Chueh, T.-C., Bryant, C., Collados, M., Hassan, M., Ramos, J., Hsieh, Y.-L., Chen, H.-H., Guo, X., Chen, H., Cao, C., Li, D., Strange, J., Wang, C., and Dehng, G.-K. 21.4 an lte-a multimode multiband rf transceiver with 4rx/2tx inter-band carrier aggregation, 2-carrier 4×4 mimo with 256qam and hpue capability in 28nm cmos. In 2019 IEEE International Solid- State Circuits Conference - (ISSCC) (2019), pp. 350–352.
- [9] Thomas H., L. The Design of CMOS Radio-Frequency Integrated Circuits, second ed. 2004.