# Fanout Optimization for an Inductor-less Broadband Variable Gain Cherry-Hooper Amplifier



Sashank Krishnamurthy Ali Niknejad

## Electrical Engineering and Computer Sciences University of California, Berkeley

Technical Report No. UCB/EECS-2021-23 http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-23.html

May 1, 2021

Copyright © 2021, by the author(s). All rights reserved.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission.

# Fanout Optimization for an Inductor-less Broadband Variable Gain Cherry-Hooper Amplifier

by Sashank Krishnamurthy

## **Research Project**

Submitted to the Department of Electrical Engineering and Computer Sciences, University of California at Berkeley, in partial satisfaction of the requirements for the degree of **Master of Science**, **Plan II**.

Approval for the Report and Comprehensive Examination:

**Committee:** 

Professor Ali M. Niknejad Research Advisor

(Date)

\* \* \* \* \* \* \*

9 am

Professor Elad Alon Second Reader

5-28-2020

(Date)

# Fanout Optimization for an Inductor-less Broadband Variable Gain Cherry-Hooper Amplifier

Sashank Krishnamurthy

#### Abstract

This report describes the design and analysis of a broadband inductor-less cascode-inverter based Cherry-Hooper amplifier. The methodology to maximize the bandwidth for a desired fan-out and gain is described. Simulation results of the extracted layout, in 28nm bulk CMOS, are presented. A bandwidth of 19.2GHz and a gain of 28.3dB were obtained, while consuming a power of 10.3mW, from a 1V supply. Additionally, the cascode bias can be used to tune the gain from 13.6dB to 29dB without significant impact on the bandwidth.

#### **1** Introduction

This report addresses the design of the baseband amplifier for a high speed energy efficient mm-wave wireless receiver [1]. This amplifier is driven by the output of a mixer, which is capable of driving a fixed input capacitance  $C_{in}$  of the amplifier, and drives a fixed load  $C_L$ . The gain of the amplifier A is specified based on the link-budget analysis of the entire receiver.

For this gain A and fan-out  $F = C_L/C_{in}$ , the devices are sized for optimal bandwidth and DC power, and the bandwidth is expressed as a function of the fan-out F and technological parameters like intrinsic gain  $a_{v0}$  and transit frequency  $\omega_T$  of transistors.

Specifically, this amplifier is designed for a maximum  $C_{in}$  of 35fF, fan-out F of 1, bandwidth of at least 15GHz and nominal gain of 30 (~ 30dB), which is approximately equal to  $a_{v0}^2$ , where  $a_{v0} = 5.4$  is intrinsic gain of an inverter of minimum channel length in the process. Also, gain programmability from 10 to 30dB is desired. Furthermore, as an LNA and an active mixer precede the amplifier, the noise figure of the baseband amplifier is not very critical. Therefore, this report addresses only optimization of bandwidth and DC power consumption for the required gain. While the design choices in this report are driven by the aforementioned specifications, the optimization procedure provided is general and may be used for any other specifications. State-of-the-art broadband baseband amplifiers [2–4] use inductive peaking techniques to achieve high bandwidth. In this report, we show that an optimally designed inductor-less Cherry-Hooper amplifier suffices to meet the high bandwidth requirement.

As derived in [5], if N amplifier stages with equal GBW product are used, the optimal gain per stage  $G_1$  and number of stages  $N_{opt}$  to maximize bandwidth for a fixed gain A, are given by  $G_1 = \sqrt{e}$  and  $N_{opt} = 2 \ln(A)$ . However to minimize DC power consumption of the amplifier, it is desirable to minimize the number of stages, while still meeting the bandwidth requirement.

#### **2** Limits of Differential Pairs

To achieve a gain A, a cascade of differential pairs (see Fig. 1), is the simplest topology. The gain of a single differential pair cannot be higher than the intrinsic gain  $a_{v0} = g_m r_o$  of the transistor, and is actually  $A_d = g_m (r_o || R_L)$ , where  $R_L$  is the resistive load. To get an overall gain of A with N stages ( $N \ge 2$ ), each stage must have a gain  $A_d = A^{\frac{1}{N}}$ . Additionally, for a given  $F = C_L/C_{in}$ , the transistors of each stage must be sized a factor of  $F^{\frac{1}{N}}$  times those of the previous stage. Hence, the bandwidth of a single stage can

be derived as

$$BW = \frac{1}{(r_o||R_L) \left[ C_{ds} + C_{gd} + F^{\frac{1}{N}} \left( C_{gs} + (1 + A^{\frac{1}{N}}) C_{gd} \right) \right]} \\ = \frac{\omega_T (1 + \gamma)}{A^{\frac{1}{N}} \left[ 1 + \gamma + F^{\frac{1}{N}} \left( 1 + (1 + A^{\frac{1}{N}}) \gamma \right) \right]}$$
(1)

For (1), we assume that  $C_{gd} = \gamma C_{gs}$  ( $\gamma \approx 0.25$ ) and  $C_{ds} \approx C_{gs}$ . This assumption was validated by parasitic extraction of the devices.

The bandwidth of a cascade of N single-pole stages [5] is given by

$$BW_N \approx \frac{0.88\omega_T (1+\gamma)}{\sqrt{N}A^{\frac{1}{N}} \left[1+\gamma + F^{\frac{1}{N}} \left(1+(1+A^{\frac{1}{N}})\gamma\right)\right]}$$
(2)

From (2) and Fig. 2, it is observed that for fan-out F values as low as 1, the maximum achievable bandwidth for the gain spec of A = 30, is as low as  $\sim 0.085\omega_T$ , requiring as many as 7 stages. For extrinsic  $f_T$  of  $\sim 180 - 200$ GHz in the 28nm process used in this work, it is extremely challenging to meet the stringent bandwidth specification of higher than 15GHz with a cascade of differential pairs.

To mitigate the Miller effect, a differential pair using cascoded transistors may be considered. A cascade of N such stages has bandwidth

$$BW_N \approx \frac{0.88\omega_T}{\sqrt{N}A^{\frac{1}{N}} \left(1 + F^{\frac{1}{N}}\right)} \tag{3}$$



Figure 1: Cascade of Differential Pairs.

It is observed that cascoding helps improve the bandwidth by mitigating the Miller effect for lower number of stages N (higher gain per stage  $G_1$ ). However, the improvement in maximum achievable bandwidth (~  $0.1\omega_T$  with 7 stages) is not very significant. This is due to reduced Miller effect for smaller gain per stage  $G_1$ , even without cascoding. Additionally, stacking devices using the same supply also reduces  $f_T$  to ~ 90 – 100GHz. The gain-bandwidth trade-off (due to the high impedance node at the output of each stage) significantly limits the maximum achievable bandwidth for a fixed gain, even with cascoding.



Figure 2: Bandwidth as a fraction of  $f_T$  v/s N for N-stage cascade of differential pairs.



Figure 3: Single stage cascode inverter based Cherry-Hooper amplifier, and its small-signal equivalent.

### 3 Cherry-Hooper Amplifier

The Cherry-Hooper amplifier [6] (see Fig. 3) consists of a transconductance driving a trans-impedance amplifier (TIA). It broadens amplifier bandwidth by incorporating drain-gate shunt feedback through a resistance  $R_F$  and creating a low-impedance at nodes X and Y. The gain of the Cherry-Hooper amplifier [7]

is given by

$$\frac{v_{out}}{v_{in}} = g_{m1}R_F - \frac{g_{m1}}{g_{m2}} \approx g_{m1}R_F \tag{4}$$

The low frequency resistance looking into nodes X and Y is approximately  $1/g_{m2}$ . Using a real-pole approximation [7],  $\omega_X \approx \omega_T / (1 + \frac{C_{dd1}}{C_{gg2}})$  and  $\omega_Y \approx \omega_T / (1 + \frac{C_L}{C_{dd2}})$ . Equation (4) and these approximate expressions for  $\omega_X$  and  $\omega_Y$  tell us that the gain may be tuned by changing  $R_F$  without affecting the bandwidth. However, a resistor bank adds significant parasitic capacitance. A parasitic-free method to implement gain tuning is by changing  $g_{m1}$  by adjusting the cascode bias. The insightful real-pole approximation and the complete analysis of the transfer function in [7], do not present a method to size the TIA transistors  $g_{m2}$  with respect to the  $g_{m1}$  transistors for a given fan-out. Additionally, the analysis also does not take into account effects of finite  $g_m r_o$ .

In this report, we provide a design-oriented analysis of the cascode-inverter based Cherry-Hooper amplifier. The self-biased cascode-inverter (see Fig. 3) provides a simple method of biasing the amplifiers, without the penalty of parasitics from any biasing circuitry. Additionally, the increased open-loop gain of  $A_{casc}a_{v0}$  due to the cascode enhances the benefit of shunt feedback, which lowers the low-frequency impedance looking into nodes X and Y (see Fig. 3).

Consider a single stage Cherry-Hooper amplifier. We define n as the ratio of  $g_{m2}$  with respect to  $g_{m1}$ , and  $\alpha = R_F/A_{casc}r_{o1}$ , where  $A_{casc} = g_{m1,casc}r_{o1,casc}$ . The intrinsic gain of a single stage of cascode-inverter is  $A_{casc}g_{m1}r_{o1}$ .<sup>1</sup> Shunt feedback in this topology lowers the gain to  $\sim g_{m1}R_F$ . The amount by which the gain is lower than the intrinsic gain is given by  $\alpha$ . We provide a method to find the optimal value of  $n_{opt}$  and  $\alpha_{opt}$  to maximize the bandwidth of a single-stage Cherry-Hooper amplifier for a fixed value of gain A.

The gain given by equation (4) assumes that  $g_m r_o \gg 1$  and also assumes that the feedback resistance  $R_F \ll r_{o1}, r_{o2}$ . These are not true for short-channel devices. Without these approximations, the DC gain A of the single-stage Cherry-Hooper amplifier is

$$A = \frac{\left(\frac{A_{casc}a_{v0}}{1+A_{casc}a_{v0}}\right)\left(A_{casc}a_{v0}\alpha - \frac{1}{n}\right)}{1 + \frac{\alpha + \frac{1}{n}}{1+A_{casc}a_{v0}}}$$
(5)

When  $a_{v0}$  tends to  $\infty$ , it is easy to see that the second term in the numerator is exactly the same as (4). (5) may be used to find  $\alpha_{opt}$  once  $n_{opt}$  is computed.

Now, the bandwidth is approximated using a real-pole approximation to obtain some insights on sizing the TIA with respect to the transconductance stage. The poles looking into the input and output of the TIA

 $<sup>^{1}</sup>A_{casc}g_{m}r_{o}$  of 20 is assumed for the plots in this section.

are given by

$$\omega_{X} = \frac{\frac{R_{F} + A_{casc}r_{o2}}{1 + A_{casc}g_{m2}r_{o2}} ||A_{casc}r_{o1}}{C_{gg2} + C_{dd1}}$$

$$\omega_{Y} = \frac{\frac{R_{F} + A_{casc}r_{o1}}{1 + A_{casc}g_{m2}r_{o1}} ||A_{casc}r_{o2}}{C_{dd2} + C_{L}}$$
(6)

Using (5) and the definitions for n and  $\alpha$ , (6) can be re-written in terms of design specs fan-out f, gain A and technology parameters  $A_{casc}a_{v0}$  and  $\omega_T$  as

$$\omega_X \approx \frac{\omega_T}{n+1} \left( \frac{1}{A_{casc} a_{v0}} + \frac{n A_{casc} a_{v0}}{A_{casc} a_{v0} + nA} \right)$$

$$\omega_Y \approx \frac{\omega_T}{n+f} \left( \frac{n}{A_{casc} a_{v0}} + \frac{n A_{casc} a_{v0}}{A_{casc} a_{v0} + A} \right)$$
(7)



Figure 4:  $\omega_X$ ,  $\omega_Y$  and bandwidth v/s n, the ratio of  $g_{m2}$  to  $g_{m1}$ , for f = 1,  $A = \sqrt{30}$ .

The two poles  $\omega_X$ ,  $\omega_Y$  and the bandwidth, as a fraction of  $\omega_T$ , are plotted against *n*, the relative sizing of the TIA  $g_{m2}$  with respect to the transconductance  $g_{m1}$ , for a fixed value of fan-out f = 1 and fixed gain  $A = \sqrt{30}$ , in Fig. 4. The values of f and A are chosen to match the final implementation.

Consider the pole  $\omega_X$  associated with node X. The resistance looking into node X is  $1/g_{m2}$  for  $R_F \ll r_{o2}$ , and the capacitance is  $C_{gg2} + C_{dd1}$ . As *n* increases, initially both  $g_{m2}$  and  $C_{gg2}$  increase, but  $C_{dd1}$  is constant, thereby increasing the value of  $\omega_X$ . However, for larger *n*, the resistance looking into node

X approaches a constant  $R_F/A_{casc}a_{v0}$ . On the other hand,  $C_{gg2}$  continues to increase with n, thereby decreasing  $\omega_X$ .  $\omega_X$  is a function of n/(n+1) for small n and 1/(n+1) for large n, thus explaining the plot of  $\omega_X$  versus n in Fig. 4. It is apparent from the expression for  $\omega_Y$  and the plot in Fig. 4 that  $\omega_Y$  is a monotonically increasing function of n. Clearly, from the plots of  $\omega_X$  and  $\omega_Y$ , there exists an optimum value  $n_{opt}$  for maximizing the bandwidth.

While the real-pole approximation gives insights about the existence of  $n_{opt}$  for maximizing bandwidth, it is necessary to do the less intuitive but complete analysis to get an accurate estimate of sizing and bandwidth (see solid plot in Fig. 4). We have the following transfer function  $v_o(s)/v_{in}(s)$ , for the Cherry-Hooper amplifier of Fig. 3.

$$\frac{v_o(s)}{v_{in}(s)} = \frac{A}{1+bs+as^2} \tag{8}$$

where a and b are given by

$$a = \frac{A_{casc}a_{v0}}{\omega_T^2} \times \frac{(A(1+n+nA_{casc}a_{v0}) + A_{casc}a_{v0})(1+n)(n+f)}{n(1+nA_{casc}a_{v0})(1+A_{casc}a_{v0})}$$

$$b = \frac{(A_{casc}a_{v0} + nA)(n+1)}{\omega_T(1+nA_{casc}a_{v0})} + \frac{(A_{casc}a_{v0} + A)(n+f)}{n\omega_T(A_{casc}a_{v0} + 1)}$$
(9)

While optimizing for bandwidth using the exact complex pole analysis, it is seen that the maximum bandwidth yields a high Q transfer function with peaking (see Fig. 5). The peaking may be reduced by choosing a value of  $n < n_{opt}$ , where  $n_{opt}$  is the size of  $g_{m2}$  with respect to  $g_{m1}$  for maximum bandwidth. However, this reduction in peaking results in marginal bandwidth degradation.



Figure 5: Bandwidth v/s n, the ratio of  $g_{m2}$  to  $g_{m1}$ , for  $f = 1, A = \sqrt{30}$ .



Figure 6: Sizing of N-stage Cherry-Hooper amplifier.

Now, to optimize bandwidth for an N-stage Cherry-Hooper amplifier, we do the following. Consider each stage of  $g_m$  and TIA together as one unit amplifier. Size these unit amplifiers using the standard exponential sizing used to minimize propagation delay of an inverter chain for a given fanout F. Now, for stage i of the amplifier, the size of the  $g_m$  cell is  $F^{(i-1)/N}$  and the size of the TIA cell is  $nF^{(i-1)/N}$ , where n is evaluated using the aforementioned optimization procedure for a single-stage Cherry-Hooper amplifier.

For our application, we limit ourselves to two stages to minimize power consumption. Fig. 7 plots the maximum achievable bandwidth (for overall gain A = 30) using two stages of a Cherry-Hooper amplifier

for different values of fanout F. For F = 1, we are able to achieve a bandwidth as high as  $0.25\omega_T$ , which is significantly higher than the maximum achievable bandwidth using 7 stages for a simple differential pair, with or without cascoding. Clearly, we can achieve the desired bandwidth specification of greater than 15GHz using a two-stage cascode-inverter based Cherry-Hooper amplifier. Additionally, as seen from Fig. 7, the maximum achievable bandwidth for F = 100 using a Cherry-Hooper amplifier is higher than the maximum achievable bandwidth for F = 1 using a simple cascade of differential pairs.



Figure 7: Bandwidth v/s fanout F for a gain of A = 30. To normalize with respect to power consumption, a 2-stage Cherry-Hooper amplifier (with 4 inverters) and a 4-stage cascoded differential pair are compared.

#### **4 Post Layout Simulation Results**

Fig. 8 shows the schematic of one half of a two-stage pseudo differential cascode-inverter based Cherry-Hooper amplifier, in 28nm bulk CMOS, designed using the methodology described in this report. An AC coupling capacitance of 180fF was used between the two Cherry-Hooper amplifier stages. The large resistance  $R_{bias}$ used to self-bias the inverter (see Fig. 8) was implemented using a MOS resistor to minimize parasitics. The complete layout shown in Fig. 9, occupies a tiny area of  $40\mu m \times 20\mu m$ , significantly smaller than any inductor based implementation. Unity gain differential pairs may be cascaded with this design to enhance the common mode rejection, with minimum bandwidth degradation. Simulation results are presented using



Figure 8: Complete schematic of 2-stage cascode-inverter based Cherry-Hooper amplifier. Transistor sizes are in  $\mu$ m.

the complete extracted layout. The circuit is designed to operate at  $V_{DD} = 1$ V.



Figure 9: Complete layout of 2-stage Cherry-Hooper amplifier.

The gain of this amplifier is controlled by varying  $g_{m1}$  of each Cherry-Hooper amplifier using the cascode bias voltages (see Fig. 8). Fig. 10 shows representative transfer functions for two different gain settings. The size of the TIA with respect to the  $g_m$ -stage was chosen to keep the in-band peaking under 3dB. AC coupling between the two Cherry-Hooper amplifier stages introduces a high-pass corner at less than 5MHz.



Figure 10: Simulated transfer function of the two-stage Cherry-Hooper amplifier, for two different settings of PMOS cascode bias voltage  $V_{bias,P}$  (see Fig. 3). NMOS cascode bias voltage  $V_{bias,N}$  was chosen to be  $V_{DD} - V_{bias,P}$ .

Fig. 11 plots the simulated gain and bandwidth as the cascode bias voltage is varied. By tuning the  $V_{bias,P}$  from 0.1 to 0.5V ( $V_{bias,N}$  correspondingly from 0.9 to 0.5V), the gain can be tuned from 13.6 – 29dB. For this entire tuning range, the bandwidth remains approximately constant at 19GHz, illustrating the gain-bandwidth independence of this topology.



Figure 11: Simulated gain and bandwidth for different settings of PMOS cascode bias voltage  $V_{bias,P}$ .  $V_{bias,N}$  was chosen to be  $V_{DD} - V_{bias,P}$ .

|                         | [3]   | [8]    | [4]  | This   |
|-------------------------|-------|--------|------|--------|
| CMOS Tech.              | 90nm  | 65nm   | 65nm | 28nm   |
| Bandwidth (GHz)         | 44    | 11     | 22   | 19.2   |
| Gain (dB)               | 19    | 15.6   | 31.1 | 28.3   |
| DC power (mW)           | 57    | 2.6    | 23   | 10.3   |
| Area (mm <sup>2</sup> ) | 0.018 | 0.0029 | 0.12 | 0.0008 |
| Inductor used           | Yes   | No     | Yes  | No     |
| GBW/Power               | 6.9   | 25.5   | 34.3 | 48.5   |

Table 1: Comparison with state-of-art broadband baseband amplifiers

# 5 Conclusion

This report provides a methodology for design of a broad-band inductor-less baseband amplifier for high-speed energy efficient wireless links. In particular, a framework to optimize the design of a Cherry-Hooper

amplifier to achieve the desired bandwidth at minimum power consumption, is provided. A parasitic-free method for bandwidth independent gain-tuning is also described. Table 1 compares this work against the state-of-art broadband baseband amplifiers. Clearly, this work achieves better FoM (GBW/Power) than the state-of-the-art.

#### Acknowledgment

This work was funded by National Science Foundation. The author would like to thank the BWRC staff, students and sponsors.

### References

- [1] A. Townley, N. Baniasadi, S. Krishnamurthy, C. Sideris, A. Hajimiri, E. Alon, and A. Niknejad, "A fully integrated, dual channel, flip chip packaged 113 GHz transceiver in 28nm CMOS supporting an 80 Gb/s wireless link," in 2020 IEEE Custom Integrated Circuits Conference (CICC). IEEE, 2020.
- [2] S. Shekhar, J. S. Walling, and D. J. Allstot, "Bandwidth extension techniques for CMOS amplifiers," *IEEE Journal of Solid-State Circuits*, vol. 41, no. 11, pp. 2424–2439, 2006.
- [3] J. R. Weiss, M. A. Kossel, C. Menolfi, T. Morf, M. L. Schmatz, T. Toifl, and H. Jaeckel, "A DC-to-44-GHz 19dB gain amplifier in 90nm CMOS using capacitive bandwidth enhancement," in 2006 IEEE International Solid State Circuits Conference-Digest of Technical Papers. IEEE, 2006, pp. 2082–2091.
- [4] Z. Hou, Q. Pan, Y. Wang, L. Wu, and C. P. Yue, "A 23-mW 30-Gb/s digitally programmable limiting amplifier for 100GbE optical receivers," in 2014 IEEE Radio Frequency Integrated Circuits Symposium. IEEE, 2014, pp. 279–282.
- [5] T. H. Lee, The design of CMOS radio-frequency integrated circuits. Cambridge university press, 2003.
- [6] E. Cherry and D. Hooper, "The design of wide-band transistor feedback amplifiers," in *Proceedings of the Institution of Electrical Engineers*, vol. 110, no. 2. IET, 1963, pp. 375–389.
- [7] B. Razavi, Design of integrated circuits for optical communications. John Wiley & Sons, 2012.
- [8] J. Baylon, X. Yu, S. Gopal, R. Molavi, S. Mirabbasi, P. P. Pande, and D. Heo, "A 16-Gb/s low-power inductorless wideband gain-boosted baseband amplifier with skewed differential topology for wireless network-on-chip," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, pp. 1–13, 2018.