# Copyright © 1984, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. LOW-VOLTAGE LOW-POWER MOS SWITCHED-CAPACITOR SIGNAL-PROCESSING TECHNIQUES by R. Castello Memorandum No. UCB/ERL M84/67 20 August 1984 207 LOW-VOLTAGE LOW-POWER MOS SWITCHED-CAPACITOR SIGNAL-PROCESSING TECHNIQUES bу R. Castello Memorandum No. UCB/ERL M84/67 20 August 1984 # LOW-VOLTAGE LOW-POWER MOS SWITCHED-CAPACITOR SIGNAL-PROCESSING TECHNIQUES by R. Castello Memorandum No. UCB/ERL M84/67 20 August 1984 **ELECTRONICS RESEARCH LABORATORY** College of Engineering University of California, Berkeley 94720 # LOW-VOLTAGE LOW-POWER MOS SWITCHED-CAPACITOR SIGNAL-PROCESSING TECHNIQUES Ph. D. Rinaldo Castello Department of EECS Paul R. Gray Chrisman of Committee ### **Abstract** To date Switched-Capacitor (S.C.) filters satisfying the PCM channel filter specifications consume approximately 1 mW per-pole and use a $\pm 5 V$ supply voltage. From a fundamental stand point, the absolute minimum achievable power dissipation in a voiceband filter with a dynamic range of 90 db in a 3-micron technology operated from a $\pm 5 V$ supply is less than one microwatt per pole. A very large margin for improvement is therefore available. Reduced power consumption is important in battery operated analog/digital interfaces and it will be even more so as larger and more complex systems are integrated on the same chip. At the same time, as a consequence of the scaling of MOS technology, supply voltages will have to be reduced. This fact, and the desire of having an analog/digital compatible technology, creates a strong motivation for developing analog circuits more suitable for low voltage operation. Although many low power and/or low-voltage S.C. MOS circuits have been presented, no low-voltage, low-power filter meeting the PCM channel filter requirement have been reported to date. This dissertation describes a new 5<sup>th</sup> order CMOS PCM channel filter operating from a single 5 Volt supply and dissipating 70 $\mu Watt$ per pole. The realized experimental prototype shows that a level of performance comparable or improved with respect to commercially available 10 Volt realizations is feasible. Together with the filter a low power buffer amplifier, also operating from a 5 Volt supply, and able to drive off-chip loads was realized. # **ACKNOWLEDGEMENTS** I wish to express my thanks and appreciation to my advisor Prof. Paul Gray for the continuous support that he has given me during the course of my Ph. D. studies. During these years I have grown to appreciate not only his deep insight in the field of LC. design, and his rigorous scientific ethic, but also his personal concern and caring toward other people which lies below his apparently reserved manners. I feel privileged to have been able to work with him. I would also like to express my sincere thanks to both Professors Robert Meyer and Dave Hodges for having been always so available to discussing problems and offering suggestions during these years. The help of Professors Alberto Sangiovanni-Vincentelli and Rainer Sachs is also gratefully acknowledged. The help and friendship of many fellow graduate students was very important in making my research effort successful. In particular a note of thanks goes to my two office mates of these years Ron Kaneshiro and Lee-Chung Yiu for their patience and understanding. Hae-Seung Lee and Ping Li also deserve special mention for helping me so much with the processing of my chip. Each and every one of the students making up the LC. group was helpful to me in ways ranging from technical cooperation to personal support and encouragement. Having been able to establish with them a rapport of friendship and trust is for me as important as the successful completion of my research. Here is a probably incomplete list of them: Paul Hurst, John Fattaruso, Devid Soo, Haideh Khorramabadi, Tat Choi, Kuang-Lu Lee, Chorng-Kuang Wang, Cheng-Chung Shih, Max Hauser, Bang Song, Jesus Guinea, Reza Kazerounian, Nanni De Micheli, Dan Senderowicz. The cooperation of both Micheal Wong and W. E. Wallace with the testing and layout of the chip helped to keep the duration of my graduate study within reasonable limits. Special thanks goes also to the microelectronic lab staff for their help and patience: Dot McDaniel, Don Rogers, Bob Hamilton, Dick Chan, Kim Chan, and Christy Atases. The cooperation provided by INTEL corporation by successfully fabricating one of the two experimental chip prototypes is also gratefully acknowledged. Last but not least, Prof. Paolo Antognetti bears most of the merit and the responsibility for my decision to come to Berkeley to pursue a graduate degree. Research sponsored by the National Science Foundation Grants ECS-8023872/ECS-8100012/ECS-8L00L2. ### CHAPTER 1 # INTRODUCTION Switched-Capacitor (S.C.) filter performance has been steadily improving in the last several years and many prototypes satisfying the stringent PCM channel filter requirement have been reported [1-6]. However, in the most recent commercial implementation the required power-per-pole is in the neighborhood of 1 mW and a relatively high double polarity ( $\pm 5 V$ ) supply voltage is needed. It will be shown in this dissertation that, from a fundamental stand point, the absolute minimum achievable power dissipation in a voiceband filter with a dynamic range of 90 db in a 3-micron technology operated from a $\pm 5 V$ supply is less than one microwatt per pole [23]. A very large margin for improvement is therefore available and new structures that more closely approach the theoretical minima seem to be feasible. A reduction in the power consumption is important in the realization of battery operated analog/digital interfaces and could prove to be even more important in the future as larger and more complex systems are integrated on the same chip [7]. At the same time, as a consequence of the continuous scaling of MOS technology, supply voltages will have to be reduced if analog interfaces are to take advantage of this scaling [8]. This fact, and the desire of having an analog/digital compatible technology, create a strong motivation for developing new approaches in the design of analog circuits to make them more suitable for low voltage operation. Recently many low power MOS circuits suitable for S.C. applications have been presented [9-15]. Of these, some are also intended to be used from a low voltage supply [10-13,15]. All of them, however, are for special purpose applications, use a low frequency clock (with the exception of [12]), and have relatively low performance. In fact no low- voltage, low-power filter meeting the PCM channel filter requirement have been reported to date. This dissertation describes a new $5^{th}$ order CMOS PCM channel filter prototype operating from a single 5 Volt supply and dissipating 70 $\mu$ Watt per pole. By utilizing a combination of circuit techniques including input-to-output class A/B amplifier design, fully differential topology, dynamic biasing, switched capacitor common mode feedback, etc, a level of performance comparable or improved with respect to commercially available 10 Volt commercial realizations is shown to be feasible. Together with the filter a low power buffer amplifier, also operating from a 5 Volt supply, and able to drive off-chip loads was realized. # CHAPTER 2 # PERFORMANCE LIMITATIONS IN SWITCHED-CAPACITOR FILTER # 2.1. INTRODUCTION Despite their relatively short history, switched capacitor (S.C.) circuits are already fairly mature. Most of their specifications have improved substantially since the first monolithic S.C. filters using S.C. integrators were designed and fabricated in 1977 [26][27]. In particular the power dissipation per pole has been reduced from about 10 to 20 mW in the first NMOS prototypes to less than 1 mW in the CMOS filters in production today [1]. These figures refer to general purpose systems working from a $\pm 5$ Volts supply and with clock rates of 128 kHz or more. For special purpose applications, on the other hand, much smaller values have been achieved [9]-[14]. Another aspect that has been extensively investigated is the improvement of the dynamic range of the filter. To this end techniques like fully differential circuit design and noise frequency translation via chopper stabilization have been proposed. This has produced a filter with a dynamic range of 102 db [28]. To achieve such a result, however, a large increase in the chip area occupied by the filter was necessary. Finally the total die area occupied by the filter has been substantially reduced. This has allowed the integration on a single chip of many S.C. filters together with other components [2][3][7][29]. Almost all of these results have been achieved by improving the performance of the operational amplifiers (op amps) in the filter [1], [30]-[32]. It is likely that better and better op. amps. will be designed in the future allowing this trend to continue. Eventually, however, some fundamental limitations other than those coming from the op. amps. will come into play. Such limitations cannot be overcome by circuit or process improvements; therefore they determine the ultimate performance limit of the filter. This paper analyzes these fundamental limitations with reference to low pass filters. Section 2 focuses on the S.C. integrator which is the building block of most S.C. filters. Under certain assumptions, the minimum area and power requirements, and the maximum achievable dynamic range are obtained as a function of relatively few parameters that are dependent on both the technology and the circuit used. It is shown that both the minimum power and area requirement vary proportionally to the square of the achievable dynamic range. Section 3 analyzes the performance limitations of a low-pass S.C. filter. The theory of section 2 is extended to any low-pass ladder structure without introducing further approximations. The obtained results, while intuitively interesting, are function of the particular filter under consideration and cannot be related to each other in a general way. By introducing additional approximations, which in most practical cases cause only a small error, and normalizing the results to the order of the filter, several simple relationship are obtained. Logarithmic plots showing the dependence of the minimum area and power requirement versus the achievable dynamic range are also provided. On the basis of such plots state-of-the-art filters can be compared with the theoretical minima. As an example, for a $5^{th}$ order voiceband filter with 95 db of dynamic range assuming a $\pm 5Volts$ supply the minimum area required is approximately $7300\mu m^2$ and the minimum power $8.5\mu W$ . This is about two to three orders of magnitude smaller than the typical actual values for both power and area. Finally in Section 4 the effect of the op amp non-idealities which were ignored in the derivation of the previous sections are considered. The op amp fundamental limitations are very difficult to exactly quantize and this is part of the reason why they were first ignored, nonetheless some upper bounds for the absolute minimum power, area, and noise can be obtained with reference to a particularly simple but relistic op amp configuration. This shows how the op amp limitations should not affect the ultimate filter performance in most practical cases. # 2.2. PERFORMANCE LIMIT FOR THE IDEAL INTEGRATOR In this section the S.C. integrator is analyzed to obtain limits for the minimum power consumption and chip area requirement and for the maximum dynamic range achievable together with their interrelations. All of the following calculations refer to the so called differential bottom plate integrator shown in Fig. 2.1. Such a circuit was chosen for sake of concreteness, since it is insensitive to parasitic capacitance and has been used extensively in the literature [1][4][28]. However the extension of the theory to other S.C. integrator configurations is very straightforward and yields similar results. The following basic assumptions will be used throughout the paper. 1. The op. amp. in the integrator is assumed to be ideal in the sense that it does not contribute any noise to the filter, it does not use any D.C. power, and it occupies no chip area. The reason for such drastic assumptions is that there are no fundamental limits, identifiable a priori, for the minimum value that can be achieved, via process and/or circuit design improvements, for any of the op. amp. non idealities mentioned above. The only potential exception to this comes from the op. amp. white noise. It has however been shown [33] that its contribution, when is not negligible, can be added to that of the $\frac{k}{C}$ noise since both can be represented in the same way. In this paper the op amp white noise is neglected for the sake of simplicity; however, because of the above considerations, the following analysis can be easily extended to include it, if a specific op. amp. configuration is given. In section 4 the validity of these assumptions will be discussed in more detail. Figure 2.1 Bottom Plate S.C. Integrator. 2. The integrator capacitor is assumed to be much larger than the sampling capacitor i.e. $$\frac{C_i}{C_r} \gg 1 \tag{2.1}$$ where $C_s$ and $C_i$ are the sampling and integration capacitors as shown in Fig.1. Making use of the following basic equation for the S.C. integrator [35]: $$\frac{C_s}{C_i} = \frac{2 \pi f_{unity}}{f_{clock}} \tag{2.2}$$ where $f_{unity}$ is the unity gain frequency of the integrator and $f_{clock}$ is the clock frequency, condition (2.1) becomes: $$\frac{f_{clock}}{f_{unity}} \gg 2\pi \tag{2.3}$$ Assumption (2.3) is almost always valid if the integrator is part of a low-pass vioceband S.C. filter. In such a case, in fact, each integrator has a unity-gain frequency which is comparable in value with the band edge of the filter, while the clock frequency is typically many times larger than the filter band edge to avoid warping effects in the transformation from the z to the s domain [34] and to ease anti-aliasing requirements. On the basis of the above assumptions, the absolute minimum integrator area is approximately equal to the area of $C_i$ . Assuming to have a symmetrical power supply equal to $\pm V_s$ Volts and that the capacitor dielectric has a maximum electric field before break-down equal to $E_{max}$ and a dielectric constant equal to $\epsilon_{diel}$ , the minimum thickness of the capacitor is: $$t_{\min} = 2 \frac{V_s}{E_{\max}} \tag{2.4}$$ The minimum area required to realize a capacitor of value $C_i$ is therefore: $$AREA_{\min} = \frac{t_{\min}C_i}{\epsilon_{diel}} = \frac{2 V_s C_i}{E_{\max} \epsilon_{diel}}$$ (2.5) The maximum amount of energy that can be stored in the integrator $\epsilon_{max}$ is given by $$\epsilon_{\text{max}} = \frac{1}{2} (2 V_s)^2 C_i = 2 V_s^2 C_i$$ (2.6) Substituting Eq. 2.6 into Eq. 2.5 gives the minimum area as a function of the maximum stored energy: $$AREA_{\min} = \frac{\epsilon_{\max}}{V_s E_{\max} \epsilon_{diel}}$$ (2.7) Next, the absolute minimum power consumption is computed. To this end the integrator of Fig. 2.1 can be represented as in Fig. 2.2. Furthermore the left hand side of the circuit of Fig. 2.2 can be modified as shown in Fig. 2.3. The only potential source of error in such a substitution is the phase difference existing between the two input of the integrator. Such a difference, however, does not effect power dissipation. The two current sources $I_1$ and $I_2$ are used to model an ideal class B Op. Amp. To guarantee zero quiescent power dissipation, as it was assumed, $I_1$ must be equal to zero when $I_2 \neq 0$ and vice versa. The same is valid for $I_1$ and $I_2$ . The total power dissipation is given by the amount of energy per unity of time drown from the supplies by the two portions of the circuit i.e. - 1. The amount of energy that $C_s$ draws from one supply and than damps into the amplifier virtual ground. - 2. The amount of energy that $C_I$ draws from the other supply, through the action of the Op. Amp., to be damped again into the virtual ground. Assuming that the input signal $v_i$ is a pure sinusoid with frequency f and peak amplitude $V_i$ , the energy dissipated during one period of the signal can be computed as in Appendix A and is equal to: Figure 2.2 Circuit Used to Compute the Power Drown From the Supplies. $$\epsilon_{cycle} = \frac{4}{\pi} V_i V_s C_s \frac{f clock}{f}$$ (2.8) The average power dissipated is obtained multiplying the energy per cycle by the frequency of the signal i.e. $$P = \frac{4}{\pi} V_i V_s C_s f_{clock}$$ (2.9) The input to output transfer function of the integrator is given by the following equation [35] $$V_i = \frac{f}{f_{unity}} = \frac{2 \pi f C_i}{f_{clock} C_s} V_o$$ (2.10) From Eq. 2.10 it can be seen that for a in-band signal, i.e. $f \leq f_{unity}$ , the maximum amplitude of the input signal that does not cause any clipping at the output is a function of the input frequency. For this reason it is convenient to express the power consumption as a function of the output signal. This can be done by substituting Eq. 2.10 into Eq. 2.9 obtaining $$P = 8 V_c V_o f C_i \tag{2.11}$$ For a maximum amplitude sinusoid at the output, i.e. $V_o = V_s$ the power dissipation becomes $$P = 8 f C_i V_s^2 (2.12)$$ Using Eq. 2.6 into Eq. 2.12 gives $$P = 4 \in_{\max} f \tag{2.13}$$ the minimum power consumption for a full swing sinusoidal output is therefore proportional to the maximum energy stored in the integrator times the frequency of the signal. Last, the dynamic range is considered. While the power consumption and the area requirement can be uniquely defined for a stand-alone S.C. integrator, the dynamic range is Equivalent Circuit of Fig. 2.3. Figure 2.3 a function of the particular circuitry that surrounds it. In any practical case the integrator must be part of a feedback loop in order to guarantee a stable DC operating point. This is shown schematically in Fig. 2.4 together with the input-to-output transfer function with and without feedback. Both the total output noise, and the dynamic range, must be expressed as a function of the particular feedback configuration. Having assumed an ideal op. amp, the only sources of noise are the MOS switches. The noise contributed by the left hand side switch is sampled by $C_s$ every clock cycle. The signal appearing across capacitor $C_s$ is therefore a sampled first order low-pass filtered white noise. It has been shown that for a properly operating S. C. circuit, i.e. the time constants associated with the switches and the capacitors are much smaller than the clock period, such a discrete random process has a white spectral distribution and a total noise power (variance) equal to $\frac{k}{C_s}$ [38]. Discrete time linear system theory can therefore be used to determine the output noise variance $n_i$ obtaining the following result [36] $$n_i^2 = \frac{kT}{C_s} \sum_{m=0}^{\infty} h^2(m)$$ (2.14) where h (m) is the impulse response from the noise source to the output. Using Parseval's theorem [36] in Eq. (2.14) gives the final result $$n_i^2 = \frac{kT}{C_s} \frac{1}{2\pi} \int_{-\pi}^{\pi} H(e^{j\omega}) H(e^{-j\omega}) d\omega$$ (2.15) where H ( $e^{j\omega}$ ) is the z transform of h (m) evaluated on the unit circle. By introducing the the following definition $$B_o = \frac{f_{clock}}{2\pi} \int_{-\pi}^{\pi} H(e^{j\omega}) H(e^{-j\omega}) d\omega \qquad (2.16)$$ and making use of Eq. (2.2), Eq. (2.15) can be written as Figure 2.4 Closed Loop S.C. Integrator. $$n_i^2 = \frac{1}{2\pi} \frac{kT}{C_i} \frac{B_o}{f_{unity}}$$ (2.16b) The quantity $B_o$ is the effective noise bandwidth from the input of the switched capacitor integrator to the integrator output for the particular feedback configuration considered. It is the integral of the magnitude squared of the frequency response of the sampled data feedback circuit from the integrator input to the output, taken around the unit circle. For lowpass filters where the clock rate is far above the passband, this is equivalent to the integral over the passband of the transfer function from the integrator input to the output for the continuous equivalent circuit. In the following $B_o$ will be called the noise bandwidth to the output. The noise contributed by the right hand side switch is also sampled by $C_s$ . However, in this case, the resulting signal cannot rigorously be considered as a first order low pass filtered noise. The reason is that the circuit through which the white noise of the switch is sampled does not have a single pole roll-off since it contains also the op. amp. The amount of noise transferred to the output is, to first order, proportional to the ratio between the op. amp. unity gain bandwidth and the bandwidth of the circuit formed by the switch resistance and the sampling capacitor[36]. For simplicity it is assumed to have an ideal op. amp. (infinite bandwidth). In such case the two switches behave in the same way and the total output noise, $n^2$ , becomes $$n^{2} = \frac{1}{\pi} \frac{kT}{C_{i}} \frac{B_{o}}{f_{unity}}$$ (2.17) Assuming that the maximum undistorted output signal is approximately equal to the supply voltage $V_s$ , i.e. $\sqrt{2} V_s$ rms, the dynamic range of the integrator, (DR), becomes: $$(DR)^2 = \frac{s^2}{n^2} = \frac{\pi}{2} \frac{V_s^2 C_i}{kT} \frac{f_{unity}}{B_0}$$ (2.18) Eq. (2.18) can be rewritten as follows by making use of Eq.(2.6) $$(DR)^2 = \frac{\pi}{4} \frac{\epsilon_{\text{max}}}{kT} \frac{f_{\text{unity}}}{B_0}$$ (2.19) Eq. (2.19) suggests that the square of the dynamic range is given by the ratio between the maximum energy stored in the integrator and the unity of thermal energy kT, modified by the ratio between the noise bandwidth to the output and the unity gain bandwidth of the integrator. As an example consider the unity gain feedback circuit shown in Fig. 2.5. This is the simplest configuration in which the S.C. integrator can be operated. It corresponds to a first order low pass filter whose z domain transfer function from $C_s$ to the output is given by $$H(z) = \frac{\frac{C_s}{C_i}}{1 - z^{-1} + \frac{C_s}{C_i}}$$ (2.20) H ( $e^{\int \omega T}$ ) is approximately shown in Fig. 2.5b. In this simple case $B_o$ can be easily computed by making use of Cauchy residue theorem with the following result $$B_o = f_{clock} \frac{C_s}{C_s + 2C_i}$$ (2.21) Assuming condition (2.1) to be valid and using Eq. (2.2) in Eq. (2.21) gives the following result $$B_o = \frac{f_{clock}}{2} \frac{C_s}{C_i} = \pi f_{unity}$$ (2.22) As expected, for this simple case, $B_o$ is just the effective noise bandwidth of the single-time constant low-pass filter whose transfer function is shown in Fig. 2.5. Using the above result the circuit dynamic range becomes. $$(DR)^2 = \frac{V_s^2 C_i}{2 kT}$$ (2.23) which is a particularly simple result. Note that this ratio is simply the maximum energy Figure 2.5 One Pole S.C. Filter. stored on the integrating capacitor divided by kT. This result has strong implications for the ultimate limit on the ability to scale switched capacitor filters with technological feature size. In effect, silicon dioxide can only store a certain amount of energy per unit volume as dictated by the maximum field strength of silicon. For a given oxide thickness and power supply voltage this dictates a maximum energy storage per unit area, which dictates a minimum area for a given dynamic range and power supply voltage. Such a minimum value can be computed in the more general case by combining Eq. (2.19) and Eq. (2.7) with the following result $$(DR)^2 = \frac{\pi}{4} \frac{V_s \epsilon_{diel} E_{max} AREA}{kT} \frac{f_{unity}}{B_0}$$ (2.24) This indicates that the ultimately achievable dynamic range is proportional to the square root of the product of the power supply voltage and the area. Since the absolute minimum achievable level of power dissipation is proportional to $\epsilon_{\text{max}}$ , as it was shown in Eq. (2.13), a relationship similar to Eq. (2.24) between dynamic range and power consumption must exist. Mathematically such a relationship can be obtained by combining Eq. (2.13) and Eq. (2.19) to obtain te following result $$(DR)^2 = \frac{\pi}{16} \frac{P}{kT B_o} \frac{f_{unity}}{f}$$ (2.25) Thus the dynamic range is proportional to the square root of the minimum power dissipation necessary to charge and discharge the sampling and integrating capacitors from the power supply. Notice that Eq. (2.25) is only valid for $f \leq f_{unity}$ since outside this range the gain of the integrator is less than 1 and therefore it is not possible to have $V_o = V_s$ for an input signal $v_i$ smaller than the supply voltage. It is easy to see that the absolute maximum for P ( $P_{max}$ ), when both $V_i$ and $V_o$ are not allowed to exceed the supply voltage, corresponds to $f = f_{unity}$ . In this case Eq. (2.24) becomes $$(DR)^2 = \frac{\pi}{16} \frac{P_{\text{max}}}{kT B_0}$$ (2.26) It can be shown that Eqs. (2.24)-(2.26) are valid for both single ended and fully differential integrators [33]. #### 2.3. APPLICATION OF THE THEORY TO A LOW PASS S. C. FILTER In this Section the previous analysis is extended to the case of low-pass ladder S. C. filters. It is known that for a ladder active filter the basic building block is the integrator which, if the filter is implemented via S. C. techniques, can be realized by the circuit of Fig. 2.1 or by some other similar structure. To apply the results of the previous section to te entire filter, it is first shown that there is a-one to-one correspondence between the order of the filter (number of poles) and the number of integrators required to realize it. This is easily done with the help of a simple example. Fig. 2.6 shows the passive ladder prototype for a 3<sup>rd</sup> order low-pass filter. This circuit can be represented in terms of integrator summers and multiplyers as in Fig. 2.7 [16]. The flow diagram of Fig. 2.7 shows that each integrator output corresponds to one of the state variables of the filter, i.e. a voltage across a capacitor or a current through an inductor. Therefore the number of integrators will be equal to the number of state variables, which also coincides with the order of the filter. The above situation can be immediately generalized to an $n^{th}$ order structure as long as the number of state variables coincides with the number of reactive elements. Even when this is not the case (due to the presence of loops of capacitors or cut-sets of inductors), however, it is still possible to modify the passive prototype so that the number of Figure 2.6 integrators will coincide with the order of the filter by introducing some voltage-controlled voltage sources in the circuit [35]. In a S.C. implementation such controlled generators can be realized by simply substituting the basic integrator of Fig. 2.1 with the integrator-summer of Fig. 2.8 [35]. All the results of Section 2 can be extended with no changes to the structure of Fig. 2.8 if the extra area due to capacitor $C_3$ is neglected. From the above considerations and from the results of Section 2 follows immediately that the minimum amount of area required for a $n^{th}$ order S.C. filter is: $$AREA_{tot} = 2 \frac{V_s}{E_{\max} \epsilon_{diel}} \sum_{i=1}^{n} C_i = \frac{\sum_{i=1}^{n} \epsilon_{\max_i}}{V_s E_{\max} \epsilon_{diel}}$$ (2.27) Where $\epsilon_{max_i}$ is the maximum amount of energy that can be stored in the $i^{th}$ integrator. To compute the total power dissipated in the filter for a sinusoidal input of frequency f and peak amplitude equal to the supply voltage $V_s$ , use can be made of Eq. (2.11) provided that the gain from the input of the filter to the output of the $i^{th}$ integrators, $G_i$ (f), is known for each integrator. This gives: $$P_{tot} = 8 \ f \ V_s^2 \sum_{i=1}^n G_i \ (f \ ) C_i = 4 \ f \ \sum_{i=1}^n G_i \ (f \ ) \in_{\max_i}$$ (2.28) where $f_i^{unity}$ is the unity gain frequency if the $i^{th}$ integrator. Notice that in Eq. (2.28) it is implicitly assumed that $G_i$ (f) $\leq 1$ for every i. This condition must be verified to be able to process a full swing (supply to supply) input signal without unacceptably large distortion. Such an assumption will be discussed further later in the paper. Finally the total output noise contribution can be computed from Eq. (2.17) provided that the value of the noise bandwidth from the input of integrator i to the output of the filter, $B_i$ , is known for all the n integrators. $$n_{tot}^{2} = \frac{1}{\pi} kT \sum_{i=1}^{n} \frac{B_{i}}{f_{i}^{unity} C_{i}}$$ (2.29) Figure 2.7 Block Diagram for the Circuit of Fig. 2.6. This gives for the filter dynamic range $$(DR_{tot})^{2} = \frac{\pi V_{s}^{2}}{2 kT \sum_{i=1}^{n} \frac{B_{i}}{f_{i}^{unity} C_{i}}} = \frac{\frac{\pi}{4}}{\sum_{i=1}^{n} \frac{kT}{\epsilon_{max_{i}}} \frac{B_{i}}{f_{i}^{unity}}}$$ (2.30) where the in-band input-to-output gain of the filter has been assumed to be equal to one and the maximum output swing to be equal to the supply voltage( $\sqrt{2} V_s$ rms). This is generally true in a S. C. low-pass filter since the 6 db in-band loss of the passive prototype can be easily eliminated by making the capacitor that samples the input voltage twice as big as the other sampling capacitor in the first integrator. An alternative way to express the dynamic range in term of the sampling capacitors which will be useful later is shown in Eq. (2.31) $$(DR_{tot})^{2} = \frac{V_{s}^{2} f_{clock}}{4 kT \sum_{i=1}^{n} \frac{B_{i}}{C_{s_{i}}}}$$ (2.31) where $C_{s_i}$ is the sampling capacitor in the $i^{th}$ integrator. The above equations involve almost no approximations and can be used if all of the required parameters are known. The results, however, are a function of the particular filter design adopted and are in a form that does not show any particular relationship between the various performances. More insight on the problem can be gained by introducing some approximations. First, it is assumed that the sampling capacitors are identical for all the integrators, with the exception of the one that samples the input voltage, which was assumed to be twice as big as the others. Next the following approximation is introduced: Figure 2.8 Bottom Plate Integrator/Summer. $$\sum_{i=1}^{n} \frac{1}{f_i^{unity}} \approx \frac{n}{f_{max}} \tag{2.32}$$ where $f_{\rm max}$ is the band edge of the filter. Physically this means that the average value of the time constants of all the integrators in the filter coincide with the time constant associated with the band edge of the filter. In a typical low-pass ladder filter the error introduced by Eq. (2.32) rarely exceed $\pm$ 30%. Table 1, for example, shows the values of the integrators unity gain frequency together with $f_{\text{max}}$ for a commercial PCM low-pass filter (INTEL 2912). In this case the approximation introduces only about 2% error. From Eqs. (2.27), (2.2), and (2.32) it follows that: $$AREA_{tot} = \frac{2 V_s n}{E_{\max} \epsilon_{diel}} \frac{f_{clock} C_s}{2 \pi f_{\max}} = \frac{2 V_s n}{E_{\max} \epsilon_{diel}} C_I = \frac{n \epsilon_{\max}^I}{E_{\max} \epsilon_{diel} V_s}$$ (2.33) where the following two definitions have been introduced $$C_I = \frac{f_{clock} C_s}{2 \pi f_{max}} \tag{2.34}$$ $$\epsilon_{\max}^{I} = 2V_s^2 C_I \tag{2.35}$$ From Eq. (2.2) $C_I$ can be interpreted as the integration capacitance necessary to obtain an integrator whose unity gain frequency is $f_{\text{max}}$ . $\epsilon_{\text{max}}^I$ is the maximum amount of energy that can be stored in $C_I$ . In order to obtain a single numerical value for the power dissipated in the filter Eq. (2.28) is evaluated for $f = f_{\text{max}}$ ; therefore obtaining an upper bound for the minimum power requirement. At that frequency, for a properly designed filter, the gain from the input to each intermediate node can be assumed, with good approximation, to be equal to 1 i.e. $G_i$ ( $f_{\text{max}}$ ) = 1 for $i = 1 \cdots n$ [16]. To understand why this is in most cases a good approximation notice that to avoid saturation, which will reduce the maximum usable amplitude of the input signal, the gain TABLE I | | f <sub>i</sub> (kHz) | l/ f <sub>i</sub> (μs) | ΔC <sub>i</sub> /C <sub>I</sub> | $(\Delta C_i/C_I)^2$ | |-------|----------------------|------------------------|---------------------------------|----------------------| | i = | 4.715 | 212.1 | 2789 | .077 | | i = 2 | 2.227 | 449 | .527 | .277 | | i =3 | 5.186 | 192.8 | 344 | .118 | | i =4 | 4.0415 | 247. 4 | .159 | .025 | | i = 5 | 2.965 | 337.3 | 147 | .0216 | | Σ/5 | 3.826 | 287.5 | 021 | .1037 | from the input to each internal node must be less or equal to one for all frequency. On the other hand the value of the gain from all the intermediate nodes to the output should be minimum, to minimize the total output noise contribution. A good compromise between these two requirements is to set the peak value of each intermediate gain to one. Since peaking typically occurs in the proximity of the band edge the above assumption is justified. Eq. (2.32) can be substituted in Eq. (2.28) to give: $$P_{tot} (f_{max}) = 8 n V_s^2 f_{max} C_I = 4 n \in I_{max} f_{max}$$ (2.36) The total output noise is obtained by using Eq. (2.31), with the condition that all the sampling capacitors are equal: $$n_{tot}^{2} = \frac{4 kT}{C_{s} f_{clock}} \sum_{i=1}^{n} B_{i} = \frac{2}{\pi} \frac{kT}{C_{I} f_{max}} \sum_{i=1}^{n} B_{i}$$ (2.37) this implies a dynamic range for the filter ( $DR_{tot}$ ) of: $$(DR_{tot})^{2} = \frac{\pi V_{s}^{2} C_{I} \quad f_{\text{max}}}{2 k T \sum_{i=1}^{n} B_{i}} = \frac{\pi}{4} \frac{\epsilon_{\text{max}}^{I}}{k T} \frac{f_{\text{max}}}{\sum_{i=1}^{n} B_{i}}$$ (2.38) Eqs. (2.33), (2.36), and (2.38) can be normalized to obtain the equivalent area, power, and dynamic range per pole as follows: $$AREA_{pole} = \frac{AREA_{tot}}{n} = \frac{\epsilon_{max}^{I}}{E_{max} \epsilon_{diel} V_{s}}$$ (2.39) $$P_{pole} = \frac{P_{tot}}{n} = 4 \in \text{max} \ f \text{max}$$ (2.40) $$(DR_{pole})^2 = n (DR_{tot})^2 = \frac{\pi}{4} \frac{\epsilon_{max}^I}{kT} \frac{f_{max}}{\frac{1}{n} \sum_{i=1}^n B_i}$$ (2.41) Eqs. (2.39), (2.40), and (2.41) can be related to each other in the same way as it was done for Eqs. (2.7), (2.13), and (2.19) to obtain: $$(DR_{pole})^2 = \frac{\pi}{16} \frac{P_{pole}}{kT \frac{1}{n} \sum_{i=1}^{n} B_i}$$ (2.42) $$(DR_{pole})^2 = \frac{\pi}{4} \frac{V_s \in_{diel} E_{max} AREA_{pole}}{kT} \frac{f_{max}}{\frac{1}{n} \sum_{i=1}^{n} B_i}$$ (2.43) Comparing Eqs. (2.42) and (2.43) with Eqs. (2.24) and (2.25) it can be seen that they have the same physical interpretation with $f_{\text{max}}$ and $\frac{1}{n} \sum_{i=1}^{n} B_i$ playing the role of $f_{\text{unity}}$ and $B_o$ respectively. In Eq. (2.42) and (2.43) $\frac{1}{n} \sum_{i=1}^{n} B_i$ is the only term that depends on the particular circuit architecture used. It turns out, however, that in practical cases its value is relatively constant. In fact the following approximation can be introduced $$\frac{1}{n} \sum_{i=1}^{n} B_i = \delta \ 2 \ f_{\text{max}} \tag{2.44}$$ where $\delta$ is a parameter that depends on the particular filter implementation whose average value is can be assumed to be equal to .75 with a worst case inaccuracy of about $\pm 40\%$ . For the filter of Table 1, for instance, $\delta$ is equal to .9. Using (2.44) with $\delta = .75$ in (2.42) and (2.43) gives: $$(DR_{pole})^2 = \frac{\pi}{24} \frac{P_{pole}}{kT f_{max}}$$ (2.45) $$(DR_{pole})^2 = \frac{\pi}{6} \frac{V_s \epsilon_{diel} E_{max} AREA_{pole}}{kT}$$ (2.46) From Eqs. (2.45) and (2.46) the logarithm of $P_{pole}$ and $A_{pole}$ can be plotted versus the achievable dynamic range, $DR_{pole}$ , expressed in db with $f_{max}$ or $V_s$ used as a parameter respectively. This is shown in Fig. 2.9 and 10 in the case that the capacitor dielectric is silicon dioxide with $E_{max}=5\ 10^6 \frac{V}{cm}$ . Figure 2.9 $\label{eq:max-power} \mbox{Minimum Power Dissipation vs. Dynamic Range for Different Values of } f \ \mbox{ }_{\mbox{max}}.$ The plots of Figs. 9 and 10 can be used for both single ended and fully differential filter configurations. On the base of the above results the power consumed and the area occupied by any low pass S.C. filter can easily be compared with the theoretical minima. As an example consider a PCM $5^{th}$ order low-pass elliptic filter with a cut-off frequency of 3.4 kHz, a supply voltage of $\pm 5$ Volts and a signal-to-noise ratio of 95 db. Typical values for such a filter are a power per pole of about 1 mW, and an area per pole of about 6.25 $10^5 \,\mu^2$ ( $1000 \, mils^2$ ). Using Eq. (2.41) the dynamic range per pole can be determined from the overall dynamic range of the filter as follows: $$(DR_{pole}) = 95db + 10 \log 5 = 102 db$$ (2.47) The plots of Fig. 2.9 and 10 for a $\pm 5$ Volts supply and a dynamic range of 102 db give a minimum area requirement of approximately 1400 $\mu^2$ and a minimum power requirement of approximately 1.7 $\mu W$ . The actual values are approximately 2 to 3 orders of magnitude larger than the theoretical minima showing that there is a strong motivation to further reduce the area occupied and the power consumed by the core amplifier. Finally from the above results it immediately follows that to achieve a dynamic range of 95 db in a $5^{th}$ order voiceband filter operating from a $\pm 5Volts$ supply the minimum area required is approximately $7300\mu m^2$ and the minimum power $8.5\mu W$ . On the other hand the absolute maximum dynamic range that can be achieved for the same filter as above assuming a total area of $5000mil^2$ and a total power dissipation of 5mW is approximately 121 db. ### 2.4. EFFECT OF AMPLIFIER NONIDEALITIES As stated in Section 2 all of the above results were based on the assumption of having an op amp with ideal characteristics, i.e. zero power consumption, zero area, zero noise con- tribution. Such an ideal situation was to be achieved by continuously scaling the feature size, provided that the $\frac{1}{f}$ noise could be eliminated by some technique like chopper stabilization. In actuality practical constraint will result in other limitations on the level of op amp performance achievable. The ultimate minimum value for the above op amp characteristics is very difficult to define. It is however possible, based on a simple model, to obtain upper bounds for the limit values of the above quantities. This is done in this section. The obtained results show that the op amp fundamental limitations should not substantially effect the the ultimate performance of the filter in all practical cases. # 2.4.1. Power Dissipation In the following the minimum amount of power requested by the op amp for a given clock frequency is compared with the result of Eq. (2.12). The minimum op amp power consumption is obtained under the following assumptions: 1) The limiting factor in the op amp settling time ( $T_{set}$ ) is given by the linear portion of the step response as opposed to the slewing portion. As a consequence the following equation is valid $$T_{set} = \delta \tau \tag{2.48}$$ where $\delta$ is a number (typically between 5 and 10) that depends on the accuracy required in the step response, and $\tau$ is the time constant of the closed loop step response of the op amp (a single pole step response is assumed). If $C_I \gg C_s$ and no large capacitance is attached at the integrator summing node it follows that $$\tau \approx \frac{1}{\omega_u} \tag{2.49}$$ where $\omega_u$ is the unity gain frequency of the amplifier. The above assumption is quite reasonable since class A/B amplifiers that do not exhibit any slewing behavior and have a power dissipation which is only a few percent higher than their stand by values can be emploied. 2) The devices are operated in the subthreshold region. This corresponds to the maximum possible transconductance for a certain current level I i.e. $$\frac{gm}{I} = \frac{q}{n \ kT} \tag{2.50}$$ where n is the subthreshold slope factor whose value is typically between 1 and 2. - 3) The time allowed for the op amp to settle is assumed to be $\frac{1}{2 f_{clock}}$ i.e. 50% duty cycle is assumed - 4) The most simple inverter like structure of Fig. 2.11 is assumed for the op amp with the possibility of using cascode devices to enhance the voltage gain. - 5) The load capacitance of the integrator is assumed to be equal to $2 C_s$ i.e. the sampling capacitor of the next stage plus the effective capacitive load at the output due to the feedback circuit which is the series of $C_I$ and $C_s$ . From assumptions 4) and 5) follows that $$\omega_u = \frac{gm_l}{2C_s} \tag{2.51}$$ where $gm_I$ is the transconductance of the driver device M1. Using assumption 3) in Eq. (2.51) gives $$\frac{gm_l}{2C_s} \geqslant 2 \delta f_{clock} \tag{2.52}$$ The absolute minimum value of $gm_I$ ( $gm_{min}$ ) is $$gm_{\min} = 4 \delta C_s f_{clock} \tag{2.53}$$ using assumption 2) the absolute minimum stand-by current level $I_{\min}$ becomes $$I_{\min} = 4 n \delta C_s f_{clock} \frac{kT}{q}$$ (2.54) Figure 2.11 Simple Inverting Amplifier. which gives a minimum power consumption $P_{\min}$ of $$P_{\min} = 8 n \delta V_s C_s f_{clock} \frac{kT}{q}$$ (2.55) Using Eq. (2.2) in Eq. (2.55) gives $$P_{\min} = 16 \pi n \delta V_s C_i f_{unity} \frac{kT}{q}$$ (2.56) Comparing the above result with the result of Eq. (2.12) in which the signal frequency is assumed to be $f_{unity}$ gives the following result $$\frac{P}{P_{\min}} = \frac{8 \ f_{\text{unity}} \ C_1 \ V_s^2}{16 \ \pi \ n \ \delta \ V_s \ C_i \ f_{\text{unity}} \ \frac{kT}{q}} = \frac{V_s}{2 \ \pi \ \delta \ n \ \frac{kT}{q}}$$ (2.57) for n = 1.5 and $\delta = 7$ Eq. (2.57) gives $$\frac{P}{P_{\min}} = \frac{V_s}{21 \,\pi \,\frac{kT}{q}} = \frac{V_s}{1.7 \,V} \tag{2.58}$$ From Eq. (2.58) follows that, for a $\pm 5$ Volt supply, the error introduced in the calculation of the absolute minimum power required by an S.C. integrator (Eq. (2.12)) by assuming that the op amp does not consume any power is smaller or equal than about 35%. Ideally, at least, such an error should be much less the the above value since from a fundamental stand point the absolute minimum power required by the op amp is considerably less than the value given by Eq. (2.56). The reason is that the unity gain bandwidth of the simple structure of Fig. 2.11 does not approach the fundamental speed limit of MOS transistor M1 which is given be the inherent $f_T$ of the device for the particular bias condition used. This is because the parasitic capacitance of M1 is typically much smaller than the load capacitance $C_s$ , as it will be shown in the next section. It is, at least conceptually, possible to increase the value of the unity gain bandwidth of an op amp up to a more close fraction of the $f_T$ of the devices used. One possible way to reach such a goal for the simple structure of Fig. 2.11 is to use positive feedback around M1 in order to obtain a larger transconductance for the same value of the current level and device size. All of the above considerations suggests that the ultimate limit in the power dissipation of an S.C. integrator does not come from the op amp consistently with the assumption of section 2. # 2.4.2. Amplifier Noise and Finite Bandwidth The only fundamental noise associated with the op amp is the white portion. As it was said in Section 2 this noise component can be expressed in the same form as the $\frac{kT}{C}$ one. The relative importance of the amplifier noise with respect to the noise of the MOS switches is considered in this section. The total noise of an S.C. integrator (both MOS switches and op amp contribution) depends on the relative value of the op amp unity gain bandwidth, $\omega_u$ , and the cut-off frequency of the low pass filter formed by the switch resistance and the sampling capacitor $\omega_{on}$ whose value is given by $\omega_{on} = \frac{1}{R_{oni} C_s}$ where $R_{oni}$ is the on resistance of the i<sup>th</sup> MOS switch. With reference to Fig. 2.12 two extreme cases exist. The first case is the one considered in Section 2 where an infinite op amp bandwidth has been assumed. This gives the same noise contribution for both the left and right hand side switches and a negligible contribution from the op amp assuming a finite total noise energy in the amplifier. On the other extreme case the op amp bandwidth is assumed to be much smaller than $\omega_{on}$ . By performing a simplified analysis as it was done by Gobet and Knob [36] both the noise contributed by the right hand side switches $(n_R^2)$ and by the op amp $(n_{OP}^2)$ can be expressed as a fraction of the noise contributed by the left hand side switches $(n_L^2)$ , which was calculated in Section 2, as follows. $$\frac{n_R^2}{n_I^2} \approx \frac{\omega_u}{\omega_{cc}} = \frac{gm_I}{2C_c} R_{on 1} C_S = \frac{gm_I R_{on 1}}{2}$$ (2.59) Figure 2.12 S.C. Integrator with Amplifier White Noise. $$\frac{n_{OP}^2}{n_L^2} \approx \frac{\frac{R_{eq}}{R_{on 2}}}{2 \frac{\omega_{on}}{\omega_{u}}} = \frac{gm_l \ R_{eq}}{2}$$ (2.60) in the derivation of both Eqs. (2.59) and (2.60) Eq. (2.51) was used and $R_{eq}$ is the equivalent input noise resistance of the op amp. From Eq. (2.59) making use of the assumption that $\omega_{on} \gg \omega_u$ it can be concluded that the contribution of the right hand switches is negligible. On the other hand assuming that the op amp noise is contribute primarily by the input device M1 and that no high frequency second stage noise contribution occurs it follows that $R_{eq} = \frac{2}{3} \frac{1}{gm_I}$ and Eq. (2.59) gives $\frac{n_O_P^2}{n_L^2} \approx \frac{1}{3}$ . Between the two extreme case there is only about 30% change in the total output noise contribution, furthermore if a source coupled pair is assumed at the op amp input such a change is reduce to only about 15%. From the above results it seems reasonable to conclude that in any practical situation Eq. (2.17) will be reasonably accurate. In reality one more potential source of noise degradation exist when the output of the S.C. integrator is sampled by another circuit of the same kind, which is the case in any S.C. filter configuration. This is due to the continuous time noise that is transmitted to the output by the amplifier independently of which phase of the clock is high. Such a wide band component can be aliased into the baseband by the next stage sampling operation. Fortunately for the case of a single stage transconductance amplifier the above noise contribution combines with the thermal noise of the MOS switches of the following stage in such a way that the variance of the total noise sampled is unchanged. The reason for such a behavior can be explained as follows. Consider the circuit shown in Fig. 2.13. Here a sampling capacitor and switch are sampling the output of a previous integrator. The integrator circuit consisting of the operational amplifier together with the feedback capacitor can be represented as a Thevinin source having some effective output impedance vs frequency and some equivalent noise resistance which is also a function of frequency. One can identify Figure 2.13 Sampling of an Integrator Output by the Following Stage. one limiting case in which the output impedance is a pure resistance, and the noise equivalent resistance is equal to the output resistance. In this case the bandwidth of the sampling circuit is reduced by the same factor that the noise power spectral density is increased, and one obtains the result that the op amp noise adds no noise over and above the fundamental kT/C noise contributed by the switch on resistance, to be discussed in the next section. This limiting case is approximated by a wideband transconductance operational amplifier with no output stage. ## 2.4.3. Amplifier Area In this section the minimum amount of area required for the op amp is compared with the result of Section 2. The simple structure of Fig. 2.11 is again assumed. The area of the amplifier is assumed to be approximately equal to the area of M1 i.e. the load device is assumed to be much smaller than M1 and therefore neglected in the area calculation. The total area of the transistor is assumed to be equal to $\beta$ times the area of its gate. Typical values for $\beta$ can be taken to be between 2 and 5. In the following to obtain numerical results $\beta = 3$ is used. Such a value can be achieved in practice by folding the transistor many times in order to have sources and drains sharing the same diffusion area. M1 is assumed to be operating in subthreshold. This is done to be consistent with the assumption used in the section dealing with the op amp power and also to insure a reasonable amount of gain in the amplifier. The maximum current level in weak inversion for a given aspect ratio is roughly given by [25] $$I = \mu C_{ox} \frac{Z}{L} \left(\frac{kT}{q}\right)^2 \tag{2.61}$$ Eq. (2.61) defines the minimum value of the aspect ratio $\frac{Z}{L}$ of an MOS transistor for which the devices is still operating in weak inversion for any given current level. Minimum aspect ratio corresponds to minimum gate area for a given technology therefore the above condition is used in the following calculation. Combining Eq. (2.50) with Eq. (2.61) an expression for the device transconductance is obtained. $$gm = \mu C_{ox} \frac{Z}{L} \frac{kT}{q n}$$ (2.62) Substituting in Eq. (2.53) for $gm_{min}$ the expression of Eq. (2.62) and making use of Eq. (2.2) gives $$\mu C_{ox} \frac{Z}{L} \frac{kT}{g n} = 4 \delta C_s f_{clock} = 8 \pi \delta C_l f_{unity}$$ (2.63) Multipling both sides of Eq. (2.63) by $L^2$ and solving for the gate area of M1 i.e. $Z \times L$ gives $$Z x L = \frac{8 \pi \delta q n f_{unity} L^2}{kT \mu} \frac{C_I}{C_{ox}}$$ (2.64) Noticing that $\frac{C_I}{C_{ox}}$ is nothing but the area of $C_I$ follows that $$\frac{Z \times L}{Area \ of \ C_I} = \frac{8 \pi \delta \ q \ n \ f_{unity} \ L^2}{kT \ \mu} \tag{2.65}$$ The op amp area was assumed to be equal to $\beta$ times the area of the gate of M1 therefore $$\frac{Area\ of\ Op\ Amp}{Area\ of\ C_I} = \frac{\beta\ 8\ \pi\ \delta\ q\ n\ f_{unity}\ L^2}{kT\ \mu} \tag{2.66}$$ Using $\mu = 800 \frac{cm^2}{V \text{ sec}}$ $n = 1.5 \delta = 7 \beta = 3 \text{ in Eq. (2.66) gives}$ $$\frac{Area \ of \ Op \ Amp}{Area \ of \ C_I} = 38 \ L^2 \ f_{unity} \tag{2.67}$$ Assuming a $1 \mu m$ minimum channel length technology it follows from Eq.(2.68) that $\frac{Area\ of\ Op\ Amp}{Area\ of\ C_I}=1$ for $f_{unity}\approx 2.6\ MHz$ . These results show that for future scaled technologies i.e. $1\ \mu m$ or less minimum channel length, the dominant factor in determining the ultimate limits in the minimum achievable area of an S.C. integrator is given by the size of the integration capacitor and not by the op amp area up to filter bandwidths well into the MHz range. This again is consistent with the assumptions of Section 2. ### CHAPTER 3 ## DESIGN ALTERNATIVES FOR MICROPOWER MOS AMPLIFIERS In this chapter we will examine some of the alternatives available in the design of core amplifiers and buffer amplifiers in MOS technology. This will give us some general design criteria which will be used in the following chapters where the actual realized circuits will be described. In all the following we exclusively refer to circuits to be used in S.C. systems, however, most of our discussion will be quite general and therefore could be applied to other situations as well. Since there is no generally accepted terminology we first define the class of circuits to which we refer in our study. We use the name core amplifier to indicate an amplifying circuit which is designed to drive only modest capacitive loads (few picofarads) and no resistive loads. Obviously the output of a Core Amplifier will never be connected directly off-chip, this fact motivates the name which has been given to it. The performance of an S.C. integrator, which is the basic building block of any S.C. circuit, is heavely dependent on the characteristic and limitations of the core amplifier used on it. This explains why we are devoting so much attention to this circuit. A buffer amplifier, as the name suggests, on the other hand, is supposed to interface the on-chip circuitry to the off-chip world. Because of its application, a buffer amplifier has to be able to drive a relatively large capacitive load or a relatively small resistive load or both. #### 3.1. DESIGN ALTERNATIVES FOR MICROPOWER CORE AMPLIFIERS Table 3.1 shows a typical set of performance requirement for a core amplifier. The reported values are typical for an S.C. integrator [1], however, they are also representative of the performance required in other applications e.g. A/D and D/A converters [40][41]. Some comments are necessary: the minimum amount of gain required is a function of the acceptable DC error in the integrator, the quoted value of 1000 corresponds to an error of about .1% which is typical in S.C. lowpass filters [16]. For high Q bandpass applications, however, the required gain may be considerably larger. Among the specifications relating to the amplifier speed (bandwidth, phase margin, and, settling time) the most critical one is the settling time since it dictates the maximum clock frequency that can be used [42][43]. The required value of $2.5 \,\mu$ sec or less to a precision of .1% is quite standard and allows for a clock of 128 kHz or more. All the other requirements vary from case to case depending on the particular application therefore, for the sake of generality, we only indicate some typical range of values [39]. We are, however, referring to micropower applications as can be seen from the value chosen for the power consumption. # 3.1.1. One Stage Versus Two Stages Topology Since core amplifiers are expected to drive only relatively small capacitive loads there are no requirements on the value of their output resistance which in turn implies that no output stage is necessary. On the other hand there is a choice between a multiple stage and a single stage design. Table 3.2 compares one stage and two stage circuits listing their respective advantages. For simplicity we chose not to consider circuits with 3 or more stages because of their stability problems and the corresponding difficulty in compensating them. Such a choice seems to be well justified for CMOS circuits because of their sufficiently large gain per stage but may not always be so in NMOS [44]. | CORE AMPLIFIER TYPICAL DESIGN SPECIFICATIONS (Capacitive load only 3-5 pF) | | | |----------------------------------------------------------------------------|------------------------------------------------------------------------------------------|--| | GAIN | > 1000 | | | SETTLING TIME TO .1% | < 2.5 μsec | | | BANDWIDTH | NON CRITICAL (1 - 5 MHz) | | | PHASE MARGIN | NON CRITICAL | | | C.MR.R. | 70 - 90 db | | | P.S.R.R. | 80 - 100 db @ DC<br>30 - 60 db @ 100 kHz | | | OUTPUT SWING | .5 to 1 VOLTS FROM SUPLIES | | | SUPPLY VOLTAGE | 5 VOLTS ONLY or ± 5VOLTS | | | POWER CONSUMPTION | <200 μW | | | INPUT NOISE | $100 \frac{nV}{\sqrt{Hz}} @ 1 \text{ kHz}$ $50 - 100 \frac{nV}{\sqrt{Hz}} \text{ WHITE}$ | | | RANDOM OFFSET | 2 - 10 mV | | | AREA | 80 - 300 mils <sup>2</sup> | | Table 3.1 Typical Core Amplifier Design Specifications | ONE STAGE | TWO STAGES | |------------------------------------------------------|----------------------------------------| | Simpler | | | Better Swing | | | No Compensation | Better Swing<br>(If a Cascode is Used) | | Smaller Area | | | Better P.S.R.R.<br>at High Frequency | ÷ | | No Second Stage High<br>Frequency Noise Contribution | Larger gain | | Better Slewing Behavior | | | Easier to Make Class A/B | | Table 3.2 One Stage Versus Two Stages Comparison The single stage configuration is discussed first. The terminology used is clarified at the outset to avoid confusion. With the general name of single stage amplifier is intended a circuit in which there is only one high impedance node (the output one) so that the load capacitance always present at the output determines the dominant pole. Both the classical common emitter and the common emitter common base (cascode) configurations belong to such a category. The main problem with the former is the limited amount of gain that it can achieve. In particular a simple common emitter circuit which satisfies the gain requirement of table 1 will require to operate the driver device in the subthreshold region and to use extremely long channel devices in order to achieve enough output resistance. The latter condition however will drastically lower the frequency of the second pole therefore reducing the achievable bandwidth of the circuit [39]. Because of the above considerations we believe that the cascode configuration has a decisive advantage with respect to the common emitter one particularly for micropower applications and therefore we will exclusively refer to it in the following. In a single stage amplifier due to the presence of only one high impedance node the load capacitance stabilizes the circuit so that in many cases no extra compensation is needed. This makes the amplifier simpler and tend to reduce the chip area requirement. Furthermore, and probably more important, due to the absence of the capacitive coupling between the supply and the output via the compensation capacitor, the high frequency power supply rejection ratio (PSRR) is improved with respect to the multistage case. This point is illustrated in Fig. 3.1 where a plot of PSRR versus frequency is shown for a single stage and a two stage amplifier assuming the same DC value. For the two stage case, Fig. 3.1 refers to the supply that gives the worse behavior between the two. The 20 db per decade roll-off for frequency above the dominant pole in the case of a two stage amplifier is due to the fact that, in this frequency range, while the input to output gain is falling, one of the two supplies is essentially shorted to the output by the compensation capacitance. This Figure 3.1 PSRR Versus Frequency for One Stage and Two Stage Amplifiers situation does not occur for a single stage amplifier. The reason is that the dominant pole, being associated with the output node, appears in the input-to-output transfer function as well as in the transfer function from either supply to the output. As shown in Fig. 3.1b eventually the PSRR starts to decrease even for the single ended case due to parasitic capacitive coupling from the supplies to the output. This, however, occurs for a frequency which is much higher than that of the dominant pole. The noise behavior for the two stage and one stage circuit, is illustrated with the help of Fig. 3.2 where the input referred noise for the two possible configurations is shown assuming the same low frequency value. The extra high frequency component in the two stage case is caused by the second stage and starts to become important when the input stage gain becomes sufficiently small, such a contribution obviously does not appear for a single stage configuration. The presence of extra high frequency noise is particularly undesirable in sampled date systems since it can be aliased in the baseband by the sampling operation. Also due to the fact that no compensation is required the slew rate problem is potentially alleviated which is particularly important for micropower applications. The final point on table 3.2, i.e. the advantage of single stage configurations when a class A/B topology is used, is particularly true if class A/B behavior is achieved by splitting the signal in two paths as is done in the design described in chapter 4. We will discuss this point in greater details in section 3.2.1 with reference to buffer amplifiers. The right hand side of table 3.2 shows the advantages of a two stages configuration, they are: a potentially large gain even for relatively short channel devices and a larger output swing for the same supply voltage when compared with the single stage case using a cascode configuration. This last point is of fundamental importance when a large dynamic range is required and becomes more and more important as the supply voltage is reduced. In fact unless some special design techniques are adopted the use of a cascode configuration becomes impractical for a total supply voltage smaller than 5 Volts or so and a two stage Figure 3.2 Input Referred Noise for One Stage and Two Stage Amplifiers solution becomes the only viable alternative. This point will be discussed in greater detail in Ch.4. ## 3.1.2. Class A Versus Class A/B Topology Table 3 compares two other possible architectural alternatives in the design of micropower core amplifiers i.e. class A versus class A/B. Some clarification in the terminology used is again in order. We call class A a circuit whose power consumption and output current availability are fixed independently of the value of the signal applied to it. On the other hand, we call input to output class A/B, or simply class A/B, a circuit whose power dissipation, and current availability, are a function of the applied signal with peak values that can be many times larger than the stand-by ones. The main reason to use a class A/B circuit is to reduce, and if possible eliminate, the highly nonlinear slewing portion of the amplifier step response. This is particularly important in micropower circuits due to the fact that the relative importance of the slewing portion of the settling time with respect to the linear portion increases as the current is reduced. The above point will be discussed in detail in the following. While the main advantage of class A/B circuits in micropower applications is their low power consumption for a given speed requirement or alternatively their fast response for a given power level, they also have a large gain and the possibility of trading some of the gain for a larger bandwidth, a good output swing and a relatively low offset. This favorable behavior is a consequence of the very low stand-by current of class A/B circuits and of the fact that, in S.C. applications, all of the above characteristics are relevant only at the instant in which the output signal is sampled by the next stage therefore they have to be evaluated a the stand-by current level. | CLASS A/B | CLASS A | |------------------------------------------------------------|-------------------------------------------------| | Very low power | Simpler structure<br>(Both design and Analysis) | | No Slewing Problems Large Gain (Can Trade Gain for Speed) | Easier to Use From<br>Low Voltage Supplies | | Better Output Swing | · | | Low Offset | Easier to Make Fully<br>Differential (CMFB) | Table 3.3 Class A/B versus class A comparison The large gain can be understood immediately from Fig. 3.3 where the maximum gain per stage is plotted as a function of the current level. The decreasing portion of the curve which varies, at least to the first order, like $\frac{1}{\sqrt{I}}$ is due to the fact that while the transconductance of an MOS device increases with increasing current as $\sqrt{I}$ its output resistance decreases as $\frac{1}{I}$ [19]. The above relations are valid if the devices are operating Figure 3.3 Gain Per-Stage Versus Current Level well above threshold. For a sufficiently low value of the current, however, the larger device (the driver) enters the subthreshold region of operation and its transconductance becomes directly proportional to I [25]. From this point on the gain become independent from the current as it is shown in the figure. The potentially lower input offset can be explain in the same way. For a differential pair configuration the input referred offset due to the mismatch in all the devices but the input ones can be shown to be inversely proportional to the value of $\frac{gm}{I}$ for the input devices [19]. Since this quantity varies with I in the same exact way as the gain, it follows that by reducing the current level the reflected back offset will also be reduced untill the subthreshold region is reach. In practise this effect is not very important since the main contribution to the offset comes from the input devices. The good output swing is also a consequence of the low current level, which in turn means low $V_{GS} - V_{I}$ , in the output devices. The basic advantage of class A circuits is their greater simplicity from the point of view of both design and analysis. This is because they do not experience the large variations in the current level (30 to 1 and more depending on the supply voltage) which are the essence of class A/B behavior, therefore small signal considerations are guaranteed to be accurate except during the slewing mode which is, however, fairly simple to model [45][46]. Furthermore for a class A amplifier it is much easier to predict the region of operation of all the active devices during the transient and to prevent them from being cut-off or entering the triode mode when this is undesirable. Easier design problem typically gives topologically simpler solutions which in turn means a smaller number of devices and the likelyhood of not having to stack many of them on top of each other which is very important if the circuit is powered by a low supply voltage. Finally if a fully differential configuration is used the design of the common mode feedback circuit is generally simpler for a class A solution. Figure 3.4 Circuit Used to Show the Variation of Slew Delay with Current The reasons why class A/B configurations are particularly suited for micropower applications are now explained in detail. Furthermore we will develop some very practical criteria for comparing in a quantitative way class A and class A/B amplifiers in terms of their achievable speed given the particular application for which they are intended and the level of power dissipation to be achieved. The behavior of the circuit of Fig. 3.4 when it is driven by a large voltage step at the input is analyzed for different values of the total supply current I to show the characteristic behavior of MOS operational amplifiers. The circuit under test is assumed to be using a standard class A topology. Curve a) in Fig. 3.5 is is a plot of the output voltage as a function of time in response to an input step of height $\Delta V_i$ for a total supply current I. Curve b) in Fig. 3.5 is a plot of the voltage at the same node and for the same input but for a total supply current equal to $\frac{I}{100}$ . Notice that the horizontal axis has been normalized to the close loop time constant of the amplifier $\tau$ which is defined as follows: $$\tau = \frac{1}{\omega_{unity}} = \frac{1}{2\pi f_{unity}} \tag{3.1}$$ with $f_{unity}$ being the amplifier unity gain frequency. The basic message of Fig. 3.4 is that the amount of time spent by the output node slewing toward its final value, relative to the total settling time, becomes more and more important as the supply current (and therefore the power) is reduced. Qualitatively this can be explained as follows. The linear portion of the settling time, $\Delta t_1$ is proportional to $\tau$ which in turn is proportional to $\frac{1}{gm}$ , where gm is the transconductance of the input devices, assuming that gm is proportional to $\sqrt{I}$ , $\Delta t_1$ becomes proportional to $\frac{1}{\sqrt{I}}$ . On the other hand the slewing portion $\Delta t_2$ is proportional to $\frac{1}{I}$ . The ratio between $\Delta t_2$ and $\Delta t_1$ is therefore proportional to $\sqrt{I}$ . A more quantitative analysis is carried out in Figure 3.5 Output Voltage for the Circuit of Fig. 3.4 appendix A under the following assumptions which are consistent with the model used by Chuang[45] - 1) The slew rate of the amplifier is limited by the input stage current available to charge the compensation capacitor - 2) The input stage is modeled as shown in Fig. 3.6 with a maximum available current $I_{xm}$ and a transfer characteristic with slope $gm_I$ for values of the input signal smaller than $\frac{I_{xm}}{gm_I}$ . - 3) The amplifier ac transfer function is well represented by a two pole system i.e. the singularities beyond the second pole contribute a negligible phase shift at the unity gain frequency, $\omega_{unity}$ . - 4) $\frac{\omega_2}{4} < \omega_{unity} < \omega_2$ i.e. $1 > \xi > \frac{1}{2}$ where $\omega_2$ is the frequency of the second pole and $\xi$ is the dumping factor of the closed loop step response. The basic result of appendix A is shown in Eq. (3.1) $$\left(\frac{\Delta t_{2}}{\Delta t_{1}}\right) = \frac{\frac{\Delta V_{o} \ 2 \ \xi^{2}}{(V_{GS} - V_{T})_{inp}}}{\ln \frac{1000 (V_{GS} - V_{T})_{inp}}{\Delta V_{o}}}$$ (3.1) where $(V_{GS} - V_T)_{inp}$ is the voltage overdrive for the input devices at equilibrium and a settling accuracy equal to .1% of the voltage step amplitude has been assumed. Although Eq. (3.1) was derived for a particular amplifier topology, it is much more general. In fact it is valid for all class A amplifiers using an emitter coupled pair as input stage provide that assumptions 1) through 4) are valid (which is the case for most of the class A amplifier reported in literature) [1][17][26][28][30][31][44]. Furthermore in some cases when a more complicated differential stage is used Eq. (3.1) needs only some minor modification to maintain its validity. As an example for a structure like the $\mu A$ 741 op Figure 3.6 Op Amp Model for the Analysis of the Step Response of the Circuit of Fig. 3.4. amp [39] an extra factor of $\frac{1}{2}$ should be added to the right hand side of Eq. (3.1). From Eq. (3.1) it may seem that the simplest way to reduce $(\frac{\Delta t_2}{\Delta t_1})$ is to reduce $\xi$ . In practice, however, due to the presence of higher order singularities, in doing so the linear portion of the settling time is degraded. It turns out that there is an optimum value of the ratio $\frac{\omega_2}{\omega_{unity}}$ (and therefore of $\xi$ ) for which the total settling time is minimum. Such an optimum value changes from case to case and can typically be determined only by time domain simulation of the step response. In the majority of the practical cases, however, $\frac{\omega_2}{\omega_{unity}}$ is within the limit of validity of Eq. (3.1). As an example of the information provided by Eq. (3.1) assuming $(V_{GS}-V_T)_{inp}=200mV \text{ and } \xi=\frac{\sqrt{3}}{2} \text{, i.e. } \omega_{unity}=\frac{\omega_2}{3} \text{, follows that } \Delta t_2>\Delta t_1 \text{ for } \Delta V_o \geqslant 760mV \text{.}$ Assuming that the above analysis is valid for any current level, it can be concluded that continuing to reduce the power consumption will eventually lead to a situation in which the slewing delay is dominant. In reality when the current becomes sufficiently low the op amp input devices enter the subthreshold region of operation. From this point on the transconductance becomes proportional to the current level and the ratio between $\Delta t_2$ and $\Delta t_1$ reaches its maximum and remain constant independently from any further reduction of the current [25]. The value of this maximum as a function of the output step amplitude is easily shown (Appendix A) to be equal to: $$\left(\frac{\Delta t_2}{\Delta t_1}\right)_{\text{max}} = \frac{\frac{\Delta V_o \ \xi^2 q}{n \ kT}}{\ln \frac{1000 \ 2 \ n \ kT}{q \ \Delta V_o}}$$ (3.2) where n is the subthreshold coefficient defined as follows: $$n = 1 + \frac{C_d}{C_{ox}}$$ where $C_d$ is the surface depletion capacitance per unit area and $C_{ox}$ is the oxide capacitance also per unit area. For a state of the art process 1 < n < 2 [25]. Using the same values as in the previous example in Eq. (3.2) and assuming n=1.5 gives the following result: $\Delta t_2 > \Delta t_1$ for $\Delta V_o > 280 \, mV$ . This shows that in micropower circuits the slew delay starts to be important from a relatively small value of the voltage step. For a class A/B structure the above theory does not apply since the amount of current available to charge either the load or the compensation capacitance is not limited to the quiescent value but can be many times larger (up to 30 times and more) for a large input signal. In fact for a class A/B circuits the slewing delay is grately reduced if not totally eliminated i.e. $\Delta t_1 \rightarrow 0$ . Since class A/B circuits generally require more complicated topologies, they become attractive only when using a classical approach would yield a response in which $\Delta t_1$ is a substantial fraction of the total settling time. The above theory gives a quantitative basis to decide between the two possible alternative (class A versus class A/B). We will now try to extend the above results to the case of an S.C. integrator as shown in Fig. 3.7. Two situations should be considered depending on the structure of the amplifier. In the first case the op amp used in the S.C. integrator is assumed to have a single stage topology. In this case Eq. (3.1) can be applied unchanged provided that - 1. $C_i \gg C_s$ so that a feedback loop gain of 1 can be assumed for the circuit of Fig. 1 - 2. The proper value of $\Delta V_o$ is used as explained below. Fig. 3.8 shows the SPICE simulated output waveform for the structure of Fig. 3.7 when MOS switch M1 is closed and the charge on $C_s$ is integrated onto $C_i$ . As can be Figure 3.7 Switched-Capacitor Integrator. Figure 3.8 Output voltage for the circuit of Fig. 3.7 when the input switches are closed. seen, immediately after the switch is closed, i.e. at $t=0^+$ a voltage spike of height equal to $\Delta V_1$ appears at the output with a polarity opposite to that of the final DC voltage step $\Delta V_2$ . Such a behavior is due to the presence of a feedforward path from the input-to-the-output via capacitor $C_I$ . At the instant $t=0^+$ the op amp is not yet active due to its finite internal delay therefore the feedforward path dominates resulting into a positive input-to-output gain. When the op amp becomes active, however, the circuit of Fig. 3.7 behaves like an inverting S.C. integrator with a negative input-to-output gain. The presence of the opposite polarity spike increases the slew delay, $\Delta t_2$ . Such an effect can be easily accounted for, however, by using the sum of $\Delta V_1$ plus $\Delta V_2$ for the value of the output step $\Delta V_0$ in Eq. (3.1). The value of $\Delta V_1$ is fearly complicated to compute, however, both very simplified calculations and computer simulation show that it is often larger than $\Delta V_2$ specially if $C_I \gg C_5$ . The second case to be considered occurs when a two stage amplifier is used in the S.C. integrator. As before $C_I$ is assumed to be much larger than $C_s$ which corresponds to a feedback loop gain of approximately one. In the derivation of Eq. (3.1) for the case of a two stage topology it was assumed that the slewing speed of the op amp was limited by the total amount of current available from the the first stage to charge the compensation capacitor. As can be seen from Fig. 3.9, where a typical two stage amplifier structure is shown, it is however possible that for a positive output step the slewing speed be limited by the amount of current available from current source I2 to charge both load and compensation capacitance. Practical considerations dictate that the value of the load and the compensation capacitance be of the same order and that the ratio between the bias current in the output stage (I2) and in that in the input stage (I1) be approximately 2 to 1. As a consequence for the structure of Fig. 3.4 input and output stage give the same slewing speed and Eq. (3.1) is valid. For an S.C. integrator, however, the presence of the voltage spike at the output plus the added capacitive load due to the presence of $C_I$ and $C_s$ causes the positive slew rate (see Fig. 3.9) to be limited by the output stage current. This is Figure 3.9 Circuit Schematic for a Classical Two stage Amplifier because with respect to tha case of Fig. 3.4 while the input stage must still drive the same capacitor $(C_c)$ and still sees the same voltage step $(\Delta V_2)$ the output stage must drive a much larger capacitor and sees a larger voltage step $(\Delta V_1 + \Delta V_2)$ . As a consequence the slew delay is larger than what is predicted by Eq. (3.1) in analogy to the case of a single stage amplifier. In this case, however, is not possible to correct Eq. (3.1) in a simple and general way so to extend its validity to the S.C. integrator. A correction factor can still be found but depends on the specific case. In practice the deterioration associated with the above effect seems to be at least as severe as in the single stage case, as it will be shown via a practical example below. An example is now developed to show the practical implication of the above considerations. Let us consider an S.C. integrator as in Fig. 3.4 with a 3 kHz unity gain frequency and a clock rate of 128 kHz. The capacitor ratio $\frac{C_I}{C_s}$ can be found from Eq. (2.2) to be $$\frac{C_I}{C_s} = \frac{f_{clock}}{2 \pi f_{unity}} = 6.8 \tag{3.3}$$ assuming for simplicity that $C_s=1~pF$ and $C_L=2~pF$ we will have $C_I=6.8~pF$ . Let us further assume that the signal present at the output of the integrator is a pure sinusoide of peak amplitude $(V_o)$ equal to 2.3 Volts (full swing amplitude for a single 5 Volts supply) and frequency (f) equal to 2.5 kHz. The maximum output DC step $\Delta V_2$ can be computed to be approximately: $$\Delta V_2 \approx \sin\left(2\pi \frac{f}{f \, clock} V_o\right) \approx 285 \, mV. \tag{3.4}$$ The size of the input signal can be computed from Eq. (2.10) as follows $$V_i = \frac{2 \pi f C_I}{f_{clock} C_s} V_o \approx 1.9 V \tag{3.5}$$ For a single stage topology, the size of the output feedforward voltage spike ( $\Delta V_1$ ) can be estimated to first order by neglecting the effect of the amplifier at $t=0^+$ and also neglecting the switch resistance. In this case the circuit of Fig. 3.4 can be substituted with the one of Fig. 3.10. Ignoring the small input parasitic capacitance of the amplifier we have that the total equivalent capacitance on the right hand side of the switch is $$C_{eq} = \frac{C_I C_L}{C_I + C_I} \approx 1.6 \ pF \tag{3.6}$$ The height of the step at the input $\Delta V_{in}$ is then obtained by using charge conservation as follows: $$\Delta V_{in} = \frac{V_{in} C_S}{C_S + C_{eq}} = 0.85 \text{ Volts}$$ (3.7) While the height of the step at the output can be computed from $\Delta V_{in}$ by capacitor division as follows: $$\Delta V_1 = \Delta V_{in} \frac{C_I}{C_I + C_I} \approx 0.65 \text{ Volts}$$ (3.8) This shows that the size of the feedforward spike is more than twice that of the DC voltage step. In actuality the finite resistance of the switches will limit the spike height, however, is very possible to have $\Delta V_1 > \Delta V_2$ . In our following calculations we will assume $\Delta V_1 = 1.5 \Delta V_2$ for sake of concreteness. For the single stage case the above result can be directly used in Eq. (3.2) to give $(\frac{\Delta V_1}{\Delta V_2})_{\rm max} \approx 2.85.$ For the two stage case (Fig. 3.9) a two to one current ratio between the output and input is assumed. Eq. (3.2) can still be used if its right hand side is multiply by the ratio between the slew delay associated with the output and that associated with the input stage. Assuming that the compensating capacitor $C_c$ is equal to the load capacitor $C_L$ and that the parasitic capacitor associated with node A in Fig. 3.9 is much smaller than $C_L$ it follows that the output voltage spike will have the same amplitude as in the single stage Figure 3.10 Circuit Used to Compute the Output Voltage Spike case. Calling I the total supply current for the entire amplifier the slew delay associated with the first stage is: $$\Delta t_{first} = \frac{3}{I} C_c \Delta V_2 \tag{3.9}$$ The slew delay associated with the second stage is $$\Delta t_{second} = \frac{3}{2} \frac{(C_L + C_{L1})}{I} (\Delta V_1 + \Delta V_2) + \frac{3}{2} \frac{C_c}{I} \Delta V_2$$ (3.10) where $C_{L\,1}=\frac{C_s\,C_I}{C_s\,+C_I}$ i.e. $C_{L\,1}$ represents the effective load at the output due to the feedback circuit. For $\Delta V_1=1.5~\Delta V_2$ and using the same numerical value as in the single stage case we obtain $\frac{\Delta t_{f\,irst}}{\Delta t_{second}}\approx 2.4$ i.e. $(\frac{\Delta V_1}{\Delta V_2})_{\rm max}\approx 2.78$ . Which is almost the same result as for the case of a single stage topology. ## 3.1.3. Single Ended Versus Fully-Differential Topology The last two alternatives considered in this overview are shown in table 3.4 and they are single ended versus fully differential (differential-in-differential-out) structures. On the left side of the table the advantages of a classical single ended circuit are listed. The most important one is its simplicity and in particular the fact that it does not need a common mode feedback circuit. In fact it is extremely difficult to design a good common mode feedback circuit without drastically increasing the overall power consumption (for a micropower amplifier) unless a switched capacitor scheme is used. This last solution, however, complicates the clocking scheme and the structure of the filter thus increasing its size. Furthermore the forward amplifier itself requires more power and area for the fully-differential case. The aspect of power consumption will be expanded further at the end of this section. | SINGLE ENDED | FULLY DIFFERENTIAL | |---------------------------------|---------------------------------------------------| | Simpler | Good P.S.R.R.<br>Up to High Frequency) | | Does Not Need<br>A CMFB Circuit | Good C.M.R.R.<br>(Up to High Frequency) | | Smaller Area | No Need for a<br>Regulated Clock in S.C. | | Less Power<br>Consumption | Easier Filter<br>Design in S.C. | | | Better Dynamic Range<br>(Same Noise Double Swing) | Table 3.4 Single Ended Versus Fully Differential Comparison On the other hand, almost all the amplifier specifications are improved by using a differential scheme. The main advantages are a better power supply rejection and common mode rejection. In order to understand the reasons for such an improvement notice that conceptually a fully differential amplifier can be obtained from a single ended one by eliminating the differential to single ended converter stage and adding another identical second stage, when this is present. This is shown schematically in Fig. 3.11 where the input differential stage is shown as a source coupled pair for sake of concreteness. If two signals with the same amplitude and opposite phase, i.e. a purely differential signal, are applied to the input nodes of the circuit of Fig. 3.11 the two outputs will also have identical amplitude and opposite phase. Defining the differential input as the difference between the signal applied to the input nodes and using the same definition for the output, follows that the fully-differential amplifier has twice as much differential gain as the original single ended one. On the other hand any signal which is simultaneously applied to the two inputs, does not produce any differential output, i.e. the circuit is able to completely reject any common mode signal. Furthermore provided that all disturbances, i.e. supply noise, clock induced noise etc, are symmetrically coupled into corresponding points in the two signal paths, their contribution to the the differential output will be totally rejected. For this to occur a perfectly symmetric circuit and layout must be realized, moreover an exact component matching must be assumed. In reality even if an exactly symmetric layout could be obtained, some mismatch is always present due to bias dependent parasitic elements. Nonetheless assuming a 10% mismatch about 20 db improvement is obtained with respect to the single ended version. In chapter 5 we will show experimentally that such a result can be achieved for a $5\mu m$ technology. The ability of rejecting unwanted disturbance from the signal path is particularly important when both digital and analog functions are integrated on the same chip due to the large amount of switching noise typically associated with any digital circuit [47]. Another kind of switching noise which is always present is an S.C. circuit, and can be reduced by using a fully-differential scheme, is the so called clock feed-through [41][48]. It is caused by the injection into the summing node of the S.C. integrator of the charge stored on the parasitic capacitance of the MOS switches and in their channels as shown in Fig. 3.12. Reducing the clock feedthrough is particularly important in S.C. applications since it could eliminate the need for a clock regulator circuit which is otherwise necessary to smooth out the noise superimposed on top of the clock signals [5]. Such a simplification of Figure 3.11 Conceptual fully-differential integrator. Figure 3.12 the design could partially compensate for the increased complexity associated with the implementation of the fully-differential topology. Further simplification in the amplifier input stage topology is often possible when a fully-differential structure is used due to the drastic reduction on the effect of the so called supply capacitance [1][39] i.e the injection of charge into the integrator capacitor due to the parasitic capacitive coupling between the supplies and the integrator summing node. If a large PSRR is necessary and single ended configuration is used the supplies must be decoupled from the summing node typically by using buffer devices biased with respect to ground and placing the input devices in floating wells (CMOS only) to eliminate body effect. These solutions are costly in terms of chip area and for the case of a low supply voltage can severely degrade the performance of the op amp. For a fully differential solution the improvement in the PSRR is often enough that no particular care in eliminating the supply capacitance coupling is necessary this can represent a fundamental advantage in a low voltage environment. Finally a drastic reduction of the clock feedthrough can allow to directly change the clock rate between successive stage of the same S.C. circuit without the need for any continuous time filtering therefore increasing flexibility at the system level. Fully-differential circuits also simplifies the design of the S.C. filter architecture because of the availability of both the positive and the negative version of the signal at each intermediate node, this is particularly advantageous in elliptic filters where zeros are to be implemented [16]. The amplifier dynamic range is also improved. In fact, since the effective signal is given by the difference of the two waveforms with opposite phases, its maximum amplitude is doubled with respect to the single ended case for the same supply voltage. On the other hand, the total input referred noise is approximately the same in the two cases due to the fact that in a well designed amplifier the noise is mostly due to the input stage and a fully differential circuit uses the same input stage for both signal paths. From the above follows that the amplifier dynamic range is almost doubled for the fully differential configuration. This, however, is not the case for the S.C. integrator due to the $\frac{k}{C}$ noise contribution associated with the MOS switches. In fact, assuming the same total capacitor size $(C_S + C_I)$ in the two cases the value of all capacitors must be reduced by a factor of two in the fully differential implementation since there are twice as many of them. This double the output noise power (see chapter 2) contributed by each switch. Furthermore, the number of switches contributing to the noise is also doubled, as can be seen from Fig. 3.13, so that the total output noise power is 4 times larger than in the single ended case which implies twice as much rms output noise. Since, as we noticed, the output swing is also doubled we conclude that the dynamic range is uneffected if the $\frac{kT}{C}$ noise is dominant while is doubled if the amplifier noise is dominant. In practice any situation between these two extreme can occur. As a final point we compare the fully differential and the single ended topologies from the point of view of power consumption when they are used in an S.C. integrator. For simplicity sake we neglect the power required by the common mode feedback portion of the circuit since such a contribution is very difficult to quantify. Such an assumption is not valid if a continuous time CMFB circuit is used [17], however, it has been shown [3] that, although at the cost of a greater complexity, it is possible to realize such a circuit in a dynamic way with essentially zero dissipation by using S.C. techniques. The above assumption should, however, always be kept in mind when evaluating the results of this analysis. The two structures are compared from the point of view of their power dissipation under the condition that they achieve the same speed and dynamic range. Assuming the same supply voltage and that the $\frac{kT}{C}$ noise is dominant to obtain the same dynamic range the value of the capacitors used in the fully differential configuration must be $\frac{1}{2}$ of those Figure 3.13 Fully-Differential and Single Ended S.C. Integrator used in the single ended one. This will have in general different effects depending on the topology of the amplifier. For a two stage structure the compensation capacitance can be reduced by a factor of two therefore obtaining the same slewing delay with $\frac{1}{2}$ of the original current in each branch of the circuit and the same linear response with either $\frac{1}{2}$ or $\frac{1}{4}$ of it depending on the region of operation of the devices. The exact amount of possible reduction will depend on the relative weight of the two portions of the response and on the region of operation. For very low power operation (which is what we are interested in) the global value of the current saving will be very close to $\frac{1}{2}$ . For a single stage amplifier on the other hand, cutting the value of the capacitance by 2 reduces correspondingly the margin of stability which may deteriorate the shape of the time response and increase the linear portion of the settling time. However, in a S.C. integrator, due to the loading effect of the integration capacitor, there is in general an ample margin of stability, so that often no degradation occurs and the same result as the two stage case is obtained. A fully differential structure, however, because of its greater complexity requires more current than the single ended one; the exact amount of such an increase is very difficult to quantify since it depends on the topology, however it will always be less than a factor of 2. Combining these two results we can conclude that the fully differential configuration compares favorably with respect to the single ended one in terms of power for the same speed and dynamic range, given that the power contributed by the common mode feedback circuit can be neglected. The above result is always true for a two stage amplifier while for a single stage one there may be cases in which this is not so. #### 3.2. DESIGN TRADE-OFFS FOR BUFFER AMPLIFIERS Many of the previous considerations referring to core amplifiers can be extended to buffer amplifiers, however, this is true to a different degree depending upon the kind of load that the circuit is expected to drive. For this reason in what follows, amongst buffer amplifiers, we will distinguish between those whose load is purely capacitive and those instead which are expected to drive both capacitors and resistors. Table 3.5 gives some general design specifications for the two cases. The main difference in the two sets of specifications is that, for the resistively loaded case, a minimum value of the output resistance is required and, therefore, a power consumption increase is expected. As we did for the core amplifiers we will focus our attention on micropower circuits as can be seen from the value of the power on table 3.5. Finally we notice that since the above circuit is mainly intended as an interface between the on-chip circuitry and the outside world we assume that the output signal will have to be referenced to ground and we therefore exclude a priori a fully differential configuration. This last assumption limits to a certain degree the generality of our considerations but drastically simplifies our task. A very challenging and interesting problem still open to new solution is the design of a differential-to-single-ended converter capable to drive off-chip loads while preserving the level of performance (PSRR, Clock Rejection, etc) achieved by the fully-differential structure which precedes it [3]. Such a problem, however, will not be addressed in the following. ## 3.2.1. Buffer Amplifier with a Capacitive-Only Load | BUFFER AMPLIFIER TYPICAL DESIGN SPECIFICATIONS | | | |------------------------------------------------|----------------------------------------------|-----------------------------------------------------------------| | AMPL. SPECIFICATIONS | CAPACITIVE-ONLY<br>LOAD<br>(100 - 200 pF) | CAPACITIVE and/or RESISTIVE LOAD (100-200 pF 1-5 $k$ $\Omega$ ) | | GAIN | > 2000 | > 2000 (Fully loaded) | | SETTLING TIME .2% ± 2.5 V Step | .5 -2.5 μsec | 1-3 μsec | | BANDWIDTH | 1 - 3 MHz | .5 - 2 MHz | | PHASE MARGIN | > 50° | > 50° | | C.M.R.R. | 70 - 90 db | 70 - 90 db | | P.S.R.R. | 80-100 db @ DC<br>30-60 db @ 100 kHz | 80-100 db @ DC<br>30-60 db @ 100 kHz | | OUTPUT SWING | 5 V From Rails | 1 V From Rails | | OUTPUT RESISTANCE Closed Loop | Non Critical | <.5-4 Ω | | SUPPLY VOLTAGE | 5 V Only or $\pm 5 V$ | 5 V Only or $\pm 5 V$ | | POWER CONSUMPTION | 100 - 300 μW | .3 - 1 mW | | RANDOM OFFSET | 2 - 10 mV | 2 - 10 mV | | AREA | 80-300 mils <sup>2</sup> | 400-1000 mils <sup>2</sup> | | NOISE | $100 \frac{nV}{\sqrt{Hz}} @ 1 \text{kHz}$ | $100 \frac{nV}{\sqrt{Hz}} @ 1 \text{ kHz}$ | Table 3.5 Typical Buffer Amplifiers Specifications If no resistive load is expected we have a situation which is very similar to the core amplifier case. In fact, it is still possible to meet all the specifications of table 3.5 using a circuit configuration which is essentially identical to that used for the core amplifier. As before we consider the pros and cons of the possible different alternatives. We begin by comparing a class A versus a class A/B topology. For a buffer amplifier connected in a unity gain configuration differently that for the case of an S.C. integrator no feedforward effect is present therefore Eq. (3.1) can be applied as is. The absence of the feedforward term will reduce the slew delay, all else being equal. Furthermore the larger capacitive load will require a higher current level and therefore a larger ( $V_{GS} - V_T$ ) for the input devices which again will reduce the relative importance of the slewing delay with respect to the overall settling time (see Eq.(3.1)). On the other hand, the size of the output step that can be expected is generally quite large so that a class A/B configuration should typically be used. A class A/B solution also gives a larger DC gain which in turn means that a single stage configuration can be used up to a very large value of the load capacitor, as we will explain below. All the other considerations made when comparing class A and class A/B circuit for the case of core amplifiers apply unchanged to the case of buffers with a capacitive-only load. Last we compare single stage and multistage topologies. We first notice that there is a limit to the size of the capacitor that can be driven from a single stage buffer amplifier satisfying the specifications of table 3.5 even if no maximum power dissipation is specified. This is because to achieve the necessary speed the level of current has to be increased as the size of the load capacitor increase which in turn lowers the gain until eventually its value become less than the specified one even if a cascode is used. At this point a multistage solution (most of the time the equivalent of three stages is necessary) becames necessary. A double cascode solution [41] is also possible but only if a large supply is used (10 V or more). Given that the choice exist between single stage and multistage configurations their advantages and disadvantages are essentially the same as in the case of core amplifiers. Some specific observations should, however, be made. First, the saving of area for the single stage case due to the absence of a compensation capacitance is more substantial for a buffer than for a core amplifier since the size of such a capacitance is closely related to that of the load one which is much larger for a buffer. Next, in a single stage configuration to guarantee stability a minimum amount of capacitance must be present at the output at all times. This is because the load capacitance sets the location of the dominant pole while the non dominant ones are dependent on the value of internal parasitic capacitances. In a two stages configuration, on the other hand, since the load capacitance determine the position of the first non dominant pole while the dominant one is fixed by the value of the compensation, to guarantee stability the size of the load capacitance must be smaller than a certain maximum value. Since the second condition is typically easier to satisfy than the first one we can conclude that, from this point of view, a two stage configuration is more flexible than a single stage one. Notice that the above is generally not true for a core amplifier due to the fact that, in this case, the amount of load capacitance needed to insure stability for a single stage circuit is almost always guarantee to be present. In fact for an S.C. integrator even if no load is attached to it the parasitic capacitance associated with the bottom plate of the integration capacitor is generally enough to guarantee stability. Finally, if a class A/B configuration is adopted a single stage solution will generally produce a simpler topology. The reson for this is explained below. The most common way to realize a class A/B circuit, as is done in most of the output stages of general purpose operational amplifiers, is to split the signal into two paths which are alternatively active depending on the polarity of the signal, and to sum them up at the output. If this technique is adopted together with a two stage topology it is necessary to frequency compensate both paths as it is shown schematically in Fig. 3.14. Such a solution, besides requiring two compensation capacitances, produces a doublet (a pole and a zero close to each other) at a frequency near the dominant pole in the open loop transfer function of the circuit. It has been shown [46] that this gives rise to a slow settling component in the closed loop step response which can potentially degrade the settling time. The amplitude of such a component is very difficult to predict because of the heavy non linearities inherent in the class A/B behavior and is likely to be larger than what can be expected from the linear analysis performed by Kamath et al. [46] so that in most of the cases such effect cannot be tolerated. The above problem does not appear in a single stage configuration since the dominant pole is located at the output node and therefore is common to the two paths. Class A/B behavior can be achieved in some different ways, for instance by using a positive feedback to raise the current level in the circuit [10]. In general, however, these technique are less efficient since they raise the current level in the entire circuit even for those devices whose current should be reduced. Furthermore they have been reported only for single stage topologies[10]. ## 3.2.2. Buffer Amplifiers with Resistive and/or Capacitive Loads As a last topic of this overview we consider the design of micropower buffer amplifiers with resistive and/or capacitive load. Table 3.5 contains a set of specifications for such circuits. First we notice that, even for a moderate resistive load as the one we have assumed, i.e. $5 k \Omega$ , multistage topology must be used if less than 1% error is expected in the DC transfer characteristic. To show why this is so let us consider a buffer amplifier connected in a unity gain configuration with a load equal to $R_L$ . This is represented schematically as Figure 3.14 Conceptual Schematic of a Class A/B Two Stages Amplifier in Fig. 3.15 with $$A = \frac{a_0}{1 + a_0}$$ and $R_0 = \frac{r_0}{1 + a_0}$ where $a_0$ and $r_0$ are the open loop gain and output resistance respectively. The input to output DC transfer function is therefore $$\frac{V_0}{V_i} = A \frac{R_L}{R_0 + R_L} \tag{3.11}$$ The difference between the actual value of $\frac{V_0}{V_i}$ and the ideal one, i.e. 1, is the DC error whose value should be less than 1% according to our specifications. From Eq. (3.11) follows that there are two terms contributing to the overall DC error. The first one, i.e. A, depends entirely on the value of the open loop DC gain of the buffer. Since we have seen that micropower circuits can produce a very large DC gain we neglect such a contribution. Setting A=1 in Eq. (3.11) we therefore obtain: $$\frac{V_0}{V_i} \approx \frac{R_L}{R_0 + R_L} = \frac{R_L}{\frac{r_0}{1 + a_0} + R_L}$$ (3.12) Approximating $1 + a_0$ with $a_0$ we finally get $$\frac{V_0}{V_i} = \frac{R_L}{\frac{r_0}{a_0} + R_L} = \frac{a_0 R_L}{r_0 + a_0 R_L}$$ (3.13) The percentage error is therefore: $$\%error = \frac{1 - \frac{a_0 R_L}{r_0 + a_0 R_L}}{1} 100 = \frac{r_0}{r_0 + a_0 R_L} 100$$ (3.14) Assuming that 1% precision is required we then have: Figure 3.15 Buffer Amplifier in Unity Gain Configuration $$\frac{r_0}{r_0 + a_0 R_L} \geqslant \frac{1}{1000} \tag{3.15}$$ For a single stage circuit the DC gain is given by $$a_0 = gm_{inp} \beta r_0 \tag{3.16}$$ where $gm_{inp}$ is the equivalent transconductance of the input devices and $\beta$ represents the maximum value of the current gain that can be achieved by using ratioed current mirrors. For stability considerations, and to keep the power dissipation within specs, $\beta$ cannot typically exceed a value of 10. Inserting Eq. (3.16) into Eq. (3.15) we obtain: $$\frac{r_0}{r_0 + g m_{inp} \ \beta \, r_0 \, R_L} = \frac{1}{1 + g m_{inp} \ \beta \, R_L} \le \frac{1}{1000}$$ This implies $gm_{inp}$ $\beta$ $R_L \geqslant 999$ which for $R_L = 5k$ $\Omega$ and $\beta = 10$ gives $gm_{inp} \geqslant \frac{1}{50 \ \Omega}$ . This condition is practically impossible to be achieved for reasonable values of the current level and the device size and is certainly never achieved by micropower circuits. ## 3.2.3. Low Versus High Open Loop Output Impedance Design Even after having established the need for a multistage design some degree of freedom is still left in the choice of the function to assign to each stage. In particular the alternative exists between the two structures shown schematically in Fig. 3.16. The essential difference between the two configurations is that in case a) by using a local DC feedback around the output stage a low DC output impedance is achieved even when the overall feedback loop is open. On the other hand in case b) there is a relatively high open loop output impedance which must be lowered by the use of overall feedback. Some local feedback must also be present in case b) to stabilize the structure but it is only active for ac signals and therefore it does not affect the DC characteristic of the circuit. Figure 3.16 Two Alternative Ways to Achieve Low Output Impedance Structure a) allows the designer to subdivide the circuit in two blocks each one performing part of the overall task of the amplifier almost totally independently from the other, i.e. the preamplifier providing the overall gain with a very small driving requirement and the output stage providing the load driving with a very small gain requirement. In fact the open loop gain of the output stage $A_{unity}$ can be as low as 10 to 20, depending on the value of the desired output voltage swing, while its closed loop output resistance must only be equal to $\frac{R_L}{\alpha}$ where the minimum value for $\alpha$ is between 10 and 20. On the other hand the preamplifier needs only to drive the parasitic capacitance present at the input of the second stage. The main disadvantage of structure a) is a potentially lower speed. In fact in order to guarantee overall close loop stability the unity gain output stage must contribute only negligibly excess shift to the overall open loop transfer function which implies that its poles must be located well beyond the unity gain frequency of the overall structure. However the unity gain stage itself is working in a close loop configuration therefore to guarantee its own stability the pole contributed by the load capacitance has to be well beyond its unity gain frequency. The cumulative effect of the above two conditions is that the pole contributed by the load capacitance has to be located at a frequency several times (4 or more from simulation) higher than the value of the overall closed loop bandwidth as oppose to about two times for the scheme of Fig. 3.16 b). The above described behavior may cause a substantial power saving for the second solution when the load capacitance is relatively large. Another advantage of structure b) is the fact that at DC the second stage contributes to the overall gain of the circuit so that the input stage needs only to have a gain equal to $\frac{A_A}{A_{unity}}$ where $A_{unity}$ is the gain of the second stage of structure b) and $A_A$ is the minimum gain required for the preamplifier of structure a). It should, however, be noticed that for very strong driving requirement (100 $\Omega$ or less) combined with extremely high accuracy requirement (better than .1%) structure a) may be the only viable #### alternative. Last we consider the alternative between class A and class A/B topologies. We can immediately see that to have a reasonably efficient circuit the output stage must be class A/B. The question is if the first stage (stages) should be class A/B also. In both the case of Fig. 3.16 a) and b) the choice depends on the size of the voltage step the amplifier must follow. This is generally high enough to suggest the use of a fully class A/B solution as was stated before. Furthermore if high output accuracy is required the larger gain per stage associated with class A/B circuits may become an important factor. In some case, however, it may be advantageous to use a class A scheme for the preamplifier particularly when the supply voltage, and consequently the output voltage step, is low. This should give a simpler structure (less area) while the increased power dissipation is of no concern since the power consumed by the output stage is totally dominant. # 3.3. HIGH VERSUS LOW VOLTAGE SUPPLIES Scaling of MOS technology must be accompanied by a corresponding reduction of the total supply voltage. Analog circuits capable to operate from lower supply voltage must therefore be designed. In this last section we consider both the design problems and the advantages that are associated with lower supply voltages in analog applications. Pros and cons of high versus low voltage supply are shown in Table 3.6 A general definition of what is intended for low and high supply is quite difficult to formulate since it depends on the technology used and in particular on the the value of the threshold voltages. For a CMOS process the important parameter is the sum of the absolute value of the p-type and n-type threshold voltage. For sake of concreteness we consider low any supply below 5 Volts. This, however, is not intended to be of general validity. | HIGH VOLTAGE SUPPLY | LOW VOLTAGE SUPPLY | |-----------------------------------------|------------------------------------| | Better Performance | Less power | | More Circuit<br>Configurations Possible | Allows Battery<br>Operation | | Easier to Design | Compatible with<br>Digital Process | | Better Dynamic Range | Compatible with | | (Larger Output Swing) | Digital Circuits | | | | Table 3.6 High Versus Low voltage Supply On the left side of the table are shown the advantages of a classic design from a larger supply. The performance advantage is mainly due to the larger amount of current driving capability for a given circuit configuration which derives from the higher gate voltage overdrive available for each device. This is particularly important in class A/B circuits. Such an advantage is partially compensated by the larger voltage swing associated with the larger supply. The availability of a larger number of design configurations is due to the fact that for a low value of the supply with respect to the threshold voltages a limited number of devices can be stacked on top of each other and still guarantee proper functioning over process variation. Furthermore for a low supply voltage the use of cascode configurations becomes very costly in term of voltage swing to the point that often it cannot be used. Double cascode, on the other hand, cannot be used at all. For the same reason voltage followers cannot generally be used to lower the output impedance in buffer amplifiers. The next point made on the table, i.e. the design ease, is a direct consequence of the two point we have just discussed i.e. better performance and a larger available number of possible configurations. Finally the larger output swing is immediately evident and also justifies the improvement in dynamic range, assuming the same value for the noise. On the right hand side of the table are listed some of the benefits that derives from the availability of analog circuits operating from low supplies. Power reduction is evident since, for the same amount of current and therefore the same performance, reducing the power supply voltage proportionally reduces the power consumption. The other two point are also self explanatory. It is important to notice, however, that compatibility with digital process not only implies the possibility of integrating both digital and analog function on the same chip but also eliminates the need for using separate fabrication lines for digital and analog products which implies improved production flexibility. # CHAPTER 4 # EXPERIMENTAL S.C. FILTER PROTOTYPE #### 4.1. INTRODUCTION This chapter describes the design of an experimental S.C. filter prototype together with a power amplifier capable of driving off-chip loads. Both circuits should be able to operate from a single 5 Volts supply while dissipating a very low stand-by power. The filter is using a fairly standard clock rate of 128 kHz in order to keep the design as general purpose as possible and should achieve a performance level which is compatible or better than the PCM channel filter requirement. Good PSRR up to high frequency, low distortion, and large dynamic range are the primary targets of the design. In section 4.2 the general filter architecture is described and the main system choices adopted are discussed. In section 4.3 the core amplifier is described. The amplifier structure is analyzed in detail and the design adopted is explained based on the results of Chapter 2. In section 4.4 a power amplifier capable of driving off-chip loads which was implemented on the same chip with the filter is described. # 4.2. FILTER ARCHITECTURES FOR LOW-POWER, LOW VOLTAGE, HIGH DYNAMIC RANGE OPERATION The filter reported in this paper uses the standard active ladder architecture, for its low sensitivity to parameter variations, [16] and utilizes parasitic-free bottom plate S.C. integrators. Some of the main design choices and their motivations are outlined below. The filter architecture is shown schematically in Fig. 4.1, it realizes a five poles four zeros transfer function which has been shown to be able to satisfy the PCM channel requirement for the transmit filter. The transmission zeros are realized via feed-forward capacitors C1 to C4. As a consequence a total of only 5 core amplifiers are needed. Furthermore, since a fully differential configuration is used, both positive and negative signals are available at the output of each integrator. The sign inversion which is necessary when using the integrator-summer of Fig. 4.2 to implement the transmission zeros, is therefore easily achieved by simply crossing the signal paths. The resulting structure is very symmetrical and each integrator realizes an exact LDI transfer function with the exception of the two integrators implementing the resistive terminations i.e. the first and the last one. Due to the relatively high ratio between the clock rate and the integrators unity gain frequency, the error associated with the non exact terminations was found to be totally negligible by making use of the simulation program DIANA. As a consequence complex conjugated terminations were not implemented which resulted in a simpler filter layout. Table 1 gives the value of the capacitor ratios for the structure of Fig. 4.1. The sampling capacitors $C_s$ are assumed to have a value of unity. The 6-db loss inherent in the ladder structure is compensated by introducing an extra sampling capacitor $C_{S\,2}$ at the input of the filter. A gain close to 0 db in the pass-band is therefore achieved. Such a solution causes a certain amount of peaking (less than 6 db) at some internal nodes for frequencies close to the band-edge which degrade the filter linearity for large input signals. On the other hand, by doing this no extra amplification block is needed at the output and the noise level is reduced. In order to obtain the highest possible performance from a low voltage supply, a fully differential architecture was chosen. The advantages of such a topology, i.e. increased dynamic range, reduced clock feed-through, improved power supply rejection and linearity, have been discussed before [3],[17]. These considerations assume a particular significance Figure 4.1 5' order S.C. filter architecture. $$C_{S} = 1 \ pF$$ $C_{J 1} = 12.82 \ pF$ $C_{S 1} = 1.429 \ pF$ $C_{J 2} = 20.8i \ pF$ $C_{J 3} = 19.74 \ pF$ $C_{J 3} = 19.74 \ pF$ $C_{J 4} = 14.04 \ pF$ $C_{J 2} = 8.231 \ pF$ $C_{J 3} = 15.32 \ pF$ $C_{J 4} = 2.513 \ pF$ $C_{I,i} = 12.82 \ pF$ Table 4.1 Capacitor ratios for the $5^{th}$ order S.C. filter. Figure 4.2 Switche-Capcitor Integrator-Summer. in low voltage applications. Furthermore the capability, intrinsic in a fully differential approach, to independently choose the value of the input and output common-mode voltage, becomes very important in the design reported here, as will be explained in detail in section 4.3. Since the fully differential structure is expected to reduce the effect of the so called supply capacitance [18] by at least 20 dB no particular care was taken to reduce the capacitive coupling from the supplies to the summing node of the integrators. As a consequence a simpler structure can be implemented which both reduces the amplifier size and improve its performance. In particular the value of the input stage peak current is substantially increased in the simpler realization. The desired common-mode voltage at the input of each integrator $V_{cmi}$ is established as shown in Fig. 4.3, while the details of how the common mode output voltage $V_{cmo}$ is defined are given in the next section. Both $V_{cmi}$ and $V_{cmo}$ can be easily generated on chip. For optimum performance (maximum swing) $V_{cmo}$ should be equal to half of the total supply voltage (a single supply is assumed); this is however not the case for $V_{cmi}$ . In the present design $V_{cmi}$ is higher than $V_{cmo}$ by an amount approximately equal to the threshold voltage of the n-channel transistors. A final advantage of a fully differential structure is that it does not require on-chip clock regulation, which results in a substantial saving in area and power [5]. The main disadvantage of such an approach is an increase in the complexity of both the filter and the amplifier structure with respect to a corresponding single-ended realization. In this design a very efficient common-mode feedback circuit design limits the power consumption increase to approximately 40%, while the total area required increases by as much as 60 to 70 %. As will be shown in section 3, the voltage at the output of each amplifier can swing to within .5 volts from either supply before any performance degradation starts to occur. Figure 4.3 S.C. Differential Integrator. Since the values of both p and n thresholds are typically larger than .5 Volts (even in the absence of body effects), it is necessary to use complementary CMOS switches (transmission gates) at all the amplifiers output nodes. As a consequence four different clock signals must be simultaneously present on the chip. The two main non overlapping clocks are externally supplied while the other two, which are the complement of the main ones, are generated on chip via a pair of CMOS inverters. The amplifier layout is almost perfectly symmetrical in order to guarantee exact cancellation of the spurious signals coupled into the system. Some small asymmetries were impossible to avoid (cross-coupled devices), but they were all limited to the metal layer. The power level in both the filter and the amplifier can be externally controlled. A simple bias circuit sets the value of all the current sources necessary for the proper biasing of both the forward amplifier and the common mode feedback structure. Only one bias circuit is present on chip and is shared by all the amplifiers in the filter. To achieve the maximum possible speed, a class A/B amplifier design has been adopted. As it was shown in Chapter 3 the use of a class A/B structure should be considered if, by using a class A solution, the portion of the settling time that the amplifier spends in a slewing mode, $\Delta t_1$ , is comparable or larger than the portion spent in the linear (small signal) mode $\Delta t_2$ . The maximum value of $\frac{\Delta t_1}{\Delta t_2}$ given that a pure class A solution is adopted is computed from Eq.(3.2). In order to do this the maximum value of the output step $\Delta V_o$ is first determined assuming that $\Delta V_1$ and $\Delta V_2$ have comparable amplitude (see Fig. 4.4). Under these assumptions and also assuming that $f_{clock} = 128 \ kHz$ , $\xi = \frac{\sqrt{3}}{2}$ , and n = 1.5 Eq. (3.2) gives $\frac{\Delta t_1}{\Delta t_2} > 2$ for a full swing input signal (4.6 Volts peak-to-peak) of 2 kHz frequency. This suggests that if the slewing portion of the response can be eliminated, without degrading the linear response, the total settling time can be reduced by more than three Figure 4.4 Typical S.C. Output Waveform. times. Such a result motivates the choice of using a class A/B structure. Other advantages of the class A/B topology will be discussed in the next section. Since the core amplifier used in the filter cannot drive large capacitive loads with sufficient speed, the filter outputs must be buffered before they can be taken off-chip. In the prototype reported here two different output buffer were simultaneously implemented to give the maximum possible flexibility. In the first possible configuration the filter outputs are fed to two buffer amplifiers capable of driving up to 5k $\Omega$ and/or 200 pF. Such a circuit will be described in section 4.4. The second possible configuration is shown in Fig. 4.5. Each filter output drives the gate of a large MOS device whose source terminal is brought off-chip. The two output buffer devices are operated in a source follower configuration with their current level established by a pair of high-impedance matched current sources implemented off-chip. In one of the two realized prototypes to eliminate any distortion related to body effects each output source follower is located in its individual well which is tied to the source terminal. Differential-to-single-ended and single-ended-to-differential conversion are realized off-chip by the circuits of Fig. 4.6 and 4.7 respectively [49]. To guarantee that these circuits do not degrade the filter PSRR and linearity they are powered by their own $\pm 15 \, Volts$ supply. ### 4.3. OPERATIONAL AMPLIFIER DESIGN The filter performance depends for the most part on the characteristics of the operational amplifier (op. amp.) used to realize the S.C. integrator. For this reason particular care has been taken in the design of such a circuit. This has resulted in a fully differential class A/B amplifier using a single- stage topology. The reasons for choosing a fully differential topology have already been discussed. The only extra advantage not yet men- Figure 4.5 Open source output buffer devices. Figure 4.6 Figure 4.7 Single-ended to differential circuit. tioned is the absence of any systematic offset at the op. amp. input. As shown before, the use of a class A/B structure has the potential for achieving the required speed with far less current than is required in a class A solution. Lower current level not only means less power consumption, but, in MOS technology, implies larger gain. Furthermore some of the extra gain obtained can be traded-off for band-width by reducing the channel length of some of the output devices; this could allow further reduction of the current level, while still achieving the desired speed. Low current in the output stage also implies low $V_{DSAT}$ on the output devices and therefore large swing, while low current in the input stage corresponds to a large value of the ratio $\frac{gm}{I}$ which reduces the input random offset voltage caused by the random mismatch in all the devices other than the input ones [19]. On the other hand, class A/B structures are generally more complicated than class A. A single stage configuration was chosen primarily because it is particularly suitable for class A/B operation; however some of its other positive characteristics are also important. In particular, the power supply rejection at high frequency (beyond the dominant pole) is far better than in the multi-stage case. Furthermore, no high frequency, second stage noise contribution is present, which can be severely damaging in a sample data system due to aliasing effects. Finally in the present design the amount of capacitance present at the output of each amplifier is enough to guarantee proper closed-loop response, so that no extra compensation is required. The main draw-back of the single stage topology is the need for cascode devices at the output to ensure a sufficient gain with consequent reduction of the allowable swing. This can be particularly damaging in low voltage circuits. However by careful design the problem can be greatly reduced as it is shown below. The details of the particular amplifier used in this project are discussed next. ## 4.3.1. Forward Amplifier A simplified schematic of the main amplifier without common-mode circuitry is shown in Fig. 4.8. Such a circuit is a modified version of a previously proposed but never realized structure [20]. The common-mode feedback circuit will be considered later. The entire structure is perfectly symmetric about the axes A-A; therefore all of the considerations that can be made about one of the two halves apply totally unchanged to the other one. In this section, for the sake of simplicity, we will always refer to the right hand side of the circuit, unless otherwise specified. Transistors M1-M4 are the input cross-coupled devices that split the signal into two paths and provide class A/B operation. Devices M5-M8 perform a level-shift operation and provide the proper voltage at the gates of M3 and M4. Since they are forced to carry a constant current I, independently of the input signal, they essentially behave as batteries. With zero differential voltage applied assuming that $(\frac{Z}{L})_2 = (\frac{Z}{L})_6$ $((\frac{Z}{L})_1 = (\frac{Z}{L})_5)$ and $(\frac{Z}{L})_4 = (\frac{Z}{L})_8$ $((\frac{Z}{L})_3 = (\frac{Z}{L})_7)$ , it follows that $I_1 = I_2 = I$ (neglecting output resistance effects). Since both M11,M14 and M9,M13 are 1:1 current mirrors, the output current is also equal to I. The amplifier power consumption is therefore controlled by the two matched current sources in the input stage. ## 4.3.1.1. Maximum current driving Next the maximum current driving of the amplifier for a large input signal, $\Delta V_i$ , is computed. This is a very important parameter in determining the speed of a class A/B circuit. The limiting factors on the amount of current that the amplifier can deliver to the load are the value of the supply voltage and the size of the largest possible input signal. For a low-voltage, micropower circuit the former is the dominant one as it will be shown Simplified Schematic of the Main Amplifier. below. In all the following, unless otherwise stated, it is assumed that $\Delta V_i$ has a positive polarity i.e. the non inverting input $(Inp^+)$ is positive with respect to the inverting one $(Inp^-)$ (the results are dual for a negative polarity). As a consequence of the applied input the combined gate-to-source voltage drop across M1 and M4 increases by $\Delta V_i$ , while the drop across M2 and M5 decreases by the same amount which forces $I_2$ to increase by $\Delta I_2$ and $I_1$ to decrease by $\Delta I_1$ . Such a variation is mirrored to the output with a gain of 1 via M11,M14 and M9,M13 giving a total output current $(I_L)$ available to charge the load capacitance equal to: $$I_{L} = \Delta I_{1} + \Delta I_{2} = I_{2}(\Delta V_{i}) - I_{1}(\Delta V_{i})$$ (4.1) The maximum differential signal that can be expected at the input of the amplifier is close to 1 Volt. Since any input signal in excess of approximately 150 mV gives $I_1(\Delta V_i) \approx 0$ it follows that the maximum load current $I_L^{MAX}$ is $$I_L^{MAX} \approx I_2 (\Delta V_i)_{MAX} \tag{4.2}$$ The value of $I_2(\Delta V_i)_{MAX}$ is a strong function of the common mode input voltage $V_{cmi}$ and is very difficult to obtain analytically; therefore its exact evaluation should be left to computer simulation. $I_2(\Delta V_i)_{MAX}$ can, however, be approximated by the value of $I_2(\Delta V_i)$ at which the first one between M4 and M1 (depending on the value of $V_{cmi}$ ) is entering the linear region of operation. From this point on, in fact, $I_2$ can only increase by a relatively small amount. It is intuitively obvious that the largest possible value of $I_2(\Delta V_i)_{MAX}$ for a certain supply voltage $V_{SS}$ is obtained when M4 and M1 enter the linear region of operation simultaneously. It can be shown that the value of $V_{cmi}$ that corresponds to such an optimum situation ( $V_{cmi_0}$ ) is approximately equal to $\frac{V_{SS}}{2} + V_{T1}$ , where $V_{T1}$ is the threshold voltage of transistor M1 (M2), and give rise to a maximum driving current of approximately $$I_{2}(\Delta V_{i})_{MAX} \approx \frac{1}{2} k' \left(\frac{Z}{L}\right)_{9} (V_{S} - V_{TN})^{2}$$ (4.3) where k' is the n-type transconductance parameter and $V_{TN}$ is the n-type threshold voltage. In the design reported here the above optimum bias condition was chosen. The corresponding peak current can be calculated from Eq. 6 for $V_{SS}=5V$ and using the actual values of the device size together with the process parameters. This gives $I_2(\Delta V_i)_{MAX}=90\mu A$ , which is approximately 45 times larger than the nominal stand-by current value. Such a current level corresponds to an input differential signal of approximately 800 mV which is smaller than the actual maximum value; this confirms that the limiting factor in achieving the largest possible driving is the value of the supply voltage. Notice that by chosing $V_{cmi}$ to be equal to $V_{cmo}$ i.e. $\frac{V_{SS}}{2}$ , as it would be necessary the case if a single ended configuration was used, the maximum current becomes only about $25\mu A$ which is almost 4 times smaller than in the previous case. These results agree fairly well with the simulation. #### 4.3.1.2. Dynamic Biasing The calculation above has given the value of the peak input current. In order to have a fast step response, however, the amplifier must be able to deliver all of the above current to the output load. For this to occur the bottom current mirror, M9,M13, (or the top one for an input signal of the opposite sign) should maintain a gain of 1 up to $I_2(\Delta V_i)_{MAX}$ or, equivalently, M13 should stay in the saturation region up to such a current level. If this is not the case the output current limits before it reaches its peak value and the full advantage of class A/B operation is not exploited. The voltage level at the gate of M15 $(V_{BIAS})$ necessary for the above condition to be verified is much higher than the level required for proper operation at the quiescent current level. This can be seen clearly from the plot of Fig. 4.9 where the output characteristics of transistor M13 are shown together with the load lines for two different values of $V_{BIAS}$ ; $(V_{GS}-V_T)_Q$ and $(V_{GS}-V_T)_{MAX}$ represent the quiescent and the maximum transient value of $(V_{GS}-V_T)$ of transistor M13. For a fixed voltage bias there is a trade-off between swing and current driving capability. In fact, as it is shown by curve a), to achieve maximum swing, M13 has to be biased at the edge of the linear region, i.e. $V_{DS}=V_{DS}^a$ , in such a case, however, the maximum value of the output current, $I_{MAX}^a$ , is only a few times the quiescent value, $I_Q$ even for a very large value of $(V_{GS}-V_T)_{MAX}$ . On the other hand, to obtain a very large value of the output current the bias condition shown by curve b) has to be chosen. In this case, however, the output swing is greatly reduced since $V_{DS}^b \gg V_{DS}^a$ . In a low voltage environment a large output swing is a primary goal in order to keep the dynamic range as high as possible. On the other hand for micropower application a large current driving capability is equally important to keep speed as high as possible. An optimum load line for transistor M13 during transient condition is shown in Fig. 4.9 as curve c). In this case the peak driving current of curve b) $(I_{MAX}^b)$ together with the drain-to-source voltage of curve a) $(V_{DS}^a)$ is achieved. The bias voltage at the gate of device M15 that corresponds to curve c) is given in Eq. (4.4). $$V_{BIAS} = V_{T 15} + (V_{GS} - V_T)_{13} + (V_{GS} - V_T)_{15} = V_{T 15} + \frac{8 I (\Delta V_i)}{\mu C_o} (\frac{L}{Z})_{13}$$ (4.4) where for simplicity it is assumed that M13 and M15 have the same aspect ratio and that the output voltage is such that M13 is in saturation From Eq. (4.4) it is evident that, if body effects are neglected, $V_{BIAS}$ can be generated by the simple circuit of Fig. 4.10, where M30 has an aspect ratio $(\frac{Z}{L})_{30} = \frac{1}{4} (\frac{Z}{L})_{13}$ provided that the current in M30 tracks the output current $I(V_i)$ during all transient. Since Figure 4.9 (Output Characteristics of Transistor M13. $V_{BIAS}$ changes in response to variation of the current level the above scheme is called "dynamic biasing". Fig. 4.11 is a detailed schematic of the main amplifier. On it is shown how the tracking current is generated. The current in each one of the two branches of the input structure is mirrored both from the top and from the bottom, therefore giving two perfectly matching currents one of which is sent directly to the output while the other is used to generate $V_{BIAS}$ . In the actual circuit the aspect ratio of the bias devices (M30,M40) is chosen to be smaller than the theoretical minimum given by Eq.7 The output devices (M15,M16) are therefore biased deeper into the saturation region than the theoretical minimum represented by curve c). This is done to guarantee a sufficiently large voltage gain even for relatively short channel devices and to compensate for the fact that, due to body effect, $(V_T)_{15} > (V_T)_{30}$ . As a consequence, the output swing is slightly reduced, nevertheless the amplifier can swing to within .5 V from the supply rails while still maintaining a gain of over 83 db according to SPICE simulation. As a last point, notice that the frequency behavior of a dynamically biased cascode structure is practically identical to that of a fixed bias one. The same kind of load line as curve c) could be obtained by using the high swing, high impedance current mirror shown in Fig 4.12 [21]. Such an approach requires less power and silicon area but is not feasible in a low-voltage environment. # 4.3.1.3. Dynamic Behavior The circuit considered here, as all the class A/B circuits, performs in a very non linear fashion for large input signals due to the big excursion on the current level during transient. A closed form analysis is therefore very difficult, if not impossible, and will not Figure 4.10 Circuit to generate the bias for the cascode devices. Figure 4.11 Detailed schematic of the main amplifier. Figure 4.12 High-swing high-impedance current mirror. be attempted here. Computer simulation can be used if a very accurate estimation of the time domain response of the circuit is required. An approximated analysis is instead performed, where the non linear circuit is simulated by a succession of linear ones whose operating conditions corresponds to those of the amplifier during the transient. Such an approach provides some interesting conclusion which are in qualitative agreement with the computer simulation. During the transient the amplifier can be in one of two different regions of operation. In region 1 the input voltage is very small so that both signal paths are active while, in region 2 the input is large enough that one of the two paths is completely shut-off and the current level in the other one is equal to $I_1 \gg I_Q$ . The open-loop voltage gain in both regions of operations is given by: $$A_{v} = g m_{eff} R_{o} (4.5)$$ where $gm_{eff}$ is the effective input to output transconductance of the amplifier and $R_o$ is the output resistance. The exact value of $gm_{eff}$ is quite complicated however it can be easily shown that, with very good approximation: $$gm_{eff} \approx \frac{gm_1 gm_4}{gm_1 + gm_4} + \frac{gm_2 gm_3}{gm_2 + gm_3}$$ in region 1 (4.6) $$gm_{eff} \approx \frac{gm_1 gm_4}{gm_1 + gm_4}$$ in region 2 (4.7) The output resistance $R_o$ can be computed with the help of Fig. 4.13 where the voltages $V_{BIAS\,1}$ - $V_{BIAS\,4}$ are constant since in computing $R_o$ the input is kept to a fixed voltage. $R_o=R_1$ / / $R_2$ in region 1 and $R_o=R_1$ in region 2 with $$R_1 \approx gm_{16} r_{0.14} r_{0.16} \tag{4.8}$$ $$R_2 \approx gm_{15} r_{013} r_{015} \tag{4.9}$$ where $r_{oi}$ is the output impedance of transistor i. Figure 4.13 Circuit use to compute the amplifier output resistance The operational amplifier dominant pole is $$p_1 = \frac{1}{R_0 C_L} \tag{4.10}$$ where $C_L$ is the total capacitance at the output node. For a single-pole roll-off, this gives a unity gain frequency $\omega_{unity}$ $$\omega_{unity} = \frac{gm_{eff}}{C_L} \tag{4.11}$$ For a properly designed circuit, the second pole is associated with the current mirror (either the top or the bottom one) while the third one is associated with the cascode devices (M15 or M16). Both the second and third pole can be expressed as follow; $$\omega_{nd} \approx \frac{gm_i}{C_{paras}} \tag{4.12}$$ where $gm_i$ is the transconductance of either the diode-connected device in the active current mirror (M9) or the cascode device (M15), and $C_{paras}$ is the total parasitic capacitance at node B or C respectively which can, with good approximation, be considered independent of the current level since its main contribution comes from a gate capacitances. The variation of the above quantities (gain, unity gain bandwidth, non-dominant pole frequency) during the transient can be represented by plotting their values as a function of the larger of the two input currents. This is done in Fig. 4.14 under the following basic assumptions. All transistors are assumed to be in the strong inversion region, with the only possible exception of the input devices (M1-M8). The output impedance and the transconductance of an MOS device are assumed to vary with the current as follows [15][19]; Figure 4.14 Variation of gain, unity gain bandwidth, and non dominant pole frequency with the amplifier current level for different input devices size. gm $$\alpha \sqrt{I}$$ in strong inversion gm $\alpha I$ in weak inversion $R_o \alpha \frac{1}{I}$ in strong iversion Assuming a fix value of the quiescent current $I_Q$ there are two possible situation that can occur depending on the size of the input devices. They are shown in Figs. 4.14 a and b. In the following $I_{st}$ represents the value of the current level above which the input devices are operating in strong inversion. Fig. 4.14a corresponds to the case of using very large input devices so that $I_{st} > 4 I_Q$ . In Fig. 4.14b the input devices are assumed to be small enough to guarantee that they are biased in strong inversion for $I = I_Q$ . In both cases during the transient the current level goes from its quiescent value to its peak value $I_{MAX}$ and back again to $I_Q$ due to the action of the feed-back loop. As a consequence the value of $\omega_{unity}$ during transient is always larger than in stand-by while $A_V$ is always smaller. Such a behavior is particularly favorable for S.C. applications; in fact during the transient the speed is enhanced, while the temporary loss of gain is of no consequence since a large gain is only important at the end of the clock cycle. On the other hand the relative position of $\omega_{unity}$ and $\omega_{nd}$ during the transient, behave differently for the two cases considered. For the case of Fig. 4.14a $\omega_{unity}$ and $\omega_{nd}$ are changing in such a way that the margin of stability present at the stand by current level is not preserved during the transient. As a consequence, to avoid instability or excessive ringing, the amplifier may need to be overcompensated with a consequent overall speed loss. For the case of Fig. 4.14b the relative position of $\omega_{unity}$ and $\omega_{nd}$ is now changing in the opposite direction, therefore no stability problems occur if stability is guaranteed at the quiescent current level. The maximum input device size for which the stability margin does not degrade during the transient corresponds to the condition $I_Q = \frac{1}{4} I_{st}$ . As a consequence of the behavior outlined above the size of the input devices cannot exceed some maximum value or the amplifier speed will be compromised and eventually instability can occur. On the other hand most of the ac characteristics of the op. amp. improve by increasing the size of the input devices. This can be seen with the help of Fig. 4.15, which shows the qualitative behavior of the gain, the unity gain frequency, and of both the $\frac{1}{f}$ and white noise as a function of the input device size for a fixed value of $I_Q$ . From this plot, it is clear that an optimum performance is achieved when the input devices are in the subthreshold region. Furthermore, to reduce the $\frac{1}{f}$ noise the input device size should be increased as much as possible [22]. The size of the input devices (M1-M8) also affects the frequency behavior of the input cross coupled structure. It can be shown in fact that such a structure contributes a pole zero doublet with about 50% separation located in the proximity of the $f_T$ of the input devices. In the subthreshold region for an MOS transistor of length L and width W with constant current level, $f_T$ is proportional to $\frac{1}{WL}$ . Therefore by increasing W the frequency of the doublet is rapidly reduced until it becomes smaller than $\omega_{unity}$ . From this point on the settling time of the amplifier begins to be degraded. If speed is a primary concern the value of W should be constrained in order to guarantee that the doublet is outside of the passband. Such a limit can be more or less stringent than the one related to the transient behavior of the op. amp. depending on the size of the load capacitor. The combination of all the above constraints give rise to an optimum size for the input devices that depends on both the size of the load capacitance and the value of $I_Q$ ; their exact value can, however, be found only by computer simulation. In practice when the current level is fairly high and/or $C_L$ is fairly small, it is impossible to sufficiently increase W to bring the input devices into subthreshold while still guaranteeing stability; therefore the optimum condition cannot be reached. In the design reported here the Figure 4.15 Variation of the amplifier gain, unity gain frequency, white noise, and $\frac{1}{f}$ noise as a function of the input device size. current level necessary to fulfill the speed requirement is low enough that the optimum condition can be closely approached, in fact the input devices are biased just below threshold, i.e. $I_Q \approx \frac{1}{2} I_{st}$ . #### 4.3.1.4. Noise The noise performance of the amplifier is of particular concern for two reason. First, low voltage supply implies a small voltage swing and therefore a lower dynamic range for a given noise level. Second, low power consumption implies larger noise since the white noise is inversely proportional to the input devices transconductance; this again degrades the dynamic range of the amplifier all else being constant. As was pointed out before, increasing the size of the input devices reduces both the $\frac{1}{f}$ and the wide band noise. However, two more factors are to be considered if an optimum noise performances is sought. First, the input structure has to be as simple as possible; second the input referred noise due to all the devices other than the input ones should be made as small as possible (ideally negligible). In this design the input structure (M1-M8) is much more complicated than the classical source-coupled pair (8 devices instead of 2). However, it can be easily shown that the noise power associated with each of the 8 input devices should be divided by 4 when referred back to the input node. The intuitive reason for this is that the noise generated by each one of M1-M8 propagates only through one of the two signal paths while the input signal propagates through both. The overall input referred noise produced by M1-M8 is therefore equivalent to that of one n plus one p device which is comparable to that of an source-coupled pair. To analyze the noise contributed by the rest of the circuit the entire fully differential structure must be considered (Fig. 4.13). The only devices that contribute appreciable noise, besides M1-M8, are M19, M24, M20, M23, M9, M11, M10, M12, M60a, M60b and those associated with common mode feed-back structure which will be, however, considered later. It can easily be shown that the contribution of M60a,b can be made negligibly small by sufficiently increasing their channel length with respect to that of M1-M8. The situation is not quite as simple for the other 8 transistors and will be analyzed in more detail. First notice that, similarly to the input devices, they only affect one signal path and therefore their noise power contribution should be divided by 4 when referred back to the input. This is not the case if the amplifier is operated single ended. Furthermore the ratio of the noise power contributed by one of these devices and that of an input transistor is inversely proportional to the ratio of their transconductances for the white component and inversely proportional to the ratio of their channel lengths to the square for the $\frac{1}{f}$ component [22]. In order to simultaneously reduce both kinds of noise the channel length of the current mirror devices should be made as long as possible. This also has a beneficial effect on the voltage gain. On the other hand the frequency of the first non dominant pole is proportional to $L^{-\frac{3}{2}}$ where L is the channel length of the current mirror transistors. As a consequence there is a trade-off between noise and gain, on one side, and speed on the other. Fortunately for the designer, while p type transistors are slower than their n counterpart, given the same size, they also contribute less noise to the system both because they are intrinsically less noisy [15] and because they have a smaller transconductance. As a consequence their channel lengths can be chosen to be much shorter than that of the n ones. Taking advantage of such a favorable situation the noise contributed by all current mirror devices can be reduced to a smaller fraction of the total (about 15% for both $\frac{1}{f}$ and white noise), while at the same time keeping the frequency of the second pole within 40% of the maximum achievable value (all minimum length devices) with the maximum output swing kept as a constant. It is interesting to note that, in order to achieve such a result the device length has to be chosen in such a way that the n type current mirrors are slower than the p ones. As a final point notice that since the cascode devices, M15,M16, give a totally negligible noise contribution to the system their channel length can be made very short (consistently with the gain requirement) thereby improving the frequency response. #### 4.3.2. Common Mode Feedback Circuit In a fully differential configuration no common-mode feedback path capable of stabilizing the common-mode value of the internal nodes exists at the system level. As a consequence each op. amp. has to be surrounded by a specialized circuit which performs the above function. The need for a common-mode feedback circuit (CMFBC) is by far the most important draw-back inherent in a fully differential approach. Besides requiring extra area and power, the CMFBC typically limits the output swing, increases the noise, and slows down the op. amp. All of the above negative effects become particularly undesirable in a low-voltage, low-power system. For all of the different continuous-time CMFBC configurations proposed to date a large power consumption is an intrinsic necessity due to the need of having devices that behave linearly over large voltage excursions. An alternative approach suitable for sampled data systems was proposed by Senderowicz et al [3] and is adopted in this design. The conceptual schematic representation of the circuit is shown in Fig. 4.16. M4, M5, and M6, are identical devices and therefore they carry the same current I; M1, M2, and M3, Figure 4.16 Simplified dynamic common mode feedback circuit. are also identical. At the beginning let us suppose that switches Ma, Mb, Mc, and Md are open. To analyze the behavior of the circuit the feedback is loop broken by assuming that the drains of M1 and M2 are assumed to be disconnected from the output nodes and the loop gain is computed. The ac voltage at node $A(V_A)$ as a function of the two output voltages $V_{o\,1}$ and $V_{o\,2}$ is given by $$V_A = V_{10} \frac{C_1}{C_1 + C_p} + V_{20} \frac{C_2}{C_2 + C_p}$$ (4.13) where $C_1$ and $C_2$ are the two common-mode feedback capacitances and $C_p$ is the total parasitic at node A. From Eq. 4.13 it follows that if C<sub>1</sub> and C<sub>2</sub> are perfectly matched and $C_p \ll C_1 + C_2$ the common-mode portion of the output signal is transmitted to node A unchanged while the differential portion has no effect on $V_A$ . From node $V_A$ to the common mode output there is a negative gain whose amplitude is comparable to the forward gain of the amplifier. On the other hand the gain from node $\boldsymbol{V}_A$ to the differential output is ideally zero. The loop gain is therefore very large and negative for any common mode signal while is extremely small (zero for perfectly matched devices) for any differential signal. This implies that the common-mode output voltage is kept at an almost constant value even in the presence of some common-mode output signal and at the same time the op. amp. differential gain is totally uneffected. The DC value of the common-mode output voltage is, however, not well defined depending only (if no leakage on the capacitor is assumed) on the initial voltage across $C_1$ and $C_2$ . The purpose of $C_{1a}$ and $C_{2a}$ is to establish the voltage drop across $C_1$ and $C_2$ that gives the desired common-mode output and to periodically restore it to compensate for leakages. In a S.C. application $C_{1a}$ and $C_{2a}$ are switched in opposition of phase with the input signal therefore not interfering with the normal operation of the filter. This CMFBC is particularly suited for low voltage low power applications for two main reasons. First it does not require any extra power consumption, with the exception of the replica circuit that defines the proper value of $V_A$ (M3, I, and M6) which can be shared among all of the op. amps. in the system. Second, it does not degrade the differential output swing since the level shift operation performed by the capacitor $C_1$ and $C_2$ is not limited by the voltage supplies. There is, however, a trade-off between the minimum noise and the maximum speed achievable. In fact, in order to increase the unity gain frequency of the CMFBC the transconductance of M1 and M2 should be increased which, in turn, introduces more noise (particularly white noise). As a consequence a compromise between noise and speed must be reached. As shown in Fig. 4.16 the top current sources (M4 and M5) are realized with p -type transistors, while the feedback devices M1 and M2 are n types. This give a slightly higher noise contribution than the dual configuration, but has another advantage which is very important in micropower applications as explained below. Due to the very small value of the current supplied by M4 and M5, if no precaution is taken, the output common-mode voltage may enter a slewing mode during the transient and the speed of the CMFBC can be severely degraded. The reason for this is that, due to the different delay associated with the p and n section of the circuit during the transient some common-mode signal is appearing at the output even for a purely differential input. The CMFBC must be able to restore the proper common-mode output value within one clock phase. While the maximum current that the CMFBC can supply is quite large for a positive signal, is limited to 2I (about $2\mu A$ in this design) in the opposite direction. This implies that the speed of response may be inadequate for the case of a negative common mode output transient. One solution to this problem is to guarantee that the polarity of the transient common-mode output signal is always positive which can by done by ensuring that the p type current mirror is faster than its n type counterpart. It turns out that, as explained before, due to noise considerations the sizes of the devices are chosen in such a way that the above condition is verified. A second solution, probably more efficient, which however is more complicated, is to operate the CMFBC in class A/B [18]. #### 4.4. OUTPUT BUFFER AMPLIFIER In the last few years there has been a considerable effort in the design of efficient power amplifiers in both NMOS [3] and CMOS [1],[51]+[54] technologies. Such circuits must be able to drive a relatively large capacitive load and/or a relatively small resistive load and are intended as off-chip drivers particularly for telecommunications circuit. Typical speed requirements are compatible with audio applications. Common objectives in such designs are power efficiency, large dynamic range, process and supply independent power dissipation and, when class A/B structure are used, small crossover distortion. Almost all the above goals become particularly difficult to achieve for a low supply voltage. To date no MOS power amplifier capable of working from a 5 volts-only supply has been reported. The present circuit was designed as an off-chip driver in conjunction with the low-pass S.C. filter described in the previous sections. Large output swing and low distortion together with very low quiescent power dissipation are desired for a moderately high driving requirement of 100 pF and/or $5 k \Omega$ # 4.4.1. General design strategy The two main requirements of this design are to be able to operate with good performance from a low voltage supply and to achieve a moderately high drive capability (100 pf and $5k \Omega$ ) with low quiescent power dissipation (less than .5 mW). Some of the challenging problems posed to the designer by the above two constraint (low power consumption and low voltage operation) are considered in this section. Different possible solution are presented and critically compared in order to motivate the basic choices adopted in this design. Some of the considerations of Chapter 3 are repeated here to show their application to a concrete case. The fundamental problem associated with the low voltage supply is the degradation of the dynamic range due to the reduced swing. Furthermore, design flexibility is reduced since commonly used circuit configurations cannot be emploied or become impractical in a low voltage environment. As a practical example source followers, which are commonly used in buffer amplifiers to reduce the open loop output impedance, cannot be utilized since they limit the voltage swing to an unacceptably low value. For the kind of driving considered in this paper (a few k $\Omega$ ) a close loop output impedance of a few ohoms is necessary. To achieve such a value without using followers a circuit with a gain equivalent to that of a 3 stages topology (100 db or more) is necessary. Alternatively a more conventional circuit with a gain equivalent to that of a two stage structure followed by a unity gain buffer can be used. The two possibility are shown in Fig. 4.17 a and b. Since speed is of paramount importance in micropower application, the configuration of Fig. 4.17b has been adopted in the circuit reported in this paper. Furthermore, since the circuit may be expected to have to follow large voltage steps a class A/B topology has been chosen for both the input and output stages. In fact, assuming a 2 Volts step and a value of the $(V_{GS} - V_T)$ for the input devices equal to 100 mV, Eq. (3.1) gives $\frac{\Delta t_1}{\Delta t_2} \approx 6$ . This implies that if a class A structure was used more than 85% of the total settling time would be spent in the slewing mode so that by eliminating the slew delay the total settling time could be reduce by a factor of almost 7. The above result was derived for an amplifier connected in a unity gain feedback configuration. If the amplifier is used in an inverting configuration, on the other hand, the potential improvement associated with a class A/B solution is not quite as large. Nonedeless, even for the latter case, a class A structure will show a dominant slew delay up to a gain of 5. Figure 4.17 Two possible configurations for the buffer amplifier. # 4.4.2. Circuit Implementation The entire buffer amplifier structure is conceptually shown in Fig. 4.18. The first stage is a class A/B cascode amplifier with a gain of more than 80 db. The second stage is also operating in class A/B and provide an additional 40 db of gain when no resistive load is present. Although this circuit provides the same gain as a 3 stage structure it only has 2 high impedance nodes. Closed-loop stability is therefore easily achieved by introducing a pole splitting compensation capacitance as shown in Fig. 4.18. Furthermore since both stages are operated in class A/B the circuit will be virtually slew free even under very high load (up to 1000 pF). In fact during transient conditions if a large signal is applied at the input the first and second stage can provide a very large amount of current to the compensation and load capacitance respectively. As it is shown schematically in Fig. 4.18 at the amplifier input the signal is splitted into two paths which are than summed up at the output of the first stage (node A). This is done to achieve class A/B behavior in the first stage. The same operation is repeated from node A to the output, thereby achieving class A/B behavior for the second stage. It is important to understand the reason for such apparently cumbersome topology. Notice that, due to the Miller-effect on capacitance $C_c$ , the delay associated with node A give rise to the dominant pole of the amplifier. Since both signal paths are forced to traverse node A their dominant pole poles are coming from the same physical point and therefore are located at exactly the same frequency. Such a perfect matching between the dominant poles of the two paths is crucial to obtain a fast settling response in a class A/B amplifier. In fact if any mismatch is present, as it is always the case when the two paths are compensated separately, a pole-zero doublet is generated in the proximity of the first pole. As it has been shown [46] the presence of a doublet within the amplifier pass-band cause a slow settling component in the closed-loop step response that can enormously degrade the settling time. Computer simulation, using Figure 4.18 Conceptual representation of the buffer amplifier. the program SPICE, has shown that such an effect is far more severe for a class A/B circuit, due to its highly non linear behavior, than what can be expected for a class A structure [46], so that it can almost never be tolerated. The input stage is a single ended version of the core amplifier described in section 4.2 with a nominal power consumption of 50 $\mu W$ i.e. 10 $\mu A$ total supply current. Such a circuit will not be further discussed here since its behavior was described before. The second stage is shown in detail in Fig. 4.19. The output signal of the first stage (node A) is level shifted down by the voltage follower M20, with a gain slightly less than one, to the gate of the large n-type common source device M21. At the same time the same signal is level shifted up via devices M22-M24, with a gain larger than one, to the gate of the large p-type common source device M25. This gives rise to the class A/B behavior of the circuit and guarantees a large output swing independently of the value of the threshold voltages and body effect (at least for moderate output loads). At equilibrium i.e. at the end of the transient for a capacitive only load or for a reserve load when the output is equal to zero, M25 and M21 carry the same quiescent current $I_{\mathbb{C}}$ whose value is defined as explained below. During transients, or when the output voltage is different from zero in the presence of a resistive load, the two output devices (M25,M21) are heavely unbalanced with one of them currying a very large current (up to many times $I_{\mathbb{C}}$ ) while the other one carries a very small one or is totally shut-off. The difference between these two currents is delivered to the load and can therefore be many times larger than $I_{\mathbb{C}}$ . ## 4.4.3. D.C. Behavior One of the problems in the design of class A/B circuits is the control of the DC quiescent level of the current at the output independently from process and supply variations [52][54]. In the circuit presented here the current in the entire output stage is referenced to Figure 4.19 Circuit schematic for the buffer amplifier output stage. the value of current source IA, therefore it can easily be defined and controlled. The four staked diode-connected devices M20A-M23A provide the proper biasing level at node B so that the current level in M22-M24 (II) can be kept in a fixed ratio with respect to the reference current source IA. The quiescent output current $I_Q$ is related to II (and therefore to IA) by the relative size of transistors M24 and M25. The nominal values are $IA = 6 \mu A$ , $I = 10 \mu A$ , $I_Q = 70 \mu A$ , $IB = 5 \mu A$ which combined with the first stage requirement gives a total nominal stand-by supply current equal to $100 \mu A$ or a power dissipation of 0.5 mW. As it was pointed out before, the ratio between $I_Q$ and I1 (approximately 7 to 1) is easily established by opportunely scaling the size of M24 with respect to that of M25. On the other hand, the way the ratio between I1 and IA is defined and maintained is not quite as simple and will be discussed next. With reference to Fig. 4.19 a one-to-one correspondence can be established between devices M20-M23 and M20A-M23A respectively (as their names suggests). In order to maintain the value of the current I1 in a fixed ratio with IA independently from process variations and voltage variations (body bias modulation) the gate and source voltages of corresponding devices (M20 and M20A, M21 and M21A, etc.) should be the same while their size $(\frac{Z}{L})$ should be in the same ratio as their respective nominal current level. This will implies that M21A should be about 11 times smaller than M21 while M20A should be slightly bigger than M20 (5 to 6 ratio). High frequency consideration, however, impose some extra constraints on the devices size as it will be explained in more detail below. A compromise between the different requrements must be reach. As a consequence a perfect match between node voltages of corresponding devices in the output stage cannot be achieved. Nonetheless due to the low current level and to the relatively large devices size used only a few millivolt mismatch between corresponding nodes can easily be achieved while satisfying the other constraints. This gives a very tight control on the absolute value of the output current level and at the same time gives low sensitivity to threshold and supply voltage variations. The value of the gain from node A to node C (neglecting body effects) is given by $$A_{V1} = \frac{gm_{23} gm_{22}}{gm_{23} + gm_{22}} \frac{1}{gm_{24}} \tag{4.14}$$ In order to achieve a large numerical value for $A_{V\,1}$ M24 should be made very small while M23 and M22 should be made very large. The impedance level at node C, however, should remain low enough to guarantee closed loop stability and proper frequency behavior. This impose a lower limit on the size of M24. On the other hand the current level used in the circuit impose an upper limit on the size of M23 and M22 to guarantee that they operate in strong inversion. In the present design a value of $A_{V\,1}$ slightly larger than two was found to be a good compromise between all of the above constraint (a larger value could be achieved if a larger power consumption was acceptable). Such a value although fairly small, is more than sufficient to compensate for the lower mobility of the p-type output driver (M25) with respect to the n-type one (M21). In fact experimental results demonstrate that the p-portion of the circuit is faster than the n-portion. This is also partially due to the fact that a gain smaller than one exists between node A and node D. # 4.4.4. Systematic Input Offset Fig. 4.20 shows a detailed schematic of the entire amplifier. From such a figure it can be seen that the DC voltage at node A is two $V_{GS}$ higher than the negative supply. For a nominal threshold voltage of 0.8 Volts and neglecting body effects, the quiescent voltage of node A is approximately 1.8 Volts above the negative supply. For a 5 Volts total supply this implies that the drain-to-source voltage of M18 is approximately 1.4 Volts larger than that of M19. Since M19 still operates in the saturation region the input referred offset can be approximately obtained by dividing the offset at node A (0.7 Volts) by the small signal gain of the input stage which is over 80 db. This gives a systematic input offset of only Figure 4.20 Detail schematic of the entire buffer amplifier. 0.07 mV which is totally negligible. In fact the systematic input offset voltage will still remain within acceptable values, even for worst case parameter variations, if device M20 was removed (together with M20A) and the gate of M21 was connected directly to node A. This would simplify the circuit structure and reduce the impedance level at node B with beneficial high frequency effects. The negative voltage swing of node A would, however, be severely reduced which in turn would reduce the maximum current driving capability of transistor M25. For the kind of load to be driven by the circuit reported here the presence of devices M20 and M20A was found to be necessary. # 4.4.5. Frequency Behavior The frequency behavior of the entire amplifier can be studied by making use of the simplified model of Fig. 4.21. For the moment let us assume that the output voltage is near ground (or $\frac{V_{SS}}{2}$ for a single supply) so that both signal paths in the output stage are simultaneously active. The frequency of the dominant pole $\omega_1$ is approximately given by $$\omega_1 \approx -\frac{1}{R_1 C_C \left[A_{V1} gm_{25} + A_{V2} gm_{21}\right] \left[R_L / / r_{out}\right]}$$ (4.15) where $r_{out}$ is the small signal output resistance of the parallel combination of M25 and M21. The low frequency gain $A_V$ is $$A_{V} \approx gm_{I} R_{1} [A_{V1}gm_{25} + A_{V2}gm_{21}] [R_{L} / / r_{out}]$$ (4.16) where $gm_I$ is the equivalent transconductance of the input stage. The unity gain frequency $\omega_{unity}$ (assuming a single pole roll-off) is given by $$\omega_{unity} \approx A_V \ \omega_1 = -\frac{gm_I}{C_C} \tag{4.17}$$ The second pole $\omega_2$ is associated with the output node since the load capacitance $C_L$ is much larger than any of the parasitic capacitance in the circuit, its frequency location is Figure 4.21 Model used to compute the frequency response of the buffer. given by $$\omega_2 \approx -\frac{1}{|R_L|/(|A_{V1}gm_{25} + A_{V2}gm_{21})^{-1}|C_L}$$ (4.18) Finally the right hand side zero $Z_R$ is given by $$Z_R \approx -\frac{1}{R_C - (A_{V1} gm_{25} + A_{V2} gm_{21})^{-1} C_C}$$ (4.19) where $R_C$ is the equivalent resistance of devices M30 and M31 in Fig. 4.20 in series with the compensation capacitance. As a first point we notice that the stability of the circuit is not degraded by reducing the value of the load resistance (increasing its conductance). In fact it is slightly improved since while the value of the unity gain frequency is independent from the value of $R_L$ the position of the second pole is inversely proportional to the parallel combination of $R_L$ and $(A_{V \ 1} \ gm_{\ 25} + A_{V \ 2} \ gm_{\ 21})^{-1}$ . The only negative effect associated with the low resistive loads from the point of view of small signal behavior is a reduction on the low frequency voltage gain. For the present design the gain from node A to the output varies from over 40 db for a pure capacitive load to less than 20 db for the maximum resistive load of $5 \ k \ \Omega$ The value of $R_C$ was chosen in such a way to move the right hand side zero to the left portion of the s-plane. No attempt was made to use such a zero to cancel the second pole in the transfer function. This is because the position of the second pole experience large variations due to the wide range of possible values for the load capacitor. Furthermore the absolute value of $R_C$ cannot be very easily controlled and has a fairly large voltage dependence even when is implemented by a combination of a p-type and a n-type device as shown in Fig. 4.20. All of the above considerations have implicitly assumed that no excess phase shift is contributed by any of the gain stages shown in Fig. 4.21 i.e. $gm_I$ , $A_{VI}$ , and $A_{VI}$ . The extra pole associated with $gm_I$ can be easily pushed at a frequency which is much higher then the nominal unity gain frequency of the entire amplifier i.e. 1 MHz, as it will be seen below and therefore do not cause any problem. On the other hand the poles associated with $A_{V\,1}$ and $A_{V\,2}$ can cause ringing on the step response or even instability unless care is taken to guarantee that they lay at a sufficiently high frequency. The pole associated with nodes A, C, and D in particular are most critical and must be carefully considered. The above problem, as we noticed before, imposes same additional constraints on the size of the output devices and on their current level which, amongst other things, limits the maximum achievable value for $A_{V\,1}$ given a certain power consumption. We now consider the behavior of the circuit in correspondence to an output voltage different from zero so that only one of the two signal paths in the output stage is active. The exact behavior will depend on the polarity of the output signal due to the asymmetry the exists between the two signal paths. The qualitative behavior, however, is the same for the two cases and will be discussed next. In carrying out the following discussion, for the sake of simplicity, we will mostly refer to the case of a positive output voltage. As the voltage at node A starts to decrease, the gate overdrive of transistor M21 is reduced and the same occurs for its drain current. At the same time the current in M22 trough M24 starts to increase and the same occurs for the drain current of M25. If a small signal analysis is performed at different values of the small signal voltage at node A, i.e. the difference between the instantaneous voltage and the stand-by value, Eq. (4.14) through (4.18) remain formally valid until one of the two paths is cut-off. The only difference with the equilibrium condition comes from the fact that the value of the small signal parameter appearing on the above equations is continuously changing being a function of the bias conditions of the various devices. When M21 turns off its small signal transconductance become zero and the expression ( $A_{V 1} gm_{25} + A_{V 2} gm_{21}$ ) should be substituted by ( $A_{V 1} gm'_{25}$ ) where $gm'_{25}$ is the transconductance of device M25 for the instantaneous taneous value of its drain current. Since the value of the gate overdrive for M21 is only slightly lower (less than 20 mV) than the sum of the overdrive of M22 and M23 when M21 turns off the current on M25 is slightly less than 4 times the quiescent value and its transconductance is almost doubled. If an accurate calculation is made it can be shown that when M21 turns off $A_{V 1}$ $gm'_{25}$ is about 20% larger than ( $A_{V 1}$ $gm_{25} + A_{V 2}$ $gm_{21}$ ) and when M25 turns off $A_{V 2}$ $gm'_{21}$ is about 20% smaller than ( $A_{V 1}$ $gm_{25} + A_{V 2}$ $gm_{21}$ ). From the above considerations it can be inferred that from equilibrium to the turning off of either M25 or M21 the position of the poles and zeros associated with the output stage remains practically unchanged. From this point on i.e. one of the output devices is off and the other is heavily on, $A_{V 1}$ $gm'_{25}$ (or $A_{V 2}$ $gm'_{21}$ ) becomes larger than ( $A_{V 1}$ $gm_{25} + A_{V 2}$ $gm_{21}$ ). As a consequence the voltage gain increases (as long as $R_L \ll r_{out}$ ), the unity gain bandwidth remains unchanged (to first order), while both the second pole and the zero move to higher frequencies therefore improving the stability of the overall amplifier. # 4.4.6. Maximum output $\frac{dV}{dt}$ The last aspect to be considered in order to define the speed of response of the circuit is the maximum output $\frac{dV}{dt}$ that can be achieved. As we have said both the input and output stage are class A/B circuits. As a consequence they are able to deliver an amount of current to their respective loads which is many times larger than their stand-by values. When the total supply voltage is reduced, however, the peak value of the current in each stage is also reduced due to the limited amount of voltage drop allowed before some device enter the linear region of operation. To increase the value of such a peak current the only alternative is to use a very simple circuit structure (few devices stacked between the supplies) and very large device sizes. This is what is done in the output stage of the amplifier which can deliver more than 2.5 mA to the load. On the other hand, in order to achieve a large gain the input stage structure must be fearly complicated (cascode). Furthermore, due to the very low stand-by current level of the input stage compared with the output stage, very large devices cannot be used in the current mirrors and cascodes if the poles associated with them are to be kept at a sufficiently high frequency. The reason for this is that for an MOS transistors the $f_T$ varies with the device width W as $\frac{1}{\sqrt{W}}$ . As a consequence of the above situation the peak current in the input stage is much smaller (between 15 and 20 times smaller) than that of the output stage. In a properly design circuit the slewing speed $(\frac{dv}{dt})$ of both the first and the second stage should be made as close as possible to utilize both of them at the maximum of their potential. The large difference between the input and output stage peak current driving capability seems to suggest that the first stage should be much slower in charging its load than the second one. In reality, for the circuit reported here, differently that in a classical two stage pole splitted compensated structure, the size of the compensation capacitance is much smaller than the load capacitance. The reason for this is the large difference between the input and the output device current (about 50 to 1) and size (about 5 to 1). In fact for a nominal capacitive load of 100 pF a compensation capacitor of anly 8 pF was used. As a consequence the input stage is only about 30% slower than the output one in charging the load for a capacitive load only. Furthermore the two stages have a very similar speed when both a capacitor and a resistor are present at the output. #### 4.4.7. Input Common Mode Range The amplifier input stage is shown in Fig. 4.22. Such a structure is exactly the same as the one used for the core amplifier (single ended). In the case of a core amplifier the input common mode voltage can be set at an optimum value and it experience only small Figure 4.22 Amplifier input stage. variation (a few millivolts). For a buffer amplifier the same is true only if an inverting configuration as in Fig. 4.23a is used. For the non inverting configuration of Fig. 4.23b on the other hand, the input nodes experience the same voltage excursions as the output. As a consequence the smaller between the input and output common mode range determine the maximum acceptable swing. As it can be seen from Fig. 4.20 for the circuit reported here while the input common mode swing in the positive direction is very good the one in the negative direction is very poor. In fact such a circuit cannot be used as is in a unity gain buffer configuration (Fig. 4.23a) with satisfactory results. It can, however, be used in the inverting configuration of Fig. 4.23b. Furthermore same modification to restore its symmetry in the input common mode range can be easily implemented as explained below. ### 4.4.8. Possible Modified Input Stages There are at lest two possible modified input stages with symmetrical common mode swing. The first one is very straightforward and it is shown in Fig. 4.24. One extra pair of p-type source followers is added in front of the input stage in order to shift the voltage at nodes 1 and 2 up by a p-type threshold. As a consequence the input common mode voltage can swing to one p-type threshold plus 2 ( $v_{GS} - V_T$ ) from the positive supply and one n-type threshold plus 2 ( $v_{GS} - V_T$ ) from the negative supply. Notice that both in the positive and negative direction the common mode range is almost totally independent from body-effects since the devices that limit the swing have a very small body bias. There are, however, some draw-backs associated with this solution. First, the power consumption is slightly increased. This is, however, of no concern since the input stage consumes only a small fraction of the overall power dissipated by the circuit. Second, the noise, both white and $\frac{1}{f}$ , is increased. This occurs both because there are two extra devices contributing to the noise (M9 and M10) and because there is a slight Figure 4.23 Inverting and non inverting amplifier configuration. IN Figure 4.24 First alternative input stage configuration. signal loss in going from node 1 or 2 to the input nodes. As a consequence the noise associated with the old structure of Fig. 4.22 is slightly amplified when referred back to the new inputs. However, due to the fact that the p-type devices are inherently much less noisy [15] and that a gain very close to one can be obtained from the voltage followers, it can be expected that no more than 10% noise degradation will occur in going from the structure of Fig. 4.22 to that of Fig. 4.24. Finally the frequency response of the circuit of Fig. 4.24 is also degraded with respect to the previous solution and some extra phase shift from the poles associted with nodes 1 and 2 could occur. Such a degradation, however, as the other two mentioned above, should not be very significant and can be made negligible by slightly increasing the current level in the input stage. The second possible alternative for a symmetrical input stage is shown in Fig. 4.25. This solution gives the same noise performance as the original circuit (Fig. 4.22) and has even better frequency behavior. As for the case of Fig. 4.24 it requires slightly more power which is, however, of no concern as it was explained before. The main draw back in this case is the fact that there is not an exact one-to-one correspondence between the four biasing devices (M1 through M4) and te four input devices (M5 through M8) as it was the case for both the circuits of Fig. 4.22 and Fig. 4.24. This is because corresponding devices i.e. M1 and M5, M3 and M6, etc, have a different body bias and therefore have slightly different thresholds. As a consequence the ratio between the input stage current I1 and the bias current I is a function of both process and voltages variations. In particular, the input current level is a function of the total supply voltage, which deteriorate the PSRR, and is also a function of the common mode input voltage. This last effect can be very troublesome when the circuit is used in the configuration of Fig. 4.23a since in such a case a variable offset voltage appears at the input for different values of the output voltage. This looks like a nonlinear gain error and degrades the maximum achievable settling accuracy which Figure 4.25 Second alternative input stage configuration. can be a problem for high precision applications. All of these effects are more severe for very low power applications where the input devices overdrive is very small and the relative importance of the threshold variation due to the body bias modulation is large. This is particularly true if the input devices are operated in subthreshold [15]. For these reason the structure of Fig. 4.25 was not used in this design. It is however possible that such a circuit could be successfully used for higher power applications. # CHAPTER 5 # EXPERIMENTAL RESULTS The feasibility of the techniques discussed in the previous chapter for low-voltage S.C. applications was tested via a classical PCM transmit filter. Two experimental prototype chips where fabricated using two different CMOS technologies. The first one was fabricated by INTEL in a $5\mu m$ CMOS n-well process and we refer to it as the "INTEL chip". The second one was fabricated in our semiconductor lab of the University of California Berkeley and it also uses a $5\mu m$ n-well CMOS process. We will refer to this second chip as the "Berkeley chip". A thoroughly complete set of tests was carried out only for the INTEL chip mostly because the threshold voltages for the Berkeley chip where out of spec due to some processing problem so that, when operating from a total supply of 5 Volts, the INTEL chip has better performance than the Berkeley one. Very similar performance are, however, obtained for the two realization when the Berkeley chip is operated from a larger supply. In the following the complete set of results is given only for the INTEL chip. The experimental data for both the INTEL and the Berkeley version of the low-pass S.C. filter will be presented side by side. In order to guarantee a fair comparison the results for the Berkeley chip are for 6 Volts total supply while those for the INTEL chip are for the nominal 5 Volts supply. Only data from the INTEL prototype will be presented for both the core and buffer amplifier. More data on the Berkeley version of these circuit should become available in the near future and will be presented in a following report together with more detail results of the filter in both technologies [55]. Figure 5.1 Microphotograph of the INTEL chip. #### 5.1. LOW PASS S.C. FILTER A microphotograph of the entire INTEL chip is shown in Fig. 5.1 while a computer generated plot of the Berkeley chip layout is shown in Fig. 5.2. In a $5\mu m$ technology the active area occupied by the filter is approximately 54 X 72 mils. The INTEL version is actually somehow larger since was derived as a modified version of the original Berkeley layout and therefore is not optimal from the point of view of compaction. Since INTEL design rules are tighter than the Berkeley ones an optimum INTEL layout should be smaller than the above given dimensions. Fig 5.3 shows a detailed plot of the filter passband for different values of the total power dissipation for the INTEL filter, total supply dissipation ranges from $250\mu W$ to about 4 mW. Figure 5.4 shows the same set of data for the Berkeley chip in this case, however the power ranges from 300 µW to about 5 mW. For very low current level the amplifier speed is reduced and peaking occurs at the band edge, on the other hand when the current becomes two high the gain is reduced and drooping occurs. Nevertheless the filter meets the channel filter requirements over a wide range of current level (25 to 1 change for the INTEL case). The absolute minimum power required to stay within specs is about $350\mu W$ which corresponds to about $70\mu W$ per op amp for both realizations. At the nominal value of 5mW total power dissipation the pass-band ripple is approximately 0.12 db and 0.15 db for the INTEL and Berkeley realizations respectively. A coarse filter response for the INTEL and Berkeley chip are shown in Fig 5.5 a and b respectively. For the first one the transmission zeros are at 4.5 kHz and 6.7 kHz while for the second area between 50 and 80 Hz higher the stop band attenuation in both cases is always more than 34 db. All of the above data agree fearly well with the simulated results obtained from the program DIANA. The behavior of both the passband and the overall response corresponding to a variation of $\pm 10\%$ in the supply voltage is shown in Fig. 5.6 and 5.7 for the two realizations. As it can be seen the largest variation occurs at the bandedge Figure 5.2 Computer generated layout plot of the Berkeley chip. Figure 5.3 Datailed pass-band filter response for the INTEL chip at different value of the total power dissipation: a) 2500 $\mu W$ , b) 400 $\mu W$ , c) 500 $\mu W$ , d) 4 mW. Figure 5.4 Datailed pass-band filter response for the Berkeley chip at different value of the total power dissipation: a) 300 $\mu W$ , b) 450 $\mu W$ , c) 1 mW, d) 5 mW. Figure 5.5 Coarse filter response for a) the INTEL chip and b) the Berkeley chip. Figure 5.6 Change in the detailed and coarese filter response for a $\pm 10\%$ variation in the value of the supply voltage INTEL chip. Figure 5.7 Change in the detailed and coarese filter response for a ±10% variation in the value of the supply voltage Berkeley chip. peak, which is typically the most sensitive point, and is about $\pm 0.01$ db for the INTEL version and $\pm 0.3$ db, $\pm 0.15$ db for the Berkeley one. Notice that on the coarse plot of Fig. 5.6b no appreciable variation can be detected even in correspondence of the transmission zeros. A much larger variation is visible in Fig. 5.7b for the Berkeley prototype. The reason for such a behavior is probably due to the fact that even for a 6 volts supply the larger threshold value in the Berkeley realization causes the current source devices to be biased not very deep in the saturation region therefore reducing their effective output impedance which makes the chip current level more sensitive to power supply variation. The power supply rejection of the INTEL chip as a function of frequency for both positive and negative supply is illustrated by Fig. 5.8 and b. As can be seen better than 50 db rejection at 1 kHz is achieved in both cases. Furthermore the rejection is always more than 40 db up to very high frequency. Fig. 5.9 a and b shows the positive and negative power supply rejection in the range 0 to 5 kHz for the Berkeley chip. As it can be seen there seems to be some improvement with respect to the INTEL version. It should be noted, however that for the Berkeley chip these measurement where taken with a total supply voltage of more than 7 Volts. As a consequence all current source devices are operating very deep in saturation and they show larger output impedance. No higher frequency PSRR data where taken for the Berkeley chip. In Fig. 5.10 the power supply rejection ratios for both single ended and fully differential output are shown simultaneously for purpose of comparison. This measurement was made only for the INTEL chip. Fig. 5.10 shows that for both positive and negative supply an improvement of 20 to 35 db is obtained by using a fully differential configuration. The input referred noise spectrum is shown in Fig. 5.11 a and b for the INTEL and Berkeley chip respectively. The total C-message weighted integrated noise is approximately $70\mu V$ in both cases. Such a moderately low value, particularly considering the low power consumption, was achieved because of the careful choice of the devices sizes as Figure 5.8 Positive and negative PSRR for the INTEL chip. Positive and negative PSRR for the Berkeley chip. Figure 5.10 Comparison between the single-ended and the fully-differential PSRRfor the INTEL version of the filter. Input referred noise spectrum for both realizations. was explained in the previous section. The total harmonic distortion for a 2 Volt rms differential output at 1 kHz is about -73 db for the INTEL chip while for the Berkeley one is about 10 dB worse. The good linearity of the filter is further shown in Fig. 5.12 where the total harmonic distortion (THD) of the INTEL prototype for the nominal supply voltage of 5 Volts and a 1 kHz input signal is plotted versus the output signal amplitude. Notice that in Fig. 5.12 no data is reported for a differential output voltage smaller than about 4 V p-p. This is because below such a value the distortion level was comparable to the noise of the measuring device. From Fig. 5.12 it can be seen that the THD stays below -40 db up to a differential output of approximately 4.6 Volts peak (3.3 V rms) i.e. 200 mV from both supply rails. The above result combined with the value of the C-message weighed noise gives a dynamic range of approximately 93 db which is comparable with the value achieved by typical commercially manufactured filters operated from ±5 Volts supplies and consuming 10 to 15 times more power. A value just slightly smaller than the above was measured for the dynamic range of the Berkeley version. The large output swing is primarily due to the use of dynamic biasing for the cascode devices and to the fact that the CMFB circuit behave linearly even for signals which are larger than the supplies. The very linear CMFB circuit also partially explains the low distortion value achieved in the filter. Other factors are, however, also important in improving the filter linearity. In particular, for relatively small signals (more than 1 V from the supplies), the fully differential structure has a primary effect in reducing the THD. This is shown quantitatively in Fig. 5.13 for the INTEL case where the harmonic content present at the output for a relatively small signal (4.4 V p-p differential output) of frequency equal to 1 kHz is shown for both the single ended and the fully differential configuration. Notice that for ease of comparison the signals in the two cases have been scaled to give the same peak value for the 1 kHz component. As expected in going from single ended to fully differential the even harmonics cancel out while the odd ones are slightly increased (ideally by 6 db). However, if for the single THD versus differential output voltage amplitude (p-p) for the INTEL chip. Figure 5.13 Distortion for both single-ended and fully differential output. ended output the second harmonic is dominant, as it is the case for a relatively small signal, then the fully differential configuration will give a lower THD. For the case of Fig. 5.13 the THD is reduced by approximately 12 db (from -68 db to -80 db) by using fully differential topology. The above effect, however, is not as substantial when the amplitude of the signal approaches the supply voltage. This is shown in Fig. 5.14 where the same situation as in Fig. 5.13 is shown with the difference that now a 8.4 p-p differential output voltage is used. In this case for the single ended topology the second and third harmonics are comparable in amplitude therefore the improvement associated with the fully differential topology is less than 6 db. Nonetheless it is interesting to notice how the second harmonic is reduced by more than 20 db which demonstrates the good matching of the two signal paths. A second factor that contributes to reduce the filter distortion is the fact that the nonlinearity associated with the last amplifier in the filter due to its slewing response is greatly reduced by using a class A/B configuration [56]. All of the above results with reference to the INTEL chip are summarized in Table 5.1. Since lower and lower supply voltages are expected to be used in future scaled technologies it is important to reduce the minimum value of the supply required for proper operation. For the INTEL filter which was fabricated in a conventional (not scaled) process featuring approximately $\pm 0.8 \ V$ thresholds such a minimum value was approximately 3 Volts. Notice that a smaller value could be obtained if a low threshold process had been used since the limiting factor in this case is given by the voltage drop across the two diode connected devices present in each of the input bias branches (transistors M5,M7 and M2,M4). The above data is not significant for the Berkeley chip due to the larger value of the threshold voltage that resulted due to some processing problems. Finally to test the opamp. speed at different current levels the clock rate was increased from its nominal value (128 kHz) and the current required to achieve a proper filter response was recorded for the INTEL version of the chip. The results of such test are depicted in Fig. 5.15. Notice that, as expected, for low values of the current level the input devices are operated in Harmonic distortion for both single-ended and fully-differential for a differential output voltage of 8.4 V p-p. | PARAMETER | CONDITION | VALUE | |---------------------------|---------------------------------|----------------| | MINIMUM POWER DISSIPATION | 5 VOLTS ONLY | 350µW | | P.S.R.R. | 1KHz +SUPPLY<br>1KHz -SUPPLY | 56 dB<br>52 dB | | TOTAL HARMONIC DISTORTION | 2V rms differential output 1KHz | 73 dB | | IDLE NOISE | CMESSAGE<br>WEIGHTED | 70 <b>μ</b> V | | OUTPUT SWING DIFFERENTIAL | <1% THD | 3.1(RMS)V | | DYNAMIC RANGE | | 93.dB | Table 5.1 Summary of the filter performance. Figure 5.15 Minimum current requrement versus clock rate. subthreshold and the op amp unit gain bandwidth (and therefore its speed for small input signals) is proportional to the current level. On the other hand for larger current the input devices are in strong inversion and the speed becomes proportional to the square root of the current. Notice also that, although the op amp was not intended for high speed applications, by simply increasing its current level it can properly function up to clock rates in the 2 MHz range. At such speed the available time for settling is as low as 200 nsec and the required power consumption becomes approximately 17 mW per op amp. #### 5.2. CORE AMPLIFIER The core amplifier as a stand alone circuit was not tested in a detailed way. The only available set of data refer to the INTEL prototype and where partially measured and partially inferred from the filter results, they are given in table 5.2. There are many reason way in general not many direct experimental data are available for the core amplifiers which are used in most of the S.C. circuits reported in the literature. The first one is the fact that such circuits are unable to drive any off chip load without compromising their stability or speed. For this reason a buffer circuit must be used if any meaningful data has to be taken. The second reason is that probably the best way to test a core amplifier is to use it as a part of a larger system, e.g. a S.C. filter, and infer its performance from the performance of the system. The reason for this is that it is very difficult, if not impossible, to exactly recreate externally the condition in which the amplifier will be operating. A typical example relating to S.C. filters is the effect of the so called supply capacitance on the filter PSRR. Such a phenomena can make the filter PSRR much worse that what can be expected from amplifier measurements. A microphotograph of the core amplifier in the INTEL version is shown in Fig. 5.16. # CORE AMPLIFIER SPECIFICATIONS (0-5 Volts Supply) DIFFERENTIAL GAIN POWER DISSIPATION UNITY GAIN FREQUENCY NOISE **OUTPUT SWING** AREA > 10,000 90µW 2 MHz 140 nV/ $\sqrt{\text{Hz}}$ 1KHz 50 nV/ $\sqrt{\text{Hz}}$ white 0.5 Volts from Supply 300 mils<sup>2</sup> Table 5.2 Summary of the core amplifier performance. Figure 5.16 Microphotograph of the INTEL core amplifier. #### 1.3. BUFFER AMPLIFIER The only version of the buffer amplifier that was thoroughly tested is the INTEL one whose microphotograph is shown in Fig 5.17. For the INTEL realization such a circuit is connected as a unity gain buffer (Fig. 5.18a) directly on chip and only one input and the output node are available off-chip. As a consequence the inverting configuration of Fig. 5.18b could not be implemented and tested. On the Berkeley version, on the other hand, the circuit can be externally connected in any desired configurations. More results on the INTEL chip and a complete characterization of the Berkeley one will be given elsewhere [55]. The nominal testing condition are 5 Volts supply, $100 \,\mu A$ supply current, and $100 \, \mathrm{pF}$ in parallel with $10 \, k \, \Omega$ output load. All the reported results corresponds to such a condition unless explicitly stated otherwise. The D.C. transfer function for the amplifier is shown in Fig 5.19 together with the value of the two supplies. As expected the curve is skewed toward the positive supply due to the asymmetry of the input stage. The amplifier can swing to within 0.3 Volts from the positive supply and 2.1 Volts from the negative one (ground). The open-loop voltage gain could not be measured because of the closed loop connection already present on the chip. The simulated result was more than 115 dB for a capacitive-only load and more than 100 dB for the nominal loading condition. The measured unity gain bandwidth was about 900 kHz. A plot of the ac closed loop transfer function for different values of the capacitive load ranging from 100 pF to 700 pF is shown in Fig 5.20. Notice that for the nominal capacitive load of 100 pF no peaking occurs in the transfer function. A positive systematic input offset of about 10 mV was measured. The reason for such a behavior is not totally clear. The power supply rejection ratio PSRR for the positive and negative supply is shown in Fig. 5.21 in the range 0 to 100kHz. The positive PSRR is about 100 dB at D.C. and more Figure 5.17 Microphotograph of the INTEL buffer amplifier. Figure 5.18 Unity-gain non-inverting buffer amplifier configuration. Figure 5.19 DC closed-loop transfer function for the buffer amplifier together with the supply voltages. AC closed loop transfer functionfor different loads: a) 100 pF, b) 200 pF, c) 300 pF, d) 400 pF, e) 500 pF, f) 700 pF. than 50 dB at 100 kHz. The negative PSRR, on the other hand, is more than 20 dB worst. The difference between the two supplies in their ability to reject high frequency noise signals can be explained by the capacitive coupling that exist between the negative supply and the output via the compensation capacitance. The difference in the D.C. value, instead, is believed to be caused by the fact that the current level in the circuit is externally controlled by applying a D.C. voltage at the gates of the n-type current sources devices. As a consequence any noise appearing at the negative rail changes the gate-to-source voltage for the current source devices and therefore the current level in the circuit. The buffer amplifier input referred noise density in the frequency range 0 to 50 kHz is shown in Fig. 5.21. The two traces in the figure correspond to different values of the total supply current. As expected by increasing the current level the white portion of the noise is reduced. In fact at 50 kHz the noise is about $50 \frac{nV}{\sqrt{Hz}}$ for $100\mu A$ of supply current and $90 \frac{nV}{\sqrt{Hz}}$ for $40\mu A$ . Such a large change indicates that the input transistors in the buffer are operating in the subthreshold region while the other devices are still in strong inversion. In such a case, in fact, the transconductance of the input devices is reduced proportionally to the reduction of the current level. Furthermore, the contribution of the devices other than the input ones to the overall input referred noise is increasing when the current level is reduced since their transconductance is only reduced proportionally to the square root of the current reduction. At 1 kHz the $\frac{1}{f}$ component of the noise is dominant and the input referred noise density is about 170 $\frac{nV}{\sqrt{Hz}}$ in both cases. The step response of the circuit was evaluated for different loading conditions. In all cases a 2.5 Volts step in both positive and negative direction is applied to the amplifier (connected as a unity gain buffer) while the total supply voltage is 5 Volts. For the nominal condition of operation (100 $\mu A$ supply current, and 100pF in parallel with 10 k $\Omega$ load) the response is shown in Fig 5.22a. The settling time to 0.5 % is about 1 $\mu$ sec for the Figure 5.21 Positive and negative PSRR for the bufier amplifier. Figure 5.22 Buler amplifier input referred noise. Figure 5.22 Buffer amplifier step response with a 100 pF load capacitance for two different values of the total current level. positive step and 1.5 $\mu$ sec for the negative. Fig. 5.23b shows the step response for the same load condition but a total supply current of only $35\mu A$ . In this case the settling time is less than 2 $\mu$ sec in the positive direction and about 3 $\mu$ sec in the negative one. Both Fig 5.23 a and b do not show any overshoot in the step response indicating a very stable situation. The same situation as in Fig 5.23 a and b (comparison between the step response for 100 and 35 $\mu A$ supply current respectively) is shown in Fig. 5.24 a and b. In the latter case, however, the capacitive load is 200 pF. Notice that in both Fig. 5.24 and b there is a slight overshoot which indicates that the margin of stability is reduced. Nonetheless even for the lower current level and for the polarity step that gives the slower response (negative) the settling time is less than 4.5 $\mu$ sec. Fig. 5.25 a and b show the step response for a capacitive load of 500 and 1000 pF respectively. As can be seen, although a strong ringing occurs, the circuit still remain stable up to such a value of the load. From Fig. 5.23 to Fig 5.25 it can be seen how no slewing occurs even for very large capacitive loads. In fact a 2 Volts per $\mu$ sec slope in the raising edge of the step (and just slightly less in the falling edge) was measured for 1000 pF load. This implies a peak current of about 2 mA at the output which is more than 30 times the stand by value. For the nominal load a slope of about 10 Volts per $\mu$ sec was achieved. Finally the amplifier was able to drive a 2.7 k $\Omega$ resistor within 1 Volt to the positive supply with 0.1 % accuracy. The same test for the negative supply could not be made due to the swing limitation due to the input stage. The main buffer amplifier achieved specifications are summarized in table 5.3. Figure 5.23 Buffer amplifier step response with a 200 pF load capacitance for two different values of the total current level. Figure 5.24 Step response or 500 pF and 1(XX) pF load. ### BUFFER AMPLIFIER SPECIFICATIONS (0-5 Volts Supply) (100pF, 10k $\Omega$ Load) DIFFERENTIAL GAIN >110dB (simulated) POWER DISSIPATION 500µW UNITY GAIN FREQUENCY 0.9MHz 170nV/√Hz 1KHz NOISE 50nV/√Hz white **OUTPUT SWING** 0.2 Volts from +Supply 2.1 Volts from -Supply +10mV (systematic) OFFSET >100dB at DC PSRR+ >50dB at 100KHz 75dB at DC PSRR -30dB at 100KHz Settling Time , <1µsec +Step</pre> (0.5% 2.5V Stop) <1.5µsec -Step >2mA MAX CURRENT DRIVING 1000pF,2.7KΩ MAX LOAD Table 5.3 Summary of the buffer amplifier performance. 10 V/µsec 500 mils<sup>2</sup> MAX OUT dV/dt AREA #### CHAPTER 6 #### SUMMARY AND CONCLUSIONS In this dissertation we have explored new design techniques for low power lowvoltage analog circuits intended to be used in S.C. systems. In chapter 2 the performance limitation of an S.C. integrator have been investigated. The obtained results have been applied to the case of a low-pass S.C. filter. Such an analysis has shown that the absolute minimum value of both power consumption and silicon area requirement for a given dynamic range are orders of magnitude smaller than the actual values of commercially manufactured filters. In chapter 3 a general overview for the design of low power MOS analog circuits was developed. Different architectural alternatives were critically compared. Chapter 4 describes the design of an experimental S.C. filter prototype that operates from a single 5 Volts supply and consumes a fraction of the power used by similar commercial circuits. The main objective of the design was to explore the level of preformence achievable in an analog system when the supply voltage is reduced. For ease of comparison a standard 5th order filter to be used as a transmit PCM channel filter was chosen for the actual chip to be fabricated. The experimental result from the experimental prototype are given in chapter 5. The device operates from a 5 Volts only supply and achieves a performance level comparable or better than that of similar commercial systems operating from a ±5Volts supply. For the achieved dynamic range both power consumption and chip area approach more closely then previous implementations the theoretical minimum values obtained in chapter 2. This research has shown the feasibility of realizing high performance S.C. circuits in a low-voltage environment. Such a result represent a first step toward the realization of analog/digital interface on the periphery of a large digital chip, e.g. microprocessor, fabri- cated in a scaled VLSI technology and operated from a low supply voltage. More work needs to be done in order to achieve the above result. In particular sufficiently high performance A/D and D/A converter working from a low supply have to be designed. Furthermore the amount of area consumed by the these circuits, i.e. filters, A/D, D/A, etc, must be reduced since only a small fraction of the overall chip area can be allocated for them. For the present design the area of the filter is still far grater than what is acceptable to make the above goal feasible. Such a circuit was, however, fabricated in a not very advanced 5 µm technology. Furthermore, a fairly loose layout was used for ease of debugging and checking, (no layout rule checker or extractor was available at the time the first layout was made). It is believed that in a scaled technology (2-3 $\mu m$ minimum feature) and using a more compact layout the overall chip area could be reduced by a factor of 3 to 5. To achieve further reduction a simpler circuit architecture for the core amplifier must be found in particularly the CMFB circuit may have to to be reconsidered. It can also expected that some of the performance, in particular the dynamic range, will deteriorate in a scaled technology. To keep the performance to a sufficiently high level will therefore require further work. #### APPENDIX A ## ENERGY DRAWN FROM THE SUPPLY IN ONE CYCLE With reference to Fig. A.1 the energy drawn from the supplies during one clock is first computed. Assuming a positive input signal $v_i$ (t) (the result is dual for a negative one) and calling $i_i$ (t) the current in $C_s$ and $i_o$ (t) the current in $C_i$ , as shown in Fig. A.1 than $I_i$ (t) = $I_1$ and $I_o$ (t) = $-I_2$ . The amount of energy drawn from the supplies during one clock period, $\epsilon_{clock}$ , is given by $$\epsilon_{clock} = V_{CC} \int_{nT}^{(n+1)T} I_{1}(t) dt - V_{EE} \int_{nT}^{(n+1)T} I_{2}(t) dt =$$ $$= V_{CC} \left\{ Q_{S} \left[ (n+1)T \right] - Q_{S} \left[ nT \right] \right\} = V_{EE} \left\{ Q_{i} \left[ (n+1)T \right] - Q_{i} \left[ nT \right] \right\}$$ where $Q_S$ (nT) ( $Q_i$ (nT)) is the charge on $C_S$ ( $C_i$ ) at t = nT. Assuming that $\phi_2$ is on for $nT \le t < (n + \frac{1}{2})T$ that $\phi_1$ is on for $(n + \frac{1}{2})T \le t < nT$ than $$Q_S (nT) = 0$$ $$Q_S \left[ (n+1)T \right] = C_S V_i \left[ (n+1)T \right]$$ (A1.2) From charge conservation at the amplifier summing node follows that $$Q_{i} \left[ (n+1) T \right] - Q_{i} \left[ nT \right] = -C_{S} V_{i} \left[ nT \right]$$ (A1.3) The total energy drown from the two supplies during one clock cycle is therefore $$\epsilon_{clock} = V_{CC} C_S v_i \left[ (n+1)T \right] + V_{EE} C_S v_i \left[ nT \right]$$ (A1.4) For a sinusoidal input signal of peak amplitude $V_i$ and frequency f the total amount of energy drawn during a full cycle of the input signal, $\epsilon_{cycle}$ , is Figure A.1 Model to compute the energy drawn from the supplies in one cycle. $$\epsilon_{cycle} = 2 C_S V_i (V_{CC} + V_{EE}) \sum_{n=0}^{M} \sin(\frac{\pi}{n} M)$$ (A1.5) where $M = \frac{f_{clock}}{2 f}$ . For simplicity in the following M is suppose to be an integer number. The summation appearing in Eq. (A.4) is evaluated below. Noticing that $\sin x = \text{Im} \left[ e^{jx} \right]$ follows that $$\sum_{n=0}^{M} \sin\left(n \frac{\pi}{M}\right) = \operatorname{Im}\left[\sum_{n=0}^{M} e^{j \cdot n \frac{\pi}{M}}\right] =$$ (A1.6) $$= \operatorname{Im} \left[ \frac{1 - e^{j\frac{\pi}{M}(M+1)}}{1 - e^{j\frac{\pi}{M}}} \right] = \operatorname{Im} \left[ \frac{1 + e^{j\frac{\pi}{M}}}{1 - e^{j\frac{\pi}{M}}} \right]$$ $$= \frac{\sin\left(\frac{\pi}{M}\right)}{\left(1 - \cos\left(\frac{\pi}{M}\right)\right)} = Cotg\left(\frac{\pi}{2M}\right)$$ Since $cotg(x) \approx \frac{1}{x}$ if $x \ll 1$ by making use of Eq. (2.2) in the above result it follows that $$\sum_{n=0}^{M} \sin\left(n \frac{\pi}{M}\right) \approx \frac{f_{clock}}{\pi f} \tag{A1.7}$$ Substituting Eq.(A.7) in Eq. (A.5) gives $$\epsilon_{cycle} = \frac{2}{\pi} C_S V_i \left( V_{CC} + V_{EE} \right) \frac{f_{clock}}{f}$$ (A1.8) Assuming to use symmetrical supplies i.e. $V_{CC} = V_{EE} = V_s$ the final result is obtained $$\epsilon_{cycle} = \frac{4}{\pi} C_S V_i V_s \frac{f_{clock}}{f}$$ (A1.9) #### APPENDIX B # CALCULATION OF $\frac{\Delta t_2}{\Delta t_1}$ In the following analysis the structure of Fig. B.1 is used to represent a classical two stage operational amplifier in the same way as it is done by Gray and Mayer [19]. The following assumptions are also used: - 1) The slew rate of the amplifier is limited by the input stage current available to charge the compensation capacitor - 2) The input stage is modeled as shown in Fig. A1 with a maximum available current $I_{xm}$ and a transfer characteristic with slope $gm_I$ for values of the input signal smaller than $\frac{I_{xm}}{gm_I}$ . It can be shown [19] that the following very basic relationship exists between the slew rate (SR) and unity gain frequency of the amplifier $$SR = \frac{dV}{dt} = \frac{I_{xm}}{gm_I} \omega_{unity}$$ (A2.1) Where $I_{xm}$ is the maximum input stage current driving and $gm_I$ is the transconductance of the input stage. For an MOS amplifier and assuming to have a classical differential pair as input stage follows that: $$I_{xm} = I (A2.2)$$ $$gm_I = \frac{2(\frac{I}{2})}{(V_{GS} - V_T)_{inp}}$$ (A2.3) where I is the tail current source of the differential pair and $(V_{GS} - V_T)_{inp}$ is the voltage overdrive for the input devices at equilibrium. Figure B.1 Model for a class A amplifier. From Eq. (A2.1)-(A2.3) follows that: $$\Delta t_2 = \frac{\Delta V_o}{SR} = \frac{\Delta V_o}{(V_{GS} - V_T)_{inp} \ \omega_{unity}}$$ (A2.4) where $\Delta V_o$ is the height of the output voltage step. Even though Eq. (A2.4) has been deriver for a class two stages structure it can be shown that it is much more general in fact it applies to most class A amplifiers reported in literature (both one and two stage). Let us now introduce two more assumptions which are consistent with the model used by Chuang [45] - 3) The amplifier ac transfer function is well represented by a two pole system i.e. the singularities beyond the second pole contribute a negligible phase shift at the unity gain frequency, $\omega_{unity}$ . - 4) $\frac{\omega_2}{4} < \omega_{unity} < \omega_2$ i.e. $1 > \xi > \frac{1}{2}$ where $\omega_2$ is the frequency of the second pole and $\xi$ is the dumping factor of the closed loop step response. It has been shown that [45]: $$\Delta t_1 \approx \frac{2}{\omega_2} \ln \frac{1000 I_{xm}}{g m_I \Delta V_o} \tag{A2.5}$$ where the factor 1000 comes from the fact that an accuracy equal to .1% of the voltage step has been assumed. For a second order system the time domain response is characterized by a damping factor $\xi$ whose value is given: $$\xi \approx \frac{\sqrt{\omega_2}}{2\sqrt{\omega_{unit}}}$$ (A2.6) From Eq. (A2.5) and (A2.6) follows that $$\Delta t_1 \approx \frac{1}{2 \omega_{min} \xi^2} \ln \frac{1000 (V_{GS} - V_T)_{inp}}{2 \Delta V_o}$$ (A2.7) therefore $$\left(\frac{\Delta t_2}{\Delta t_1}\right) = \frac{\frac{\Delta V_o \ 2 \ \xi^2}{(V_{GS} - V_T)_{inp}}}{\ln \frac{1000 (V_{GS} - V_T)_{inp}}{\Delta V_o}}$$ (A2.8) The maximum value of $(\frac{\Delta t_2}{\Delta t_1})$ is reached when the MOS amplifier operates in the subthreshold region. In such a case we have that: $$gm_I = \frac{(\frac{I}{2})}{\frac{n \ kT}{q}} \tag{A2.9}$$ and $$\frac{I_{xm}}{gm_I} = \frac{2 n kT}{q} \tag{A2.10}$$ where n is the subthreshold coefficient defined as follows: $$n = 1 + \frac{C_d}{C_{ox}} \tag{A2.11}$$ where $C_d$ is the surface depletion capacitance per unit area, $C_{ox}$ is the oxide capacitance also per unit area. From Eq. (A2.5),(A2.6),(A2.9),(A2.10) and (A2.11) follows that: $$\left(\frac{\Delta t_{2}}{\Delta t_{1}}\right)_{\text{max}} = \frac{\frac{\Delta V_{o} \ q \ \xi^{2}}{n \ kT}}{\ln \frac{1000 \ 2 \ n \ kT}{q \ \Delta V_{o}}}$$ (A2.12) #### REFERENCES - [1] W.C. Black, Jr., D.J. Allstot, R.A. Reed, "A High Performance Low Power CMOS Channal Filter," IEEE J. Solid-State Circuits, vol. SC-15, pp. 929-938, Dec. 1980. - [2] Y.A. Haque, R. Gregortan, D. Blasco, R. Mao, and W. Nicholson, "A Two-Chip PCM Codec with Filters" IEEE J. Solid-State Circuits, vol. SC-14, pp. 961-969, Dec. 1979. - [ 3] D.Senderowicz, S.F. Dreyer, J.H. Huggins, C.F. Rahim, and C.A. Laber, "A Family of Differential NMOS Analog Circuits for a PCM Codec Filter Chip" IEEE J. Solid-State Circuits, vol. SC-17, pp. 1014-1023, Dec. 1982. - [4] R. Gregorian, and W. A. Nicholson JR., "CMOS Switched-Capacitor Filters for a PCM Voice CODEC," IEEE J. Solid-State Circuits, vol. SC-14, pp. 970-980, Dec. 1979. - [5] H. Ohara, P.R. Gray, W.M. Baxter, C.F. Rahim, and J.L. McCreary "A Precision Low-Power PCM Channel Filter with on Chip Power Supply Regulation," IEEE J. Solid-State Circuits, IEEE J. Solid-State Circuits, vol. SC-15, pp. 1005-1013, Dec. 1980. - [6] D.G. Marsh, B.K. Ahuja, M.R. Dwarakanath, P.E. Fleisher, and V.R. Saari, "A single-Chip CMOS PCM CODEC with Filters" IEEE J. Solid-State Circuits, vol. SC-16, pp. 308-315, Aug. 1981. - [7] L.T. Lin, H.F. Tseung, D.B. Cox, S.S. Viglione, D.P. Conrad, and R.G. Runge, "A Monolithic Audio Spectrum Analizer" IEEE J. Solid-State Circuits, vol. SC-18, pp. 40-45, Feb. 1983. - [8] S. Wong, and C.A.T. Salama, "Impact of Scaling on MOS Analog Performance" IEEE J. Solid-State Circuits, vol. SC-18, pp. 106-114, Feb. 1983. - [ 9] B.J. Hosticka, "Dynamic CMOS Amplifiers" IEEE J. Solid-State Circuits, vol. SC-15, pp. 887-894, Oct. 1980. - [10] M.G. Degrauwe, J. Rijmenants, E.A. Vittoz, and H.J. De Man, "Adaptive Biasing CMOS Amplifiers" *IEEE J. Solid-State Circuits*, vol. SC-17, pp. 522-528, June 1982. - [11] F.Krummenacher, "Micropower SC Biqadratic Cell," *IEEE J. Solid-State Circuits*, vol. SC-17, pp. 507-512, June 1982. - [12] M.G. Degrauwe, and W.C. Sansen "A Multipurpose Micropower SC Filter," IEEE J. Solid-State Circuits, vol. SC-19, pp. 343-348, June 1984. - [13] H. Pinier, F. Krummenacher, and V. Valencic "A μP Sixt-Order SC Leaofrog Low-Pass Filter," in *Proc. ESSCIRC*' 82, pp. 223-225, Sept. 1982. - [14] B.J. Hosticka, D. Herbst, B. Hoefflinger, U. Kleine, J. Pandel, R. Schweer, "Real-Time Programmable SC Bandpass Filter," IEEE J. Solid-State Circuits, vol. SC-17, pp. 499-506, June 1982. - [15] E. Vittoz, and F.Krummenacher, "Micropower SC Filters in Si-Gate CMOS Technology," in *Proceedings* ECCTD'80, Warsaw, vol. 1, pp.61-72, Sept. 1980. - [16] D.J. Allstot, "MOS Switched Capacitor Ladder Filter", Ph. D. Dissertation, University of California, Berkeley; May 1979. - [17] T.C. Choi, R.T. Kaneshiro, R.W. Brodersen, P.R. Gray, W.B. Jett, and M. Wilcox, "High-Frequency CMOS Switched-Capacitor Filters for Communications Applications," IEEE J. Solid-State Circuits, vol. SC-18, pp. 652-663, Dec. 1983. - [18] D.J. Allstot, and W.C. Black, Jr., "Technological Design Considerations for Monolithic MOS Switched-Capacitor Filtering Systems", Proceedings of the IEEE, vol. 71, pp. 967-986, Aug. 1983. - [19] P.R. Gray, and R.G. Meyer, Analysis and Design of Analog Integrated Circuits, New York: Wiley, 1977. - [18] W.C. Black, "High Speed CMOS A/D Conversion Technique", Ph. D. Dissertation, University of California, Berkeley; Nov. 1980. - [21] R. Castello, "Micropower CMOS Amplifier for Switched Capacitor Applications", Master's Report, University of California, Berkeley; Dec. 1981. - [22] J.C. Bertails "Low-Frequency Noise Considerations for MOS Amplifier Design" IEEE J. Solid-State Circuits, vol. SC-14, pp. 774-776, Aug. 1979. - [23] R. Castello, and P.R. Gray, to be published - [24] H. DeMan, J.Rabaey, L. Claesen, and J. Vandewalle, "DIANA-SC: A Complete CAD System for Switched-Capacitor Filters," in ESSCIRC Dig. Tach. Papers, Sept. 1981, pp. 130-133. - [25] E. Vittoz, and J. Fellrath, "CMOS Analog Integrated Circuits Based on Weak Inversion Operation", IEEE J. Solid-State Circuits, vol. SC-12, pp. 224-231, June 1977. - [26] B.J. Hosticka, R.W. Brodersen, and P.R. Gray, "MOS Sampled Data Recursive Filter Using Switched Capacitor Integrator", IEEE J. Solid-State Circuits, vol. SC-12, pp. 600-608, Dec. 1977. - [27] J.T. Caves, M.A. Copland, C.F. Rahim, S.D. Rosenbaum, "Sampled Analog Filtering Using Switched Capacitors as Resistor Equivalent" *IEEE J. Solid-State Circuits*, vol. SC-12, pp. 592-599, Dec. 1977. - [28] K.C. Hsieh, and P.R. Gray, D. Senderowicz, D. Messerschmitt "A Low-Noise Chopper-Stabilized Differential Switched Capacitor Filtering Technique," *IEEE J. Solid-State Circuits*, vol. SC-16 pp.708-715, Dec. 1981. - [29] Y.Kuraishi, T. Makabe, and K. Nakayama, "A Single-Chip Analog Front-End LSI for Modems" IEEE J. Solid-State Circuits, vol. SC-17, pp. 1039-1044, Dec. 1982. - [30] D.Senderowicz, D.A. Hodges, and P.R. Gray, "A High-Performance NMOS Opera tional amplifier." *IEEE J. Solid-State Circuits*, vol. SC-13, pp. 760-768, Dec. 1978. - [31] Y.P. Tsividis and, P.R. Gray, "An Integrated NMOS Operational Amplifier with Internal Compensation," *IEEE J. Solid-State Circuits*, vol. SC-11, pp. 748-753, Dec. 1976. - [32] D.Senderowicz, J.H. Huggins, "A Low-Noise NMOS Operational Amplifier" IEEE J. Solid-State Circuits, vol. SC-17, pp. 999-1008, Dec. 1982. - [33] K.C. Hsieh, "Noise Limitations in Switched-Capacitor Filters," Ph. D. Dissrtation, University of California, Berkeley; - [34] T.C. Choi, and R.W. Brodersen, "Considerations for High-Frequency Switched- - Capacitor Ladder Filters" *IEEE Trans. Circuits and Syst.* vol. CAS-27, pp. 545-552, June 1980. - [35] R.W. Brodersen, P.R. Gray, D.A. Hodges, "MOS Switched-Capacitor Filters," *Proc. IEEE*, pp. 61-71 Jan. 1979. - [36] L.R. Rabiner, and B. Gold, Theory and Application of Digital Signal Processing, Englewood Cliff: Prentice-Hall, 1975. - [37] C.A. Gobet, and A. Knob, "Noise Analysis of Switched Capacitor Networks," IEEE Trans. Circuits and Syst. vol. CAS-30, pp. 37-43, Jen. 1983. - [38] C.A Gobet, "Spectral Distribution of a Sampled 1st-Order Lowpass Filtered White Noise," *Electronics Letters*. vol. 17 No 19, pp. 720-721, Sept. 1981. - [39] P.R. Gray, and R.G. Meyer, "MOS Operational Amplifier Design- A Tutorial Overview "IEEE J. Solid-State Circuits, vol. SC-17, pp. 969-982, Dec. 1982. - [40] R. H. McCharles and D. A. Hodges, " Charge Circuits for Analog LSI," IEEE Trans. Circuits and Syst. vol. CAS-25, Jul. 1978. - [41] P. W. Li, M. Chin, P. R. Gray, and R. Castello, "A Ratio-Independent Algorithmic Analog-Digital Conversion Technique", in *Dige. of Tech.* Papers, 1984 Int. Solid-State Circuits Conference", San Francisco, Ca, Feb. 1984 - [42] K. Martin, and A.S. Sedra, "Effects of the op-amp Finite Gain and Bandwidth on the Performance of Switched-Capacitor Filters" IEEE Trans. Circuits and Syst. vol. CAS-28, pp. 822-829, Aug. 1981. IEEE trans. Circ. Syst. vol. CAS-28, Aug. 81. - [43] G.C. Temes, "Finite Amplifier Gain and Bandwidth Effects in Switched Capacitor Filters", IEEE J. Solid-State Circuits, vol. SC-15, pp. 358-361, Jun. 1980. - [44] T. Ishihara, T. Enomoto, M. Yasumoto, and T. Aizawa, "High-Speed NMOS Operational Amplifier Fabricatd Using VLSI Technology", *Electronics Letters*. vol. 18, pp. 159-161, Feb. 1982. - [45] C.T. Chung "Analysis of the Settling Behavior of an Operational Amplifier" IEEE J. Solid-State Circuits, vol. SC-17, pp. 74-80, Feb. 1982. - [46] B.Y. Kamath, R.G. Meyer, and P.R. Gray "Relationship Between Frequency Response and Settling Time of Operational Amplifiers," IEEE J. Solid-State Circuits, vol. SC-9, pp. 347-352, Dec. 1974. - [47] H-S Lee, D. A. Hodges, and P. R. Gray, "A Self-Calibrating 12b, 12us CMOS ADC" in Dige. of Tech. Papers, 1984 Int. Solid-State Circuits Conference", San Francisco, Ca, Feb. 1984 - [48] B.J. Shu, "Switch Induced Error Voltage on a Switched Capacitor", Master's Report, University of California, Berkeley; Jun. 1983. - [49] C.L. Hoang, "Evaluation of a Fully Integrated High Frequency Switched-Capacitor Bandpass Filter ", Master's Report, University of California, Berkeley; Sept. 1982. - [50] W.E. Wallace, "Design and Layout of a Micropower Fifth-Order Switched Capacitor Filter", Master's Report, University of California, Berkeley; Oct. 1982. - [51] V.R. Saari "Low-Power High-Drive CMOS Operational Amplifier " IEEE J. Solid-State Circuits, vol. SC-18, pp. 121-127, Feb. 1983. - [52] D.G. Maeding " A CMOS Operational Amplifier with LOW Impedance Drive Capability" IEEE J. Solid-State Circuits, vol. SC-18, pp. 227-229, Apr. 1983. - [53] K.E. Brehmer, and J.B. Wieser "Large Swing CMOS Operational Amplifier "IEEE J. Solid-State Circuits, vol. SC-18, pp. 624-629, Dec. 1983. - [54] B.K. Ahuja, W.M. Baxter, and P.R. Gray "A programmable CMOS Dual-Channal Inreface Processor "Dige. of Tech. Papers, 1984 Int. Solid-State Circuits Conference", San Francisco, Ca, Feb. 1984 - [55] M Womg, Master's Report to be published. - [56] K.L. Lee, Private communications.