This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 All Digital Energy Sensing for Minimum Energy Tracking Sagar Venkatesh Gubbi and Bharadwaj Amrutur Abstract— Minimizing energy consumption is of utmost importance in an energy starved system with relaxed performance requirements. This brief presents a digital energy sensing method that requires neither a constant voltage reference nor a time reference. An energy minimizing loop uses this to find the minimum energy point and sets the supply voltage between 0.2 and 0.5 V. Energy savings up to 1 275% over existing minimum energy tracking techniques in the literature is achieved. Index Terms— Droop detector, low point (MEP), minimum energy tracking. power, minimum energy Fig. 1. System energy–VDD curve for a 32-tap finite-impulse response (FIR) filter obtained from schematic-level SPICE simulations. I. I NTRODUCTION A whole class of systems, such as wireless sensor systems for remote monitoring, implantable medical electronic devices, and so on, has been made possible by ultralow-power very large scale integration circuits. These systems are often severely constrained in size, and the battery supplying energy will therefore be of limited capacity [1]. Since it is often inconvenient or infeasible to replace the battery, it is paramount to minimize the net energy consumed by the system to maximize its lifetime. The energy consumption of the system can be reduced by lowering the supply voltage. However, at very low supply voltages, the leakage energy dominates [2], and the net energy consumed per operation starts to increase (Fig. 1). The minimum energy point (MEP) depends on the activity factor and is also sensitive to process and temperature variations [3]. Therefore, the location of the MEP changes during circuit operation. To track the MEP, it is necessary to sense the energy consumed per operation at different supply voltages. In this brief, we present a digital energy sensing technique that does not require any sort of reference, is robust to process variations and performs well over a wide range of system current consumption. The existing energy sensing method in literature needs both a time and voltage reference [4]. The approach to sensing energy in [4], lets the supply capacitor discharge for a fixed number of clock cycles and then measures the voltage droop via a time-based ADC, which needs a time and voltage reference. This scheme also performs poorly when there is a large variance in the system current consumption. Our work addresses these issues. Abdallah et al. [5] have jointly optimized the dc–dc converter and the load circuit. This however, does not undermine the importance of minimum energy tracking. An all-digital voltage sensing method is proposed in [6] where voltage is digitized by measuring charge on a small capacitance, whereas this brief measures small voltage differences by converting voltage to time. We first describe how the energy minimizing loop works. Then, we present the proposed circuit and estimate the error it makes in measuring energy per operation. Finally, we present the performance of a system using the proposed circuit and compare it with prior art. Manuscript received October 8, 2013; accepted April 16, 2014. The authors are with the Department of Electrical and Communication Engineering, Indian Institute of Science, Bangalore 560012, India (e-mail: sagar@ece.iisc.ernet.in; amrutur@ece.iisc.ernet.in). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TVLSI.2014.2320304 Fig. 2. System incorporating the minimum energy tracking loop. II. E XISTING M INIMUM E NERGY T RACKING S CHEME A. Description of the System A low-power system such as a biomedical sensing platform comprises the minimum energy tracker, dc–dc converter, and the digital load circuit (processor, filter etc.), we wish to operate at minimum energy (Fig. 2). The minimum energy tracking loop locates the MEP dynamically. B. Finding the MEP The way the minimum energy tracking loop works is by sensing energy at each supply voltage. Because, the energy–VDD curve is a convex function, an algorithm similar to gradient descent can hunt the MEP. Once the minimum energy tracking loop is initiated, it perturbs the supply and measures energy per operation (E op ) at the new voltage. If there is an increase in E op , the direction is reversed. Now that the direction to proceed is found, the process is continued until there is no longer a decrease in the measured energy, and the algorithm halts declaring the last chosen supply voltage as the MEP. C. Issues With the Existing Scheme The primary difficulty in the existing scheme is the way in which energy is sensed. Ramadass and Chandrakasan [4] propose shutting off the power supply for Nop cycles and monitoring the supply voltage droop to estimate the energy consumed. The energy consumed per operation is given by C V12 − V22 . (1) E= 2Nop 1063-8210 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS Here, C is the decoupling capacitance and V1 is the supply voltage just before shutting off the power supply. V2 is the voltage to which the supply droops to Nop cycles after disconnecting the power supply. By choosing a sufficiently large decoupling capacitance, the droop V1 − V2 is kept small. Therefore, the approximation V1 + V2 ≈ 2V1 is made C V1 Vdroop (2) E ≈ Nop V1 − V2 = Vdroop . (3) The supply voltage V1 is already known. The droop Vdroop is digitized using an ADC. A measure of energy consumed per operation is obtained by digitally multiplying V1 and Vdroop . To digitize Vdroop , Ramadass and Chandrakasan [4] employ a time-based ADC. The problems with this approach are as follows. 1) The ADC needs both a fixed voltage reference and a reference clock. 2) The droop has to be much larger than the comparator offset (1 mV) of the comparator in the ADC to limit the error in estimating the MEP. This means that if the current consumption is overestimated and the decoupling capacitor chosen is much larger than necessary, the error in estimating the MEP balloons. On the other hand, choosing too small a decoupling capacitor could potentially cause the droop to be too large resulting in functional failure. 3) Even if an accurate estimate of the maximum current consumption is made, the variance in current consumption poses an issue. For instance, if the circuit consumes only 20% of the maximum current under typical operating conditions, the decoupling capacitor still has to be sized to account for the maximum possible current consumption, but the error in estimating MEP under typical conditions will be larger than desirable. We will see in a later section that this happens in a 32-tap FIR filter when the number of taps is reduced by gating. III. P ROPOSED M INIMUM E NERGY T RACKING S CHEME A. Proposed Method for Energy Sensing To circumvent the issues mentioned in the previous section, we propose measuring energy per operation by keeping Vdroop fixed and computing V1 /Nop as a measure of energy (2). That is to say, the power supply to the load circuit is shut off and a counter is enabled (Fig. 3). The counter keeps incrementing until Vdroop reaches a certain fixed value. When this happens, the power supply is reconnected, and the value of the counter is captured, which gives Nop . The digital controller in Fig. 3 computes V1 /Nop as a measure of energy per operation. V1 is the digital code word of the supply voltage that the controller chose and is proportional to the fraction of VBAT that the dc–dc converter is producing. Note that it is not necessary to know the absolute value of V1 as long as VBAT remains fixed when the MEP is being located. Thus, voltage references are avoided. Fig. 4 shows a critical path replica ring oscillator providing the clock to the entire system including the energy minimizing loop. It also shows a delay line longer than the ring oscillator chain powered directly from the supply whereas the power supply to the ring oscillator and the digital system can be gated (Fig. 2). To measure energy, the power supply is shut off. As the voltage VDD droops, the ring oscillator time period increases exponentially. But, the delay produced by the delay chain remains as before because its power supply is not gated. After some number of clocks, the delay of the longer delay chain and the delay of the ring oscillator running at a slightly lower voltage become equal (Fig. 5). The number of clocks for this Fig. 3. Proposed circuitry in the minimum energy tracking loop. Fig. 4. Droop detection circuit. to happen is counted. In this scheme, the signal CLKd is initially (when VDD ≈ Vsup ) captured when it is high. Eventually (as VDD droops), the falling edge of CLKd comes closer to the rising edge of the CLK. We should expect the flop to go metastable at this point. This is easily handled by adding a LO-skew inverter to the output of the flop. This is because, the flop is initially known to capture a 1, then go metastable (possibly) and finally capture 0. Therefore, a LO-skew inverter will ensure that even if the flop goes metastable, the output remains low. The number of clocks elapsed between gating the power supply and Y going 1 gives Nop . The delay of the chain of inverters constituting the ring oscillator (tring ) and delay of the delay chain above the ring oscillator (tchain ) can be shown [3] to be tring = VDD e −(1+η)VDD nVT Nring Ki (4) i=1 tchain = Vsup e −(1+η)Vsup nVT Ndelay Ki (5) i=1 where K i and K i depend on transistor parameters, n is the ideality factor, η is the Drain Induced Barrier Lowering factor, and VT is the thermal voltage. The circuit in Fig. 4 detects a 1 when tring becomes just equal to tchain . The droop at which this happens is given by ⎛ Ndelay ⎞ Vsup nVT i=1 K i ⎠ Vdroop = . (6) ln ⎝ Nring 1+η (Vsup − Vdroop ) K i=1 i Equation (6) shows that Vdroop has a weak dependence on the supply voltage. If Vdroop is small ⎞ ⎛ Ndelay Ki nVT i=1 ⎠. (7) ln ⎝ N Vdroop ≈ ring 1+η K i=1 i We shall later examine the impact of supply voltage on the droop detected by the circuit. Nring is dictated by the critical path of the This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS Fig. 5. 3 Illustration of how the droop detection circuit works. Fig. 7. Thousand-point Monte Carlo simulation of the droop detector with 10% Vth variation across process corners and 1.5%–4% within die mismatch. Fig. 6. Layout of the droop detector. digital system. From (7), Ndelay is chosen to give a reasonable droop such as 15 mV. When making the calculation, all K ’s and K ’s are taken to be equal. Although the droop detected is sensitive to temperature, the energy minimizing loop works without problems because it does not need Vdroop to be well specified so long as it is a small constant for all supply voltages chosen by the loop. The time taken for temperature to change is much larger than the time taken by the loop to find the MEP. Hence, the droop remains fixed when the loop is hunting the MEP. The proposed droop detector circuit works without needing a voltage reference or time reference. In constructing the minimum energy tracking loop, the only requirement is that the supply voltage given by the dc–dc converter is proportional to the value requested by the digital controller. The precise value of the supply voltage is not relevant. Thus, the energy minimizer avoids references altogether. There are two sources of error in the measurement of energy. One due to the approximation made in arriving at (2) and the other due to variation in the droop detected at different supply voltages. B. Impact of Process Variations A. Error Due to Approximation in Computation Equation (7) shows that global process variations have no impact on the droop detected by the circuit. All the Ks are process dependent, but global variations affect the numerator and denominator in an identical manner and thus Vdroop is left unaffected. Local variations cause a small variation in the droop detected. However, the proposed energy sensing scheme does not need a precise droop target. It is sufficient if the droop is kept constant and within tolerable limits across all supply voltages of interest and (7) shows that the summation averages out the local variations and the droop can be well controlled. Ramadass and Chandrakasan [4] have shown that the error in computing E because of the approximation V1 + V2 ≈ 2V1 is δE V1 − V2 . (8) = E V1 + V2 A constant relative error in energy estimation does not affect the energy minimizing loop. However, a relative error that changes with the supply voltage limits the energy resolution. For typical values such as Vdroop = 20 mV and V1 +V2 = 250 + 230 = 480 mV at the lowest operating point, this error is 4.166% of energy per operation at 250 mV. To obtain an estimate of the energy resolution, assume that the energy per operation E op at VDD = 300 mV is the same as at VDD = 250 mV. The error in estimating E op at VDD = 300 mV is 3.448% of E op . Thus the limit on energy resolution due to this approximation is 4.166%−3.448% = 0.718% of E op . This is because an increase of up to 0.718% in energy per operation at VDD = 300 mV will not be detected by the circuit and the system continues to operate at VDD = 300 mV, which is no longer the MEP. C. Limitations This circuit works only at subthreshold voltages, and the droop detected by the circuit is sensitive to temperature. When the current consumption of the load circuit is on the low end of the spectrum, the droop detector takes many more cycles to detect the droop because the voltage droop is slower. Therefore, the energy sensing is slower Fig. 8. Thirty-two-tap FIR filter. when the current consumption is low or the temperature is high. The energy sensing can be speeded up by having a programmable delay line in Fig. 4 that changes Ndelay to control the droop depending on the number of clock cycles it is taking for droop detection. Future work will include the performance of this circuit in the presence of power supply noise and clock jitter. IV. E RROR E STIMATION This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 4 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS TABLE I P ERFORMANCE OF THE E NERGY M INIMIZING L OOP B. Error Due to Variation in Droop B. Energy Minimizing Loop For a number of reasons including finite width of the flop’s metastability window, variation in the position of the metastability window as the supply voltage changes and nonzero droop during one clock cycle, the droop (Vdroop ) fixed by the circuit in Fig. 4 is not constant across the range of possible supply voltages, rather it is a weak function of the supply voltage. Let δV be the maximum difference in the droop between two successive supply voltages set by the minimum energy tracking loop. The error in computing energy due to this uncertainty in the detected droop is C VDD δV . (9) δ E = E computed − E actual = Nop Fig. 8 shows a 32-tap FIR filter. It consists of 32 8-bit multipliers, 31 adders ranging from 16 to 19 bit, and 31 flops, which totals to 21 020 gates. The power consumed by the FIR filter was modeled by testing the multiplier under different input combinations. The first input combination was having the multiplier fixed to 0×FF and the multiplicand swinging between 0×FF and 0×00 on every clock cycle. This refers to be swinging input case in Table I and is used to estimate the maximum possible current consumption of each multiplier. The second input combination was keeping the multiplier fixed and having a digital ramp as the multiplicand that reflects the typical power consumption of the multiplier. The maximum current drawn from the circuit at 0.5 V (Vdd,max ) is used to arrive at the decoupling capacitance (off-chip) by finding the minimum capacitance needed to prevent the droop from exceeding 20 mV when the power supply is shut off for 100 clock cycles in the proposed circuit and 32 clock cycles for the method in [4]. For the proposed circuit, the needed capacitor is 100 nF and for the method in [4], it is 32 nF. It has been suggested that a realistic estimate of energy requires consideration of the efficiency of the dc–dc converter. The efficiency of the dc–dc converter, we have assumed is based on the results in [7]. Fig. 1 shows system energy per operation versus VDD , which includes losses in the dc–dc converter. The relative loss of energy by operating at the MEP found by both the technique presented in [4] and ours when compared with operating at the actual MEP is shown in Table I. Vdroop for the proposed method was chosen randomly between 20 and 21.4 mV at every voltage step and the maximum error over 10 trials is reported. The proposed method performs well irrespective of the load whereas the method in [4] performs poorly when the current consumed is much lower than the estimated maximum because the droop over 32 clock cycles is small and consequently there is a larger relative error in digitizing the droop. The energy overhead associated with the proposed scheme in locating the MEP is equal to the energy of 11 477 operations at MEP of the FIR filter operating with only one tap enabled and the rest power gated, whereas only 463 operations is the overhead in [4]. The huge disparity is in part due to the fact that the loop in [4] halts prematurely before finding the actual MEP. The proposed scheme takes a maximum of 3 s to locate the MEP in the worst-case scenario. This time is much smaller than the time taken for ambient temperature to change substantially. Thus, the operation of the proposed circuit is independent of ambient temperature. The relative error is δE δV . (10) = E δV + Vdroop For typical values of Vdroop = 20 mV and δV = 1 mV, the error comes to 4.7%, which limits the energy resolution of this energy sensing scheme. The total error in estimating energy is thus bound by 4.7% + 0.718% = 5.418% of the energy per operation (E min ) at MEP. V. R ESULTS A 32-tap FIR filter and the proposed minimum energy tracking loop were built on the UMC 65-nm 1 Poly, 10-metal-layer lowleakage process (Fig. 6). The simulation results of the droop detector following post layout extraction are reported. The power data for the FIR filter are from transistor level SPICE simulations of a single multiplier. A. Droop Detector The performance of the droop detector was analyzed by modeling the voltage on the decoupling capacitor with the power supply shut off as a decreasing ramp. The maximum difference between the droop detected at successive supply voltages is the metric of the performance. The distribution of the maximum variation in the droop detected by the circuit with the supply voltage swept in steps of 50 mV is shown Fig. 7. The low-leakage transistors used to construct the ring oscillator and delay chain have a threshold voltage of about 450 mV. Since the exponential dependence of delay on supply voltage is true only in the subthreshold region, the performance of the droop detector declines rapidly when the threshold voltage is crossed as can be observed in Fig. 7. If the supply voltage is kept below 0.45 V, the maximum droop difference is expected to be below 1.414 mV in 99.9% of the chips at the 95% confidence level. This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS VI. C ONCLUSION We have presented a method of energy sensing that is completely digital and that does not rely on any fixed references. This makes the system robust even at very low voltages. We have demonstrated that sensing energy by shutting off the power supply works better when the droop is fixed rather than when the number of clock cycles is fixed. This also eases the choice of the decoupling capacitance and an overestimate of current consumption will not hurt the performance of the energy sensing circuit. R EFERENCES [1] M. Seok et al., “The Phoenix processor: A 30 pW platform for sensor applications,” in Proc. IEEE Symp. VLSI Circuits, Jun. 2008, pp. 188–189. [2] M. Alioto, “Ultra-low power VLSI circuit design demystified and explained: A tutorial,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 59, no. 1, pp. 3–29, Jan. 2012. 5 [3] B. H. Calhoun, A. Wang, and A. Chandrakasan, “Modeling and sizing for minimum energy operation in subthreshold circuits,” IEEE J. Solid-State Circuits, vol. 40, no. 9, pp. 1778–1786, Sep. 2005. [4] Y. K. Ramadass and A. P. Chandrakasan, “Minimum energy tracking loop with embedded DC–DC converter enabling ultra-low-voltage operation down to 250 mV in 65 nm CMOS,” IEEE J. Solid-State Circuits, vol. 43, no. 1, pp. 256–265, Jan. 2008. [5] R. A. Abdallah, P. S. Shenoy, N. R. Shanbhag, and P. T. Krein, “System energy minimization via joint optimization of the DC-DC converter and the core,” in Proc. Int. Symp. Low Power Electron. Design, Aug. 2011, pp. 97–102. [6] R. Ramezani, A. Yakovlev, F. Xia, J. Murphy, and D. Shang, “Voltage sensing using an asynchronous charge-to-digital converter for energyautonomous environments,” IEEE Trans. Emerg. Sel. Topics Circuits Syst., vol. 3, no. 1, pp. 35–44, Mar. 2013. [7] Y. Pu et al., “Misleading energy and performance claims in sub/near threshold digital systems,” in Proc. IEEE/ACM Int. Conf. Comput.-Aided Design, Nov. 2010, pp. 625–631.