ABSTRACT* With technology scaling, Power-Wall and

ABSTRACT— With technology scaling, “Power-Wall” and “Reliability wall” have turned out to be the major challenges. In recent years, many approaches like clock-gating, dynamic voltage scaling, drowsy- cache etc, have been adopted to mitigate the issues relating to the power-wall. These techniques like dynamic voltage scaling, drowsy cache tend to operate at lower supply voltages, and with technology scaling node capacitances also keep reducing. This has enabled to have reduced energy consumption but at the cost of increased soft error rates. In this project, we carry out the analysis of soft error rate impact for the standard 6T SRAM cell based cache memory due to the technology scaling and operation of lowpower drowsy mode caches. First, we study the effect of technology scaling on the soft error rate in terms of the FIT rate, and then we address the impact of operating the caches in drowsy mode (lower operating voltages) on the soft error rates. Then, we apply the PARMA model to classify the FIT rate into SDC (Silent Data Corruption) and DUE (Detectable Unrecoverable Errors). We carry out the simulation of SPEC2000 benchmarks on drowsy cache enabled simplescalar tool to quantify the results. The results obtained for the FIT rates for technologies scaling from 65nm to18nm technologies, using the ITRS roadmap indicate the peak increase of around 12.5% and the variation of FIT rate in each technology nodes for the drowsy voltage levels scaling from nominal supply voltage to the lower operating voltage, as low as 1.1times the threshold voltage indicate the maximum increase of 2.6%. More importantly, we demonstrate that the low voltage operating mode of drowsy caches can be employed at different technology scales (65nm-22nm), as they do not impact much on the soft error rate and there is no need to build any additional expensive corrective mechanisms for improving the immunity to soft error for operating drowsy mode caches. Index Terms—Drowsy-Cache, Technology-Scaling. Reliability, Soft-Error, INTRODUCTION Traditionally technology scaling aims to improve the performance, increase the transistor density, and reduce the energy/power consumption per transistor. In this aspect, CMOS technology has been more promising in effectively meeting the low power demands. But the CMOS technology scaling beyond the 90nm have raised significant concerns in meeting the power demands of the microprocessors [15, 17] and have also raised the concerns over the system reliability [18]. This need of meeting the power demand has become a major issue and has led to the term “Power Wall”- which signifies the much-aggravated issue of meeting the power demands of the power hungry microprocessors. Power consumption of the transistors can be attributed to two chief aspects namely the Dynamic power and the static/leakage power. Dynamic power is due to the switching activity in CMOS circuits, while static power consumption is due to leakage current, and unlike dynamic power, static leakage is not based on activity and contributes to power dissipation even when the transistors are not operational. This static leakage (Ileak) is primarily due to subthreshold and gate-oxide leakage currents as shown in Eq (1). 𝐼𝑙𝑒𝑎𝑘 = 𝐼𝑠𝑢𝑏𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 + 𝐼𝑔𝑎𝑡𝑒𝑜𝑥𝑖𝑑𝑒 - (1) Isubthreshold 𝛼 𝑊𝑒 −𝑣𝑡ℎ/𝑣 - (2) 𝐼𝑔𝑎𝑡𝑒𝑜𝑥𝑖𝑑𝑒 ∝ ( V Tox 2 ) e− Tox V - (3) From Eq(2,3 ) sub-threshold leakage depends on the factors like supply voltage (v) , threshold voltage(Vth) and gate width (W), and gate-oxide leakage depends on the supply voltage and gate oxide thickness( Tox). These parameters vary with technology and with technology scaling the impact is seen to be increasing the contribution of static power dissipation. According to the 2007 International Technology Roadmap for Semiconductors (ITRS) [1] power dissipation due to static leakage predicted to constitute to more than 50% of the total power dissipation of the processors. Several efforts have been made to address the Powerwall issue, by paying special attention to reducing the power dissipation by incorporating changes in the process technology as well as by adding intelligence in the architecture and circuit design [18]. Also, over the past few years, the trend has been to incorporate large caches in the microprocessors as they tend to provide tremendous benefit in terms of improving the processor performance. But the large caches also constitute towards significant fraction of the total power consumption especially due to the static leakages. Further, with the feature sizes shrink, the dominant component of this power loss will be due to static leakage. In order to mitigate impact of static leakage in microprocessors, techniques like Gated VDD, Dynamic voltage scaling (DVS), increased threshold voltage, Drowsy cache, etc have been proposed [ 18, 19]. The state preserving techniques like drowsy cache employ the approach of reducing the operational supply voltage of the cache lines that is just sufficient enough to retain the data. This way the impact of static leakage is greatly mitigated, without less impact of processor performance. But in this aspect, it is important to note that the state retention on lower voltages imply significant reduction in the node charges, which mean that the cache data is rendered more vulnerable to the transient faults that may occur due to alpha particle strike or the neutrons strike. This escalation of vulnerability accounts to add up to another issue referred to as soft error wall.” Radiation-induced transient faults arise from energetic particles, such as alpha particles from packaging material and neutrons from the atmosphere, generating electron–hole pairs (directly or indirectly) as they pass through a semiconductor device. Transistor source and diffusion nodes can collect these charges. Sufficient amount of accumulated charge may invert the state of a logic device, such as a latch, static random access memory (SRAM) cell, or gate, thereby introducing a logical fault into the circuit’s operation. Because this type of fault does not reflect a permanent malfunction of the device, it is termed soft or transient [19]. Radiation-induced transient faults arise from energetic particles, such as alpha particles from packaging material and neutrons from the atmosphere, generating electron–hole pairs (directly or indirectly) as they pass through a semiconductor device. Transistor source and diffusion nodes can collect these charges. Several architectural techniques have been implemented to tackle the soft error problem, for e.g. error correction codes (ECC) are commonly employed in the memory systems, while the high-end systems employ redundant copies of hardware to detect faults and recover from errors. However, many of these solutions have been prohibitively expensive and difficult to justify in the mainstream commodity computing market [19]. In this paper, we focus on analyzing the soft error rate impact due to the technology scaling on low-power drowsy mode caches. The key motivation for this project is to study the effectiveness of employing the drowsy mode caches for next generation technologies and to quantify their impact on the reliability factor, by means of which we can justify whether there is need or employing any additional/expensive mechanisms to address the reliability concerns that may arrive due to drowsy caches. RELATED WORK This paper models the effect of technology scaling on the soft error rate of SRAM cell for current and future scaling technologies for low power cache design, while most of the previous experimental work related to SER has been done to estimate the soft error rate of the SRAM cell for current technologies with nominal VDD.[11][12] [13] studied the SER of a low power 70nm SRAM cell. It presents different circuit design techniques used to reduce the power consumed by the SRAM cell, and it analysed the impact of implementing two commonly used architecturallevel leakage reduction approaches namely the cache decay and drowsy cache on the reliability of the cache system. It concludes that implementing these techniques is a tradeoff between optimizing SRAM cell for leakage power and improving the immunity to soft error. In addition to this they ran their experiments on a commercial chip with neutroninduced soft errors at Breazeale Nuclear Reactor Facility. Whereas in contrast, in our paper we studied the reliability of the drowsy cache for current and future technologies on 6T SRAM cell and compared it with the reliability of a cache working on nominal VDD. We use an empirical model given by [Hazucha] to calculate SER for 6T SRAM cell. [14] In this paper, they measure the SER for SRAM and Combinational logic for different device and pipeline scaling from 600nm to 50nm. Their study shows that SER for SRAM for deeper technologies for a constant SRAM chip area will increase slowly. In our paper we measured the SER for SRAM cell for technologies from 65nm to 18nm for IMB L2 cache. In addition we calculated the SER with the existence 1-bit ECC which is supported by the PARMA model. On the other hand, in our study we do not handle the pipeline scaling, and hence we fixed the pipeline depth by using simplescalar simulator running in simcache mode for all simulations. [21] studied the Multi Bit Upset (MBU) probability for 65 nm technology, and it showed that this effect will be more important with deeper technology scaling. In this paper we have restricted our study to single bit upset for different technology scaling due to time constraints. [22] ran a 3D simulation to get the value of Qcritial charge. Then [] compared this value with different mathematical models used to present the generated current pulse from a particle strike. [] shows a wide variation in the Qcritical between different current models. In our paper we adopted an approximation model used by [.7 tinvert] for the calculation of the current pulse generated from the particle strike. [23] studied the effect of different process variation parameters (gate length, Vth, and Tox) on Qcritical and concluded that gate length is the main parameter that affect Qcrit and SER. In our study we find that Vth also plays a main role in determining the SER in drowsy mode by controlling the lowest workable drowsy level (approximately 1.5Vt) PROJECT EXPERIMENTAL METHODOLOGY In this section, we describe the experimental work, aspects of our simulation framework and how they are used to analyze the Soft Error rate and classify them to SDC, true and DUE for the SPEC2000 benchmarks. First, we employ the ITRS2007 roadmap for the High Performance profile to the MASTAR tool to derive the technology related parameters for the technologies with feature size ranging 65nm to 18nm technologies. We restrict out study to 65nm to 18nm CMOS bulk technology as we see that the impact of power and reliability walls is more pronounced in the technology scales beyond the 90nm technology, and the CMOS bulk model is the one more vulnerable to the static leakage issues, where employing drowsy cache will be more significant than the Hi-K or SOI technologies. Also, beyond the 18nm technology, we could not get the CMOS bulk technology scaling parameters from the ITRS2007 profiles in the MASTAR tool. Also, for estimation of the node capacitance, pmos leakage current, and critical charge factors, we employ the High Performance profile to the MASTAR tool The reason we choose the HP profile is due to the fact that the variations in the Threshold voltage is more significant (margin between the nominal vdd and the threshold voltage is large enough) in HP profile than those compared in the LSTP and LP profiles. Also, the High performance profile provides the relatively larger leakage currents. Further, to evaluate the SER for drowsy cache modes, apart from the nominal vdd we employ three lower operational voltages chosen as the multiples of threshold voltages operational modes namely: two, one-and-half, and 1.1 times the threshold voltage. Even though, the references indicate the best possible drowsy operational voltage to be 1.5times threshold voltage [ROCHE]. This way, we ensure that we consider the lowest possible operational drowsy voltage level, to estimate the account of drowsy voltage operation more aggressively. For simulation in simple-scalar tool, we plan to set the use the simple-cache mode of simple scalar that accounts for in-order execution of Load/Store instructions that access the cache, without adding the complexity of executing all the other types of instructions in out-order mode. This way it helps us to execute the benchmarks in much faster mode, without compromising on the execution of the cache access related instructions. Further, we plan to set the drowsywindow size of 4000 cycles, as this window size approximately corresponds to the sweet-spot – where the energy-delay product is maximum for the in-order processor cores [20]. Finally, we choose the workloads from the SPEC2000 benchmarks for the PISA architecture. We choose both the benchmarks from the SPECINT and SPECFP workloads. This way we can characterize both the integer based and Floating point oriented benchmarks. Further, based on the results shown in the drowsy-cache implementation paper [20], we choose the ‘gzip’ benchmark from the integer based and the ‘ammp’ benchmark for the floating point based workloads- as these two correspond to have the highest runtime overheads- which suggest having more number of access on the different drowsy-mode cache lines, at every window cycle. This enables us to have good mix of benchmarks in terms of cache access. The PARMA model that accurately classifies the cache faults into TRUE vs RAW errors and SDVC vs DUE errors is employed to classify the Soft errors into SDC and DUE is used without any modification to the core model. We only modify the execution pattern form cycle-by-cycle mode to the instruction-by-instruction mode, but the notion of cycle that is employed by the PARMA model for estimation and classification of the soft errors is not changed. REFERENCE [1] M. Powell et.al, Gated-Vdd: A Circuit technique to reduce leakage in deep submicron cache memories. Proc. Of Int. Symp. Low Power Electronics and design, 2000. [2] K. Flautner, et.al, Drowsy Caches: Simple Techniques for Reducing Leakage Power. Proc. of the 29th annual Int. Symp on Computer Architecture (ISCA 02). [3] S.S. Mukherjee, C. Weaver, J. Emer, S.K. Reinhardt, and T. Austin. A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor. Proceedings of the 36th Annual International Symposium on Microarchitecture, pages 29 - 40, Dec 2003. [4] S. Mukherjee Architecture Design for Soft Errors. [5] H. Mostafa, M. Anis, M. Elmasry. Comparative Analysis of Process Variation Impact on Flip-Flops Soft Error Rate. [6] P. Hazucha and C. Svensson, Member, IEEE. Impact of CMOS Technology Scaling on the Atmospheric Neutron Soft Error Rate [7] Accurate Reliability Benchmarking of Caches with PARMA [8] A. J. Johnston. Scaling and Technology Issues for Soft Error Rates [9] V. Degalahal, L. Li, V. Narayanan M. Kandemir, M. J. Irwin. Soft Errors Issues in Low-Power Caches [10] T. Heijmen, D. Giot, P. Roche. Factors that impact the critical charge of memory elements. [11]T. Juhnke and H. Klar. Calculation of the soft error rate of submicron CMOS logic circuits. IEEE Journal of Solid State Circuits, 30:830–834, July 1995. [12]Y. Tosaka, S. Satoh, K. Suzuki, T. Sugii, H. Ehara, G.Woffinden, and S.Wender. Impact of cosmic ray neutron induced soft errors on advanced submicron cmos circuits. Symposium on VLSI Technology Digest of Technical Papers, 1996 [13]Degalahal, V.; Lin Li; Narayanan, V.; Kandemir, M.; Irwin, M.J. Soft Errors Issues in Low-Power Caches ,Very Large Scale Integration (VLSI) Systems, IEEE Transactions on Volume 13, Issue 10, Oct. 2005 Page(s):1157 – 1166 [14]Burger ,D. et al. Modeling the Impact of Device and Pipeline Scaling on the Soft Error Rate of Processor Elements,technical Report,2002 [15] N.S. Kim, T. Austin, D. Baauw, T. Mudge, K. Flautner, J.S. Hu, M.J.Irwin, M. Kandemir, and V. Narayanan. Leakage current: Moore's law meets static power. Computer, 36(12):68{75, Dec. 2003. [16] S. Borkar. Design challenges of technology scaling. Micro, IEEE, 19(4):23-29,Jul-Aug 1999. [17] S. Borkar, T. Karnik, S. Narendra, J. Tschanz, A. Keshavarzi, and V. De. Parameter variations and impact on circuits and microarchitecture. Proceedings of the 40th Annual Conference on Design Automation, pages 338342, 2003. [18] BOOK [19] S.S Mukherjee, C. Weaver, J. Emer, S.K. Reinhardt, and T.Austin. A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor. Proceedings of the 36th Annual International Symposium on Microarchitecture, pages 29- 40, Dec 2003. [20] K. Flautner, N. Kim, S. Martin, D. Blaauw, and T. Mudge. Drowsy caches: simple techniques for reducing leakage power. Proceedings of the 29th Annual International Symposium on Computer Architecture, pages 148 157, July 2002. [21] Ruckerbauer , F. ,Soft Error Rates in 65nm SRAMs – Analysis of new Phenomena, 13th IEEE International On-Line Testing Symposium (IOLTS 2007). [22] Naseer, R.et al. Critical Charge Characterization for Soft Error Rate Modeling in 90nm SRAM, IEEE International Symposium on Circuits and Systems 2007, ISCAS 2007. [23]Ding ,Q. et al. Impact of process variation on soft error vulnerability for nanometer VLSI circuits 6th International Conference On ASIC, 2005. ASICON 2005 [24] S. Thoziyoor, N. Muralimanohar, J. H. Ahn, and N. P. Jouppi. Cacti 5.1.

ABSTRACT* With technology scaling, Power-Wall and

Related documents

Products

Support

ABSTRACT* With technology scaling, *Power-Wall* and

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

ABSTRACT* With technology scaling, Power-Wall and