2012 39th Annual International Symposium on Computer Architecture (ISCA 2012) Portland, Oregon, USA 9 – 13 June 2012 IEEE Catalog Number: ISBN: CFP12030-PRT 978-1-4673-0475-7 Table of Contents ISCA 2012 Organization .......................................................................................................................................................... ix ISCA'12 Sponsors & Supporters ........................................................................................................................................... xiii Session 1: Memory Systems I RAIDR: Retention-Aware Intelligent DRAM Refresh ........................................................................................................... 1 Jamie Liu, Ben Jaiyen, Richard M. Veras, Onur Mutlu (Carnegie Mellon University) PARDIS: A Programmable Memory Controller for the DDRx Interfacing Standards .................................................... 13 Mahdi Nazm Bojnordi, Engin Ipek (University of Rochester) BOOM: Enabling Mobile Memory Based Low-Power Server DIMMs .............................................................................. 25 Doe Hyun Yoon, Jichuan Chang, Naveen Muralimanohar, Parthasarathy Ranganathan (HP Labs) Towards Energy-Proportional Datacenter Memory with Mobile DRAM.......................................................................... 37 Krishna T. Malladi, Frank Austin Nothaft, Karthika Periyathambi (Stanford University), Benjamin C Lee (Duke University), Christos Kozyrakis, Mark Horowitz (Stanford University) Session 2A: GPU Architectures Simultaneous Branch and Warp Interweaving for Sustained GPU Performance ............................................................. 49 Nicolas Brunie (Kalray and ENS de Lyon), Sylvain Collange (Universidade Federal de Minas Gerais), Gregory Diamos (NVIDIA Research) CAPRI: Prediction of Compaction-Adequacy for Handling Control-Divergence in GPGPU Architectures.................. 61 Minsoo Rhu, Mattan Erez (The University of Texas at Austin) iGPU: Exception Support and Speculative Execution on GPUs.......................................................................................... 72 Jaikrishnan Menon, Marc de Kruijf, Karthikeyan Sankaralingam (University of Wisconsin, Madison) Boosting Mobile GPU Performance With A Decoupled Access/Execute Fragment Processor ......................................... 84 Jose Maria Arnau, Joan Manuel Parcerisa (Technical University of Catalonia), Polychronis Xekalakis (Intel Labs Barcelona) Session 2B: Architectures for Security Branch Regulation: Low-Overhead Protection from Code Reuse Attacks ........................................................................ 94 Mehmet Kayaalp, Meltem Ozsoy, Nael Abu-Ghazaleh, Dmitry Ponomarev (SUNY at Binghamton) Side-Channel Vulnerability Factor: A Metric for Measuring Information Leakage ...................................................... 106 John Demme, Robert Martin, Adam Waksman, Simha Sethumadhavan (Columbia University) Time Warp: Rethinking Timekeeping and Performance Measurement Mechanisms to Mitigate Side Channels ................................................................................................................................................................................. 118 Robert Martin, John Demme, Simha Sethumadhavan (Columbia University) Inspection Resistant Memory: Architectural Support for Security from Physical Examination ................................... 130 Jonathan Valamehr, Timothy Sherwood (UC Santa Barbara), Andrew Putnam, Daniel Shumow, Melissa Chase, Seny Kamara (Microsoft Research), Vinod Vaikuntanathan (University of Toronto) Session 3A: Interconnection Networks Tolerating Process Variations in Nanophotonic On-chip Networks ................................................................................. 142 Yi Xu, Jun Yang, and Rami Melhem (University of Pittsburgh) A Micro-architectural Analysis of Switched Photonic Multi-chip Interconnects ............................................................ 153 Pranay Koka, Michael O. McCracken, Herb Schwetman (Oracle Labs), Chia-Hsin Chen (MIT), Xuezhe Zheng, Ron Ho, Kannan Raj, Ashok V. Krishnamoorthy (Oracle Labs) v Table of Contents Enhancing Effective Throughput for Transmission Line-Based Bus................................................................................ 165 Aaron Carpenter, Jianyun Hu, Ovunc Kocabas, Michael Huang, Hui Wu (University of Rochester) A Case for Random Shortcut Topologies for HPC Interconnects ..................................................................................... 177 Michihiro Koibuchi (National Institute of Informatics), Hiroki Matsutani, Hideharu Amano (Keio University), D. Frank Hsu (Fordham University), Henri Casanova (University of Hawaii at Manoa) Session 3B-1: Architectures for Software Productivity Watchdog: Hardware for Safe and Secure Manual Memory Management and Full Memory Safety .......................... 189 Santosh Nagarakatte, Milo M K Martin, Steve Zdancewic (University of Pennsylvania) RADISH: Always-On Sound and Complete Race Detection in Software and Hardware ............................................... 201 Joseph Devietti, Benjamin Wood (University of Washington), Karin Strauss (Microsoft Research and University of Washington), Luis Ceze, Dan Grossman (University of Washington), Shaz Qadeer (Microsoft Research) Session 3B-2: Heterogeneity Scheduling Heterogeneous Multi-Cores through Performance Impact Estimation (PIE) .............................................. 213 Kenzo Van Craeynest (Ghent University), Aamer Jaleel (Intel), Lieven Eeckhout (Ghent University), Paolo Narvaez, Joel Emer (Intel/MIT) The Yin and Yang of Power and Performance for Asymmetric Hardware and Managed Software ............................. 225 Ting Cao, Stephen M Blackburn, Tiejun Gao (The Australian National University), Kathryn S McKinley (Microsoft Research, The University of Texas at Austin) Session 4: Circuits and Technology Lane Decoupling for Improving the Timing-Error Resiliency of Wide-SIMD Architectures ........................................ 237 Evgeni Krimer (The University of Texas at Austin), Patrick Chiang (Oregon State University), Mattan Erez (The University of Texas at Austin) VRSync: Characterizing and Eliminating Synchronization-Induced Voltage Emergencies in Many-core Processors ............................................................................................................................................................................... 249 Timothy N. Miller, Renji Thomas, Xiang Pan, Radu Teodorescu (The Ohio State University) Session 5: Reliability Euripus: A Flexible Unified Hardware Memory Checkpointing Accelerator for Bidirectional Debugging and Reliability ........................................................................................................................................................................ 261 Ioannis Doudalis (Intel, Georgia Institute of Technology), Milos Prvulovic (Georgia Institute of Technology) A First-Order Mechanistic Model for Architectural Vulnerability Factor ...................................................................... 273 Arun Arvind Nair (University of Texas at Austin), Stijn Eyerman, Lieven Eeckhout (Ghent University, Belgium), Lizy Kurian John (University of Texas at Austin) LOT-ECC: LOcalized and Tiered Reliability Mechanisms for Commodity Memory Systems...................................... 285 Aniruddha N. Udipi (University of Utah), Naveen Muralimanohar (HP Labs), Rajeev Balasubramonian, Al Davis (University of Utah), Norman P. Jouppi (HP Labs) Session 6A: Cache Systems Reducing memory reference energy with Opportunistic Virtual Caching ....................................................................... 297 Arkaprava Basu, Michael M. Swift, Mark D. Hill (University of Wisconsin-Madison) Improving Writeback Efficiency with Decoupled Last-Write Prediction ........................................................................ 309 Zhe Wang, Samira M. Khan, Daniel A. Jiménez (The University of Texas at San Antonio) FLEXclusion: Balancing Cache Capacity and On-Chip Traffic via Flexible Exclusion ................................................. 321 Jaewoong Sim, Jaekyu Lee, Moinuddin K. Qureshi, Hyesoon Kim (Georgia Institute of Technology) vi Table of Contents Session 6B: Dependable Architectures Setting an Error Detection Infrastructure with Low Cost Acoustic Wave Detectors...................................................... 333 Gaurang Upasani (UPC), Xavier Vera (Intel), Antonio Gonzalez (Intel & UPC) Viper: Virtual Pipelines for Enhanced Reliability .............................................................................................................. 344 Andrea Pellegrini, Joseph Greathouse, Valeria Bertacco (University of Michigan) A Defect-Tolerant Accelerator for Emerging High-Performance Applications............................................................... 356 Olivier Temam (INRIA) Session 7A: Memory Systems II A Case for Exploiting Subarray-Level Parallelism (SALP) in DRAM ............................................................................. 368 Yoongu Kim, Vivek Seshadri, Donghyuk Lee, Jamie Liu, Onur Mutlu (Carnegie Mellon University) PreSET: Improving Read-Write Performance of Phase Change Memories by Exploiting Asymmetry in Write Times ............................................................................................................................................................................ 380 Moinuddin Qureshi (Georgia Institute of Technology), Michele Franceschini, Ashish Jagmohan, Luis Lastras (IBM) Buffer-On-Board Memory System ....................................................................................................................................... 392 Elliott Cooper-Balis, Paul Rosenfeld, Bruce Jacob (University Of Maryland) Session 7B: Scheduling and Resource Management PAQ: Physically Addressed Queuing for Resource Conflict Avoidance in Solid State Disk .......................................... 404 Myoungsoo Jung, Ellis H. Wilson III, Mahmut Kandemir (The Pennsylvania State University) Staged Memory Scheduling: Achieving High Performance and Scalability in Heterogeneous Systems ....................... 416 Rachata Ausavarungnirun (Carnegie Mellon University), Gabriel Loh (Advanced Micro Devices), Kevin Chang, Lavanya Subramanian, Onur Mutlu (Carnegie Mellon University) Probabilistic Shared Cache Management (PriSM) ............................................................................................................. 428 R Manikantan (Indian Institute of Science), Kaushik Rajan (Microsoft Research), R Govindarajan (Indian Institute of Science) Session 8: Application Analysis Can Traditional Programming Bridge the Ninja Performance Gap for Parallel Computing Applications?................ 440 Nadathur Satish, Changkyu Kim, Jatin Chhugani, Hideki Saito, Rakesh Krishnaiyer, Mikhail Smelyanskiy, Milind Girkar, Pradeep Dubey (Intel Corporation) Harmony: Collection and Analysis of Parallel Block Vectors ........................................................................................... 452 Melanie Kambadur, Kui Tang, Martha Kim (Columbia University) Session 9: Virtualized Systems Configurable Fine-Grain Protection for Multicore Processor Virtualization .................................................................. 464 David Wentzlaff (Princeton University), Christopher J. Jackson (Tilera), Patrick Griffin (Google), Anant Agarwal (Tilera) Revisiting Hardware-Assisted Page Walks for Virtualized Systems................................................................................. 476 Jeongseob Ahn, Seongwook Jin, Jaehyuk Huh (KAIST) vii Table of Contents Session 10A: Data Centers Managing Distributed UPS Energy for Effective Power Capping in Data Centers ......................................................... 488 Vasileios Kontorinis, Liuyi Zhang, Baris Aksanli, Jack Sampson, Houman Homayoun (UCSD), Eddie Pettis (Google), Tajana Rosing, Dean Tullsen (UCSD) Scale-Out Processors ............................................................................................................................................................. 500 Pejman Lotfi-Kamran, Boris Grot (EPFL), Michael Ferdman (CMU/EPFL), Stavros Volos, Onur Kocberber, Javier Picorel, Almutaz Adileh, Djordje Jevdjic (EPFL), Sachin Idgunji, Emre Ozer (ARM), Babak Falsafi (EPFL) iSwitch: Coordinating and Optimizing Renewable Energy Powered Server Clusters .................................................... 512 Chao Li, Amer Qouneh, Tao Li (University of Florida) Session 10B: HW/SW Interface and Flexibility End-To-End Sequential Consistency.................................................................................................................................... 524 Abhayendra Singh, Satish Narayanasamy (University of Michigan, Ann Arbor), Daniel Marino (Symantec), Todd Millstein (University of California, Los Angeles), Madanlal Musuvathi (Microsoft Research, Redmond) BlockChop: Dynamic Squash Elimination for Hybrid Processor Architecture ............................................................... 536 Jason Mars (University of Virginia), Naveen Kumar (Intel Labs) The Dynamic Granularity Memory System ........................................................................................................................ 548 Doe Hyun Yoon (HP Labs), Michael Sullivan, Min Kyu Jeong, Mattan Erez (The University of Texas at Austin) Author Index .......................................................................................................................................................................... 561 viii