Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Read Section 5.1: Introduction Adapted from Slides by Prof. Mary Jane Irwin, Penn State University And Slides Supplied by the textbook publisher CPE432 Chapter 5A.1 Dr. W. Abu-Sufah, UJ Review: Major Components of a Computer Processor Control Devices Memory Datapath Output Secondary Memory (Disk) Main Memory Cache CPE432 Chapter 5A.2 Input Dr. W. Abu-Sufah, UJ The “Memory Wall” Processor vs DRAM speed disparity continues to grow Clocks per instruction 1000 100 10 Core Memory 1 0.1 Clocks per DRAM access 0.01 VAX/1980 PPro/1996 2010+ Good memory hierarchy (cache) design is increasingly important to overall performance CPE432 Chapter 5A.4 Dr. W. Abu-Sufah, UJ The Memory Hierarchy Goal Fact: Large memories are slow and fast memories are small How do we create a memory that gives the illusion of being large and fast (most of the time)? Also cheap With hierarchy With parallelism CPE432 Chapter 5A.5 Dr. W. Abu-Sufah, UJ A Typical Memory Hierarchy The memory system of a modern computer consists of a series of black boxes ranging from the fastest to the slowest. Besides variation in speed, these boxes also vary in size (smallest to biggest) and cost. On-Chip Components Control Speed (%cycles): ½’s Size (bytes): Cost: CPE432 Chapter 5A.6 100’s highest Instr Data Cache Cache ITLB DTLB RegFile Datapath Second Level Cache (SRAM) Main Memory (DRAM) 1’s 10’s 100’s 10,000’s 10K’s M’s G’s T’s Secondary Memory (Disk) lowest Dr. W. Abu-Sufah, UJ Characteristics of the Memory Hierarchy Processor 4-8 bytes (word) Increasing distance from the processor in access time L1$ 8-32 bytes (block) L2$ 1 to 4 blocks Main Memory (MM) Inclusive– what is in L1$ is a subset of what is in L2$ is a subset of what is in MM that is a subset of is in SM 1,024+ bytes (disk sector = page) Secondary Memory (SM) (Relative) size of the memory at each level CPE432 Chapter 5A.7 Dr. W. Abu-Sufah, UJ Why Does the Concept of a Memory Hierarchy Work? What makes this kind of hierarchical memory organization work is the principle of locality of memory references generated by programs. The principle of locality states that programs access a relatively small portion of the address space at any instant of time. A memory hierarchy takes advantage of the principle of locality to present the user with as much memory as is available in the cheapest technology at the speed offered by the fastest technology CPE432 Chapter 5A.8 Dr. W. Abu-Sufah, UJ Memory Hierarchy Technologies: the Cache Caches use SRAM for speed and technology compatibility Fast (typical access times of 0.5 to 2.5 nsec) Low density (6 transistor cells), higher power, expensive ($2000 to $5000 per GB in 2008) content will last “forever” (as long as power is left on) Static: CPE432 Chapter 5A.9 Dr. W. Abu-Sufah, UJ Memory Hierarchy Technologies: Main Memory Main memory uses DRAM for size (density) Slower (typical access times of 50 to 70 nsec) High density (1 transistor cells), lower power, cheaper ($20 to $75 per GB in 2008) Dynamic: needs to be “refreshed” regularly (~ every 8 ms) - consumes1% to 2% of the active cycles of the DRAM Addresses divided into 2 halves (row and column) - RAS or Row Access Strobe triggering the row decoder - CAS or Column Access Strobe triggering the column selector CPE432 Chapter 5A.10 Dr. W. Abu-Sufah, UJ Principal of Locality Temporal Locality (locality in time) If a memory location is referenced then it will tend to be referenced again soon Keep most recently accessed data items closer to the processor Spatial Locality (locality in space) If a memory location is referenced, the locations with nearby addresses will tend to be referenced soon Move blocks consisting of contiguous words closer to the processor CPE432 Chapter 5A.11 Dr. W. Abu-Sufah, UJ The Memory Hierarchy: Terminology Block (or line): the minimum unit of information that is present (or not present ) in a cache Hit Rate: the fraction of memory accesses found in a level of the memory hierarchy Hit Time: Time to access that level which consists of Time to determine hit/miss + Time to access the block CPE432 Chapter 5A.12 Dr. W. Abu-Sufah, UJ The Memory Hierarchy: Terminology II Miss Rate: the fraction of memory accesses not found in a level of the memory hierarchy 1 - (Hit Rate) Miss Penalty: Time to replace a block in that level with the corresponding block from a lower level which consists of: Time to access the block in the lower level + Time to transmit that block to the level that experienced the miss + Time to insert the block in that level + Time to pass the block to the requestor Hit Time <<< Miss Penalty CPE432 Chapter 5A.13 Dr. W. Abu-Sufah, UJ How is the Hierarchy Managed? registers by compiler (programmer?) cache memory main memory by the cache controller hardware main memory disks by the operating system (virtual memory) virtual to physical address mapping assisted by the hardware (TLB) by the programmer (files) CPE432 Chapter 5A.14 Dr. W. Abu-Sufah, UJ