RAIDR: Retention-Aware Intelligent DRAM Refresh Jamie Liu Ben Jaiyen Richard Veras Onur Mutlu Executive Summary I DRAM requires periodic refresh to avoid data loss due to capacitor charge leakage RAIDR: Retention-Aware Intelligent DRAM Refresh 2 Executive Summary I I DRAM requires periodic refresh to avoid data loss due to capacitor charge leakage Refresh operations interfere with memory accesses and waste energy RAIDR: Retention-Aware Intelligent DRAM Refresh 2 Executive Summary I I I DRAM requires periodic refresh to avoid data loss due to capacitor charge leakage Refresh operations interfere with memory accesses and waste energy Refresh overhead limits DRAM scaling RAIDR: Retention-Aware Intelligent DRAM Refresh 2 Executive Summary I I I I DRAM requires periodic refresh to avoid data loss due to capacitor charge leakage Refresh operations interfere with memory accesses and waste energy Refresh overhead limits DRAM scaling Observation: High refresh rate caused by few weak DRAM cells RAIDR: Retention-Aware Intelligent DRAM Refresh 2 Executive Summary I I I I I DRAM requires periodic refresh to avoid data loss due to capacitor charge leakage Refresh operations interfere with memory accesses and waste energy Refresh overhead limits DRAM scaling Observation: High refresh rate caused by few weak DRAM cells Problem: All cells refreshed at the same high rate RAIDR: Retention-Aware Intelligent DRAM Refresh 2 Executive Summary I I I I I I DRAM requires periodic refresh to avoid data loss due to capacitor charge leakage Refresh operations interfere with memory accesses and waste energy Refresh overhead limits DRAM scaling Observation: High refresh rate caused by few weak DRAM cells Problem: All cells refreshed at the same high rate Idea: RAIDR decreases refresh rate for most DRAM cells while refreshing weak cells at a higher rate RAIDR: Retention-Aware Intelligent DRAM Refresh 2 Executive Summary I I I I I I DRAM requires periodic refresh to avoid data loss due to capacitor charge leakage Refresh operations interfere with memory accesses and waste energy Refresh overhead limits DRAM scaling Observation: High refresh rate caused by few weak DRAM cells Problem: All cells refreshed at the same high rate Idea: RAIDR decreases refresh rate for most DRAM cells while refreshing weak cells at a higher rate I Group parts of DRAM into di erent bins depending on their required refresh rate RAIDR: Retention-Aware Intelligent DRAM Refresh 2 Executive Summary I I I I I I DRAM requires periodic refresh to avoid data loss due to capacitor charge leakage Refresh operations interfere with memory accesses and waste energy Refresh overhead limits DRAM scaling Observation: High refresh rate caused by few weak DRAM cells Problem: All cells refreshed at the same high rate Idea: RAIDR decreases refresh rate for most DRAM cells while refreshing weak cells at a higher rate I Group parts of DRAM into di erent bins depending on their required refresh rate I Use Bloom lters for scalable and e cient binning RAIDR: Retention-Aware Intelligent DRAM Refresh 2 Executive Summary I I I I I I DRAM requires periodic refresh to avoid data loss due to capacitor charge leakage Refresh operations interfere with memory accesses and waste energy Refresh overhead limits DRAM scaling Observation: High refresh rate caused by few weak DRAM cells Problem: All cells refreshed at the same high rate Idea: RAIDR decreases refresh rate for most DRAM cells while refreshing weak cells at a higher rate I Group parts of DRAM into di erent bins depending on their required refresh rate I I Use Bloom lters for scalable and e cient binning Refresh each bin at the minimum rate needed RAIDR: Retention-Aware Intelligent DRAM Refresh 2 Executive Summary I I I I I I DRAM requires periodic refresh to avoid data loss due to capacitor charge leakage Refresh operations interfere with memory accesses and waste energy Refresh overhead limits DRAM scaling Observation: High refresh rate caused by few weak DRAM cells Problem: All cells refreshed at the same high rate Idea: RAIDR decreases refresh rate for most DRAM cells while refreshing weak cells at a higher rate I Group parts of DRAM into di erent bins depending on their required refresh rate I I I Use Bloom lters for scalable and e cient binning Refresh each bin at the minimum rate needed RAIDR reduces refreshes signi cantly with low overhead in the memory controller RAIDR: Retention-Aware Intelligent DRAM Refresh 2 Outline I Executive Summary I Background & Motivation I Key Observation & Our Mechanism: RAIDR I Evaluation I Conclusion RAIDR: Retention-Aware Intelligent DRAM Refresh 3 DRAM Refresh 1 1 0 0 1 0 0 1 1 1 1 0 0 1 0 0 0 1 1 1 0 0 1 0 0 0 1 1 1 0 0 1 1 0 0 1 1 1 0 0 0 1 0 0 1 1 1 0 0 0 1 0 0 1 1 1 1 0 0 1 0 0 1 1 Rows Row buffer RAIDR: Retention-Aware Intelligent DRAM Refresh 4 DRAM Refresh 1 1 0 0 1 0 0 1 1 1 1 0 0 1 0 0 0 1 1 1 0 0 1 0 0 0 1 1 1 0 0 1 1 0 0 1 1 1 0 0 0 1 0 0 1 1 1 0 0 0 1 0 0 1 1 1 1 0 0 1 0 0 1 1 Activate 1 1 0 0 1 0 0 1 RAIDR: Retention-Aware Intelligent DRAM Refresh 4 DRAM Refresh 1 1 0 0 1 0 0 1 1 1 1 0 0 1 0 0 0 1 1 1 0 0 1 0 0 0 1 1 1 0 0 1 1 0 0 1 1 1 0 0 0 1 0 0 1 1 1 0 0 0 1 0 0 1 1 1 1 0 0 1 0 0 1 1 1 1 0 0 1 0 0 1 RAIDR: Retention-Aware Intelligent DRAM Refresh 4 DRAM Refresh 1 1 0 0 1 0 0 1 1 1 1 0 0 1 0 0 0 1 1 1 0 0 1 0 0 0 1 1 1 0 0 1 1 0 0 1 1 1 0 0 0 1 0 0 1 1 1 0 0 0 1 0 0 1 1 1 1 0 0 1 0 0 1 1 Precharge RAIDR: Retention-Aware Intelligent DRAM Refresh 4 DRAM Refresh 1 1 0 0 1 0 0 1 1 1 1 0 0 1 0 0 0 1 1 1 0 0 1 0 0 0 1 1 1 0 0 1 1 0 0 1 1 1 0 0 0 1 0 0 1 1 1 0 RAIDR: Retention-Aware Intelligent DRAM Refresh 0 0 1 0 0 1 1 1 1 0 0 1 0 0 1 1 4 DRAM Refresh 1 1 0 0 1 0 0 1 1 1 1 0 0 1 0 0 0 1 1 1 0 0 1 0 0 0 1 1 1 0 0 1 0 0 0 1 1 1 0 0 0 1 0 0 1 1 1 0 RAIDR: Retention-Aware Intelligent DRAM Refresh 0 0 1 0 0 1 1 1 1 0 0 1 0 0 1 1 4 DRAM Refresh 1 1 0 0 1 0 0 1 1 1 1 0 0 1 0 0 0 1 1 1 0 0 1 0 0 0 1 1 1 0 0 1 1 0 0 1 1 1 0 0 0 1 0 0 1 1 1 0 0 0 1 0 0 1 1 1 1 0 0 1 0 0 1 1 Activate 1 1 0 0 1 0 0 1 RAIDR: Retention-Aware Intelligent DRAM Refresh 4 DRAM Refresh 1 1 0 0 1 0 0 1 1 1 1 0 0 1 0 0 0 1 1 1 0 0 1 0 0 0 1 1 1 0 0 1 1 0 0 1 1 1 0 0 0 1 0 0 1 1 1 0 0 0 1 0 0 1 1 1 1 0 0 1 0 0 1 1 1 1 0 0 1 0 0 1 RAIDR: Retention-Aware Intelligent DRAM Refresh Refresh Activate 4 DRAM Refresh 1 1 0 0 1 0 0 1 1 1 1 0 0 1 0 0 0 1 1 1 0 0 1 0 0 0 1 1 1 0 0 1 1 0 0 1 1 1 0 0 0 1 0 0 1 1 1 0 0 0 1 0 0 1 1 1 1 0 0 1 0 0 1 1 Precharge RAIDR: Retention-Aware Intelligent DRAM Refresh 4 Refresh Overhead: Performance % time spent refreshing 100 80 Present Future 60 40 20 0 2 Gb 4 Gb RAIDR: Retention-Aware Intelligent DRAM Refresh 8 Gb 16 Gb 32 Gb 64 Gb Device capacity 5 Refresh Overhead: Performance % time spent refreshing 100 80 Present Future 60 40 20 0 2 Gb 8% 4 Gb RAIDR: Retention-Aware Intelligent DRAM Refresh 8 Gb 16 Gb 32 Gb 64 Gb Device capacity 5 Refresh Overhead: Performance % time spent refreshing 100 80 Present Future 60 46% 40 20 0 2 Gb 4 Gb RAIDR: Retention-Aware Intelligent DRAM Refresh 8 Gb 16 Gb 32 Gb 64 Gb Device capacity 5 % DRAM energy spent refreshing Refresh Overhead: Energy 100 80 Present Future 60 40 20 0 2 Gb 4 Gb RAIDR: Retention-Aware Intelligent DRAM Refresh 8 Gb 16 Gb 32 Gb 64 Gb Device capacity 6 % DRAM energy spent refreshing Refresh Overhead: Energy 100 80 Present Future 60 40 20 0 2 Gb 15% 4 Gb RAIDR: Retention-Aware Intelligent DRAM Refresh 8 Gb 16 Gb 32 Gb 64 Gb Device capacity 6 % DRAM energy spent refreshing Refresh Overhead: Energy 100 80 Present 60 Future 47% 40 20 0 2 Gb 4 Gb RAIDR: Retention-Aware Intelligent DRAM Refresh 8 Gb 16 Gb 32 Gb 64 Gb Device capacity 6 Outline I Executive Summary I Background & Motivation I Key Observation & Our Mechanism: RAIDR I Evaluation I Conclusion RAIDR: Retention-Aware Intelligent DRAM Refresh 7 Key Observation and Idea Retention failures 1012 32 GB DRAM 109 106 103 100 100ms 1s 10s 100s Refresh interval RAIDR: Retention-Aware Intelligent DRAM Refresh 1000s 8 Key Observation and Idea Retention failures 1012 32 GB DRAM 109 106 103 100 I 100ms 1s 10s 100s Refresh interval 1000s Key observation: Most cells can be refreshed infrequently without losing data [Kim+, EDL '09] RAIDR: Retention-Aware Intelligent DRAM Refresh 8 Key Observation and Idea Retention failures 1012 32 GB DRAM 109 106 103 100 I < 50 retention failures @ 128 ms 100ms 1s 10s 100s 1000s Refresh interval Key observation: Most cells can be refreshed infrequently without losing data [Kim+, EDL '09] RAIDR: Retention-Aware Intelligent DRAM Refresh 8 Key Observation and Idea Retention failures 1012 32 GB DRAM 109 106 103 100 I < 1000 retention failures @ 256 ms 100ms 1s 10s 100s Refresh interval 1000s Key observation: Most cells can be refreshed infrequently without losing data [Kim+, EDL '09] RAIDR: Retention-Aware Intelligent DRAM Refresh 8 Key Observation and Idea Retention failures 1012 32 GB DRAM 109 106 103 100 Cutoff at 64ms 100ms 1s 10s 100s Refresh interval 1000s I Key observation: Most cells can be refreshed infrequently without losing data [Kim+, EDL '09] I Problem: All cells are refreshed at the same worst-case rate RAIDR: Retention-Aware Intelligent DRAM Refresh 8 Key Observation and Idea Retention failures 1012 32 GB DRAM 109 106 103 100 100ms 1s 10s 100s Refresh interval 1000s I Key observation: Most cells can be refreshed infrequently without losing data [Kim+, EDL '09] I Problem: All cells are refreshed at the same worst-case rate I Key idea: refresh rows containing weak cells more frequently; refresh other rows less frequently RAIDR: Retention-Aware Intelligent DRAM Refresh 8 Retention-Aware Intelligent DRAM Refresh 1. Pro ling I Determine each row's retention time (how frequently each row needs to be refreshed to avoid losing data) RAIDR: Retention-Aware Intelligent DRAM Refresh 9 Retention-Aware Intelligent DRAM Refresh 1. Pro ling I Determine each row's retention time (how frequently each row needs to be refreshed to avoid losing data) 2. Binning I Group rows into di erent retention time bins based on their retention time RAIDR: Retention-Aware Intelligent DRAM Refresh 9 Retention-Aware Intelligent DRAM Refresh 1. Pro ling I Determine each row's retention time (how frequently each row needs to be refreshed to avoid losing data) 2. Binning I Group rows into di erent retention time bins based on their retention time 3. Refreshing I Refresh rows in di erent bins at di erent rates RAIDR: Retention-Aware Intelligent DRAM Refresh 9 Retention-Aware Intelligent DRAM Refresh 1. Pro ling I Determine each row's retention time (how frequently each row needs to be refreshed to avoid losing data) 2. Binning I Group rows into di erent retention time bins based on their retention time 3. Refreshing I Refresh rows in di erent bins at di erent rates RAIDR: Retention-Aware Intelligent DRAM Refresh 9 1 Retention Time Pro ling I To pro le a row: RAIDR: Retention-Aware Intelligent DRAM Refresh 10 1 Retention Time Pro ling I To pro le a row: 1. Write data to the row RAIDR: Retention-Aware Intelligent DRAM Refresh 10 1 Retention Time Pro ling I To pro le a row: 1. Write data to the row 2. Prevent it from being refreshed RAIDR: Retention-Aware Intelligent DRAM Refresh 10 1 Retention Time Pro ling I To pro le a row: 1. Write data to the row 2. Prevent it from being refreshed 3. Measure time before data corruption RAIDR: Retention-Aware Intelligent DRAM Refresh 10 1 Retention Time Pro ling I To pro le a row: 1. Write data to the row 2. Prevent it from being refreshed 3. Measure time before data corruption Initially Row 1 Row 2 Row 3 11111111... 11111111... 11111111... RAIDR: Retention-Aware Intelligent DRAM Refresh 10 1 Retention Time Pro ling I To pro le a row: 1. Write data to the row 2. Prevent it from being refreshed 3. Measure time before data corruption Initially After 64 ms Row 1 Row 2 Row 3 11111111... 11111111... 11111111... 11111111... 11111111... 11111111... RAIDR: Retention-Aware Intelligent DRAM Refresh 10 1 Retention Time Pro ling I To pro le a row: 1. Write data to the row 2. Prevent it from being refreshed 3. Measure time before data corruption Initially After 64 ms After 128 ms Row 1 Row 2 Row 3 11111111... 11111111... 11111111... 11111111... 11111111... 11111111... 11011111... 11111111... 11111111... RAIDR: Retention-Aware Intelligent DRAM Refresh 10 1 Retention Time Pro ling I To pro le a row: 1. Write data to the row 2. Prevent it from being refreshed 3. Measure time before data corruption Initially After 64 ms After 128 ms Row 1 Row 2 Row 3 11111111... 11111111... 11111111... 11111111... 11111111... 11111111... 11011111... 11111111... 11111111... (64‒128ms) RAIDR: Retention-Aware Intelligent DRAM Refresh 10 1 Retention Time Pro ling I To pro le a row: 1. Write data to the row 2. Prevent it from being refreshed 3. Measure time before data corruption Initially After 64 ms After 128 ms After 256 ms Row 1 Row 2 Row 3 11111111... 11111111... 11111111... 11111111... 11111111... 11111111... 11011111... 11111111... 11111111... (64‒128ms) 11111011... 11111111... RAIDR: Retention-Aware Intelligent DRAM Refresh 10 1 Retention Time Pro ling I To pro le a row: 1. Write data to the row 2. Prevent it from being refreshed 3. Measure time before data corruption Initially After 64 ms After 128 ms After 256 ms Row 1 Row 2 Row 3 11111111... 11111111... 11111111... 11111111... 11111111... 11111111... 11011111... 11111111... 11111111... (64‒128ms) 11111011... 11111111... (128‒256ms) (>256ms) RAIDR: Retention-Aware Intelligent DRAM Refresh 10 Retention-Aware Intelligent DRAM Refresh 1. Pro ling I Determine each row's retention time (how frequently each row needs to be refreshed to avoid losing data) 2. Binning I Group rows into di erent retention time bins based on their retention time 3. Refreshing I Refresh rows in di erent bins at di erent rates RAIDR: Retention-Aware Intelligent DRAM Refresh 11 2 Grouping Rows Into Retention Time Bins I Rows are grouped into di erent bins based on their pro led retention time RAIDR: Retention-Aware Intelligent DRAM Refresh 12 2 Grouping Rows Into Retention Time Bins I Rows are grouped into di erent bins based on their pro led retention time DRAM RAIDR: Retention-Aware Intelligent DRAM Refresh 12 2 Grouping Rows Into Retention Time Bins I Rows are grouped into di erent bins based on their pro led retention time 64-128ms >256ms DRAM 128-256ms RAIDR: Retention-Aware Intelligent DRAM Refresh 12 2 Grouping Rows Into Retention Time Bins I Rows are grouped into di erent bins based on their pro led retention time Row 1 Row 3 DRAM Row 2 RAIDR: Retention-Aware Intelligent DRAM Refresh 12 2 Grouping Rows Into Retention Time Bins I Rows are grouped into di erent bins based on their pro led retention time Row 1 Row 3 DRAM Row 2 I Store bins using Bloom lters [Bloom, CACM '70] RAIDR: Retention-Aware Intelligent DRAM Refresh 12 Storing Retention Time Bins Using Bloom Filters Example with 64 128ms bin: 0 0 0 0 Hash function 1 0 0 0 0 0 0 Hash function 2 RAIDR: Retention-Aware Intelligent DRAM Refresh 0 0 0 0 0 0 Hash function 3 13 Storing Retention Time Bins Using Bloom Filters Example with 64 128ms bin: 0 0 0 0 Hash function 1 0 0 0 0 0 0 Hash function 2 0 0 0 0 0 0 Hash function 3 Insert Row 1 RAIDR: Retention-Aware Intelligent DRAM Refresh 13 Storing Retention Time Bins Using Bloom Filters Example with 64 128ms bin: 0 0 0 0 Hash function 1 0 0 0 0 0 0 Hash function 2 0 0 0 0 0 0 Hash function 3 Insert Row 1 RAIDR: Retention-Aware Intelligent DRAM Refresh 13 Storing Retention Time Bins Using Bloom Filters Example with 64 128ms bin: 0 0 0 0 Hash function 1 0 0 0 0 0 0 Hash function 2 0 0 0 0 0 0 Hash function 3 Insert Row 1 RAIDR: Retention-Aware Intelligent DRAM Refresh 13 Storing Retention Time Bins Using Bloom Filters Example with 64 128ms bin: 0 0 1 0 Hash function 1 1 0 0 0 0 1 Hash function 2 0 0 0 0 0 0 Hash function 3 Insert Row 1 RAIDR: Retention-Aware Intelligent DRAM Refresh 13 Storing Retention Time Bins Using Bloom Filters Example with 64 128ms bin: 0 0 1 0 Hash function 1 1 0 0 0 0 1 Hash function 2 0 0 0 0 0 0 Hash function 3 Row 1 present? RAIDR: Retention-Aware Intelligent DRAM Refresh 13 Storing Retention Time Bins Using Bloom Filters Example with 64 128ms bin: 0 0 1 0 Hash function 1 1 0 0 0 0 1 Hash function 2 0 0 0 0 0 0 Hash function 3 Row 1 present? RAIDR: Retention-Aware Intelligent DRAM Refresh 13 Storing Retention Time Bins Using Bloom Filters Example with 64 128ms bin: 0 0 1 1 & 0 Hash function 1 1 1 0 0 & 0 0 1 1 Hash function 2 =1 0 0 0 0 0 0 Hash function 3 Row 1 present? RAIDR: Retention-Aware Intelligent DRAM Refresh 13 Storing Retention Time Bins Using Bloom Filters Example with 64 128ms bin: 0 0 1 1 & 0 Hash function 1 1 1 0 0 & 0 0 1 1 Hash function 2 =1 0 0 0 0 0 0 Hash function 3 Row 1 present? Yes RAIDR: Retention-Aware Intelligent DRAM Refresh 13 Storing Retention Time Bins Using Bloom Filters Example with 64 128ms bin: 0 0 1 0 Hash function 1 1 0 0 0 0 1 Hash function 2 0 0 0 0 0 0 Hash function 3 Row 2 present? RAIDR: Retention-Aware Intelligent DRAM Refresh 13 Storing Retention Time Bins Using Bloom Filters Example with 64 128ms bin: 0 0 1 0 Hash function 1 1 0 0 0 0 1 Hash function 2 0 0 0 0 0 0 Hash function 3 Row 2 present? RAIDR: Retention-Aware Intelligent DRAM Refresh 13 Storing Retention Time Bins Using Bloom Filters Example with 64 128ms bin: 0 0 0 1 & 0 Hash function 1 1 1 0 & 0 0 0 0 =0 1 Hash function 2 0 0 0 0 0 0 Hash function 3 Row 2 present? RAIDR: Retention-Aware Intelligent DRAM Refresh 13 Storing Retention Time Bins Using Bloom Filters Example with 64 128ms bin: 0 0 0 1 & 0 Hash function 1 1 1 0 & 0 0 0 0 =0 1 Hash function 2 0 0 0 0 0 0 Hash function 3 Row 2 present? No RAIDR: Retention-Aware Intelligent DRAM Refresh 13 Storing Retention Time Bins Using Bloom Filters Example with 64 128ms bin: 0 0 1 0 Hash function 1 1 1 0 0 0 1 Hash function 2 0 0 1 0 1 0 Hash function 3 Insert Row 4 RAIDR: Retention-Aware Intelligent DRAM Refresh 13 Storing Retention Time Bins Using Bloom Filters Example with 64 128ms bin: 0 0 1 0 Hash function 1 1 1 0 0 0 1 Hash function 2 0 0 1 0 1 0 Hash function 3 Row 5 present? RAIDR: Retention-Aware Intelligent DRAM Refresh 13 Storing Retention Time Bins Using Bloom Filters Example with 64 128ms bin: 0 0 1 0 Hash function 1 1 1 0 0 0 1 1 Hash function 2 0 & 0 1 1 & 0 1 =1 1 0 Hash function 3 Row 5 present? RAIDR: Retention-Aware Intelligent DRAM Refresh 13 Storing Retention Time Bins Using Bloom Filters Example with 64 128ms bin: 0 0 1 0 Hash function 1 1 1 0 0 0 1 1 Hash function 2 0 & 0 1 1 & 0 1 =1 1 0 Hash function 3 Row 5 present? Yes (false positive) RAIDR: Retention-Aware Intelligent DRAM Refresh 13 Bloom Filters: Key Characteristics I False positives: a row may be declared present in the Bloom lter even if it was never inserted RAIDR: Retention-Aware Intelligent DRAM Refresh 14 Bloom Filters: Key Characteristics I False positives: a row may be declared present in the Bloom lter even if it was never inserted I Not a problem: Rows may be refreshed more frequently than needed RAIDR: Retention-Aware Intelligent DRAM Refresh 14 Bloom Filters: Key Characteristics I False positives: a row may be declared present in the Bloom lter even if it was never inserted I I Not a problem: Rows may be refreshed more frequently than needed No false negatives: a row will always be declared present in the Bloom lter if it was inserted RAIDR: Retention-Aware Intelligent DRAM Refresh 14 Bloom Filters: Key Characteristics I False positives: a row may be declared present in the Bloom lter even if it was never inserted I I Not a problem: Rows may be refreshed more frequently than needed No false negatives: a row will always be declared present in the Bloom lter if it was inserted I No correctness problems: Rows are never refreshed less frequently than needed RAIDR: Retention-Aware Intelligent DRAM Refresh 14 Bloom Filters: Key Characteristics I False positives: a row may be declared present in the Bloom lter even if it was never inserted I I No false negatives: a row will always be declared present in the Bloom lter if it was inserted I I Not a problem: Rows may be refreshed more frequently than needed No correctness problems: Rows are never refreshed less frequently than needed No over ow: any number of rows may be inserted into a Bloom lter RAIDR: Retention-Aware Intelligent DRAM Refresh 14 Bloom Filters: Key Characteristics I False positives: a row may be declared present in the Bloom lter even if it was never inserted I I No false negatives: a row will always be declared present in the Bloom lter if it was inserted I I Not a problem: Rows may be refreshed more frequently than needed No correctness problems: Rows are never refreshed less frequently than needed No over ow: any number of rows may be inserted into a Bloom lter I Scalable: contrast with straightforward table implementation RAIDR: Retention-Aware Intelligent DRAM Refresh 14 Bloom Filters: Key Characteristics I False positives: a row may be declared present in the Bloom lter even if it was never inserted I I No false negatives: a row will always be declared present in the Bloom lter if it was inserted I I No correctness problems: Rows are never refreshed less frequently than needed No over ow: any number of rows may be inserted into a Bloom lter I I Not a problem: Rows may be refreshed more frequently than needed Scalable: contrast with straightforward table implementation Bloom lters allow implementation of retention time bins with low hardware overhead RAIDR: Retention-Aware Intelligent DRAM Refresh 14 Bloom Filters: Key Characteristics I False positives: a row may be declared present in the Bloom lter even if it was never inserted I I No false negatives: a row will always be declared present in the Bloom lter if it was inserted I I No correctness problems: Rows are never refreshed less frequently than needed No over ow: any number of rows may be inserted into a Bloom lter I I Not a problem: Rows may be refreshed more frequently than needed Scalable: contrast with straightforward table implementation Bloom lters allow implementation of retention time bins with low hardware overhead I 1.25 KB storage overhead (2 Bloom lters) for 32 GB DRAM system RAIDR: Retention-Aware Intelligent DRAM Refresh 14 Retention-Aware Intelligent DRAM Refresh 1. Pro ling I Determine each row's retention time (how frequently each row needs to be refreshed to avoid losing data) 2. Binning I Group rows into di erent retention time bins based on their retention time 3. Refreshing I Refresh rows in di erent bins at di erent rates RAIDR: Retention-Aware Intelligent DRAM Refresh 15 3 Refreshing Rows at Di erent Rates Memory controller chooses each row as a refresh candidate every 64ms Row in 64-128ms bin? (First Bloom filter: 256B) Row in 128-256ms bin? (Second Bloom filter: 1KB) Refresh the row Every other 64ms window, refresh the row RAIDR: Retention-Aware Intelligent DRAM Refresh Every 4th 64ms window, refresh the row 16 Tolerating Temperature Variation: Refresh Rate Scaling I Change in temperature causes retention time of all cells to change by a uniform and predictable factor RAIDR: Retention-Aware Intelligent DRAM Refresh 17 Tolerating Temperature Variation: Refresh Rate Scaling I Change in temperature causes retention time of all cells to change by a uniform and predictable factor I Refresh rate scaling: increase the refresh rate for all rows uniformly, depending on the temperature RAIDR: Retention-Aware Intelligent DRAM Refresh 17 Tolerating Temperature Variation: Refresh Rate Scaling I Change in temperature causes retention time of all cells to change by a uniform and predictable factor I Refresh rate scaling: increase the refresh rate for all rows uniformly, depending on the temperature I Implementation: counter with programmable period RAIDR: Retention-Aware Intelligent DRAM Refresh 17 Tolerating Temperature Variation: Refresh Rate Scaling I Change in temperature causes retention time of all cells to change by a uniform and predictable factor I Refresh rate scaling: increase the refresh rate for all rows uniformly, depending on the temperature I Implementation: counter with programmable period I Lower temperature ⇒ longer period ⇒ less frequent refreshes RAIDR: Retention-Aware Intelligent DRAM Refresh 17 Tolerating Temperature Variation: Refresh Rate Scaling I Change in temperature causes retention time of all cells to change by a uniform and predictable factor I Refresh rate scaling: increase the refresh rate for all rows uniformly, depending on the temperature I Implementation: counter with programmable period I I Lower temperature ⇒ longer period ⇒ less frequent refreshes Higher temperature ⇒ shorter period ⇒ more frequent refreshes RAIDR: Retention-Aware Intelligent DRAM Refresh 17 Outline I Executive Summary I Background & Motivation I Key Observation & Our Mechanism: RAIDR I Evaluation I Conclusion RAIDR: Retention-Aware Intelligent DRAM Refresh 18 Methodology I 8-core, 4 GHz, 512 KB 16-way private cache per core I 32 GB DDR3 DRAM system (2 channels, 4 ranks/channel) I 1.25 KB storage overhead for 2 Bloom lters I Extended temperature range (85 95◦ C) characteristic of server environments I SPEC CPU2006, TPC-C, TPC-H benchmarks in 8-core multiprogrammed workloads I I I Benchmarks categorized by memory intensity (LLC misses per 1000 instructions) Workloads categorized by fraction of memory-intensive benchmarks 32 workloads per category, 5 workload categories RAIDR: Retention-Aware Intelligent DRAM Refresh 19 Comparison Points I Auto-refresh [DDR3, LPDDR2, . . . ]: I I I I Memory controller periodically sends auto-refresh commands DRAM devices refresh many rows on each command Baseline typical in modern systems All rows refreshed at same rate RAIDR: Retention-Aware Intelligent DRAM Refresh 20 Comparison Points I Auto-refresh [DDR3, LPDDR2, . . . ]: I I I I I Memory controller periodically sends auto-refresh commands DRAM devices refresh many rows on each command Baseline typical in modern systems All rows refreshed at same rate Distributed refresh: I I Memory controller refreshes each row individually by sending activate and precharge commands to DRAM All rows refreshed at same rate RAIDR: Retention-Aware Intelligent DRAM Refresh 20 Comparison Points I Auto-refresh [DDR3, LPDDR2, . . . ]: I I I I I Distributed refresh: I I I Memory controller periodically sends auto-refresh commands DRAM devices refresh many rows on each command Baseline typical in modern systems All rows refreshed at same rate Memory controller refreshes each row individually by sending activate and precharge commands to DRAM All rows refreshed at same rate Smart Refresh [Ghosh+, MICRO '07]: I I I I Memory controller refreshes each row individually Refreshes to recently activated rows are skipped Requires programs to activate many rows to be e ective Very high storage overhead (1.5 MB for 32 GB DRAM) RAIDR: Retention-Aware Intelligent DRAM Refresh 20 Comparison Points I Auto-refresh [DDR3, LPDDR2, . . . ]: I I I I I Distributed refresh: I I I Memory controller refreshes each row individually by sending activate and precharge commands to DRAM All rows refreshed at same rate Smart Refresh [Ghosh+, MICRO '07]: I I I I I Memory controller periodically sends auto-refresh commands DRAM devices refresh many rows on each command Baseline typical in modern systems All rows refreshed at same rate Memory controller refreshes each row individually Refreshes to recently activated rows are skipped Requires programs to activate many rows to be e ective Very high storage overhead (1.5 MB for 32 GB DRAM) No refresh (ideal) RAIDR: Retention-Aware Intelligent DRAM Refresh 20 Refresh Operations Performed (32 GB DRAM) % of refreshes performed 100 80 60 40 20 0 Distributed Smart RAIDR: Retention-Aware Intelligent DRAM Refresh RAIDR No-refresh 21 Refresh Operations Performed (32 GB DRAM) % of refreshes performed 100 80 60 40 25.4% 20 0 Distributed Smart RAIDR: Retention-Aware Intelligent DRAM Refresh RAIDR No-refresh 21 Weighted speedup Performance 8.5 Auto RAIDR 8.0 6.1% Distributed No Refresh Smart 8.4% 7.5 9.3% 7.0 8.6% 6.5 6.0 9.6% 5.5 9.8% 5.0 4.5 4.0 0% 25% 50% 75% 100% Avg Memory-intensive benchmarks in workload RAIDR: Retention-Aware Intelligent DRAM Refresh 22 Weighted speedup Performance 8.5 Auto RAIDR 8.0 6.1% Distributed No Refresh Smart 8.4% 7.5 9.3% 7.0 8.6% 6.5 6.0 9.6% 5.5 9.8% 5.0 4.5 4.0 0% 25% 50% 75% 100% Avg Memory-intensive benchmarks in workload RAIDR: Retention-Aware Intelligent DRAM Refresh 22 Weighted speedup Performance 8.5 Auto RAIDR 8.0 6.1% Distributed No Refresh Smart 8.4% 7.5 9.3% 7.0 8.6% 6.5 6.0 9.6% 5.5 9.8% 5.0 4.5 4.0 0% 25% 50% 75% 100% Avg Memory-intensive benchmarks in workload RAIDR: Retention-Aware Intelligent DRAM Refresh 22 DRAM Energy E ciency 100 Auto Distributed Smart Energy per access (nJ) 18.9% 80 60 RAIDR No Refresh 17.3% 15.4% 16.1% 13.7% 12.6% 40 20 0 0% 25% 50% 75% 100% Avg Memory-intensive benchmarks in workload RAIDR: Retention-Aware Intelligent DRAM Refresh 23 DRAM Energy E ciency 100 Auto Distributed Smart Energy per access (nJ) 18.9% 80 60 RAIDR No Refresh 17.3% 15.4% 16.1% 13.7% 12.6% 40 20 0 0% 25% 50% 75% 100% Avg Memory-intensive benchmarks in workload RAIDR: Retention-Aware Intelligent DRAM Refresh 23 DRAM Energy E ciency 100 Auto Distributed Smart Energy per access (nJ) 18.9% 80 60 RAIDR No Refresh 17.3% 15.4% 16.1% 13.7% 12.6% 40 20 0 0% 25% 50% 75% 100% Avg Memory-intensive benchmarks in workload RAIDR: Retention-Aware Intelligent DRAM Refresh 23 DRAM Device Capacity Scaling: Performance Weighted speedup 8 7 Auto RAIDR 6 5 4 3 2 1 0 4 Gb 8 Gb 16 Gb 32 Gb Device capacity RAIDR: Retention-Aware Intelligent DRAM Refresh 64 Gb 24 DRAM Device Capacity Scaling: Performance Weighted speedup 8 7 Auto RAIDR 6 108% 5 4 3 2 1 0 4 Gb 8 Gb 16 Gb 32 Gb Device capacity RAIDR: Retention-Aware Intelligent DRAM Refresh 64 Gb 24 DRAM Device Capacity Scaling: Energy Energy per access (nJ) 160 140 Auto RAIDR 120 100 80 60 40 20 0 4 Gb 8 Gb 16 Gb 32 Gb Device capacity RAIDR: Retention-Aware Intelligent DRAM Refresh 64 Gb 25 DRAM Device Capacity Scaling: Energy Energy per access (nJ) 160 140 Auto RAIDR 50% 120 100 80 60 40 20 0 4 Gb 8 Gb 16 Gb 32 Gb Device capacity RAIDR: Retention-Aware Intelligent DRAM Refresh 64 Gb 25 Outline I Executive Summary I Background & Motivation I Key Observation & Our Mechanism: RAIDR I Evaluation I Conclusion RAIDR: Retention-Aware Intelligent DRAM Refresh 26 Conclusion I Refresh is an energy and performance problem that is becoming increasingly signi cant in DRAM systems RAIDR: Retention-Aware Intelligent DRAM Refresh 27 Conclusion I Refresh is an energy and performance problem that is becoming increasingly signi cant in DRAM systems I Refresh limits DRAM scaling RAIDR: Retention-Aware Intelligent DRAM Refresh 27 Conclusion I Refresh is an energy and performance problem that is becoming increasingly signi cant in DRAM systems I I Refresh limits DRAM scaling High refresh rate is caused by a small number of problematic cells RAIDR: Retention-Aware Intelligent DRAM Refresh 27 Conclusion I Refresh is an energy and performance problem that is becoming increasingly signi cant in DRAM systems I Refresh limits DRAM scaling I High refresh rate is caused by a small number of problematic cells I RAIDR groups rows into bins and refreshes rows in di erent bins at di erent rates RAIDR: Retention-Aware Intelligent DRAM Refresh 27 Conclusion I Refresh is an energy and performance problem that is becoming increasingly signi cant in DRAM systems I Refresh limits DRAM scaling I High refresh rate is caused by a small number of problematic cells I RAIDR groups rows into bins and refreshes rows in di erent bins at di erent rates I Uses Bloom lters for scalable and e cient binning RAIDR: Retention-Aware Intelligent DRAM Refresh 27 Conclusion I Refresh is an energy and performance problem that is becoming increasingly signi cant in DRAM systems I Refresh limits DRAM scaling I High refresh rate is caused by a small number of problematic cells I RAIDR groups rows into bins and refreshes rows in di erent bins at di erent rates I I Uses Bloom lters for scalable and e cient binning 74.6% refresh reduction, 8.6% performance improvement, 16.1% DRAM energy reduction at 1.25 KB overhead RAIDR: Retention-Aware Intelligent DRAM Refresh 27 Conclusion I Refresh is an energy and performance problem that is becoming increasingly signi cant in DRAM systems I Refresh limits DRAM scaling I High refresh rate is caused by a small number of problematic cells I RAIDR groups rows into bins and refreshes rows in di erent bins at di erent rates I Uses Bloom lters for scalable and e cient binning I 74.6% refresh reduction, 8.6% performance improvement, 16.1% DRAM energy reduction at 1.25 KB overhead I RAIDR's bene ts improve with increasing DRAM density RAIDR: Retention-Aware Intelligent DRAM Refresh 27 Conclusion I Refresh is an energy and performance problem that is becoming increasingly signi cant in DRAM systems I Refresh limits DRAM scaling I High refresh rate is caused by a small number of problematic cells I RAIDR groups rows into bins and refreshes rows in di erent bins at di erent rates I Uses Bloom lters for scalable and e cient binning I 74.6% refresh reduction, 8.6% performance improvement, 16.1% DRAM energy reduction at 1.25 KB overhead I RAIDR's bene ts improve with increasing DRAM density I Enables better DRAM scaling RAIDR: Retention-Aware Intelligent DRAM Refresh 27 RAIDR: Retention-Aware Intelligent DRAM Refresh Jamie Liu Ben Jaiyen Richard Veras Onur Mutlu DRAM Hierarchy Channel Rank Rank Processor Core Bank Bank Memory Controller Rank Bank Rank Bank Channel RAIDR: Retention-Aware Intelligent DRAM Refresh 29 DRAM Array Organization Bit lines Row Cell Word lines Sense amp Sense amp RAIDR: Retention-Aware Intelligent DRAM Refresh Sense amp Row buffer 30 DRAM Activation 0V VDD /2 + – VDD Sense amp RAIDR: Retention-Aware Intelligent DRAM Refresh 31 DRAM Activation VPP VDD /2 + – VDD Sense amp RAIDR: Retention-Aware Intelligent DRAM Refresh 31 DRAM Activation VPP VDD /2+δ + – ??? Sense amp RAIDR: Retention-Aware Intelligent DRAM Refresh 31 DRAM Activation VPP VDD + – VDD Sense amp RAIDR: Retention-Aware Intelligent DRAM Refresh 31 DRAM Precharge VPP VDD + – VDD Sense amp RAIDR: Retention-Aware Intelligent DRAM Refresh 32 DRAM Precharge 0V VDD + – VDD Sense amp RAIDR: Retention-Aware Intelligent DRAM Refresh 32 DRAM Precharge 0V VDD /2 + – VDD Sense amp RAIDR: Retention-Aware Intelligent DRAM Refresh 32 RAIDR Components Refresh Rate Scaler Period Period Counter Row Counter 64-128ms Bloom Filter Refresh Rate Scaler Counter 128-256ms Bloom Filter RAIDR: Retention-Aware Intelligent DRAM Refresh 33 Idle DRAM power consumption (W) Idle Power Consumption 5 Auto Self Refresh 4 RAIDR No Refresh 19.6% 3 12.2% 2 1 0 Normal temp. RAIDR: Retention-Aware Intelligent DRAM Refresh Extended temp. 34 Weighted speedup Performance: 85◦ C 8.5 Auto RAIDR 8.0 2.9% Distributed No Refresh Smart 7.5 3.8% 7.0 4.4% 6.5 4.1% 6.0 5.5 4.5% 5.0 4.8% 4.5 4.0 0% 25% 50% 75% 100% Avg Memory-intensive benchmarks in workload RAIDR: Retention-Aware Intelligent DRAM Refresh 35 Energy: 85◦ C Energy per access (nJ) 100 Auto Distributed Smart 80 10.1% 60 9.0% 40 7.9% RAIDR No Refresh 8.3% 6.9% 6.4% 20 0 0% 25% 50% 75% 100% Avg Memory-intensive benchmarks in workload RAIDR: Retention-Aware Intelligent DRAM Refresh 36 RAIDR Default Con guration I 64 128 ms bin: 256 B Bloom lter, 10 hash functions; 28 rows in bin, false positive probability 1.16 · 10−9 I 128 256 ms bin: 1 KB Bloom lter, 6 hash functions; 978 rows in bin, false positive probability 0.0179 RAIDR: Retention-Aware Intelligent DRAM Refresh 37 Refresh Reduction vs. RAIDR Con guration # of refreshes performed 4.0 ×10 7 3.5 3.0 2.5 2.0 1.5 1.0 0.5 A RA uto ID R 0.0 1 2 1 2 3 4 1 Bin 2 Bins RAIDR: Retention-Aware Intelligent DRAM Refresh 1 2 3 4 5 3 Bins 38 RAIDR Con gurations Key Auto RAIDR 1 bin (1) 1 bin (2) 2 bins (1) 2 bins (2) 2 bins (3) 2 bins (4) 3 bins (1) 3 bins (2) 3 bins (3) 3 bins (4) 3 bins (5) Description Auto-refresh Default RAIDR: 2 bins (64 128 ms, m = 2048; 128 256 ms, m = 8192) 1 bin (64 128 ms, m = 512) 1 bin (64 128 ms, m = 1024) 2 bins (64 128 ms, m = 2048; 128 256 ms, m = 2048) 2 bins (64 128 ms, m = 2048; 128 256 ms, m = 4096) 2 bins (64 128 ms, m = 2048; 128 256 ms, m = 16384) 2 bins (64 128 ms, m = 2048; 128 256 ms, m = 32768) 3 bins (64 128 ms, m = 2048; 128 256 ms, m = 8192; 256 512 ms, m 3 bins (64 128 ms, m = 2048; 128 256 ms, m = 8192; 256 512 ms, m 3 bins (64 128 ms, m = 2048; 128 256 ms, m = 8192; 256 512 ms, m 3 bins (64 128 ms, m = 2048; 128 256 ms, m = 8192; 256 512 ms, m 3 bins (64 128 ms, m = 2048; 128 256 ms, m = 8192; 256 512 ms, m RAIDR: Retention-Aware Intelligent DRAM Refresh Storage Overhead = = = = = 32768) 65536) 131072) 262144) 524288) N/A 1.25 KB 64 B 128 B 512 B 768 B 2.25 KB 4.25 KB 5.25 KB 9.25 KB 17.25 KB 33.25 KB 65.25 KB 39