DATE Conference Template

advertisement
System and Circuit Level Power Modeling of
Energy-Efficient 3D-Stacked Wide I/O DRAMs
Karthik Chandrasekar
TU Delft
Christian Weis$, Benny Akesson*, Norbert Wehn$ & Kees Goossens#
$
*
#
Overview
•
•
•
•
•
•
Motivation for 3D-stacking of DRAMs
Problem Statement - Power Modeling
Circuit-level DRAM architecture & power model
System-level DRAM power model (DRAMPower)
Comparison: Results and Analysis
Summary
19-Mar-13
Karthik Chandrasekar / TU Delft
1
Motivation: Why 3D-Stacked DRAMs?
State of the art: Mobile LPDDR/2/3
3D-Stacked Wide IO
PoP
(Package-on-Package)
bumps)
Off-Chip
Interconnects (on (μ
PCB)
TSV (Through Silicon Via) - Many Dies
Capacitance:
8 to 20pF (-50%
Power)
PoP (Package-on-Package)
(μ bumps)
Capacitance: ~2pF (-85% Power)
1 or 2 Channels (x32) (Low Bandwidth)
4 Channels (x128) (High Bandwidth)
[I/O power per bit: 0.7mW in TSV vs 2.3mW in PoP vs 4.6mW in Off-Chip – Samsung]
The Performance Vs. Power Factor
80
60
40

Bandwidth
14
Power
12
10
8
6
4
20
2
0
Peak Bandwidth (GBps)
Power (mW/GBps)
100
0
SC LPDDR2 x32 (400)
DC LPDDR2 x32 (533)
DC LPDDR3 x32 (800) QC Wide IO x128 (200)
Images & Data Courtesy: HMC, JEDEC 42.6, FineTech, Nvidia, Samsung
19-Mar-13
Karthik Chandrasekar / TU Delft
2
What’s missing? [Problem Statement]
An accurate 3D-DRAM Power Model
to design DRAM-stacked SoCs
19-Mar-13
Karthik Chandrasekar / TU Delft
3
Approaches to power modeling
• Circuit-level Power Model
– Modeling the DRAM architecture at the circuit-level in SPICE
– Pros: Accurate and detailed
– Cons: Slow, requires circuit-level understanding of DRAM architecture &
technology specifications for DRAMs are publicly unavailable
• System-level Power Model (like Micron’s)
– Based on vendor provided datasheet measures and JEDEC specifications
– Pros: Fast, easy to integrate & employs simple models for memory operations
– Cons: Accuracy is unclear. Not directly applicable for 3D-DRAMs and is not
verified against circuit-level models or hardware measurements.
Need: Fast, Simple & Accurate Model
19-Mar-13
Karthik Chandrasekar / TU Delft
4
What’s the solution?
Develop
A System-Level 3D-DRAM Power Model
i.e. as accurate as
A Circuit-Level 3D-DRAM Power Model
19-Mar-13
Karthik Chandrasekar / TU Delft
5
Circuit-Level DRAM Modeling
Baseline DRAM Model
•
•
•
(Weis) DATE‘11 and DAC‘13
NGSPICE - PTM/BSIM
1T1C Cell to Banks
2D to 3D (New)
•
•
•
•
•
•
•
Based on DATE ‘11 &
JEDEC Wide IO – x512
4 Banks/Channel
4 Channels
TSV Routing
– Data, Cmd & Addr
– Control, Clock & Power
No ODT (On Die Termination)
– Low Freq. & IO Capacitance
No DLL (Delay Locked Loop)
TSV model from IMEC/GaTech
19-Mar-13
Karthik Chandrasekar / TU Delft
6
System-Level Power Model (DRAMPower)
Comparison to Micron model
• Problem with Micron’s model:
•
•
•
Not directly applicable for 3D-DRAMs (Multiple voltage domains and IO)
Accuracy is unclear (State transitions not addressed & Approx. workload used)
Not verified against circuit-level models or hardware power measurements.
• Adapting to 3D-DRAMs:
• Considers multiple voltage domains: (a) Core (b) Derived (Wordline)
• Includes IO power consumption (Incl. I/O Pads, Buffers, Bumps, Drivers & Pins)
• RD operation Energy (Generic equation):
• Modeling for Accuracy:
• Models memory state transitions – from active to power-down
• Models self-refresh accurately (functional correctness & timing difference)
• Most importantly: Is almost as accurate as the circuit-level model
19-Mar-13
Karthik Chandrasekar / TU Delft
7
Self-Refresh Operation - Accuracy
Micron SREF
NOP
NOP
NOP
NOP
NOP
NOP
NOP
SREX
NOP
NOP
NOP
NOP
NOP
Timings <--------- ---------- ---------- -------SR EF------- ---------- ---------- --------> <--------- ---------- ---------- -XSDLL- ---------- -------->
Active
Current
Bckgnd
Current IDD6
IDD6
IDD6
IDD6
IDD6
IDD6
IDD6
IDD6 IDD2N IDD2N IDD2N IDD2N IDD2N IDD2N
Actual
•
•
Internal
Refresh
No DLL
Actual SREF
Timings <--------Active IDD5Current IDD3N
Bckgnd
Current IDD3P0
NOP
NOP
NOP
NOP
NOP
NOP
NOP
SREX
NOP
NOP
NOP
RFC-RP --------> <-------R P-------> <--------- --SREF-- ---------> <--------- ---------X S-------- --------->
IDD5- IDD5- IDD5- IDD5IDD3N IDD3N IDD2N IDD2N
IDD3P0 IDD3P0 IDD2P0 IDD2P0
IDD6
IDD6
IDD6
IDD2N IDD2N IDD2N IDD2N
We furnish new equations in the system-level power model to address such accuracy issues
19-Mar-13
Karthik Chandrasekar / TU Delft
8
Comparison: Results & Analysis
• Experiment I:
– Different Operations
– Different Granularity
• Results:
– Less than 2% difference
– Adapted Micron SR (200): 72% diff.
• Experiment II:
– H.263 Encoder & EPIC Encoder
– JPEG Encoder & MPEG2 Decoder
– Different Loads and Power Modes
• Results:
– Less than 2% difference
– Adapted Micron: 12% diff. (SR 500MHz)
•
The 2% difference is due to the use of JEDEC-specified averaged IDD currents.
Shows the accuracy of the system-level power model
19-Mar-13
Karthik Chandrasekar / TU Delft
9
Summary
Key Highlights:
•
•
•
Presented an accurate datasheet-based system-level power model for Wide I/O
3D-stacked DRAMs.
Verified the system-level model for accuracy against as a detailed SPICE-based
circuit-level 3D-DRAM architecture and power model.
Observed < 2% difference in power and energy estimates for different memory
operations and for any variations in memory load.
Other Important Contributions:
•
•
Provided estimates for IDD current measures for different JEDEC 3D-DRAM
configurations, in place of the as yet unavailable datasheets (in the paper).
The system-level power model (DRAMPower) has been released online as an
open-source 3D-DRAM power estimation tool. Download link:
www.drampower.info
19-Mar-13
Karthik Chandrasekar / TU Delft
10
Download