Serial link DRAM system

advertisement
Serial Network
SDRAM
ENEE 759H
Spring 2003
Introduction

SDRAM system drawbacks
 No
parallelism for memory accesses
 Multitude of pins for address/command/data

Overall Goals
 Increase
parallelism, reduce latency
 Reduce pin count
 Attempt to increase bandwidth
Motivation

Poulton’s idea
 Bi-directional
serial
links.
 Theoretically high
bandwidth!
 Less pins required for
same functionality!
 Looks perfect!
*Graphic from Poulton’s Signaling Tutorial
Evolution I

Initial design
Memory Controller
 Split
topology.
 Effectively halve
latency.
 Complicated protocol
and connection
details.
Address, etc.
Evolution II

Initial design
Memory Controller
 Individual
DRAM chips
directly connected.
 High overall
bandwidth.
 Inflexible, lower
capacity for system.

We need a better
design!
8 SDRAM
Chips
The Next Step
Want simple system interconnects
 Keep basic SDRAM chip structure intact
 Utilize the strengths of both parallel and
serial connections
 Create a system that facilitates parallelism

System Overview



Take a “step back”…
Consider memory
module interface.
Consider inter-chip
interface on module.
Memory
Controller
Memory
Modules
Serial Lines
@ fast clock
System Overview
1 logical channel, 4 physical channels
 3.2 GHz point-to-point connections
 Each channel called “module”
 5 pins/module on memory controller
 Intra-module connections: parallel
 External connections: high speed serial

Module Topology
8 bit, Data-in buses
8 bit, Data-out buses
18 bit Addr/Cmd buses
DIN/DOUT Buffers
Memory
Controller
256 Mbit x8
SDRAM Parts
CLK1
DIN
CLK2
DOUT
CMD
Serial Lines
@ 3.2 GHz
Translator Circuits
Memory Module
System Details I
DIN Translator
DOUT Translator
DOUT
8
8
8
COMMAND
18
To SDRAM Chips
8
8
8
From DOUT Buffer
8
DIN
To DIN Buffer
8
System Details II
DOUT Buffer
DIN Buffer
From SDRAM Chip 0
To SDRAM Chip 0
From SDRAM Chip 1
8
From SDRAM Chip 2
8
8
From DIN Translator
To DOUT Translator
8
To SDRAM Chip 1
8
To SDRAM Chip 2
8
To SDRAM Chip 3
From SDRAM Chip 3
8
8
4, 8-bit registers
System Details – Protocol I

The Command Set
CMD
USE
OP
ADDR?
NOP
No operation.
000
N
ACT
Activate a row; uses bank and row address.
001
Y
READ
Selects bank/column, initiates read burst.
010
Y
WRITE
Select bank and column, initiate write burst.
011
Y
PREC
Precharge; deactivate row in bank.
100
*
AUTOR
Auto-refresh; enter refresh mode.
101
N
XXX
Reserved
110
XXX
Reserved
111
System Details – Protocol II

Packets
 18
bit command/address
 32 bit data packets
Activate this row and bank…
COMMAND
0
0
1
0
1
1
1
1
0
1
0
0
1
0
0
1
1
1
1
1
1
0
0
Start a READ burst at this column…
COMMAND
0
1
0
1
1
0
1
1
0
0
1
0
0
*Operating at 3.2GHz, command packets take 5.62ns; data packets
take 10ns (the same as SDRAM operating at 100 MHz).
Cubing I





“Chip stacking”
Developed by IrvineSensors Corp.
Currently can stack
two 256 Mbit chips.
Smaller footprint/area!
Much shorter
connection wires!
*Graphics from Irvine-Sensors Data Sheet
Cubing II – Serial Network



Point-to-point star
topology.
Dedicated circuits high speed serial
lines.
Departure from
“traditional” bus
concept.
4-stack Cubes
Memory
Controller
Address/Command line
DOUT line
DIN line
Clock line @ 3.2 GHz
System Access Protocol

Consecutive access
to same module
 Similar
timing as
SDRAM.


Bandwidth matched
between parallel and
serial.
DIN/DOUT buffers - no
additional timing
constraints.
*Graphic from Dr. Jacob and Dave Wang
System Access Protocol

Independent,
simultaneous access
to separate modules.
Conventional SDRAM:
 No
inter-module timing
issues.
*Graphic from Dr. Jacob and Dave Wang
Serial Network Advantages I

Path length matching
 No
more heroic
routing!
 Star topology is
symmetric.

No clock mismatch
issues…
 Everyone
is on time!
*Graphic from Dr. Jacob and Dave Wang
Serial Network Advantages IIa

No need for bus
termination.
 Point-to-point
communication,
terminated in module.
*Graphic from Dr. Jacob and Dave Wang
Serial Network Advantages IIb

Serial/P2P vs.
RAMBUS multi-drop.
 Faster
signaling!
 No ringing!
 Clean timing.
 Serial wins…

RAMBUSted!
*Graphic from Dr. Jacob and Dave Wang
System Simulation

SimpleScalar
 Single
CPU, Single Thread
 SNSDRAM(32 Meg x 8)
 1 rank in every memory module
 Channel width : 32 bits
 One extra cycle of Transaction Queue Delay
to model the parallel to serial conversion
Simulation Run I - Parallel Bus
Channel
1
1
1
1
Rank Per Channel
1
2
4
8
Sim_Cycles
884521
881421
880361
880361
Simulation Run I - Serial Network
Channel
1
2
4
8
Rank Per Channel
1
1
1
1
Sim_Cycles
885291
805721
766711
766711
Simulation I Cycles Chart
test-printf
Total
Cycles in
Thousands
900
850
800
750
700
1
Serial Link
Parallel Bus
2
4
Number of Channels (Serial Link)
Number of Ranks (Parallel Bus)
8
Simulation Run II – Parallel Bus
Channel
1
2
4
8
Rank Per Channel
1
1
1
1
Sim_Cycles
13206613
13169500
13144737
13144737
Simulation Run II – Serial Network
Channel
1
2
4
8
Rank Per Channel
1
1
1
1
Sim_Cycles
13264603
12633349
12510912
12510912
Simulation II Cycles Chart
test-printf
Total
Cycles in
Thousands
13400
13200
13000
12800
12600
12400
12200
12000
1
Serial Link
Parallel Bus
2
4
Number of Channels (Serial Link)
Number of Ranks (Parallel Bus)
8
Memory Mapping

Basic SDRAM
Rank

Row ID
Bank
Hi Col ID
Channel ID
Lo Col ID
Col Size
Lo Col ID
Col Size
High Performance SDRAM
Row ID
Rank
Bank
Hi Col ID
Channel ID
Analysis





Cache line = 64 byte
channel width
Read after Read
Multi-CPU
Single CPU Multi-Thread
Summary I

Recall…
 SDRAM
has complex
interface, simple chips.
 RDRAM has a simple
interface, but very
complex chips.

SNSDRAM…
 Blends
these seemingly
split philosophies!
Summary II

Advantages
 Smaller
pin count on memory controller.
 Independent memory modules facilitate
parallelism.
 Simulated performance improvement over
similar SDRAM configurations.
 Smaller system footprint with cubing
technology.
 Theoretically scalable.
Download