ECE2030 Introduction to Computer Engineering Lecture 17: Memory

advertisement
ECE2030
Introduction to Computer Engineering
Lecture 17: Memory and Programmable Logic
Prof. Hsien-Hsin Sean Lee
School of Electrical and Computer Engineering
Georgia Tech
Memory
• Random Access Memory (RAM)
– Contrary to Serial Access Memory (e.g. Tape)
– Static Random Access Memory (SRAM)
• Data stored so long as Vdd is applied
• 6-transistors per cell
• Faster
• Differential
– Dynamic Random Access Memory (DRAM)
• Require periodic refresh
• Smaller (can be implemented with 1 or 3 transistor)
• Slower
• Single-Ended
– Can be read and written
– Typically, addressable at byte granularity
• Read-Only Memory (ROM)
2
Block Diagram of Memory
K-bit address
lines
N
N-bit Data Input
(for Write)
Memory Unit
K
Read/Write
Chip Enable
2k words
N-bit per word
N
N-bit Data Output
(for Read)
• Example: 2MB memory, byte-addressable
– N = 8 (because of byte-addressability)
– K = 21 (1 word = 8-bit)
3
Static Random Access Memory (SRAM)
Wordline (WL)
BitLine
BitLine
• Typically each bit is implemented with 6 transistors (6T SRAM Cell)
• During read, the bitline and its inverse are precharged to Vdd (1) before set
WL=1
• During write, put the value on Bitline and its inverse on Bitline_bar before
set WL=1
4
Dynamic Random Access Memory (DRAM)
Wordline (WL)
Bitline
•
•
•
•
1-transistor DRAM cell
During a write, put value on bitline and then set WL=1
During a read, precharge bitline to Vdd (1) before assert WL to 1
Storage decays, thus requires periodic refreshing (read-sense-write)
5
Memory Description
• Capacity of a memory is described as
– # addresses x Word size
– Examples:
Memory
# of addr
# of data lines
# of addr lines
# of total bytes
1M x 8
1,048,576
8
20
1 MB
2M x 4
2,097,152
4
21
1 MB
1K x 4
1024
4
10
512 B
4M x 32
4,194,304
32
22
16 MB
16K x 64
16,384
64
14
128 KB
6
How to Address Memory
4x8 Memory
2-to-4
0
Decoder
A0
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1
2
A1
3
CS
Chip
Select
D7
D6
D5
D4
D3
D2
D1
D0
7
How to Address Memory
4x8 Memory
2-to-4
0
Decoder
A0=1
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1
2
A1=0
3
CS
Chip
Select=1
D7
D6
D5
D4
D3
D2
D1
D0
Access address = 0x1
8
Use 2 Decoders
2-to-4
8x4 Memory
0
Decoder
A1
Row
Decoder
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1
2
A2
3
CS
Chip
Select
0
1
CS 1-to-2 Decoder Column Decoder
A0
D0
D1
D2
D3
Tristate
Buffer
(read)
9
Tristate Buffer
En
Input
Output
Input
Output
En
Vdd
Input
En
En
Output
• Similar to Transmission Gate
• Could amplify signal (in contrast
to a TG)
• Typically used for signal
traveling, e.g. bus
CMOS circuit
10
Bi-directional Bus using Tri-state Buffer
Direction
(control data flow for read/write)
A
Input/Output
B
11
Read/Write Memory
8x4 Memory
0
A1
2-to-4
Row
Decoder
A2
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1
2
3
CS
Rd/Wr = 0
Chip
Select = 0
CS
0
1
1-to-2 Column Decoder
A0
D0
D1
D2
D3
12
Read/Write Memory
8x4 Memory
0
A1
2-to-4
Row
Decoder
A2
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1-bit
1
2
3
CS
Rd/Wr = 1
Chip
Select = 1
CS
0
1
1-to-2 Column Decoder
A0
D0
D1
D2
D3
13
Building Memory in Hierarchy
• Design a 1Mx8 using 1Mx4 memory chips
D7
1Mx4
CS
A19
A18
A17
R/W
A19
A18
A17
CS
A0
D5
D4
D3
1Mx4
A0
D6
CS
R/W
D2
D1
D0
14
Building Memory in Hierarchy
• Design a 2Mx4 using 1Mx4 memory chips
Note that 1-to-2
decoder is the wire
itself (or use
an inverter)
A19
A18
A17
1-to-2
Decoder
CS
D2
1Mx4
A0
A20
D3
CS
R/W
D1
D0
1
0
A19
A18
A17
1Mx4
A0
CS
R/W
15
Building Memory in Hierarchy
• Design a 2Mx8 using 1Mx4 memory chips
A19
A18
A17
A0
A20
1-to-2
Decoder
1
A19
A18
A17
A0
A19
A18
A17
1Mx4
CS
R/W
A19
A18
A17
A0
A19
A18
A17
A0
D6
D5
D4
D3
1Mx4
0
A0
CS
D7
CS
R/W
D2
D1
D0
1Mx4
CS
R/W
1Mx4
CS
R/W
16
Memory Model
• 32-bit address space can address up to 4GB (232) different
memory locations
0x00000000
0x0A
0x00000001
0xB6
0x00000002
0x41
0x00000003
0xFC
0xFFFFFFFF
0x0D
Lower
Memory
Address
Higher
Memory
Address
Flat Memory Model
17
Endianness [Danny Cohen 91]
• Byte ordering  How a multiple byte data word stored
in memory
• Endianness (from Gulliver’s Travels)
– Big Endian
• Most significant byte of a multi-byte word is stored at the lowest
memory address
• e.g. Sun Sparc, PowerPC
– Little Endian
• Least significant byte of a multi-byte word is stored at the lowest
memory address
• e.g. Intel x86
• Some embedded & DSP processors would support
both for interoperability
18
Endianness Examples
• Store 0x87654321 at address 0x0000, byte-addressable
0x0000
0x87
0x0001
0x65
0x0002
0x0003
Lower
Memory
Address
0x0000
0x21
0x0001
0x43
0x43
0x0002
0x65
0x21
0x0003
0x87
Higher
Memory
Address
BIG ENDIAN
Lower
Memory
Address
Higher
Memory
Address
LITTLE
ENDIAN
19
Memory Allocation (Little Endian)
declare:
.data
.globl declare
.align 0
.word 511
.byte 14
.align 2
.byte 14
.word 0x0B1E8143
.align 2
.ascii “GAece”
.half 10
.word 0x2B1E8145
.space 1
.byte 52
.align 1
.byte 16
.space 2
.byte 67
0
0xFF
e
------
1c
0x34
1
0x01
f
------
1d
------
2
0x00
10
0x47
1e
0x10
3
0x00
11
0x41
1f
4
0x0E
12
0x65
20
5
------
13
0x63
21
6
------
14
0x65
7
------
15
0x0A
8
0x0E
16
0x00
9
0x43
17
0x45
a
0x81
18
0x81
b
0x1E
19
0x1E
c
0x0B
1a
0x2B
d
------
1b
0x43
.align N: Align next datum
on a 2n byte boundary
.align 0: turn off automatic
alignment for .half, .word,
.float, and .double till the
next .data directive
.word: 4 bytes
.half: 2 bytes
.byte: 1 byte
.space: 1-byte space
.ascii: ASCII code (American
Standard Code for
Information Interchange) 20
Read Only Memory (ROM)
• “Permanent” binary information is stored
• Non-volatile memory
– Power off does not erase information stored
K-bit address
lines
K
ROM
2k words
N-bit per work
N-bit Data Output
N
21
32x8 ROM
5
A4
A3
A2
A1
5-to-32
32x8 ROM
8
Each
represents
32 wires
0
1
2
3
Decoder
A0
Fuse can be
implemented as
a diode or a
pass transistor
28
29
30
31
D7
D6
D5
D4
D3
D2
D1
D0
22
Programming the 32x8 ROM
A4
A3
A2
A1
A0
D7
D6
D5
D4
D3
D2
D1
D0
0
0
0
0
0
1
1
0
0
0
1
0
1
0
0
0
0
1
1
0
0
0
1
0
1
1
0
0
0
1
0
1
0
1
1
0
0
0
0
…
…
…
…
…
…
…
…
…
…
…
…
…
1
1
1
0
1
0
0
0
1
0
0
0
0
1
1
1
1
0
0
1
0
1
0
1
1
0
1
1
1
1
1
1
1
1
0
0
0
0
1
A4
A3
A2
A1
A0
5-to-32
0
1
2
Decoder
29
30
31
D7 D6 D5 D4 D3 D2 D1 D0
23
Example: Lookup Table
• Design a square lookup table for F(X) = X2 using ROM
X
F(X)=X2
X
F(X)=X2
0
0
000
000000
1
1
001
000001
2
4
010
000100
3
9
011
001001
4
16
100
010000
5
25
101
011001
6
36
110
100100
7
49
111
110001
24
Square Lookup Table using ROM
0
1
X
F(X)=X2
000
000000
001
000001
X1
010
000100
X0
011
001001
6
100
010000
7
101
011001
110
100100
111
110001
X2
3-to-8
2
3
Decoder
4
5
F5
F4
F3
F2
F1
F0
25
Square Lookup Table using ROM
0
1
X
F(X)=X2
000
000000
001
000001
X1
010
000100
X0
011
001001
6
100
010000
7
101
011001
110
100100
111
110001
X2
3-to-8
2
3
Decoder
4
5
F5
F4
F3
F2
F1
F0
Not Used
= X0
26
Square Lookup Table using ROM
0
1
X
F(X)=X2
000
000000
001
000001
X1
010
000100
X0
011
001001
6
100
010000
7
101
011001
110
100100
111
110001
X2
3-to-8
2
3
Decoder
4
5
F5
F4
F3
F2
F1
F0
27
Classifying Three Basic PLDs
INPUT
Fixed AND plane
(decoder)
Programmable
Connections
Programmable
OR plane
OUTPUT
(Programmable) Read-Only Memory (ROM)
INPUT
Programmable
AND plane
Programmable
Connections
Programmable
OR plane
OUTPUT
Programmable Logic Array (PLA)
INPUT
Programmable
AND plane
Fixed
OR plane
F/F
OUTPUT
Programmable Array Logic (PAL) Devices
PAL: trademark of AMD, use PAL as an adjective or
expect to receive a letter from AMD’s lawyers
28
Programmable Logic Array (PLA)
A
Programmable
OR Plane
B
C
Programmable
AND Plane
C C B B A A
F2
29
Example using PLA
F1(A, B, C)   m(0,1,2,4)
F2(A, B, C)   m(0,5,6,7)
F1  A B  AC  BC
F1  AB  AC  BC
F2  AB  AC  A BC
30
Example using PLA
A
F1  AB  AC  BC
B
F2  AB  AC  ABC
C
AB
AC
BC
ABC
C C B B A A
F1
F2
31
PAL Device
A
A
B
B
IO1 IO2 IO1 IO1
IO1
Programmable
AND Plane
A
IO2
B
Fixed
OR Plane
32
PAL Device Design Example
A
A
B
B
C
C
D
D IO1 IO1
IO1
Not programmed
A
IO2
B
IO1  ABC  ABCD
IO2  ABC  ABCD  ACD  ABCD
33
CPLD and FPGA [Brown&Rose 96]
• Complex Programmable Logic Device (CPLD)
– Multiple PLDs (e.g. PALs, PLAs) with programmable
interconnection structure
– Pioneered by Altera
• Field-Programmable Gate Array (FPGA)
– High logic capacity with large distributed interconnection
structure
• Logic capacity  number of 2-input NAND gates
– Offers more narrow logic resources
• CPLD offers logic resources w/ a wide number of inputs (AND planes)
– Offer a higher ratio of Flip-flops to logic resources than CPLD
• HCPLD (High Capacity PLD) is often used to refer to
both CPLD and FPGA
34
CPLD structure
Logic block
PLD
PLD
PLD
PLD
I/O block
Interconnects
PLD
PLD
PLD
PLD
35
FPGA Structure
Logic block
I/O block
Interconnects
36
FPGA Programmability
• Floating gate transistor
– Used in EPROM and EEPROM
• SRAM-controlled switch  Control
– Pass transistors
– Multiplexers (to determine how to route inputs)
• Antifuse
– Similar to fuse
– Originally an Open-Circuit
– One-Time Programmable (OTP)
37
Download