ECE2030 Introduction to Computer Engineering Lecture 17: Memory and Programmable Logic Prof. Hsien-Hsin Sean Lee School of Electrical and Computer Engineering Georgia Tech Memory • Random Access Memory (RAM) – Contrary to Serial Access Memory (e.g. Tape) – Static Random Access Memory (SRAM) • Data stored so long as Vdd is applied • 6-transistors per cell • Faster • Differential – Dynamic Random Access Memory (DRAM) • Require periodic refresh • Smaller (can be implemented with 1 or 3 transistor) • Slower • Single-Ended – Can be read and written – Typically, addressable at byte granularity • Read-Only Memory (ROM) 2 Block Diagram of Memory K-bit address lines N N-bit Data Input (for Write) Memory Unit K Read/Write Chip Enable 2k words N-bit per word N N-bit Data Output (for Read) • Example: 2MB memory, byte-addressable – N = 8 (because of byte-addressability) – K = 21 (1 word = 8-bit) 3 Static Random Access Memory (SRAM) Wordline (WL) BitLine BitLine • Typically each bit is implemented with 6 transistors (6T SRAM Cell) • During read, the bitline and its inverse are precharged to Vdd (1) before set WL=1 • During write, put the value on Bitline and its inverse on Bitline_bar before set WL=1 4 Dynamic Random Access Memory (DRAM) Wordline (WL) Bitline • • • • 1-transistor DRAM cell During a write, put value on bitline and then set WL=1 During a read, precharge bitline to Vdd (1) before assert WL to 1 Storage decays, thus requires periodic refreshing (read-sense-write) 5 Memory Description • Capacity of a memory is described as – # addresses x Word size – Examples: Memory # of addr # of data lines # of addr lines # of total bytes 1M x 8 1,048,576 8 20 1 MB 2M x 4 2,097,152 4 21 1 MB 1K x 4 1024 4 10 512 B 4M x 32 4,194,304 32 22 16 MB 16K x 64 16,384 64 14 128 KB 6 How to Address Memory 4x8 Memory 2-to-4 0 Decoder A0 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1 2 A1 3 CS Chip Select D7 D6 D5 D4 D3 D2 D1 D0 7 How to Address Memory 4x8 Memory 2-to-4 0 Decoder A0=1 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1 2 A1=0 3 CS Chip Select=1 D7 D6 D5 D4 D3 D2 D1 D0 Access address = 0x1 8 Use 2 Decoders 2-to-4 8x4 Memory 0 Decoder A1 Row Decoder 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1 2 A2 3 CS Chip Select 0 1 CS 1-to-2 Decoder Column Decoder A0 D0 D1 D2 D3 Tristate Buffer (read) 9 Tristate Buffer En Input Output Input Output En Vdd Input En En Output • Similar to Transmission Gate • Could amplify signal (in contrast to a TG) • Typically used for signal traveling, e.g. bus CMOS circuit 10 Bi-directional Bus using Tri-state Buffer Direction (control data flow for read/write) A Input/Output B 11 Read/Write Memory 8x4 Memory 0 A1 2-to-4 Row Decoder A2 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1 2 3 CS Rd/Wr = 0 Chip Select = 0 CS 0 1 1-to-2 Column Decoder A0 D0 D1 D2 D3 12 Read/Write Memory 8x4 Memory 0 A1 2-to-4 Row Decoder A2 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1 2 3 CS Rd/Wr = 1 Chip Select = 1 CS 0 1 1-to-2 Column Decoder A0 D0 D1 D2 D3 13 Building Memory in Hierarchy • Design a 1Mx8 using 1Mx4 memory chips D7 1Mx4 CS A19 A18 A17 R/W A19 A18 A17 CS A0 D5 D4 D3 1Mx4 A0 D6 CS R/W D2 D1 D0 14 Building Memory in Hierarchy • Design a 2Mx4 using 1Mx4 memory chips Note that 1-to-2 decoder is the wire itself (or use an inverter) A19 A18 A17 1-to-2 Decoder CS D2 1Mx4 A0 A20 D3 CS R/W D1 D0 1 0 A19 A18 A17 1Mx4 A0 CS R/W 15 Building Memory in Hierarchy • Design a 2Mx8 using 1Mx4 memory chips A19 A18 A17 A0 A20 1-to-2 Decoder 1 A19 A18 A17 A0 A19 A18 A17 1Mx4 CS R/W A19 A18 A17 A0 A19 A18 A17 A0 D6 D5 D4 D3 1Mx4 0 A0 CS D7 CS R/W D2 D1 D0 1Mx4 CS R/W 1Mx4 CS R/W 16 Memory Model • 32-bit address space can address up to 4GB (232) different memory locations 0x00000000 0x0A 0x00000001 0xB6 0x00000002 0x41 0x00000003 0xFC 0xFFFFFFFF 0x0D Lower Memory Address Higher Memory Address Flat Memory Model 17 Endianness [Danny Cohen 91] • Byte ordering How a multiple byte data word stored in memory • Endianness (from Gulliver’s Travels) – Big Endian • Most significant byte of a multi-byte word is stored at the lowest memory address • e.g. Sun Sparc, PowerPC – Little Endian • Least significant byte of a multi-byte word is stored at the lowest memory address • e.g. Intel x86 • Some embedded & DSP processors would support both for interoperability 18 Endianness Examples • Store 0x87654321 at address 0x0000, byte-addressable 0x0000 0x87 0x0001 0x65 0x0002 0x0003 Lower Memory Address 0x0000 0x21 0x0001 0x43 0x43 0x0002 0x65 0x21 0x0003 0x87 Higher Memory Address BIG ENDIAN Lower Memory Address Higher Memory Address LITTLE ENDIAN 19 Memory Allocation (Little Endian) declare: .data .globl declare .align 0 .word 511 .byte 14 .align 2 .byte 14 .word 0x0B1E8143 .align 2 .ascii “GAece” .half 10 .word 0x2B1E8145 .space 1 .byte 52 .align 1 .byte 16 .space 2 .byte 67 0 0xFF e ------ 1c 0x34 1 0x01 f ------ 1d ------ 2 0x00 10 0x47 1e 0x10 3 0x00 11 0x41 1f 4 0x0E 12 0x65 20 5 ------ 13 0x63 21 6 ------ 14 0x65 7 ------ 15 0x0A 8 0x0E 16 0x00 9 0x43 17 0x45 a 0x81 18 0x81 b 0x1E 19 0x1E c 0x0B 1a 0x2B d ------ 1b 0x43 .align N: Align next datum on a 2n byte boundary .align 0: turn off automatic alignment for .half, .word, .float, and .double till the next .data directive .word: 4 bytes .half: 2 bytes .byte: 1 byte .space: 1-byte space .ascii: ASCII code (American Standard Code for Information Interchange) 20 Read Only Memory (ROM) • “Permanent” binary information is stored • Non-volatile memory – Power off does not erase information stored K-bit address lines K ROM 2k words N-bit per work N-bit Data Output N 21 32x8 ROM 5 A4 A3 A2 A1 5-to-32 32x8 ROM 8 Each represents 32 wires 0 1 2 3 Decoder A0 Fuse can be implemented as a diode or a pass transistor 28 29 30 31 D7 D6 D5 D4 D3 D2 D1 D0 22 Programming the 32x8 ROM A4 A3 A2 A1 A0 D7 D6 D5 D4 D3 D2 D1 D0 0 0 0 0 0 1 1 0 0 0 1 0 1 0 0 0 0 1 1 0 0 0 1 0 1 1 0 0 0 1 0 1 0 1 1 0 0 0 0 … … … … … … … … … … … … … 1 1 1 0 1 0 0 0 1 0 0 0 0 1 1 1 1 0 0 1 0 1 0 1 1 0 1 1 1 1 1 1 1 1 0 0 0 0 1 A4 A3 A2 A1 A0 5-to-32 0 1 2 Decoder 29 30 31 D7 D6 D5 D4 D3 D2 D1 D0 23 Example: Lookup Table • Design a square lookup table for F(X) = X2 using ROM X F(X)=X2 X F(X)=X2 0 0 000 000000 1 1 001 000001 2 4 010 000100 3 9 011 001001 4 16 100 010000 5 25 101 011001 6 36 110 100100 7 49 111 110001 24 Square Lookup Table using ROM 0 1 X F(X)=X2 000 000000 001 000001 X1 010 000100 X0 011 001001 6 100 010000 7 101 011001 110 100100 111 110001 X2 3-to-8 2 3 Decoder 4 5 F5 F4 F3 F2 F1 F0 25 Square Lookup Table using ROM 0 1 X F(X)=X2 000 000000 001 000001 X1 010 000100 X0 011 001001 6 100 010000 7 101 011001 110 100100 111 110001 X2 3-to-8 2 3 Decoder 4 5 F5 F4 F3 F2 F1 F0 Not Used = X0 26 Square Lookup Table using ROM 0 1 X F(X)=X2 000 000000 001 000001 X1 010 000100 X0 011 001001 6 100 010000 7 101 011001 110 100100 111 110001 X2 3-to-8 2 3 Decoder 4 5 F5 F4 F3 F2 F1 F0 27 Classifying Three Basic PLDs INPUT Fixed AND plane (decoder) Programmable Connections Programmable OR plane OUTPUT (Programmable) Read-Only Memory (ROM) INPUT Programmable AND plane Programmable Connections Programmable OR plane OUTPUT Programmable Logic Array (PLA) INPUT Programmable AND plane Fixed OR plane F/F OUTPUT Programmable Array Logic (PAL) Devices PAL: trademark of AMD, use PAL as an adjective or expect to receive a letter from AMD’s lawyers 28 Programmable Logic Array (PLA) A Programmable OR Plane B C Programmable AND Plane C C B B A A F2 29 Example using PLA F1(A, B, C) m(0,1,2,4) F2(A, B, C) m(0,5,6,7) F1 A B AC BC F1 AB AC BC F2 AB AC A BC 30 Example using PLA A F1 AB AC BC B F2 AB AC ABC C AB AC BC ABC C C B B A A F1 F2 31 PAL Device A A B B IO1 IO2 IO1 IO1 IO1 Programmable AND Plane A IO2 B Fixed OR Plane 32 PAL Device Design Example A A B B C C D D IO1 IO1 IO1 Not programmed A IO2 B IO1 ABC ABCD IO2 ABC ABCD ACD ABCD 33 CPLD and FPGA [Brown&Rose 96] • Complex Programmable Logic Device (CPLD) – Multiple PLDs (e.g. PALs, PLAs) with programmable interconnection structure – Pioneered by Altera • Field-Programmable Gate Array (FPGA) – High logic capacity with large distributed interconnection structure • Logic capacity number of 2-input NAND gates – Offers more narrow logic resources • CPLD offers logic resources w/ a wide number of inputs (AND planes) – Offer a higher ratio of Flip-flops to logic resources than CPLD • HCPLD (High Capacity PLD) is often used to refer to both CPLD and FPGA 34 CPLD structure Logic block PLD PLD PLD PLD I/O block Interconnects PLD PLD PLD PLD 35 FPGA Structure Logic block I/O block Interconnects 36 FPGA Programmability • Floating gate transistor – Used in EPROM and EEPROM • SRAM-controlled switch Control – Pass transistors – Multiplexers (to determine how to route inputs) • Antifuse – Similar to fuse – Originally an Open-Circuit – One-Time Programmable (OTP) 37