Accelerators to Applications Computer Organization Lesson Redesigning Memory By Increasing Access Width Brian Shelburne Associate Professor of Mathematics and Computer Science Wittenberg University 2012 Summary: Accessing 16-bit words from a 16K memory can be sped up if the memory is reorganized to access two adjacent 16-bit words per fetch. Prerequisites: This exercise is based on using the material found in the first five chapters of The Elements of Computing Systems: Building a Modern Computer from First Principles by Noam Nisan and Shimon Schocken, MIT Press (2005). In their text using the accompanying software (freely download from their web-site) the reader “builds” a virtual computer from the basic gate level up using a VDHL like language to design components which are submitted to Java based computer hardware simulator for testing and execution. By the 5th chapter the user has designed tested and assembled the components (memory plus CPU) for a fully functioning stored program computer. The software and accompanying materials are easy to use and flexible so that using them it is easy to develop one’s own applications. Familiarity with the Nisan & Schocken’s software and how the test scripts are used to check out circuits is assumed. An understanding of Boolean algebra, digital circuits and sequential logic is required. This module is based on a student having completed Project 03 (building and testing a 16K RAM memory) from Noam Nisan and Shimon Schocken’s “The Elements of Computing Systems: Building a Modern Computer from Frist Principles ” MIT Press 2005 and 2009. Information about Nisan & Schocken’s text and free downloadable software can be found at their companion website http://www1.idc.ac.il/tecs/ Familiarity with the Nisan & Schocken’s software and how the test scripts are used to check out circuits is assumed. Outcomes: Student will design and implement an alternate design for a 16K RAM of 16-bit words (completed in Project 03) which accesses two adjacent 16-bit words (32-bits) at a time. Required Resources: A knowledge of Nisan & Schockens’ HDL language and access to the software tools used in Project 3. Additional resources are found in Chapter 7.3: Chip Organization from Computer Architecture and Organization: An Integrated Approach, 2007 by M. Murdocca and V. Heuring. 1 Lesson Outline: In Project 3 a 16K RAM of 16-bit word was designed as follows 1. 2. 3. 4. 5. 6. A 16-bit register consisting of 16 flip-flops was designed (Register.hdl) An 8 register chip (RAM8.hdl) combined 8 registers Compare this chip’s organization to Fig 7.6 on page 254 of Murdocca & Heuring A 64 register chip (RAM64.hdl) combined 8 RAM8 chips A 512 register chip (RAM512.hdl) combined 8 RAM64 chips A 4K register chip (RAM4K.hdl) combined 8 RAM512 chips A 16K register chip (RAM16K) combined 4 RAM4K chips In Project 3 the interface of the RAM4K chip (part 5. above) is in[16] load RAM4K address[12] out[16] Internally there are eight RAM512 Chips where the most significant three bits of the 12 bit address are used to select the particular RAM512 chip. address[0..8] load address[12] D M U X 8 W A Y in[16] RAM 512 RAM 512 RAM 512 … RAM 512 RAM 512 MUX-8-WAY-16 address[9..11] out[16] 2 We can redesign the RAM4K chip in part 5 so that the in and out data buses are 32 bits wide. Label them respectively in1 and in0 and out1 and out0 (in1 and out 1 are to the left of in0 and out0; i.e. little endian). Double words are hard-aligned in that addresses are all even (least significant bit is 0) in1[16], in0[16] load RAM4Kw address[12] out1[16], out0[16] Internally the eight RAM512 chips are paired off forming 1K blocks where the left and right RAM512 hold the odd and even words for each block. Double words are hard-aligned in that addresses are all even (least significant bit is 0) so addresses are technically 11 bits (instead of 12). Note the parallelism. address[1..9] load address[12] D U X 4 W A Y in1[16] in0[16] RAM 512 RAM 512 RAM 512 RAM 512 RAM 512 RAM 512 RAM 512 RAM 512 MUX-4-WAY-16 MUX-4-WAY-16 address[10..11] out1[16] out0[16] Do the Following 5’. Modify step 5 above. From the eight 512 register chips (RAM512K.hdl) design a 4K register chip (RAM4Kw.hdl) that accesses adjacent 16-bit words. Use the following skeletal .hdl file. Note: Treat all l addresses as hard- aligned by ignoring the least significant bit (i.e. address[0]). /*** * * RAM4Kw.hdl * ***/ 3 CHIP RAM4Kw { IN in1[16], in0[16], load, address[12]; OUT out1[16], out0[16]; PARTS: // Implementation missing } 6’. Using four RAM4Kw chips design and implement a 32-bit wide 16 bit by 16K register. A simple modification of Step 6 above is needed to work with the 32-bit wide input/output ports. Note that the 32-bit I/O width is implemented in the RAM4Kw chips. /*** * * RAM16Kw.hdl * ***/ CHIP RAM16Kw { IN in1[16], in0[16], load, address[14]; OUT out1[16], out0[16]; PARTS: // Implementation missing } Resources The only tools you’ll need are the hardware simulator. All chips should be implemented in the HDL language. Skeletal .hdl files are provided. Contract Chip (HDL) Function Test Script Compare File RAM4Kw 32-bit wide 16-bit / 4096 register RAM4Kw.tst RAM4Kw.cmp memory RAM16Kw 32-bit wide 16 bit / 16384 register RAM16Kw.tst RAM16Kw.cmp memory Steps 0. 1. RAM4Kw.hdl will be constructed out of eight RAM512.hdl chips in1[16] is the high order 16-bits, in0[16] is the low order 16 bits of a 32-bit double word. A similar arrangement holds for out1[16] and out0[16]. 4 2. 3. 4. Use one DMux4Way chip as a front-end decoder to select adjacent RAM512 banks; use two Mux4Way chips to gate adjacent RAM512 banks to output. Addresses for the RAM4Kw chips are 12 bits but the least significant bit is ignored (all addresses are assumed to be even). The arrangement is hard aligned; misaligned accesses ignored as the least significant bit of all addresses is ignored. Addresses for RAM16Kw chips are 14 bits again with the least significant bit ignored (all addresses are assumed to be even). Afterword Unlike the Carry Look-Ahead Adder from Project 02A, since the interface of the redesigned 16K RAM is different from the 16K RAM from Project 03, it cannot be used as the memory for the Computer Chip that is designed and tested in Chapter 05 of Nisan and Schocken. A2AProject02.doc 5/30/2012 5