CSE 431 Computer Architecture Fall 2008 Read Me Mary Jane Irwin ( www.cse.psu.edu/~mji ) www.cse.psu.edu/~cg431 [Adapted from Computer Organization and Design, 4th Edition, Patterson & Hennessy, © 2008, MK] CSE431 ReadMe.1 Irwin, PSU, 2008 Permissions to Use Conditions The slides have evolved over a period of ten+ years, originating with slides developed by Dave Patterson for the first edition of Computer Organization and Design. Since then, they have gone through extensive revisions (both by me and by various UCB faculty). With the publication of the 4rd edition, I again revised and updated the entire slide set to coincide with the new edition. They are all now in Microsoft ppt 2007, if you are still using 2003, you’re out of luck Permission is granted to copy and distribute and/or alter and distributed this slide set for educational purposes only, provided that the complete bibliographic citation and following credit line is included: "Adapted from Computer Organization and Design, 4th Edition, Patterson & Hennessy, © 2008.” This material may not be copied or distributed for commercial purposes without express written permission of the copyright holders. I also ask that you acknowledge my (considerable) efforts in some way. One way is to acknowledge that your slides are adapted from mine on the first slide of each lecture that you use - or to retain my copyright if the slides are simply copied and distributed. CSE431 ReadMe.2 Irwin, PSU, 2008 CSE 431 Course Details The slides set is for a 15 week (full semester) senior and first year graduate level course in computer science and engineering (CSE). It is a required course for both undergraduate computer engineering majors and undergraduate computer science majors. It is also taken by CSE graduate computer science and engineering students (usually early in their graduate career and often in preparation for PSU’s PhD candidacy exams). My goals for CSE 431 is that the student understand the organizational paradigms that determine the capabilities and performance of computer systems, the interactions between the computer’s architecture and its software so that future software designers (compiler writers, operating system designers, database programmers, …) can achieve the best cost-performance trade-offs and so that future architects understand the effects of their design choices on software applications. CSE431 ReadMe.3 Irwin, PSU, 2008 Course Structure CSE 431’s prerequisite is a sophomore level course in computer organization (which also uses Computer Organization and Design, Patterson & Hennessy, © 2008 Chapters 1 (parts of), 2, 3, and 4 (parts of) (along with one lecture on pipelining, one on caches and two on I/O)). Slides for this organization course, based on the 3nd edition (I won’t get around to updating those until Fall 2009), are also available. Since the students were supposed to already know this material, it was only reviewed (and quickly!) in CSE 431. CSE 331 is the course where they learn MIPS assembler and do a design of a simple MIPS processor in VHDL or verilog. So I assume they already know MIPS assembler and how the basic, single cycle MIPS datapath works. This made room in CSE 431 for three lectures on dynamic (superscalar) processors. I developed lectures on that material which is as consistent as I could make it with the text book based on the architecture defined in Guri Sohi’s paper in IEEE Trans. On Computing, Mar. 1990. The (rough) outline for CSE 431 is included (see the next slide). CSE431 ReadMe.4 Irwin, PSU, 2008 Course Outline Wk Topic COD 4 Reading 1 Introduction and performance metrics 1 2 MIPS ISA review 2 3 MIPS arithmetic review; floating point 3 4 MIPS datapath and control review 4.1-4.4 5 MIPS pipelined datapath, data and control hazards 4.5-4.9 6 A MIPS SS execution model 4.10-4.14, Sohi 7 SS fetch, decode, and register dataflow issues Sohi 8 Catch-up, review and midterm examination Midterm exam week 9 Memory hierarchies; cache basics review 5.1-5.2 10 Improving cache performance, cache coherence 5.3, 5.7-5.13 11 Architecture support for virtual memory 5.4-5.6 12 Disk systems, RAIDs; I/O systems 6 13 Multiprocessor intro; SMPs and MMPs and SMT 7.1-7.5 14 GPUs; Network connected multi’s, network topologies 7.6-7.8 15 Performance models; technology trends and future directions 7.9-7.14 CSE431 ReadMe.5 Irwin, PSU, 2008 Course Assignments/Grading In addition to homework problems selected from the Exercises included in the book, the students also do a series of simulation experiments using SimpleScalar. We selected five of the benchmarks (that we knew ran fairly quickly) and precompiled them for the students. They experimented with branch prediction for a single issue, in-order machine in the first SimpleScalar assignment They experimented with a multiple issue, out-or order machine and compared it to the results from the first set of experiments in the second assignment. They experimented with different cache sizes, line sizes, associativities, latencies, cache levels, etc. in the third assignment. They were given an baseline single issue, in-order machine with a baseline memory system and asked to come up with their “best” alternative design (with some prei-mposed constraints to limit the search space) in the fourth assignment. There was a single midterm (I don’t have the stamina any more to give and grade two exams during the semester) and a final exam. CSE431 ReadMe.6 Irwin, PSU, 2008 A Bit About the Slides Themselves These slides were being developed at the same time that the final draft of the book was being polished (as I prepare these ReadMe notes at the end of the semester, I just received my printed copy of the 4th Edition). I did not have access to the figures, pictures, tables, etc. in the book. Some of them have been recreated (by hand), others are from the web (with appropriate credits). Adding and/or replacing figures when the book figures are available might be advisable. The graphs included in the slides from the book have been converted into powerpoint “comic” graphs so that they could be animated effectively. The data used to construct the graph is only approximate; it was constructed to make the graphs as similar to the graphs in the book as possible (once again, I didn’t have the actual data to work from). The notes section contains backup for each slide from the book and, in places, from the original UCB slides. Each slide set ends with a reminder slide which you will want to replace with your own set of reminders to the students ! CSE431 ReadMe.7 Irwin, PSU, 2008 Keeping Your Students Awake and Involved One thing that I have started doing so I can call on students by name (and maybe learn their names) is to make name tents for each student which they pick up at the beginning of class and return at the end of class (those that aren’t picked up I collect during the lecture so I can “take roll” off-line – yes, ugh, I now do that and give a small amount of credit for class attendance). I use the medium tent cards from Office Depot and print the name on both sides so they can also learn each others names! Throughout the set of slides, you will often see two slides that are almost identical. One is for the class handout and is missing some key points (it is usually marked in the notes section as “for class handout”). The other is for lecture (marked “for lecture”) where the key points are included and are animated to appear as students respond to questions posed to them in class. Put the “for lecture” slide in hide mode when preparing class handouts and the “for class handouts” in hide mode when preparing lectures A sample pair of slides follows CSE431 ReadMe.8 Irwin, PSU, 2008 Datapath with Forwarding Hardware PCSrc 1 ID/EX 0 EX/MEM Control IF/ID Add Shift left 2 4 PC Instruction Memory Read Address Add Read Addr 1 Data Memory Register Read Read Addr 2Data 1 File Write Addr Write Data 16 Sign Extend MEM/WB Branch ALU Read Data 2 1 Address Read Data 0 Write Data 0 32 1 ALU cntrl EX/MEM.RegisterRd 0 1 IF/ID.RegisterRs IF/ID.RegisterRt CSE431 ReadMe.10 Forward Unit MEM/WB.RegisterRd Irwin, PSU, 2008