Introducing Computer Systems from a Programmer’s Perspective Randal E. Bryant, David R. O’Hallaron Computer Science and Electrical Engineering Carnegie Mellon University 1200 1000 800 600 400 s16 16m s14 s15 s12 Size s13 s10 s11 s8 s9 s5 s6 Stride s7 s3 s4 0 8m 4m 2m 1024k 512k 256k 128k 64k 32k 16k 8k 4k 2k 1k .5k 200 s1 s2 MB/s Outline Introduction to Computer Systems Course taught at CMU since Fall, 1998 Some ideas on labs, motivations, … Computer Systems: A Programmer’s Perspective Our textbook Ways to use the book in different courses The Role of Systems Design in CS/Engineering Curricula –2– ICS Background 1995-1997: REB/DROH teaching computer architecture course at CMU. Good material, dedicated teachers, but students hate it Don’t see how it will affect there lives as programmers Course Evaluations 5 4.5 CS Average 4 3.5 REB: Computer Architecture 3 2.5 2 1995 –3– 1996 1997 1998 1999 2000 2001 2002 ICS Computer Arithmetic Builder’s Perspective 32-bit Multiplier –4– How to design high performance arithmetic circuits ICS Computer Arithmetic Programmer’s Perspective void show_squares() { int x; for (x = 5; x <= 5000000; x*=10) printf("x = %d x^2 = %d\n", x, x*x); } x x x x x x x = 5 = 50 = 500 = 5000 = 50000 = 500000 = 5000000 x2 x2 x2 x2 x2 x2 x2 = = = = = = = 25 2500 250000 25000000 -1794967296 891896832 -1004630016 Numbers are represented using a finite word size Operations can overflow when values too large But behavior still has clear, mathematical properties –5– ICS Memory System Builder’s Perspective Builder’s Perspective Direct mapped or set indexed? Write through or write back? Regs CPU L1 d-cache L1 i-cache L2 unified cache How many lines? –6– Synchronous or asynchronous? Main memory Disk Virtual or physical indexing? Must make many difficult design decisions Complex tradeoffs and interactions between components ICS Memory System Programmer’s Perspective void copyij(int int { int i,j; for (i = 0; i for (j = 0; dst[i][j] } src[2048][2048], dst[2048][2048]) < 2048; i++) j < 2048; j++) = src[i][j]; 59,393,288 clock cycles void copyji(int int { int i,j; for (j = 0; j for (i = 0; dst[i][j] } < 2048; j++) i < 2048; i++) = src[i][j]; 1,277,877,876 clock cycles 21.5 times slower! src[2048][2048], dst[2048][2048]) (Measured on 2GHz Intel Pentium 4) Hierarchical memory organization Performance depends on access patterns Including how step through multi-dimensional array –7– ICS The Memory Mountain Read throughput (MB/s) 1200 Pentium III Xeon 550 MHz 16 KB on-chip L1 d-cache 16 KB on-chip L1 i-cache 512 KB off-chip unified L2 cache copyij 1000 L1 800 copyji 600 400 xe L2 200 –8– 2k 8k 32k 128k 512k 2m 8m s15 s13 s9 s11 Stride (words) Mem s7 s5 s3 s1 0 Working set size (bytes) ICS Background (Cont.) 1997: OS instructors complain about lack of preparation Students don’t know machine-level programming well enough What does it mean to store the processor state on the run- time stack? –9– Our architecture course was not part of prerequisite stream ICS Birth of ICS 1997: REB/DROH pursue new idea: Introduce them to computer systems from a programmer's perspective rather than a system designer's perspective. Topic Filter: What parts of a computer system affect the correctness, performance, and utility of my C programs? 1998: Replace architecture course with new course: 15-213: Introduction to Computer Systems Curriculum Changes – 10 – Sophomore level course Eliminated digital design & architecture as required courses for CS majors ICS 15-213: Intro to Computer Systems Goals Teach students to be sophisticated application programmers Prepare students for upper-level systems courses Taught every semester to 150 students 50% CS, 40% ECE, 10% other. Part of the 4-course CMU CS core: – 11 – Data structures and algorithms (Java) Programming Languages (ML) Systems (C/IA32/Linux) Intro. to theoretical CS ICS ICS Feedback Students Course Evaluations 5 REB: Intro. Comp. Systems 4.5 CS Average 4 3.5 REB: Computer Architecture 3 2.5 2 1995 1996 1997 1998 1999 2000 2001 2002 Faculty – 12 – Prerequisite for most upper level CS systems courses Also required for ECE embedded systems, architecture, and network courses ICS Lecture Coverage Data representations [3] It’s all just bits. int’s are not integers and float’s are not reals. IA32 machine language [5] Analyzing and understanding compiler-generated machine code. Program optimization [2] Understanding compilers and modern processors. Memory Hierarchy [3] Caches matter! Linking [1] – 13 – With DLL’s, linking is cool again! ICS Lecture Coverage (cont) Exceptional Control Flow [2] The system includes an operating system that you must interact with. Measuring performance [1] Accounting for time on a computer is tricky! Virtual memory [4] How it works, how to use it, and how to manage it. I/O and network programming [4] Programs often need to talk to other programs. Application level concurrency [2] Processes, I/O multiplexing, and threads. Total: 27 lectures, 14 week semester. – 14 – ICS Labs Key teaching insight: Cool Labs Great Course A set of 1 and 2 week labs define the course. Guiding principles: – 15 – Be hands on, practical, and fun. Be interactive, with continuous feedback from automatic graders Find ways to challenge the best while providing worthwhile experience for the rest Use healthy competition to maintain high energy. ICS Lab Exercises Data Lab (2 weeks) Manipulating bits. Bomb Lab (2 weeks) Defusing a binary bomb. Buffer Lab (1 week) Exploiting a buffer overflow bug. Performance Lab (2 weeks) Optimizing kernel functions. Shell Lab (1 week) Writing your own shell with job control. Malloc Lab (2-3 weeks) Writing your own malloc package. Proxy Lab (2 weeks) – 16 – Writing your own concurrent Web proxy. ICS Bomb Lab Idea due to Chris Colohan, TA during inaugural offering Bomb: C program with six phases. Each phase expects student to type a specific string. Wrong string: bomb explodes by printing BOOM! (- 1/4 pt) Correct string: phase defused (+10 pts) In either case, bomb sends mail to a spool file Bomb daemon posts current scores anonymously and in real time on Web page Goal: Defuse the bomb by defusing all six phases. For fun, we include an unadvertised seventh secret phase The kicker: – 19 – Students get only the binary executable of a unique bomb To defuse their bomb, students must disassemble and ICS reverse engineer this binary Properties of Bomb Phases Phases test understanding of different C constructs and how they are compiled to machine code Phase 1: string comparison Phase 2: loop Phase 3: switch statement/jump table Phase 4: recursive call Phase 5: pointers Phase 6: linked list/pointers/structs Secret phase: binary search (biggest challenge is figuring out how to reach phase) Phases start out easy and get progressively harder – 20 – ICS Let’s defuse a bomb phase! 08048b48 <phase_2>: ... # function prologue not shown 8048b50: mov 0x8(%ebp),%edx 8048b53: add $0xfffffff8,%esp 8048b56: lea 0xffffffe8(%ebp),%eax 8048b59: push %eax 8048b5a: push %edx 8048b5b: call 8048f48 <read_six_nums> 8048b60: 8048b68: mov lea 8048b70: 8048b74: 8048b77: 8048b7a: 8048b7c: 8048b81: 8048b82: 8048b85: mov 0xfffffffc(%esi,%ebx,4),%eax add $0x5,%eax cmp %eax,(%esi,%ebx,4) je 8048b81 <phase_2+0x39> call 804946c <explode_bomb> inc %ebx cmp $0x5,%ebx jle 8048b70 <phase_2+0x28> ... # function epilogue not shown ret 8048b8f: – 21 – $0x1,%ebx 0xffffffe8(%ebp),%esi # edx = &str # eax = &num[] on stack # push function args # rd 6 ints from str 2 num # i = 1 # esi = &num[] on stack # # # # # # # # LOOP: eax = num[i-1] eax = num[i-1] + 5 if num[i-1] + 5 == num[i] then goto OK: else explode! OK: i++ if (i <= 5) then goto LOOP: # YIPPEE! ICS Source Code for Bomb Phase /* * phase2b.c - To defeat this stage the user must enter arithmetic * sequence of length 6 and delta = 5. */ void phase_2(char *input) { int ii; int numbers[6]; read_six_numbers(input, numbers); for (ii = 1; ii < 6; ii++) { if (numbers[ii] != numbers[ii-1] + 5) explode_bomb(); } } – 22 – ICS The Beauty of the Bomb For the Student Get a deep understanding of machine code in the context of a fun game Learn about machine code in the context they will encounter in their professional lives Working with compiler-generated code Learn concepts and tools of debugging Forward vs backward debugging Students must learn to use a debugger to defuse a bomb For the Instructor – 23 – Self-grading Scales to different ability levels Easy to generate variants and to port to other machines ICS ICS Summary Proposal Introduce students to computer systems from the programmer's perspective rather than the system builder's perspective Themes What parts of the system affect the correctness, efficiency, and utility of my C programs? Makes systems fun and relevant for students Prepare students for builder-oriented courses Architecture, compilers, operating systems, networks, distributed systems, databases, … Since our course provides complementary view of systems, does not just seem like a watered-down version of a more advanced course Gives them better appreciation for what to build – 35 – ICS Fostering “Friendly Competition” Desire Challenge the best without blowing away everyone else Method Web-based submission of solutions Server checks for correctness and computes performance score How many stages passed, program throughput, … Keep updated results on web page Students choose own nom de guerre Relationship to Grading – 36 – Students get full credit once they reach set threshold Push beyond this just for own glory/excitement ICS Shameless Plug http://csapp.cs.cmu.edu Published August, 2002 – 37 – ICS CS:APP Vital stats: 13 chapters 154 practice problems (solutions in book), 132 homework problems (solutions in IM) 410 figures, 249 line drawings 368 C code example, 88 machine code examples Turn-key course provided with book: – 38 – Electronic versions of all code examples. Powerpoint, EPS, and PDF versions of each line drawing Password-protected Instructors Page, with Instructor’s Manual, Lab Infrastructure, Powerpoint lecture notes, and Exam problems. ICS Adoptions Adoptions May, 2006 – 39 – Research universities: Prepare students for advanced courses Small colleges: Only systems course ICS Translations – 40 – ICS Coverage Material Used by ICS at CMU Pulls together material previously covered by multiple textbooks, system programming references, and man pages Greater Depth on Some Topics IA32 floating point Dynamic linking Thread programming Additional Topic – 41 – Computer Architecture Added to cover all topics in “Computer Organization” course ICS The Evolving CS & Engineering Curriculum Programming Lies at the Heart of Most Modern Systems Computer systems Embedded devices: Cell phones, automobile controls, … Electronics: DSPs, programmable controllers Programmers Have to Understand Their Machines and Their Limitations Correctness: computer arithmetic, storage allocation Efficiency: memory & CPU performance Knowing How to Build Systems Is Not the Way to Learn How to Program Them – 45 – It’s wasteful to teach every computer scientist how to design a microprocessor ICSof Knowledge of how to build does not transfer to knowledge