20-755: The Internet Lecture 2: Computer Systems I David O’Hallaron School of Computer Science and Department of Electrical and Computer Engineering Carnegie Mellon University Institute for eCommerce, Summer 1999 Lecture 2, 20-755: The Internet, Summer 1999 1 Today’s lecture • • • • Administrative issues (10 minutes) Data (50 min) Break (10 min) Programs (40 min) Lecture 2, 20-755: The Internet, Summer 1999 2 Systems • • A system is a collection of interworking parts. Examples: – – – – – • • the human body the economy a car a stereo a computer Systems are often extremely complicated. How do we understand complex systems? – Abstraction is the key. Lecture 2, 20-755: The Internet, Summer 1999 3 Abstraction • Example: a lighting system light bulb switch A A B B Lecture 2, 20-755: The Internet, Summer 1999 4 Abstraction • Here’s one way to understand how our lighting system works: “Closing the switch induces a voltage drop between A and B, which causes current to flow through the light bulb, which heats up the filament, which causes the filament to emit light. Opening the switch eliminates the voltage differential, which stops the current, which causes the filament to cool and stop emitting light.” Lecture 2, 20-755: The Internet, Summer 1999 5 Abstraction • Here’s another way to understand our lighting system: “Close the switch and light turns on. Open the switch and the light turns off” • This is an example of an abstraction, where we describe the behavior of a system in terms of its inputs (the position of the switch) and its outputs (whether it is emitting light or not). Lecture 2, 20-755: The Internet, Summer 1999 6 Abstraction • • Abstraction is one of the most powerful weapons in the arsenal of computer science. Useful because it hides complexity. – High-level languages like C, Java, and Perl provide abstractions for low-level machine instructions. – Operating systems provide abstractions for resources such as the CPU, memory, and I/O devices. – TCP/IP provides an abstraction for collections of interconnected heterogeneous networks. • However, abstractions are most useful if we understand something about how things work. Lecture 2, 20-755: The Internet, Summer 1999 7 Typical computer system Keyboard Processor Interrupt controller Mouse Keyboard controller Modem Serial port controller Printer Parallel port controller Local/IO Bus Memory IDE disk controller SCSI controller Video adapter Network adapter Display Network SCSI bus disk disk Lecture 2, 20-755: The Internet, Summer 1999 cdrom 8 Bits • • All computer data (input, output, memory, and even programs) are collections of 1’s and 0’s called bits (binary digits). “0” and “1” are abstractions for voltage levels. – easy to store with bistable elements and can be reliably transmitted on noisy and innacurate wires. 0 1 0 3.3V 2.8V 0.5V 0.0V Lecture 2, 20-755: The Internet, Summer 1999 9 Powers of 10 • Def: 10p = 10 x 10 x ... x 10 (there are p 10’s) – “ten to the power p” or “the pth power of 10” • Examples of powers of 10 100 = 1 101 = 10 102 = 10 x 10 = 100 103 = 10 x 10 x 10 = 1,000 Lecture 2, 20-755: The Internet, Summer 1999 10 Decimal (base 10) numbers • • • 10 digits to choose from: 0, 1, ... , 8, 9. Each digit corresponds to a different power of 10. Examples: 345 = (3 x 102) + (4 x 101) + (5 x 100) = 300 + 40 + 5 132 = (1 x 102) + (3 x 101) + (2 x 100) = 100 + 30 + 2 Lecture 2, 20-755: The Internet, Summer 1999 11 Powers of 2 • Def: 2p = 2 x 2 x ... x 2 (there are p 2’s) – “two to the power p” or “the pth power of two” • Examples of powers of 2 20 = 1 21 = 2 22 = 2 x 2 = 4 23 = 2 x 2 x 2 = 8 24 = 2 x 2 x 2 x 2 = 16 Lecture 2, 20-755: The Internet, Summer 1999 12 Binary (base 2) numbers • • • 2 digits to choose from: 0, 1. Each digit corresponds to a different power of 2. Examples (converting from binary to decimal) 1012 = (1 x 22) + (0 x 21) + (1 x 20) = 4 + 0 + 1 = 5 0112 = (0 x 22) + (1 x 21) + (1 x 20) = 0 + 2 + 1 = 3 Lecture 2, 20-755: The Internet, Summer 1999 13 Converting from decimal to binary • converting decimal 5 to binary 5 / 2 = 2 rem 1 2 / 2 = 1 rem 0 1 / 2 = 0 rem 1 • first (righmost) binary digit is 1 second binary digit is 0 third binary digit is 1 So 5 = 1012 converting decimal 3 to binary 3 / 2 = 1 rem 1 1 / 2 = 0 rem 1 first (righmost) binary digit is 1 second binary digit is 1 So 3 = 112 = 0112 Lecture 2, 20-755: The Internet, Summer 1999 14 Counting in binary • With n bits, you can represent 2n numbers: 0, 1, ... , 2n - 1. • Example: 1 bit (21 = 2 numbers) 02 12 • (0) (1) Example: 2 bits (22 = 4 numbers) 002 012 102 112 (0) (1) (2) (3) Lecture 2, 20-755: The Internet, Summer 1999 15 Counting in binary (cont.) • Example: 3 bits (23 = 8 numbers) 0002 (0) 0012 (1) 0102 (2) 0112 (3) 1002 (4) 1012 (5) 1102 (6) 1112 (7) Lecture 2, 20-755: The Internet, Summer 1999 16 Practice problems Powers of two: Binary to decimal: Decimal to binary (a) 20 = (a) 00102 = (a) 8 = (b) 21 = (b) 01112 = (b) 9 = (c) 25 = (c) 10012 = (c) 12 = (d) 11112 = (d) 14 = Lecture 2, 20-755: The Internet, Summer 1999 17 Practice problems: Counting in binary (a) how many numbers can you represent using 4 bits? (b) What is the largest number? (c) write them out in binary, starting from 00002: Lecture 2, 20-755: The Internet, Summer 1999 18 Two’s complement representation of signed integers sign bit: 0: positive 1: negative positive integers negative integers Lecture 2, 20-755: The Internet, Summer 1999 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 (0) (1) (2) (3) (4) (5) (6) (7) (-8) (-7) (-6) (-5) (-4) (-3) (-2) (-1) 19 Interpreting two’s complement numbers • To negate a two’s complement number: – invert the bits (I.e., change each 0 to 1 and each 1 to 0). – add 1 to the result. • • Examples: – To negate 0 = 00002 : 11112+ 1 = 00002 – To negate -2 = 11102 : 00012 + 1 = 00102 You can use this property to determine the two’s complement representation of a negative number. – Example: -5 = inv(0101) + 1 = 1010 + 1 = 1011, where inv(x) inverts the bits of x. – Example: -12 = ? Lecture 2, 20-755: The Internet, Summer 1999 20 Hex (base-16) representation of binary numbers Binary 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 Hex Decimal 0 1 2 3 4 5 6 7 8 9 a b c d e f 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Lecture 2, 20-755: The Internet, Summer 1999 Hex is a compact way to represent binary numbers. Each hex digit represents a byte (8 bits) of data: • 011010102 = 6a16 • 10111110111011112 = beef16 In Perl, use the “0x” prefix to denote a hex number: x = 0x6a; y = 0xbeef; 21 Characters • • ASCII characters are represented in 8-bit chunks called bytes. Two types of characters: – printable characters: characters that can be typed on a keyboard (e.g., ‘d’, ‘%’h) – unprintable control characters (e.g., BEL,BS) Lecture 2, 20-755: The Internet, Summer 1999 22 ASCII character set hex 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 char esc NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM '\0' '\a' '\b' '\t' '\n' '\v' '\f' '\r' 1A 1B 1C 1D 1E 1F 20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D 2E 2F 30 31 32 33 SUB ESC FS GS RS US SPACE ! " # $ % & ’ ( ) * + , . / 0 1 2 3 Lecture 2, 20-755: The Internet, Summer 1999 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F 40 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M 4E 4F 50 51 52 53 54 55 56 57 58 59 5A 5B 5C 5D 5E 5F 60 61 62 63 64 65 66 67 N O P Q R S T U V W X Y Z [ \ ] ^ _ ' a b c d e f g '\\' 68 69 6A 6B 6C 6D 6E 6F 70 71 72 73 74 75 76 77 78 79 7A 7B 7C 7D 7E 7F h i j k l m n o p q r s t u v w x y z { | } ~ DEL 23 Computer memory Keyboard Processor Interrupt controller Mouse Keyboard controller Modem Serial port controller Printer Parallel port controller Local/IO Bus Memory IDE disk controller SCSI controller Video adapter Network adapter Display Network SCSI bus disk disk Lecture 2, 20-755: The Internet, Summer 1999 cdrom 24 Metrics for space and time • Space – Kilobyte (KB) = 210 » approx. 102 , “a thousand” – Megabyte (MB) = 220 » approx. 106, “a million” – Gigabyte (GB) = 230 » approx. 109, “a billion” – Terabyte (TB) = 240 » approx. 1012, “a trillion” Lecture 2, 20-755: The Internet, Summer 1999 • Time – millisecond (ms) = 10-3 s » .001 s » “a thousandth of a sec” – microsecond (us) = 10-6 s » .000001 s » “a millionth of a second” – nanosecond (ns) = 10-9 s » .000000001 s » “a billionth of a second” » 1 ns is the time it takes light to travel about 12 in! 25 Computer memory Bytes • • • Organized as a sequential array of bytes. Each byte has an integer address (location). Addresses start at 0. Lecture 2, 20-755: The Internet, Summer 1999 Addr 0000 0001 0002 0003 0004 0005 0006 0007 0008 0009 0010 0011 0012 0013 0014 0015 26 Words • Machine has “word size” – nominal size of numbers, including addresses. – Numbers, instructions, and addresses typically fractions or multiples of the word size. – Words are addressed by first byte. • Addr = 0000 PC-class machines are 32 bits – Limits addresses to 4 GB. – Becoming too small! • Words Newer server-class machines are 64 bits (e.g. DEC Alpha) – Limits addresses to 4 TB. – In 20 years, we’ll be complaining about this too! Lecture 2, 20-755: The Internet, Summer 1999 Addr = 0008 Addr = 0012 Bytes Addr 0000 0001 0002 0003 0004 0005 0006 0007 0008 0009 0010 0011 0012 0013 0014 0015 27 Strings • • • A string is represented in memory as a sequence of bytes terminated by 0x00 (‘\0’). Known as a “null-terminated string” Example (Perl): – $x = “Hi Dave!\n”; Bytes ‘H’ ‘I’ SPACE ‘D’ ‘a’ ‘v’ ‘e’ ‘\0’ Lecture 2, 20-755: The Internet, Summer 1999 Addr 0000 0001 0002 0003 0004 0005 0006 0007 0008 0009 0010 0011 0012 0013 0014 0015 28 Data Representation Summary • Key concepts: – It’s all just bits! » everything in a computer is represented as a collection of bits that are interpreted in different ways. – memory is organized as a sequence of bytes » each byte in memory has its own address. – each machine has nominal word size » numbers are fractions or multiples of words » the address of a word is the address of its first byte • floating point representation is used for nonintegral numbers: – e.g., 3.14159 – too complex for us to study in this course. Lecture 2, 20-755: The Internet, Summer 1999 29 Break time! (10 mins) Lecture 2, 20-755: The Internet, Summer 1999 30 Today’s lecture • • • • Administrative issues (10 minutes) Data (50 min) Break (10 min) Programs (40 min) Lecture 2, 20-755: The Internet, Summer 1999 31 Programs Keyboard Processor Interrupt controller Mouse Keyboard controller Modem Serial port controller Printer Parallel port controller Local/IO Bus Memory IDE disk controller SCSI controller Video adapter Network adapter Display Network SCSI bus disk disk Lecture 2, 20-755: The Internet, Summer 1999 cdrom 32 Processor components Processor Memory Addresses P C ALU Register File Data Instructions Object Code Program Data • PC (Program Counter) » Contains address of the next instruction • ALU (Arithmetic/Logic Unit) – addition, subtraction, etc. • Register File » Small fast internal memory for heavily used program data, typically 16, 32, or 64 locations (called registers), each of which holds 1 word. Register name Register file R0 R1 R2 ... R30 R31 • Memory » Contains both object code and data Lecture 2, 20-755: The Internet, Summer 1999 33 Object code • • • Processor simply executes one machine language instruction after another until you unplug it. A program (or code) is a related set of these instructions. Key idea: Programs are stored in memory as simply another kind of data – variously called object code, machine code, object program. • Example: – on the DEC Alpha, the word 0x42110401 is the bit pattern for an instruction that adds the contents of two registers and stores the result in a third register. Lecture 2, 20-755: The Internet, Summer 1999 34 Assembly Language • machine language instructions are represented in ASCII text form as assembly language instructions. • – Add the integers in registers 16 and 17 and store the result in register 1. addq r16,r17,r1 • 0x1200012d0: 0x42110401b Lecture 2, 20-755: The Internet, Summer 1999 Assembly Object Code – 32-bit pattern – Stored at address 0x1200012d0 35 Basic processor operation • Fetch – load an instruction from memory, using the address contained in the PC. • Execute – load word from memory to register file – store word from register file to memory – perform an arithmetic operation (e.g., addition) on the contents of two registers and store the results in a register. • Update PC – if Branch instruction » set PC to some new address – if not Branch instruction » increment PC to point to next sequential instruction » ex: PC <-- PC + 4 Lecture 2, 20-755: The Internet, Summer 1999 36 Basic instructions: load • Move a word from memory to a register. – example: ld r1, addr – move word at address addr (Mem[addr]) to register r1. – increment PC with address of next instruction. Processor Memory addr P C ALU Register File PC <-- PC + 4 Lecture 2, 20-755: The Internet, Summer 1999 Mem[addr] Object Code Program Data Mem[PC] “ld r1, addr” 37 Basic instructions: store • Move a word from a register to memory – example: st r1, addr – move contents of register r1 (Reg[r1]) to memory address addr. – increment PC with address of next instruction Processor Memory <addr> P C ALU Register File PC <-- PC + 4 Lecture 2, 20-755: The Internet, Summer 1999 Reg[r1] Object Code Program Data Mem[PC] “st r1, addr” 38 Basic instructions: arithmetic operations • Add the contents of two registers and store the result in a third register – example: add r16,r17,r1 – add contents of r16 and r17 and store the results in r1. – increment PC with address of next instruction Processor P C ALU Memory Register File PC <-- PC + 4 Object Code Program Data Mem[PC] “add r16, r17, r1” Lecture 2, 20-755: The Internet, Summer 1999 39 Basic instructions: branch • branch to a new location in the program – example: branch addr – set the PC to addr Processor P C ALU Memory Register File PC <-- addr Lecture 2, 20-755: The Internet, Summer 1999 Object Code Program Data Mem[PC] “branch addr” 40 Altering the control flow • • Changing the default value of the PC (I.e., pointing to the next instruction in memory) is called altering the control flow. There are two mechanisms for altering the control flow: – executing branch instructions – exceptions » crucial mechanism for modern multitasking operating systems. Lecture 2, 20-755: The Internet, Summer 1999 41 Exceptions and interrupts Keyboard Processor Interrupt controller Mouse Keyboard controller Modem Serial port controller Printer Parallel port controller Local/IO Bus Memory IDE disk controller SCSI controller Video adapter Network adapter Display Network SCSI bus disk disk Lecture 2, 20-755: The Internet, Summer 1999 cdrom 42 Exceptions • An exception is a transfer of control to the OS in response to some event (i.e. change in processor state) User Process event Operating System exception exception return (optional) Lecture 2, 20-755: The Internet, Summer 1999 exception processing by exception handler (also called a device driver or an interrupt handler) 43 Internal (CPU) exceptions • Internal exceptions occur as a result of events generated by executing instructions. – Execution of a SYSCALL instruction. » allows a program to ask for OS services (e.g., timer updates) – Execution of a BREAK instruction » used by debuggers – Errors during instruction execution » arithmetic overflow, address error, parity error, undefined instruction – Events that require OS intervention » virtual memory page fault Lecture 2, 20-755: The Internet, Summer 1999 44 External (I/O) exceptions (or I/O interrupts) • External exceptions occur as a result of events generated by devices external to the processor (managed by interrupt controller). – I/O interrupts » hitting ^C at the keyboard » arrival of a packet from a network » arrival of data from a disk – Hard reset interrupt » hitting the reset button – Soft reset interrupt » hitting ctl-alt-delete on a PC Lecture 2, 20-755: The Internet, Summer 1999 45 High-level languages • High-level languages like C/C++, Java, and Perl provide an abstraction for the low-level details of machine-language programs. C Code long foo(long a, long b, long c) { long sum = (a+a+b)*c; return(sum); } Lecture 2, 20-755: The Internet, Summer 1999 Corresponding machine/assembly code foo: 0x42100400 0x40110400 0x4c120400 0x6bfa8001 addq addq mulq ret a0,a0,v0 v0,a1,v0 v0,a2,v0 zero,(ra),1 46 Procedures • • Procedures (or functions) are named collections of commonly executed instruction sequences. Crucial abstraction mechanism in every programming language at every level. – take some input, produce some output Lecture 2, 20-755: The Internet, Summer 1999 47 Perl procedure example # definition of function say # prints first parameter followed by second sub say { print “$_[0], $_[1]!\n”; } # # main body of Perl program # From “Learning Perl”, page 95. # first invocation of s say(“hello”,“world”); # prints “hello, world!” # second invocation of say $x = “hi there”; $y = “Dave”; say($x, $y); # prints “hi there, Dave!” Lecture 2, 20-755: The Internet, Summer 1999 48 Compiled vs interpreted programs • Compiled programs: – translated in a series of steps by a compiler, assembler, and linker from an ASCII source program written in a high level language to a binary executable... – and then loaded into memory and executed by a loader. – Examples: C/C++ compilers, Java Just-in-Time (JIT) compilers • Interpreted programs: – executing interpreter reads the ASCII source program and executes its statements. – Examples: Java virtual machine (JVM), Perl interpreter. • Compiled object code is to processor as interpreted source program is to interpreter. Lecture 2, 20-755: The Internet, Summer 1999 49 Compiled programs ASCII text files C program (p1.c p2.c) “source program” Compiler ASCII text files Asm program (p1.s p2.s) Assembler binary files Object program (p1.o p2.o) libraries (.a) Linker binary file Executable object program (p) “executable program” Loader memory Executing process (p) Lecture 2, 20-755: The Internet, Summer 1999 50 Interpreted programs Perl script (foo.pl) Executing Perl interpreter Lecture 2, 20-755: The Internet, Summer 1999 51 Pros and cons of compiled and interpreted programs • Efficiency – compiled programs are more efficient » instructions are executed directly by hardware » interpreted programs can be orders of magnitude slower than compiled programs. » one of the current problems with Java. • Ease of Use – Interpreted programs are easier to write » compiled code: edit, compile, link, execute cycle » interpreted code: edit, execute cycle Lecture 2, 20-755: The Internet, Summer 1999 52 • Processes A process is an instance of a running program – runs concurrently with other processes (multitasking) – managed by a shared piece of OS code called the kernel » kernel is not a separate process, but rather runs as part of some user process – each process has its own data space and process id (pid) – data for each process protected from other processes Process A Process B user code Time kernel code user code Just a stream of instructions! kernel code user code Lecture 2, 20-755: The Internet, Summer 1999 53 Unix process hierarchy [0] init [1] Daemon e.g. httpd shell child child grandchild Lecture 2, 20-755: The Internet, Summer 1999 child grandchild 54 Unix Startup: Step 1 1. Pushing reset button loads the PC with the address of a small bootstrap program. 2. Bootstrap program loads the boot block (disk block 0). 3. Boot block program loads kernel (e.g., /vmunix) 4. Boot block program passes control to kernel. 5. Kernel handcrafts the data structures for process 0. [0] init [1] Lecture 2, 20-755: The Internet, Summer 1999 process 0: handcrafted kernel process process 1: user mode process fork() and exec(/sbin/init) 55 Unix Startup: Step 2 [0] /etc/inittab Daemons e.g. snmp init [1] getty Lecture 2, 20-755: The Internet, Summer 1999 init forks new processes as per the /etc/inittab file forks a getty (get tty or get terminal) for the console 56 Unix Startup: Step 3 [0] init [1] login Lecture 2, 20-755: The Internet, Summer 1999 getty execs a login program 57 Unix Startup: Step 4 [0] init [1] tcsh Lecture 2, 20-755: The Internet, Summer 1999 login gets user’s login and password if OK, it execs a shell if not OK, it execs another getty 58 Running programs from the Unix shell [0] The shell displays a prompt (% in this example) and waits for the user to type a command such as ls, cd, or the name of a program to execute. init [1] keyboard shell (tcsh) “%” screen % Lecture 2, 20-755: The Internet, Summer 1999 59 Running programs from the Unix shell [0] When the user types in the name of program to run (e.g. foo), the shell creates a new child process and runs the program within that process, transferring control of the keyboard and display to the child. init [1] keyboard “foo” shell (tcsh) “foo” child(foo) Lecture 2, 20-755: The Internet, Summer 1999 screen % foo 60 Running programs from the Unix shell [0] init [1] shell (tcsh) keyboard child (foo) Lecture 2, 20-755: The Internet, Summer 1999 While the child is running, it reads input from the keyboard and writes output to the screen. screen % foo <output from foo> 61 Running programs from the Unix shell [0] init [1] keyboard shell (tcsh) “%” The shell waits for the child process (foo) to finish and then prints another prompt, Indicating that it is ready to read another command from the keyboard. screen % foo <output from foo> % Lecture 2, 20-755: The Internet, Summer 1999 62 Programs summary • Key concepts: – programs exists at different levels of abstraction » machine code, assembly code, high-level source – programs are just data in memory – program instructions are just bits! – programs can be executed by a processor or by an interpreter. – processes are instances of running programs. – modern OS’s allow multiple processes to run independently at the same time. Lecture 2, 20-755: The Internet, Summer 1999 63