CSCE 121:509-512 Simple Computer Model Spring 2015 Based on slides created by Bjarne Stroustrup and Jennifer Welch 1 HOW DATA IS REPRESENTED CSCE 121:509-512 -- Set 2: Architecture Fundamental unit of computer storage is a bit, a data entity that can take on one of two values, 0 or 1 CSCE 121:509-512 -- Set 2: Architecture Given a sequence of bits (01010001), what does it represent? Depends on the code – Base 10 integer: 1,010,001 – Base 2 integer: 1*20 + 1*24 + 1*26 = 81 (in base 10) – ASCII: The letter ‘Q’ Must know the code to decode a sequence There are many other codes… CSCE 121:509-512 -- Set 2: Architecture 2 HOW DATA IS STORED CSCE 121:509-512 -- Set 2: Architecture Bytes Make Memory • • • • • Byte: a group of 8 bits How many different values? 00000000, 00000001, 00000010, …, 11111110, 11111111 28 = 256 possibilities Memory: long sequence of bytes, numbered 0, 1, 2, 3,… 11 00 1 10 0 11 00 1 10 000 11 0 011 0 10 0 110 0 byte 0 byte 1 byte 2 CSCE 121:509-512 -- Set 2: Architecture byte 3 ….. Addresses and Words Address: the number of a byte in memory Contents of a byte can change Use consecutive locations to store longer sequences of information 4 bytes = 1 word a word 11 00 1 10 0 11 00 1 10 000 11 0 011 0 10 0 110 0 byte 0 byte 1 byte 2 CSCE 121:509-512 -- Set 2: Architecture byte 3 ….. Limitations of Finite Data Encodings Overflow: number is too large – suppose 1 byte stores integers in base 2, from 0 (00000000) to 255 (11111111) – if the byte holds 255, then adding 1 to it results in 256, which is too large to be stored in the byte CSCE Set --2:Set Architecture CSCE 121-200: 121:509-512 2: Architecture Limitations of Finite Data Encodings Roundoff error: – insufficient precision (size of word): try to store 1/8, which is .001 in base 2, with only two bits – nonterminating expansions in current base: try to store 1/3 in base 10, which is .333… – nonterminating expansions in every base: irrational numbers such as π CSCE 121:509-512 -- Set 2: Architecture Kinds of Storage Cache: super-fast Main memory: random access, equally fast to access any address Disk: random access, significantly slower than main memory Tape: sequential access, significantly slower than disk CSCE 121:509-512 -- Set 2: Architecture 3 HOW DATA IS OPERATED ON CSCE 121:509-512 -- Set 2: Architecture Main Players • Data in main memory cannot be operated on: must be copied (loaded) to special locations called registers • Once data is in registers, it can be operated on by circuitry called arithmetic logic unit (ALU) – e.g., add, multiply • Result of operation can then be copied (stored) back to main memory • Procedure is organized by control unit circuitry – figures out what the ALU should do next – transfers data between main memory and registers • Registers, ALU and control unit are in the CPU CSCE 121:509-512 -- Set 2: Architecture ALU CPU registers main memory control unit CSCE 121:509-512 -- Set 2: Architecture Machine Instructions Goal: add the number stored in address 3 and the number stored in address 6; put the result in address 10. Control unit does the following: 1. copies data from main memory address 3 into some register, say 1: LOAD 3,1 2. copies data in main memory address 6 into some register, say 4: LOAD 6,4 3. tells ALU to add the contents of registers 1 and 4, and put result in some register, say 3: ADD 1,4,3 4. copies data in register 3 into main memory address 10: STORE 3,10 LOAD, ADD and STORE are machine instructions. How does the control unit know which instruction is next? The program! CSCE 121:509-512 -- Set 2: Architecture 4 HOW A PROGRAM IS STORED CSCE 121:509-512 -- Set 2: Architecture Program: list of machine instructions using some agreed upon coding convention. Example: ADD 1 0 0 1 0 0 0 0 1 opcode 1st operand 4 0 1 0 0 2nd operand 3 0 0 1 1 3rd operand Program is stored in memory the same way data is stored! CSCE 121:509-512 -- Set 2: Architecture 5 HOW A PROGRAM IS EXECUTED ON THE DATA CSCE 121:509-512 -- Set 2: Architecture How a Program is Executed • The control unit has – instruction register: holds current instruction to be executed – program counter: holds address of next instruction in the program to be fetched from memory • Program counter tells where the computer is in the program. Usually the next instruction to execute is the next instruction in memory • Sometimes we want to JUMP to another instruction (e.g., if or while) – unconditional JUMP: always jump to address given – conditional JUMP: only jump if a certain condition is true (e.g., some register contains 0) CSCE 121:509-512 -- Set 2: Architecture ALU CPU registers main memory control unit CSCE 121:509-512 -- Set 2: Architecture instruction register program counter Machine Cycle • fetch next instruction into instruction register (IR), as indicated by program counter (PC), and increment PC • decode bit pattern in IR to figure out which circuitry needs to be activated to perform instruction • execute instruction by copying data into registers and activating ALU to do the right thing – a JUMP may cause the PC to be altered CSCE 121:509-512 -- Set 2: Architecture Diagram of Architecture CPU: PC: R1: IR: R2: ALU R3: control unit R4: bus 0 main memory: 1 2 3 4 5 data … first instr. program second instr. … third instr. 95 96 97 CSCE 121:509-512 -- Set 2: Architecture 98 99 100 … 6 HOW A PROGRAM IS TRANSLATED INTO MACHINE LANGUAGE CSCE 121:509-512 -- Set 2: Architecture Evolution of Programming Languages Machine languages: all in binary machine dependent painful for people no translation needed Assembly languages: allow symbols for operators and addresses still machine dependent slightly less painful assembler translates assembly language programs into machine language High-level languages: such as Fortran, C, Java, C++ machine independent easier for people compiler translates high-level language programs into machine language CSCE 121:509-512 -- Set 2: Architecture Why Compiling is Hard one high-level instruction can correspond to several machine instructions • x = (y+3)/z; fancier control structures than JUMP • • • • while repeat if-then-else case/switch fancier data structures • arrays: must calculate addresses for references to array elements • structs: similar to arrays, but less regular functions/subroutines/ methods: generate machine language to: • copy input parameter values • save return address • start executing at beginning of function code • copy output parameter values at the end • set PC to return address CSCE 121:509-512 -- Set 2: Architecture Compilation Process Lexical Analysis • break up strings of characters into logical components, called tokens, and discard comments and spaces • Ex: total = sum + 55.32 has 5 tokens Parsing • decide how the tokens are related • Ex: sum + 55.32 is an arithmetic expression • Ex: total = sum + 55.32 is an assignment statement Code Generation • generate machine instructions for each high-level instruction • resulting machine language program, called object code, is written to disk CSCE 121:509-512 -- Set 2: Architecture Linking Linker combines results of compiling different pieces of the program separately. If pieces refer to each other, these references cannot be resolved during independent compilation Combined code is written to disk. function p … refers to x … main … declares x … invokes p … CSCE 121:509-512 -- Set 2: Architecture Loading To run the program, the loader copies the object code from disk into main memory (location in main memory is determined by operating system, not programmer) Loader initializes PC to starting location of program and adjusts JUMP addresses Result is an executable CSCE 121:509-512 -- Set 2: Architecture Source code for your program Compiler Object code for your program Object code for libraries Linker/Loader Executable for your program CSCE 121:509-512 -- Set 2: Architecture Credits • Slide 1: “NEC APC” by Niv Singer, licensed under CC BY-SA 2.0 • Slide 3: “La tecnología...” by Infocux Technologies, licensed under CC BYNC 2.0 • Slide 4: http://commons.wikimedia.org/wiki/File:ASCII_Code_Chart.svg • Slide 8: “Overflow” by Braveheart, licensed under CC BY-NC-ND 2.0 • Slide 9: “pumpkin pi” by Craig Damlo, licensed under CC BY-NC-ND 2.0 CSCE 121:509-512 -- Set 2: Architecture