Lecture Overview – September 28, 2015 • Housekeeping • Questions about first assignment • Questions about first lab • Second assignment available today (due next Monday) • Review of last Wednesday’s lecture • Representing information • Converting binary to decimal and vice versa • Hexadecimal • Representing information – ASCII/Unicode for text; RGB for graphics • Inside the Computer • Architecture • The CPU and memory • Toy machines to show how things work for simple programs • Historical background Review of Wednesday’s lecture • Analog to Digital conversion • Bits and bytes • • • • • • • • A bit is on or off (nothing more) Information can be stored as bits 1 bit has 2 possibilities (0 or 1) 2 bits have 4 possibilities (00 or 01 or 10 or 11) … n bits have 2n possibilities Bits are grouped in 8’s to make bytes Powers of 2, powers of 10 • 210 is approximately 103 (1024 vs. 1000) Converting binary to decimal from right to left: if bit is 1 add corresponding power of 2 i.e. 20, 21, 22, 23 (rightmost power is zero) 1101 = 1 x 20 + 0 x 21 + 1 x 22 + 1 x 23 = 1 x 1 + 0 x 2 + 1 x 4 + 1 x 8 = 13 Converting decimal to binary repeat while the number is > 0: divide the number by 2 write the remainder (0 or 1) use the quotient as the number and repeat the answer is the resulting sequence in reverse (right to left) order divide 13 by 2, write "1", number is 6 divide 6 by 2, write "0", number is 3 divide 3 by 2, write "1", number is 1 divide 1 by 2, write "1", number is 0 answer is 1101 Hexadecimal notation • binary numbers are bulky • hexadecimal notation is a shorthand • it combines 4 bits into a single digit, written in base 16 • a more compact representation of the same information • hex uses the symbols A B C D E F for the digits 10 .. 15 0 1 2 3 4 5 6 7 8 9 A B C D E F 0 4 8 C 0000 0100 1000 1100 1 5 9 D 0001 0101 1001 1101 2 6 A E 00103 01107 1010B 1110F 0011 0111 1011 1111 [ Representing letters as numbers ] • what letters and other symbols are included? • how many digits/letter? • determined by how many symbols there are • how do we disambiguate if symbols have different lengths? • how do we decide whose encoding to use? • the representation is arbitrary • but everyone has to agree on it • if they want to work together ASCII: American Standard Code for Information Interchange • an arbitrary but agreed-upon representation for USA • widely used everywhere del , Color • TV & computer screens use Red-Green-Blue (RGB) model • each color is a combination of red, green, blue components • R+G = yellow, R+B = magenta, B+G = cyan, R+G+B = white • for computers, color of a pixel is usually specified by three numbers giving amount of each color, on a scale of 0 to 255 • this is often expressed in hexadecimal so the three components can be specified separately (in effect, as bit patterns) • 000000 is black, FFFFFF is white • printers, etc., use cyan-magenta-yellow[-black] (CMY[K]) Things to remember about storage (RAM, disk, •…digital ) devices represent everything as numbers • discrete values, not continuous or infinitely precise • all modern digital devices use binary numbers (base 2) • it's all bits at the bottom • a bit is a "binary digit", that is, a number that is either 0 or 1 • computers ultimately represent and process everything as bits • groups of bits represent larger things • numbers, letters, words, names, pictures, sounds, instructions, ... • the interpretation of a group of bits depends on their context • the representation is arbitrary; standards (often) define what it is • the number of digits used in the representation determines how many different things can be represented • number of values = base number of digits • e.g., 102 , 210 Pause for a quantitative question • Background • Many governments (city, state, federal, national) have begun the process of making data public • The goal is to create open government • Some data releases have been in response to FOIL (Freedom of Information Law) requests • New York City has been a leader in this process • • • • • Subway data Taxi data Property purchases City budgets … • Data scientists have done much analysis on the data sets released A simple example --- Citi bikes Transit between boroughs (July 2013 to May 2014) • Slightly more people used the bikes to go from Manhattan to Brooklyn (199,491 trips) than from Brooklyn to Manhattan (183,634 trips). Back to computer architecture mouse keyboard display CPU (processor) Bus Memory (RAM) Hard disk CD/ DVD network /wireless (and many others) Inside the Computer – the CPU • how does the CPU work? • what operations can it perform? • Instruction set of the CPU • how does it perform them? on what kind of data? • Arithmetic/Logical/Control Unit performs operations • Computation is done in the accumulator • where are instructions and data stored? • Key concept – von Neumann computer (instructions and data are indistinguishable) arithmetic, logic, control Communication with the outside world accumulator Inside the CPU • CPU can perform a small set of basic operations • • • • • arithmetic: add, subtract, multiply, divide, … memory access: fetch data from memory, store results back in memory read input from a peripheral write output to a peripheral decision making: compare numbers, letters, …, and decide what to do next according to result • control the rest of the machine • operations performed near the CPU (in the accumulator) • • • • • add the contents of a memory location to the accumulator store the contents of the accumulator in a memory location load the contents of a memory location into the accumulator read to accumulator from a peripheral write from accumulator to a peripheral • operates by performing sequences of very simple operations very fast Inside the computer -- Memory • Random Access Memory (RAM) • a place to store information while the computer is running • the programs that are running • their data • the operating system (Windows, Mac OS X, Unix/Linux, ...) • volatile: forgets everything when power is turned off • limited (though large) capacity • logically, a set of numbered boxes ("pigeonholes"? mailboxes?) • each capable of storing one byte = 8 bits of information • a small number or a single character like A or part of a larger value • can be organized into words each of 4 bytes in length • random access • CPU can access any location as quickly as any other location • Disk • • • • a place to store information that will be needed later non-volatile larger capacity historically not random access Very simplistic view of how a computer • RAM operates • Organized in words consisting of 4 bytes address Byte1 Byte2 Byte3 byte4 Operation code Operand1 address Operand2 address Result address 00 04 ….. 0c 10 14 • CPU – When the computer starts it is given an address to start reading instructions from – At every clock cycle, CPU fetches instruction, looks up operands, executes and then stores result Fetch/execute cycle • At every clock cycle • Fetch the next instruction from memory • Instruction is OpCode || Address1 || Address2 || Address3 • Execute the instruction • • • • • • Then Decode the OpCode Get data from Address1 Get data from Address2 Perform the operation Store the result in Address2 • Go on to the next instruction unless directed to do otherwise • • Starting configuration Imagine that operation code 1 means add address Byte1 Byte2 Byte3 byte4 Operation code Operand1 address Operand2 address Result address 1 10 14 0c 10 12 34 56 78 14 21 39 46 52 00 04 ….. 0c • When the operation is completed, what changes? • Location 0c holds the sum of 12345678 and 21394652 • Or 336c9cca Before address Byte1 Byte2 Byte3 byte4 Operation code Operand1 address Operand2 address Result address 1 10 14 0c 10 12 34 56 78 14 21 39 46 52 Byte1 Byte2 Byte3 byte4 Operation code Operand1 address Operand2 address Result address 1 10 14 0c 0c 33 6c 9c ca 10 12 34 56 78 14 21 39 46 52 00 04 ….. 0c After address 00 04 ….. 12345678 + 21394652 = 336c9cca How •does the work actually happen? Operation: • Add contents of word starting at location 10 to contents of word starting at location 14 and store the result in word starting at location 0c • What really happens • Computer has an accumulator where the work happens • Operation is translated into • Fetch contents of word starting at location 10 and store it in the accumulator • Fetch the contents of word starting at location 14 and add it to the number in the accumulator • Store the contents of the accumulator in word starting at location 0c • How does the computer differentiate between address 00 (an instruction) and addresses 0c, 10, 14 (data)? A slightly more realistic "toy" computer • repertoire ("instruction set"): a handful of instructions, including • RAM with each location holding instructions or data • CPU has one "accumulator" for arithmetic and input & output • execution: CPU operates by a simple cycle • programming: writing instructions to put into RAM and execute A very simple computer (instruction set) Operation Operation code Stop 0 Load contents of memory location to accumulator 1 Store contents of accumulator in memory location 2 Add contents of memory location to accumulator 3 Print contents of accumulator 4 Get input from keyboard to accumulator 5 A very simple computer (architecture) CPU arithmetic, logic, control GET accumulator PRINT LOAD STORE Memory (RAM) ADD keyboard display A program to print a number GET get a number from keyboard into accumulator PRINT print the number that's in the accumulator STOP • convert these instructions into numbers • put them into RAM starting at first location • tell CPU to start processing instructions at first location • CPU fetches GET, decodes it, executes it • CPU fetches PRINT, decodes it, executes it • CPU fetches STOP, decodes it, executes it • See the code work A program to add any two numbers GET get first number from keyboard into accumulator STORE NUM save value in RAM location labeled "NUM" GET get second number from keyboard into accumulator ADD NUM add value from NUM (1st number) to accumulator PRINT print the result (from accumulator) STOP NUM --a place to save the first number A program to add any Instruction two numbers Operation Code Need an instruction set Need to manage things Accumulator is Location 00 NUM stored in Location 10 Keyboard is device 01 Printer is device 02 STOP 00 LOAD 01 STORE 02 ADD 03 PRINT 04 GET 05 GET 05 01 00 # get keyboard input to accumulator STORE NUM 02 00 10 # store accumulator in location 10 GET 05 01 00 # get keyboard input to accumulator ADD NUM 03 10 00 # add location 10 to accumulator PRINT 04 02 00 # print to printer accumulator STOP 00 # stop What05gets loaded into RAM 01 00 00000101000000010000000000000000 02 00 10 05 01 00 03 10 00 04 02 00 00 becomes 00000010000000000001000000000000 00000101000000010000000000000000 00000011000100000000000000000000 00000100000000100000000000000000 00000000000000000000000000000000 Which is written in machine language and is unintelligible and prone to errors About programming languages • We will be writing our programs in a version of assembly language • Simple instruction format • Operator followed by operand(s) • Operator names seem sensible • Statements are precise and of limited power • There is a program called an assembler which converts to machine language • Collections of words to be loaded into RAM • Operation code Operand1 Operand2 Operand3 • All information expressed as bits/bytes • each CPU architecture has its own instruction format and one (or more) assemblers A flow chart for adding 2 numbers Read number Store at X Read number Add X Print result question: how would you extend this to adding three numbers? Stop Adding 3 numbers Read number Store at A Read number Store at B Read number Add A Add B Print result Stop A program to add any three numbers GET get first number from keyboard into accumulator STORE NUM1 save value in RAM location labeled "NUM1" GET get second number from keyboard into accumulator ADD NUM1 add value from NUM1 (1st number) to accumulator STORE NUM2 save value in RAM location labeled "NUM2" GET get third number from keyboard into accumulator ADD NUM2 add value from NUM2 (1st number) to accumulator PRINT print the result (from accumulator) STOP NUM1 a place to save the first number questions:--how would you extend this to adding 1000 numbers? NUM2 a placeof to save the second number or an --unknown number numbers?