Arithmetic and Logic Unit CS 365 Lecture 5 Prof. Yih Huang CS365 1 Inside a Processor Branch Registers Control Logic Floating Point Arithmetic Circuits Integer Arithmetic Circuits (Internal) Bus Data Cache CS365 Instruction Cache 2 1 Arithmetic and Logic Unit (ALU) The part of a processor circuit that actually gets the computations done. operation a 32 ALU result 32 b 32 CS365 3 Numbers Bits are just bits (no inherent meaning) Binary numbers (base 2) ⇒ decimal: 0...2n-1 ASCII codes Of course it gets more complicated: numbers are finite (overflow) fractions and real numbers negative numbers How do we represent negative numbers? CS365 4 2 Possible Representations Sign Magnitude: 000 = +0 001 = +1 010 = +2 011 = +3 100 = -0 101 = -1 110 = -2 111 = -3 One's Complement Two's Complement 000 = +0 001 = +1 010 = +2 011 = +3 100 = -3 101 = -2 110 = -1 111 = -0 000 = +0 001 = +1 010 = +2 011 = +3 100 = -4 101 = -3 110 = -2 111 = -1 Most of the modern architectures use two’s complement. CS365 5 Two’s Complement Numbers X3 X2 X1 X0 − 23 22 21 20 0010 = 1010 = -10 in 8-bit two’s complement = CS365 6 3 32-bit Signed Numbers Two’s Complement 0000 0000 0000 ... 0111 0111 1000 1000 1000 ... 1111 1111 1111 Decimal 0000 0000 0000 0000 0000 0000 0000 = 0 0000 0000 0000 0000 0000 0000 0001 = +1 0000 0000 0000 0000 0000 0000 0010 = +2 1111 1111 0000 0000 0000 1111 1111 0000 0000 0000 1111 1111 0000 0000 0000 1111 1111 0000 0000 0000 1111 1111 0000 0000 0000 1111 1111 0000 0000 0000 1110 1111 0000 0001 0010 = = = = = +2,147,483,646 +2,147,483,647 –2,147,483,648 –2,147,483,647 –2,147,483,646 1111 1111 1111 1111 1111 1111 1101 = –3 1111 1111 1111 1111 1111 1111 1110 = –2 1111 1111 1111 1111 1111 1111 1111 = –1 CS365 7 Two's Complement Operations Negating a two's complement number: invert all bits and add 1 – remember: “negate” and “invert” are different! Exercises (in 6 bits) – Negate 12 – Negate -5 CS365 8 4 Sign Extensions MIPS 16 bit immediate gets converted to 32 bits for arithmetic copy the most significant bit (the sign bit) into the other bits 0010 1010 4 bit number ⇒ ⇒ 0000 0010 1111 1010 8 bit equivalent CS365 9 Additions & Subtractions Just like regular binary numbers CS365 0010 + 0110 1111 + 0001 1111 + 1111 0010 - 0110 1111 - 0001 1111 - 1111 10 5 Overflows Result too large to store in finite-size computer words – e.g., adding two n-bit numbers does not always yields an n-bit number Depends on the kind of numbers you have in mind: Signed or unsigned 0010 + 0110 1000 - 0001 CS365 11 Detecting Overflow No overflow when adding a positive and a negative number No overflow when signs are the same for subtraction Overflows when the value affects the sign: A+B A+B A−B A−B CS365 A >0 <0 >0 <0 B >0 <0 <0 >0 result <0 >0 <0 >0 12 6 Effects of Overflow Architecture and case dependent Solution 1: just remember it and leave the handing to software. – The condition/flag register of IA32 Solution 2: exception/interrupt – Control jumps to predefined address for exception – Interrupted address is saved for possible resumption – Used by MIPS CS365 13 Discussion IA32 provides an addc (add with carry) instruction. What is its use? CS365 14 7 Review: Boolean Algebra & Gates Problem: Consider a logic function with three inputs: A, B, and C. Output D is true if at least one input is true Output E is true if exactly two inputs are true Output F is true only if all three inputs are true Show the truth table for these three functions. Show the Boolean equations for the three functions. Show an implementation consisting of inverters, AND, and OR gates CS365 15 Design An Overflow Detector Inputs: SA (sign of A), SB (Sign of B), OP (operation, 0 for add, 1 for sub). Output: OF=0 no overflow, 1 overflow Truth Table: Boolean equation for OF. A circuit design of OF according to the equation above. CS365 OP 0 0 0 0 1 1 1 1 SA SB OF 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 16 8 Review: The Multiplexer A B Select Multiplexor Output Selects one of the inputs to be the output, based on a control input Note: we call this 2-input multiplexer even though it actually has three inputs CS365 17 More Inputs A B C D Select Multiplexor 2 Output The general case: N-input multiplexer needs log2N select lines. You should be able to design its logic circuit. CS365 18 9 Second Exercise Let us build a one-bit ALU to support addition and logic or. – Operation: 0 for add 1 for or operation a ALU result b CS365 19 Solution Truth Table Sum of product CS365 20 10 Supporting MIPS Logic Instructions MIPS provides bit-wise and, or, xor, and nor instructions. Input operation (3 bits) determine the output. operation 3 a ALU result b CS365 21 32-bit ALU Both inputs A and B are 32 bit wide. – Size of the truth table ? Rather we will just cascade 32 1-bit ALU. – How about carries ? – We need to refine the spec of the 1-bit ALU CS365 22 11 Two Solutions Truth table and sum of product Use multiplexer CS365 23 1-bit Adder Cin + A B Cout = AB + ACin + BCin Sum = A xor B xor Cin How could we build a 1-bit ALU for add, and, or? How could we build a 32-bit ALU? CS365 24 12 Building a 32-bit ALU C a r r y In a0 O pe ration b0 C arryIn a O p e r a t io n C a r r y In R e s u lt 0 ALU 0 C a rry O u t a1 0 b1 C a r r y In R e s u lt 1 ALU 1 C a rry O u t 1 R esu lt a2 b2 2 C a r r y In R e s u lt 2 ALU 2 C a rry O u t b C a rryO ut a31 b31 C a r r y In R e s u lt 3 1 ALU 31 CS365 25 What about subtraction (a – b) ? Two's complement approach: negate b and add. How do we negate? Binvert Operation CarryIn a 0 A clever solution: 1 b 0 Result 2 1 CarryOut CS365 26 13 Tailoring the ALU to the MIPS Need to support the set-on-less-than instruction (slt) – remember: slt is an arithmetic instruction – produces a 1 if rs < rt and 0 otherwise – use subtraction: (a-b) < 0 implies a < b Need to support test for equality (beq $t5, $t6, offset) – use subtraction: (a-b) = 0 implies a = b CS365 27 3 O v e rflo w Set R e s u lt 2 O v e rflo w d e tec tio n 1 b. Le s s b a 0 B in v e rt a. 1 0 O p e r atio n C ar ryO u t L es s 1 0 b a CS365 C ar ryI n 3 2 1 0 C arr yIn B in v e rt O p e ra tio n R e s ult Supporting slt 28 14 Ca rryIn AL U31 Le ss Ca rryIn AL U2 Less CarryO ut a2 b2 0 a31 b31 0 Ca rryIn AL U1 Less CarryO ut a1 b1 0 CarryIn Ca rryIn AL U0 Less CarryO ut a0 b0 S et O verflo w R esult31 R esult2 R esult1 R esult0 O peration Ca rryIn Binvert CS365 29 Test for equality Bnegate Operation Notice control a0 b0 CarryIn ALU0 Less CarryOut Result0 a1 b1 0 CarryIn ALU1 Less CarryOut Result1 a2 b2 0 CarryIn ALU2 Less CarryOut Result2 lines: Zero 000 001 010 110 111 = = = = = and or add subtract slt Ouput zero=1 when result is 0. a31 b31 0 CS365 CarryIn ALU31 Less Result31 Set Overflow 30 15 Important points about hardware – all of the gates are always working – the speed of a gate is affected by the number of inputs to the gate – the speed of a circuit is affected by the number of gates in series (on the “critical path” or the “deepest level of logic”) – What is the critical path in our 32-bit ALU? CS365 31 Ripple Carry Adder Is Slow Logic circuit speed is determined by the number of gates a signal have to pass in the worst case. Assuming each 1-bit ALU adds x–gate delay, what is the delay of a 32-bit ALU? CS365 32 16 G P C2 ci + 2 Re sult8--11 CarryIn Re sult12--15 ALU3 P3 G3 pi + 3 gi + 3 C4 ci + 4 Ca rryOu t Cout 0 “kill” Cin “propagate” Cin “propagate” 1 “generate” ci + 3 G = A and B P = A xor B C3 B 0 1 0 1 pi + 2 gi + 2 A 0 0 1 1 ALU2 P2 G2 G P CarryIn C1 =G0 + C0 P0 pi + 1 gi + 1 Cin S S Re sult4--7 ALU1 P1 G1 C2 = G1 + G0 P1 + C0 P0 P1 CarryIn C3 = G2 + G1 P2 + G0 P1 P2 + C0 P0 P1 P2 ci + 1 Carry Look Ahead A0 B0 A1 B1 G P S S Carry-loo kahead un it C4 = . . . a12 b12 a13 b13 a14 b14 a15 b15 A2 pi gi 16-Bit Adder a8 b8 a9 b9 a10 b10 a11 b11 B2 ALU0 P0 G0 G P Re sult0--3 C1 a4 b4 a5 b5 a6 b6 a7 b7 A3 CarryIn G P a0 b0 a1 b1 a2 b2 a3 b3 B3 CS365 CS365 CarryIn 33 34 17 Exercise A 0 0 1 1 Cin A0 S G P B 0 1 0 1 C1 = A1 Cout 0 “kill” Cin “propagate” Cin “propagate” 1 “generate” G = A and B P = A xor B S G P C2 = A2 S G P C3 = A3 S G= P= G P CS365 C4 = . . . 35 Multiplications N bits × N bits → 2N bits result Paper and pencil example (unsigned): Multiplicand: Multiplier: CS365 0011 0101 36 18 Unsigned Combinational Multiplier 0 A3 0 A2 0 A1 0 A0 B0 A3 A3 A2 A2 A1 A1 A0 B1 A0 B2 A3 P7 P6 A2 A1 P5 A0 P4 B3 P3 P2 P1 P0 Stage i accumulates A * 2 i if Bi == 1 CS365 37 Discussions Multiplication is expensive A combinational multiplier uses a great deal of silicon – 32 32-bit adders needed We will discuss designs that are slower but less silicon demanding. – Due to its complexity, we will first present a basic but suboptimal design, and refine it twice. CS365 38 19 Shift and Add 0 0 0 0 A3 Clock tick A3 Clock tick A3 Clock tick Clock tick A3 P7 P6 A2 P5 A2 A1 P4 A2 A1 0 A2 A1 0 A1 0 A0 A0 B1 A0 B2 A0 P3 B0 B3 P2 P1 P0 One step per clock tick; n clock cycles needed for n-bit multiplications CS365 39 Example: 0101 × 0011 Multiplicand Product Multiplier CS365 To add or not to add 40 20 Unsigned Shift-Add Multiplier 64-bit Multiplicand reg, 64-bit ALU, 64-bit Product reg, 32-bit multiplier reg Shift Left Multiplicand 64 bits Multiplier 64-bit ALU Shift Right 32 bits Write Product 64 bits Control Multiplier = datapath + control CS365 41 Observations Half bits in multiplicand always 0 – 64 bit adder is a waste Improvement: – Use 32 bit multiplicand – Don’t shift multiplicand left; shift the product right instead CS365 42 21 Example: 0101 × 0011 Multiplicand Product To add or not to add Multiplier CS365 43 Shift-Add Multiplier Version 2 32-bit Multiplicand reg, 32 -bit ALU, 64-bit Product reg, 32-bit Multiplier reg Multiplicand 32 bits Multiplier 32-bit ALU Shift Right 32 bits Shift Right Product 64 bits CS365 Control Write 44 22 A Second Example Product 0000 0000 0010 0000 0001 0000 0011 0000 0001 1000 0000 1100 0000 0110 Multiplier 0011 Multiplicand 0010 0001 0001 0000 0000 0000 0010 0010 0010 0010 0010 CS365 45 Observations Product register wastes space that exactly matches size of multiplier Improvement: combine Multiplier register and Product register CS365 46 23 Example: 0101 × 0011 To add or not to add Multiplicand Product Multiplier CS365 47 A Second Example Multiplicant Initial Product 0 0 1 0 1 0 0 0 0 0 0 1 0 1 1 Product after 1st shift Product after 2nd shift Product after 3rd shift Product after 4th shift Product after 5th shift CS365 48 24 Multiplier Hardware Version 3 32-bit Multiplicand reg, 32 -bit ALU, 64-bit Product reg, (0-bit Multiplier reg) Multiplicand 32 bits 32-bit ALU Shift Right Product (Multiplier) 64 bits Control Write CS365 49 Discussions Can you see where the MIPS Hi and Lo registers come from? Can you see the special hardware associated with IA32 EAX and EDX? CS365 50 25