240-208 Fundamental of Computer Architecture November 01, 2003 By Panyayot Chaikan panyayot@coe.psu.ac.th Chapter 3 โพรเซสเซอร์ และการทางาน The Processing Unit 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 2 เนื้อหา นิยาม และคาศัพท์ที่ควรรู ้เกี่ยวกับไมโครโพรเซสเซอร์และ ไมโครคอมพิวเตอร์ ประวัติความเป็ นมาของไมโครโพรเซสเซอร์ ข้อดีขอ้ เสี ยของไมโครโพรเซสเซอร์ ข้อพิจารณาในการเลือกใช้ไมโครโพรเซสเซอร์ 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 3 Computer BUS A group of wires that connects several devices Three types of Bus Address bus Data bus Control bus 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 4 Address bus Used to specify memory location that the cpu want to access(read/write) n-bit address bus provides 2n addresses For Example 16-bit address bus -> 216 = 16 Kbyte of memory 8086 20-bit address bus -> 220 = 1 Mbyte of memory Pentium 32-bit address bus -> 232 = 4 Gbyte of memory MCS-51 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 5 Databus Used to sent data between CPU and peripheral(memory, i/o) The more bit of data bus, the more speed achieved For Example MCS-51 8-bit data bus 8086 16-bit data bus Pentium 64-bit data bus 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 6 BUS I/O RAM ROM Microprocessor Data bus Address bus Control bus 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 7 CPU : Basic operations Fetch : Read the instructions and data from memory Execute : perform the desired operation and write the result into the memory or registers Fetch Execute Fetch Execute Fetch Execute Instruction Cycle 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 8 CPU Memory R0 R1 R2 10000 20 A 10001 12 B 10002 5 C 10003 D Address : 15001 MAR FETCH data 15001 LOAD R0,[10000] 15002 LOAD R1,[10001] Control Unit control:read 15003 15004 LOAD R0,[10002] 15005 IR LOAD R0,[10000] ADD R2, R0, R1 program ADD R1, R0,R2 15006 STORE [10003],R1 data : load R0,[10000] 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 9 Memory 10000 20 A 10001 12 B 10002 5 10003 C CPU data R0 R1 D R2 15001 LOAD R0,[10000] 15002 LOAD R1,[10001] 15003 ADD R2, R0, R1 15004 LOAD R0,[10002] 15005 program ADD R1, R0,R2 15006 STORE [10003],R1 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 10 Terminology IR : Instruction Register MAR : Memory Address Register 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 11 Instructions of CPU There are 4 types of instructions 1. Data transfer between memory and CPU registers 2. Arithmetic and Logic Operations on data 3. Program Sequencing and Control 4. I/O transfer 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 12 Basic instruction types : three address instruction D = A+B+C Memory LOAD R0,[10000] CPU LOAD R1,[10001] ADD R2, R0, R1 LOAD R0,[10002] ADD R1, R0,R2 STORE [10003],R1 10000 20 A 10001 12 B 10002 5 C 10003 27 D data R0 5 R1 37 R2 32 Note ADD R2,R0,R1 means R2 = R0+R1 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 13 Basic instruction types : two address instruction D = A+B+C LOAD R0,[10000] LOAD R1,[10001] ADD R0,R1 LOAD R2,[10002] ADD R0,R2 STORE [10003],R0 Memory CPU 10000 20 A 10001 12 B 10002 5 C 10003 27 D data R0 5 R1 37 R2 32 Note ADD R0,R1 means R0 = R0+R1 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 14 Basic instruction types : one address instruction D = A+B+C LOAD [10000] ADD [10001] ADD [10002] STORE [10003] Memory CPU 10000 20 A 10001 12 B 10002 5 C 10003 27 D ACC 5 data Note ADD [10001] means Acc = Acc + [10001] 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 15 CPU registers General purpose registers R0,R1…Rn A,B, C,…. Special purpose register PC SP Accumulator Flag or Condition code 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 16 PC :Program Counter register Used to keep the next address of memory that CPU want to access PC and address-bus have the same size 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 17 PC :Program Counter (continued) PC 240-208 Fundamental of Computer Architecture 0000H Instruction 1 0001H Instruction 2 0002H Instruction 3 0003H Instruction 3 0004H Instruction 4 0005H Instruction 5 0006H Instruction 5 0007H Instruction 5 0008H Instruction 6 Chapter 3 - The Processing Unit 18 PC :Program Counter (continued) PC 240-208 Fundamental of Computer Architecture 0000H Instruction 1 0001H Instruction 2 0002H Instruction 3 0003H Instruction 3 0004H Instruction 4 0005H Instruction 5 0006H Instruction 5 0007H Instruction 5 0008H Instruction 6 Chapter 3 - The Processing Unit 19 PC :Program Counter (continued) PC 240-208 Fundamental of Computer Architecture 0000H Instruction 1 0001H Instruction 2 0002H Instruction 3 0003H Instruction 3 0004H Instruction 4 0005H Instruction 5 0006H Instruction 5 0007H Instruction 5 0008H Instruction 6 Chapter 3 - The Processing Unit 20 PC :Program Counter (continued) PC 240-208 Fundamental of Computer Architecture 0000H Instruction 1 0001H Instruction 2 0002H Instruction 3 0003H Instruction 3 0004H Instruction 4 0005H Instruction 5 0006H Instruction 5 0007H Instruction 5 0008H Instruction 6 Chapter 3 - The Processing Unit 21 PC :Program Counter (continued) PC 240-208 Fundamental of Computer Architecture 0000H Instruction 1 0001H Instruction 2 0002H Instruction 3 0003H Instruction 3 0004H Instruction 4 0005H Instruction 5 0006H Instruction 5 0007H Instruction 5 0008H Instruction 6 Chapter 3 - The Processing Unit 22 PC :Program Counter (continued) PC 240-208 Fundamental of Computer Architecture 0000H Instruction 1 0001H Instruction 2 0002H Instruction 3 0003H Instruction 3 0004H Instruction 4 0005H Instruction 5 0006H Instruction 5 0007H Instruction 5 0008H Instruction 6 Chapter 3 - The Processing Unit 23 Branching [25000] = [10000]+[10002]+[10003]+….+[24999] Memory 10000 20 10001 12 10002 5 10003 27 LOC35000: LOAD R0,#0 LOAD R1,#14999 LOAD R3,#10000 LOC35003: LOAD R2,[R3] input data ADD R0, R2 INC R3 DEC R1 24999 25000 result Branch_NZ LOC35003 STORE [R3],R0 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 24 Flag or Condition code Register keep the status after perform arithmetic and logic operation 7 6 5 4 3 2 1 0 S Z X H X P/V N C Carry flag Negative flag Parity/Overflow flag Not used Half-carry flag Not used Example: Flags of CPU z80 240-208 Fundamental of Computer Architecture Zero flag Sign flag Chapter 3 - The Processing Unit 25 Addressing modes of CPU Immediate #value Register Ri Direct(absolute) [mem_loc] Register indirect [Ri] Relative X[PC] Index 240-208 Fundamental of Computer Architecture load R0,#00001 load R0,R1 load R0,[100000] load R0,[R1] Chapter 3 - The Processing Unit 26 Immediate addressing load R1,#00001 After Before R1 R1 2500 00001H 3900H 00000H 240-208 Fundamental of Computer Architecture 00001 00001H 3900H 00000H Chapter 3 - The Processing Unit 27 Direct addressing LOAD R1,[1200H] After Before 11FDH 11FDH 11FEH 11FEH 11FFH 11FFH R1 1200H 57H 1201H 1202H 39H 1203H 240-208 Fundamental of Computer Architecture R1 1200H 39H 1201H 1202H 39H 1203H Chapter 3 - The Processing Unit 28 Register indirect load R0,[R1] Before R1 R0 00006 2500 00006H 2400 00005H 1457 00004H 638 00003H After R1 00002H 6713 00001H 3900H 00000H 240-208 Fundamental of Computer Architecture R0 00006 2400 00006H 2400 00005H 1457 00004H 638 00003H 00002H 6713 00001H 3900H 00000H Chapter 3 - The Processing Unit 29 Index addressing Use index register Effective address = X + [Ri] When X = offset (or displacement) Ri = index register or Base register 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 30 Index addressing Array+4 displacement 04 R1 Array+3 ADD R2,04(R1) Array+2 Array+1 Array Array Offset is given as a constant 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 31 Index addressing Array+4 R1 Array+3 04 ADD R2,Array(R1) Array+2 displacement Array+1 Array Array Offset is in the index register 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 32 Example1 : Transfering bytes of data Copy values in memory location 1000h-1400h to location 2000h-2400h (1024 byte) 1000H 07H 2000H 1001H 05H 2001H 1002H 00H 2002H 13FEH 61H 23FEH 13FFH 72H 23FFH 1400H 74H 2400H 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 33 Example1 : Transfering bytes of data STRT: LOC_A: LD R0,#1000H LD R1,#2000H LD R3,#1024 LD R4, [R0] STORE [R1], R4 INC R0 INC R1 DEC R3 BRANCH>0 LOC_A CALL PRINTF 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 34 Example2: Unsigned Multiplication by Repeated Addition Multiply 8-bit unsigned number C = A*B 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 35 Example2: Unsigned Multiplication by Repeated Addition A A A A B times A A Result 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 36 Example2: Unsigned Multiplication by Repeated Addition STRT : LOOP: LOAD R1,#0 LOAD R3, [mem_loc_A] LOAD R2, [mem_loc_B] ADD R1,R3 DEC R2 BRANCH>0 LOOP STORE [mem_loc_C],R1 CALL PRINTF 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 37 Example2: Unsigned Multiplication by Repeated Addition Problem of the program in page 37 If B = 0 then the result is A , not 0 How to remedy the problem 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 38 Example2: Unsigned Multiplication by Repeated Addition STRT : LOOP: STR: LOAD LOAD LOAD Compare Branch_Z ADD DEC Branch>0 STORE 240-208 Fundamental of Computer Architecture R1,#0 R3, [mem_loc_A] R2, [mem_loc_B] R2,#0 STR R1,R3 R2 LOOP [mem_loc_C],R1 Chapter 3 - The Processing Unit 39 Example3: if-then-else if (mem_loc_a == 5) mem_loc_b++; else mem_loc_b = mem_loc_a + mem_loc_b; 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 40 Example3: if-then-else b_p_a: stre: Load Load Compare Branch_NZ inc branch Add Store Store 240-208 Fundamental of Computer Architecture R1,[mem_loc_a] R2,[mem_loc_b] r1,#5 b_p_a r2 stre r2,r1 [mem_loc_a],r1 [mem_loc_b],r2 Chapter 3 - The Processing Unit 41 Example4: checking greater-than if (mem_loc_a > 5) mem_loc_b++; else mem_loc_b = mem_loc_a + mem_loc_b; 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 42 Example4: checking greater-than Load Load compare branch_z branch_M inc branch equ_g_5: Add stre: Store Store 240-208 Fundamental of Computer Architecture R1,[mem_loc_a] R2,[mem_loc_b] r1,#5 equ_g_5 ;equal 5 equ_g_5 ;M= minus r2 stre r2,r1 [mem_loc_a],r1 [mem_loc_b],r2 Chapter 3 - The Processing Unit 43 Example4: checking greater-than gt_5: stre: Load Load sub branch>0 Add branch inc Store Store 240-208 Fundamental of Computer Architecture R1,[mem_loc_a] R2,[mem_loc_b] r1,#5 gt_5 r2,r1 stre r2 [mem_loc_a],r1 [mem_loc_b],r2 Chapter 3 - The Processing Unit 44 Basic processing unit the structure of simple CPU How the internal parts of CPU work How to design the simple processor Datapath Control Unit 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 45 Inside simple CPU with Single-bus Datapath Constant4 MUX PC IR R0 R1 : R(n-1) Temp 240-208 Fundamental of Computer Architecture MAR MDR Y Z ALU Chapter 3 - The Processing Unit 46 Perform instruction ADD R1,R2 MAR ADDRESS_BUS MDR IR Z PC Y Z R2 <= PC <= MAR, read <= MEMORY[MAR] <= MDR <= PC + 4 <= Z <= R1 <= Y + R2 <= Z 240-208 Fundamental of Computer Architecture Fetch phase Execution phase Chapter 3 - The Processing Unit 47 Perform instruction ADD R1,R2 MAR ADDRESS_BUS MDR IR Z PC Y Z R2 Active Signals <= PC PCout, MARin <= MAR,read read <= MEMORY[MAR] MDRinE, WMFC <= MDR MDRout,IRin <= PC + 4 PCout, MUX_sel4, Add,Zin <= Z Zout,PCin <= R1 Yin, R1out <= Y + R2 R2out, MUX_selY, Add, Zin <= Z Zout, R2in 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 48 How to modify FETCH operation to be faster MAR <= PC ADDRESS_BUS <= MAR, Read MDR <= MEMORY[MAR], WMFC IR <= MDR Z <= PC + 4 PC <= Z 240-208 Fundamental of Computer Architecture MAR PC IR <= PC, Read, Z <= PC+4 , MDR <= MEMORY[MAR] <= Z, WMFC <= MDR Chapter 3 - The Processing Unit 49 Modified FETCH operation Active Signals MAR PC IR <= PC, Read, Z <= PC+4 , MDR <= MEMORY[MAR] <= Z, WMFC <= MDR 240-208 Fundamental of Computer Architecture PCout, MARin, Read, Mux_sel4, Add, Zin Zout, PCin, Yin, WMFC MDRout, IRin Chapter 3 - The Processing Unit 50 Perform instruction load R1,[mem_locA] MAR <= PC, Read, Z <= PC+4 , MDR <= MEMORY[MAR] PC <= Z, WMFC IR <= MDR MAR ADDRESS_BUS MDR R1 <= 00000000 & IR24..0 <= MAR <= MEMORY[MAR] <= MDR 240-208 Fundamental of Computer Architecture Fetch phase Execution phase Chapter 3 - The Processing Unit 51 Three-bus organization busA busB Constant4 Address bus PC R0 R1 : ALU R incrementer A B R(n-1) IR MDR MAR Control Unit and Instruction Decoder Data bus MUX .... control signals busC 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 52 Perform Instruction ADD R6,R5, R4 Step 1 2 3 4 Action PCout, R=B, MARin, Read, incPC WMFC, MDRin_from_databus MDRout_busB, R= B, IRin R4out_busB, R5out_busA, Add, R6in, End 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 53 Control Units 2 types of Control units Hardwired Microprogrammed 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 54 Control sequence for instruction ADD R1,(R3) Step 1 2 3 4 5 6 7 Action PCout, MARin, Read, Select4, Add, Zin Zout, PCin, Yin, WMFC MDRout, IRin R3out, MARin, Read R1out, Yin, WMFC MDRout, SelectY, Add, Zin Zout, R1in, End 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 55 Control sequence for instruction Branch Step 1 2 3 4 5 Action PCout, MARin, Read, Select4, Add, Zin Zout, PCin, Yin, WMFC MDRout, IRin Offset-field-of-IRout, ADD, Zin Zout, PCin, End 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 56 Control sequence for instruction Branch<0 Step 1 2 3 4 5 Action PCout, MARin, Read, Select4, Add, Zin Zout, PCin, Yin, WMFC MDRout, IRin Offset-field-of-IRout, ADD, Zin, if N=0 then End Zout, PCin, End 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 57 Hardwired Control Unit Clock Control Step Counter External Inputs IR Decoder/ Encoder Condition codes Control signals 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 58 Control Unit organization Control Step Counter Clock Tn T2 T1 Step decoder INS1 External Inputs INS2 INS3 IR Instruction Decoder Encoder Condition codes INS m Run 240-208 Fundamental of Computer Architecture End Control signals Chapter 3 - The Processing Unit 59 Zin and END control signals Zin = T1 + (T6ADD) + (T4 BR)+….. End = (T7 ADD) + (T5 BR) + (((T5 N)+(T4 N)) BRN)+.... Note BR = Branch instruction BRN = Branch<0 instruction N = Negative flag 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 60 Generation of Zin control signal T6 Add T4 Zin BR T1 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 61 Generation of END control signal T5 BR N BRN End T4 Add T7 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 62 Microprogrammed control unit External inputs IR Starting and branch address generator Clock Condition codes uPC Control store 240-208 Fundamental of Computer Architecture Control Word Chapter 3 - The Processing Unit 63 “Control words” stored in “Control Store” From Figure 7.15 page 430 of “Computer Organization”, 5th edition, Carl Hamacher, McGraw Hill 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 64 จบ บทที่ 3 240-208 Fundamental of Computer Architecture Chapter 3 - The Processing Unit 65