Homework • Reading – None (Finish all previous reading assignments) • Machine Projects – Continue with MP5 • Labs – Finish lab reports by deadline posted in lab 1 Hierarchy for 80286 Memory and I/O • • • • • IBM PC-AT (“Advanced Technology” in 1984) DOS 3.0 Operating System PC-AT bus evolved into Industry Standard Bus Many manufacturers built ISA-based PCs/cards ISA Bus – Slow 6 MHz evolved to 8 MHz or 125 nsecs/cycle – Address Bus 20 bits – Data Bus 16 bits 2 IBM PC-AT Reference: http://www.vintage-computer.com/ibmpcat.shtml 3 Big Picture (80286) 80286 ROM Memory RAM Memory Hard Disk ISA Bus: 20/16 bits, 8 MHz (125 nsecs/cycle) RTC Keyboard Serial Port Parallel Port Floppy Disk 4 Hierarchy for 80486 Memory and I/O • CPU Clock: 66 MHz • Local Bus or CPU Bus – “Fast” 33MHz / 32 bits wide • Expansion Bus Controller (CPU-ISA Bridge) • ISA Bus (Legacy) – “Slow” 8 MHz or 125 nsecs/cycle – Address Bus 20 bits – Data Bus 16 bits 5 Big Picture (80486) CPU Local bus or CPU bus: fast (33 MHz, 32 bits) [30 nsec./cycle] Memory Video Adapter Cache Expansion Bus Controller Disk System ROM ISA bus: slow (8 MHz, 8/16 bits) [125 nsec./cycle] RTC Keyboard Serial Port Parallel Port Floppy Disk 6 Competition for ISA replacement • Many vendors proposed busses to replace ISA as the technology improved – – – – IBM: Micro Channel Architecture (MCA) Extended Industry Standard Architecture (EISA) VESA Local Bus Intel: Peripheral Component Interconnect (PCI) • PCI had won commercial battle by mid-90’s – “Medium Speed”: 33 or 66 MHz / 32 or 64 bits wide • For a while PCs had a mix of ISA and PCI slots 7 Subsequent Evolution • More and more of the “random logic” and VLSI chips surrounding the processor were included in Ultra-VLSI chips (“Motherboard chips”) • The “North Bridge” allows highest speed access to program/data memory and high speed graphics processors (faster access than the PCI bus!) • The “South Bridge” interfaces to PCI bus and incorporates devices for Ethernet, USB, etc. • The Super I/O chip incorporates legacy interfaces – Floppy Disk, Keyboard, Parallel Port, Serial Ports, etc. 8 Hierarchy for Pentium Memory and I/O Mother Board chip or chip set 9 Hierarchy for Pentium Memory and I/O Motherboard Chipsets • The motherboard chip set provides the core logic and manages the motherboard's functions – – – – CPU: CPU NB: Northbridge GPU: Graphics SB: Southbridge Source: Wikipedia 11 Multi-core CPU Chips • To increase processing power, multiple CPU’s are included in high performance CPU chips • Example Dual core: Enhancing Performance • “Pipelining is an implementation technique in which multiple instructions are overlapped in execution”, (Patterson and Hennessey, “Computer Organization and Design”, p. 436) Sequential Execution: Load Word Instruction Fetch Reg Read Load Word ALU Operation Data Memory Reg Write Instruction Fetch 8 ns Reg Read ALU Operation Load Word Data Memory Reg Write Instruction Fetch 8 ns Reg Read Pipelined Execution: Load Word Instruction Fetch Load Word 2 ns Load Word Reg Read Instruction Fetch 2 ns ALU Operation Reg Read Instruction Fetch 2 ns Data Access ALU Operation Reg Read Reg Write Data Memory ALU Operation Reg Write Data Memory Reg Write 13 Enhancing Performance • Pipelining improves the overall performance by increasing instruction throughput per unit time not decreasing execution time of an individual instruction • Ideal speedup is number of stages in the pipeline • Do we achieve this? Sometimes / Not always – Notice the idle time in the pipe at certain times – Flushing pipeline during conditional jumps – Data being calculated by previous instruction may be needed too early during the next instruction cycle 14 Superscalar Processors • More than one execution pipeline executing in parallel 0 IF OF EX OS IF OF EX OS IF OF EX OS IF OF EX OS IF OF EX OS IF OF EX OS IF OF EX OS IF OF EX OS IF OF EX OS IF OF EX 1 2 3 4 5 OS 6 7 8 9 Time in Base Cycles • Note: Possible coordination problems must be resolved 15 Custom Digital Signal Processor • Early example of a super scalar processor with pipelining of arithmetic operations • Embedded application is hard real time system – processing analog modem waveforms • Computations are based on complex arithmetic • Rotation of a vector (e.g. carrier frequency) – Complex multiply • Filtering of a sequence of signal samples – Loop doing complex multiplication and addition 16 Digital Signal Processor Architecture Controlling Host Processor (68000) Signal Processing Controller (SPC) Fetch and Execute Instructions Address/Modulo Registers R0 R1 N0 N1 Real MAR Chip Select Imaginary MAR Chip Select Multiplier-Accumulator-Ram Address Bus (Real) ALU From Phone Line Memory Analog/Digital Converter SPC Program Memory Data Bus Multiplier-Accumulator-Ram (Imaginary) Memory Digital/Analog Converter ALU To Phone Line17 Addresses and Complex Data • Addresses stored in SPC Registers – Pointers to complex data in MAR memories – Register post increment modes: Increment by one Decrement by one Increment by one modulo specified N register Decrement by one modulo specified N register • Complex numbers stored in MAR memories: – Real part in one MAR (Real MAR) – Imaginary part in other MAR (Imaginary MAR) 18 Multiplier-Accumulator-RAM X-Register Y-Register Multiplier MAC MPY Adder Accumulator Selector Chip Select From SPC 256 Words of RAM Memory Selector Address from SPC From Other MAR To Other MAR 19 SPC/MAR Assembly Programming • Complex Multiply (RR0 * RR1 – IR0 * IR1) + i (RR0 * IR1 + RR1 * IR0) YPP.P MPY.P YMM.R YPP.I MAC.P NOP NOP STA.P MP.R1 MR.R0 MI.R1 MR.R1 MI.R0 Load both Y from own memory at R1 address Multiply both with real memory at R0 address Load minus real Y from imag memory at R1 address Load imag Y from real memory at R1 address Multiply/add both with imag memory at R0 address Result not ready yet: NOP or housekeeping Result not ready yet: NOP or housekeeping MP.R0 Store each acc. in own memory at R0 address 20 Moore’s Law • Gordon Moore made a famous observation in 1965, just four years after the first planar integrated circuit was discovered • Moore observed an exponential growth in the number of transistors per chip and predicted that this trend would continue • Moore's Law, the doubling of transistors every couple of years, has been maintained, and still holds true today 21 Moore’s Law 22 Moore’s Law 23