2015-10-30 TDTS 08 Advanced Computer Architecture [Datorarkitektur] www.ida.liu.se/~TDTS08 Zebo Peng Embedded Systems Laboratory (ESLAB) Dept. of Computer and Information Science (IDA) Linköping University Contact Information Zebo Peng, course leader and examiner Email: zebo.peng@liu.se Ke Jiang, course and lab assistant Email: ke.jiang@liu.se Arian Maghazeh, lab assistant Email: arian.maghazeh@liu.se Åsa Kärrman, course secretary Email: asa.karrman@liu.se Zebo Peng, IDA, LiTH 2 TDTS 08 – Lecture 1 1 2015-10-30 Course Information Web page http://www.ida.liu.se/~TDTS08 Lectures 12 lectures. Lecture notes will be available at the web page usually one day before each lecture. The whole set of last year’s lecture notes is on the web. Lecture notes following the course book’s structure are also available on the web. Examination Written exam, closed book. Previous exam examples are at the course web site. Zebo Peng, IDA, LiTH 3 TDTS 08 – Lecture 1 Course Information (cont’d) Literature William Stallings: Computer Organization and Architecture, 10th edition, Peason, 2015. Additional articles available at the website. http://williamstallings.com/ComputerOrganization/ • Student resources, incl. homework problems & solutions. You can also use: • Older editions of Stallings’ book. • Books covering the same subjects. Ex. Hennessy and Patterson: “Computer Architecture: A Quantitative Approach.” Zebo Peng, IDA, LiTH 4 TDTS 08 – Lecture 1 2 2015-10-30 Course Information (cont’d) Labs Hands-on exercises with concepts taught in the course. Use tools for architecture evaluation via simulation. Give insights in various trade-offs involved in the design of computers. Enhance the understanding of parallel computer systems. Lab assignments Lab 1: Cache Memories. Lab 2: Instruction Pipelining. Lab 3: Superscalar Processors. Lab 4: VLIW Processors. Lab 5: Multiprocessor and Multi-Computer Systems. Please sign up for the labs in the website before November 11. Additional information to be given in the seminar (lesson) Wednesday November 11, 15-17. Zebo Peng, IDA, LiTH 5 TDTS 08 – Lecture 1 Lecture 1 Computer architecture: concepts and definitions CPU and instruction execution What is expected from this course Zebo Peng, IDA, LiTH 6 TDTS 08 – Lecture 1 3 2015-10-30 Computer Architecture Computer architecture refers to those attributes of a computer system that are visible to programmers, or have a direct impact on the logical execution of programs. Control Logic Registers Instruction I/O Devices ALU Set Memory System Zebo Peng, IDA, LiTH 7 TDTS 08 – Lecture 1 Many Definitions for CA “The science and art of selecting and interconnecting hardware components to create computers that meet functional, performance and cost goals.” “The theory behind the design of a computer.” “The conceptual design and fundamental operational structure of a computer system.” “The arrangement of computer components and their relationships.” … Zebo Peng, IDA, LiTH 8 TDTS 08 – Lecture 1 4 2015-10-30 Typical Architecture Attributes The instruction set. Data representation methods. The basic hardware units in the CPU. Functions of the main components. Instruction execution. Memory organization. I/O mechanisms. The ways in which the main components are interconnected. … Zebo Peng, IDA, LiTH 9 TDTS 08 – Lecture 1 What is a Computer? Zebo Peng, IDA, LiTH 10 TDTS 08 – Lecture 1 5 2015-10-30 Definition of a Computer A computer is a data processing machine which is operated automatically under the control of a list of instructions stored in its main memory. Computer Central Processing Unit (CPU) Data transfer Main Memory Control Zebo Peng, IDA, LiTH 11 TDTS 08 – Lecture 1 A Computer System A computer system consists of a computer and its peripherals. Computer peripherals include input devices, output devices, and secondary memories. Computer System Input device Computer Output device Secondary memory Zebo Peng, IDA, LiTH 12 TDTS 08 – Lecture 1 6 2015-10-30 Microprocessor Market Share General-purpose computers 98% 2% Zebo Peng, IDA, LiTH 13 TDTS 08 – Lecture 1 Basic Principles of Computers Virtually all modern computer designs are based on the von Neumann architecture principles: Data and instructions are stored in a single read/write memory. The contents of this memory are addressable by location, without regard to what are stored there. Instructions are executed sequentially (from one instruction to the next) unless the order is explicitly modified. CPU Memory Zebo Peng, IDA, LiTH 14 TDTS 08 – Lecture 1 7 2015-10-30 Why von Neumann Architecture? General-purpose, programmable. They can solve very different problems by executing different programs. Instruction execution is done automatically. It can be built with very simple electronics components: Data processing function is performed by electronic gates. Data storage function is provided by memory cells. Data communication is achieved by electrical wires. Zebo Peng, IDA, LiTH 15 CPU Memory TDTS 08 – Lecture 1 Technology Development Eniac, 1946 Zebo Peng, IDA, LiTH 16 TDTS 08 – Lecture 1 8 2015-10-30 Technology Development Year Relative perf./cost Technology 1951 Vacuum tube 1 1965 Transistor 35 1975 Integrated circuit (IC) 900 1995 Very large scale IC (VLSI) 2,400,000 2005 Ultra large scale IC 6,200,000,000 Eniac, 1946 Zebo Peng, IDA, LiTH 17 TDTS 08 – Lecture 1 Moore’s Law # of trans. 1000 M 750 M Number of transistors per chip would double every 1.5 years. Similar improvement in: Clock Frequency (every 2 years) Performance Memory capacity 50 M 25 M year 75 80 Zebo Peng, IDA, LiTH 85 90 18 95 00 05 10 TDTS 08 – Lecture 1 9 2015-10-30 Intel Microprocessor Evolution Intel 8‐core Xeon 2.3 Billion Transistor 45nm Intel 4004 2.3 Thousands transistors 10000 nm 19 Images courtesy of Intel Corporation Zebo Peng, IDA, LiTH 19 TDTS 08 – Lecture 1 Lecture 1 Computer architecture: concepts and definitions CPU and instruction execution What is expected from this course Zebo Peng, IDA, LiTH 20 TDTS 08 – Lecture 1 10 2015-10-30 Central Processing Unit (CPU) The Central Processing Unit (CPU), also called processor, includes two main units: A program control unit, and An Arithmetic and Logic Unit (ALU). CPU Arithmetic and Logic Unit Control Unit Register Zebo Peng, IDA, LiTH 21 TDTS 08 – Lecture 1 CPU (Cont’d) The primary function of a CPU is to execute the instructions stored in the main memory. An instruction tells the CPU to perform one of its basic operations. The CPU includes also a set of registers, which are temporary storage devices used to hold control information, key data, and intermediate results. It includes also an internal bus infrastructure, which provides data movement paths among the control unit, ALU, and registers. The CU is the one which interprets (decodes) the instruction to be executed and "tells" the other components what to do. Zebo Peng, IDA, LiTH 22 TDTS 08 – Lecture 1 11 2015-10-30 CPU Internal Structure Zebo Peng, IDA, LiTH 23 TDTS 08 – Lecture 1 Registers CPU must have some working space (temporary storage). These storage units are called registers. They are the top level component in the memory hierarchy. Number and function of the registers vary between different computers. Register organization is one of the major design decisions. Zebo Peng, IDA, LiTH 24 TDTS 08 – Lecture 1 12 2015-10-30 Register Organization The registers serve two main functions: User-Visible Registers: used by machine or assembly language programmers to minimize memory access. • General-purpose registers • Data registers • Address registers • Condition code registers Control and Status Registers: used by the control unit to control the operation of the CPU, and by the operating system to control the execution of programs. Zebo Peng, IDA, LiTH 25 TDTS 08 – Lecture 1 Machine Instructions The CPU can only execute machine code in binary format, called machine instructions. A machine instruction specifies the following information: What has to be done (the operation code) To whom the operation applies (source operands) Where does the result go (destination operand) How to continue after the operation is finished (next instruction address). Machine instructions are of four types: Arithmetic and logic operations. Data transfer between memory and CPU registers. Program control (conditional branches, etc.). I/O transfer. Zebo Peng, IDA, LiTH 26 TDTS 08 – Lecture 1 13 2015-10-30 Instruction Set Design The design of an instruction set is critical to the operations of a computer system. The most important issues are: Operation repertoire — How many and which operations to provide, and how complex these operations should be. Data types — Which data types to be supported. Instruction format — Length, number of addresses, size of various fields, etc. Registers — Number of CPU registers and their use. Addressing — Which modes to be provided. The issues are highly interrelated and must be considered together. Zebo Peng, IDA, LiTH 27 TDTS 08 – Lecture 1 Instruction Execution Mechanism PC AR IR Control Unit ALU CPU MAR MBR Address D/I Main Memory Zebo Peng, IDA, LiTH 28 TDTS 08 – Lecture 1 14 2015-10-30 Instruction Execution fetch cycle fetch execute cycle decode PC -> MAR Decode(IR) M[MAR] -> MBR MBR -> IR PC + 1 -> PC execute Perform the specified operation (memory access may be needed) (PC may be changed) PC AR Control Unit IR MAR Zebo Peng, IDA, LiTH CPU 29 ALU MBR TDTS 08 – Lecture 1 Machine Cycles The execution of an instruction is carried out in a machine cycle (instruction cycle). The CPU executes one instruction after the other, cycle by cycle, repeatedly. The machine cycle time (or instruction execution time) of a computer gives an indication of its performance (speed). Ex. a computer can have a performance of 733 MIPS (Millions of Instructions Per Second). Since different instructions need different time to execute, the average instruction execution time is often used. Very common, FLOPS (Floating-point Operations Per Second) is used nowadays. Zebo Peng, IDA, LiTH 30 TDTS 08 – Lecture 1 15 2015-10-30 Summary A computer executes repeatedly a series of instructions (called programs) stored in its main memory: It performs data processing operations specified by the programs. It runs the programs automatically, with no need for human intervention. It can perform the operations in extremely high speed. It can store and manipulate a large amount of data. It can communicate with each other and with users in an efficient way. It represents program and data in the same way, which leads to flexibility. Zebo Peng, IDA, LiTH 31 TDTS 08 – Lecture 1 Lecture 1 Computer architecture: concept and definitions CPU and instruction execution What is expected from this course Zebo Peng, IDA, LiTH 32 TDTS 08 – Lecture 1 16 2015-10-30 Computer Architecture Computer architecture refers to those attributes of a computer system that are visible to programmers, or have a direct impact on the logical execution of programs. Control Logic Registers Instruction I/O Devices ALU set Memory System Zebo Peng, IDA, LiTH 33 TDTS 08 – Lecture 1 Computer Organization Computer organization refers to the operational units and their interconnections that realize the architectural specifications. Registers I/O Devices ALU Memory System Hidden Reg. Microprog. controller Ex. Multiplication function: Architectural issue: having a multiply instruction or not. Organization issue: a special multiply unit or repeated use of the add unit to perform multiplication. Zebo Peng, IDA, LiTH 34 TDTS 08 – Lecture 1 17 2015-10-30 Lecture Contents 1. Introduction: Basic concepts and definitions of computer architecture and organizations. 2. Memory System: Memory hierarchy, cache memories, virtual memories, memory management. 3. Instruction Pipelining: Organization, pipeline hazards, reducing branch penalties, branch prediction strategies. 4. RISC Architectures: Analysis of instruction execution, compiling for RISC computers, RISC-CISC trade-offs. 5. Superscalar Architectures: Instruction level parallelism and machine parallelism, HW techniques for performance enhancement, limitations. 6. VLIW Architectures: VLIW advantages and limitations, compiling for VLIW architectures, the Merced (Itanium) architecture. Zebo Peng, IDA, LiTH 35 TDTS 08 – Lecture 1 Lecture Contents (Cont’d) 7. Parallel Computation: Parallel programs, performance of parallel computers, classification of architectures. 8. SIMD Architectures: Vector processors, array processors, multimedia extensions. 9. MIMD Architectures: Symmetric multiprocessors, NUMA architecture, and clusters. 10. Cache Coherence in Parallel Architectures: Cache design for parallel system, cache coherence protocols. 11. Multi-Core Processor and GPU: Technology, hardware issues, software impact, commercial examples. 12. Low-Power Architectures: Power consumption in CMOS circuits, designing for low power, Crusoe processor. Zebo Peng, IDA, LiTH 36 TDTS 08 – Lecture 1 18