Lecture 3: Computer Architectures Basic Computer Architecture Memory instruction Input unit data ALU Processor CU Reg. Output unit Von Neumann Architecture Levels of Parallelism Bit level parallelism • Within arithmetic logic circuits Instruction level parallelism • Multiple instructions execute per clock cycle Memory system parallelism • Overlap of memory operations with computation Operating system parallelism • • More than one processor Multiple jobs run in parallel on SMP • • Loop level Procedure level Levels of Parallelism Bit Level Parallelism Within arithmetic logic circuits Levels of Parallelism Instruction Level Parallelism (ILP) Multiple instructions execute per clock cycle Pipelining (instruction - data) Multiple Issue (VLIW) Levels of Parallelism Memory System Parallelism Overlap of memory operations with computation Levels of Parallelism Operating System Parallelism • • There are more than one processor Multiple jobs run in parallel on SMP • • Loop level Procedure level Flynn’s Taxonomy Single Instruction stream - Single Data stream (SISD) Single Instruction stream - Multiple Data stream (SIMD) Multiple Instruction stream - Single Data stream (MISD) Multiple Instruction stream - Multiple Data stream (MIMD) Single Instruction stream Single Data stream (SISD) Memory instruction CU data ALU Processor Von Neumann Architecture Flynn’s Taxonomy Single Instruction stream - Single Data stream (SISD) Single Instruction stream - Multiple Data stream (SIMD) Multiple Instruction stream - Single Data stream (MISD) Multiple Instruction stream - Multiple Data stream (MIMD) Single Instruction stream Multiple Data stream (SIMD) instruction PE PE instruction Each processor executes the same instruction synchronously, but using different data Used for applications that operate upon arrays of data data CU PE Instructions of the program are broadcast to more than one processor data data data Memory PE Flynn’s Taxonomy Single Instruction stream - Single Data stream (SISD) Single Instruction stream - Multiple Data stream (SIMD) Multiple Instruction stream - Single Data stream (MISD) Multiple Instruction stream - Multiple Data stream (MIMD) Multiple Instruction stream Multiple Data stream (MIMD) Each processor has a separate program An instruction stream is generated for each program on each processor Each instruction operates upon different data Multiple Instruction stream Multiple Data stream (MIMD) Shared memory Distributed memory Shared vs Distributed Memory M M M M P P P P Distributed memory • • Network P P P Bus Memory P Each processor has its own local memory Message-passing is used to exchange data between processors Shared memory • • Single address space All processes have access to the pool of shared memory Distributed Memory M M M M P P P P NI NI NI NI Network Processors cannot directly access another processor’s memory Each node has a network interface (NI) for communication and synchronization Distributed Memory M instr CU PE data data data CU PE data M instr data CU PE data M instr data data CU PE Network M instr Each processor executes different instructions asynchronously, using different data Shared Memory CU PE data CU PE data CU PE data CU PE instruction Memory data Each processor executes different instructions asynchronously, using different data Shared Memory P P P P Bus Uniform memory access (UMA) • Memory P P P P P P P Bus Bus Memory Memory Network P Each processor has uniform access to memory (symmetric multiprocessor - SMP) Non-uniform memory access (NUMA) • • • Time for memory access depends on the location of data Local access is faster than nonlocal access Easier to scale than SMPs Distributed Shared Memory Making the main memory of a cluster of computers look as if it is a single memory with a single address space Shared memory programming techniques can be used Multicore Systems Many general purpose processors GPU (Graphics Processor Unit) GPGPU (General Purpose GPU) Hybrid The trend is: Board composed of multiple manycore chips sharing memory Rack composed of multiple boards A room full of these racks Memory Distributed Systems Clusters • Individual computers, that are tightly coupled by software, in a local environment, to work together on single problems or on related problems Grid • Many individual systems, that are geographically distributed, are tightly coupled by software, to work together on single problems or on related problems