EE-321 Fall 2023 Computer Architecture and Organization Lecture # 01 Introduction and Motivation Muhammad Imran muhammad.imran1@seecs.edu.pk Welcome to Computer Architecture and Organization Contents 3 ▪ Introduction ▪ What this course is about? ▪ Why should you take this course? ▪ Course Outline, Textbook, Grading Policy etc. ▪ What is Computer Architecture? ▪ Classical Ideas ▪ Computer Architecture Today! Introduction 4 ▪ About me … ▪ Ph.D. in Electronic and Electrical Engineering Sungkyunkwan University, South Korea ▪ Research Interests ▪ Processing in Memory / In-Memory Computing ▪ Efficient and reliable Computing Architectures for AI ▪ Vector processor design ▪ Contact ▪ Phone: 03344921069 ▪ Email: muhammad.imran1@seecs.edu.pk ▪ Web: https://soc.seecs.edu.pk ▪ Office ▪ System on Chip (SoC) Lab 3rd Floor, SINES, NUST ▪ Office Hours: any time by appointment Besides teaching and research … 5 … 6 … 7 … 8 … 9 Introduction 10 ▪ Students’ Introduction ▪ Let’s get to know each other … ▪ What’s your name? ▪ Which major do you like? ▪ What are you passionate about? ▪ Anything else … What this course is about? How a microprocessor works at a very low level? How do you design a microprocessor bottom up? Why do certain programs run faster on a GPU than a CPU? How can we make computers better (fast, less costly, secure etc.)? How different programming styles impact performance of a computer? What are the features of a good Instruction Set Architecture (ISA)? Course contents in brief … 18 ▪ Learning to design your own microprocessor based on popular, open-standard RISC-V ISA! ▪ Techniques for enhancing computing performance ▪ Architecting memory for performance enhancement ▪ Understanding design trade-offs ▪ Introduction to advanced computing architectures ▪ Out-of-order execution ▪ Vector processors ▪ Multicores ▪ Domain-specific architectures Prerequisite 19 ▪ No prior knowledge as such ▪ But be willing to learn Verilog … ▪ To implement the ideas and build your own computer! ▪ Required in some labs, assignments and project! ▪ Check out following sources: ▪ https://safari.ethz.ch/digitaltechnik/spring2023/doku.php?id=schedule ▪ (Week 3) ▪ https://fpgacademy.org/courses.html ▪ https://www.asic-world.com/verilog/veritut.html ▪ https://www.youtube.com/playlist?list=PLZU5hLL_713x0_AV_rVbay0pWmED799 2G Why should you take this course? Why should you take this course? 21 ▪ Computers are everywhere today! ▪ Personal computers! ▪ Cellphones! ▪ Embedded systems! ▪ Servers, supercomputers etc. ▪ Knowing architectural design principles can help you optimize many modern systems! ▪ Programmers can write efficient programs if they know the underlying hardware well! ▪ Computer Architecture is an exciting and rapidly advancing area of research! Textbook, Course Outline, Grading etc. … Textbook(s) 23 Main Textbook Advanced Concepts More Detailed Lectures Plan 24 ▪ Week 1 to Week 8 ▪ Introduction & Motivation ▪ Background and design example ▪ Instruction Set Architecture (ISA) ▪ Measuring Performance ▪ Design of an Arithmetic Logic Unit (ALU) ▪ Design of Single-Cycle Processor Datapath and Control ▪ Multicycle Pipelined Processor Implementation ▪ Dealing with Pipelining Hazards ▪ Handling Interrupts and Exceptions Mid-Semester Exam ▪ 9th Week – 6 November 2023 ~ 10 November 2023 Lectures Plan 25 ▪ Week 10 to Week 17 ▪ Advanced Techniques for Instruction-Level Parallelism ▪ Out-of-Order Execution ▪ Superscalar Processors ▪ Hardware Multithreading ▪ Data-Level Parallelism for Performance ▪ Vector Architectures and GPUs ▪ Memories and Memory Hierarchy ▪ Caches ▪ Virtual Memory ▪ Thread-Level Parallelism for Performance ▪ Multicore systems ▪ Cache Coherence in Multicores ▪ Domain-Specific Architectures ▪ Processing-In Memory ▪ End-Semester Exam (ESE) ▪ 18th Week – 8 January 2024 ~ 12 January 2024 Grading Policy 26 ▪ Assignments ▪ 10% ▪ Quizzes ▪ 10% ▪ MSE ▪ 30% ▪ ESE ▪ 50% ▪ Project ▪ 40% of Lab. grade! ▪ Design, implement and test on FPGA, an application-specific RISC-V processor! ▪ Pick any application where conventional microcontrollers are used! A few words on class ethics … 27 ▪ Say no to Plagiarism! ▪ Never copy others’ assignment! ▪ Better fail now or you fail later in your life! ▪ Properly cite the resources used in your assignments / project! ▪ Work Hard! ▪ Respect for all! ▪ If you have any problem, feel free to contact me! Presence in class 28 ▪ Mandatory 75% attendance! ▪ Be on time to ensure you do not miss the attendance! ▪ If you are late for any reason, do not disturb the class while entering! ▪ Attentiveness during lecture ▪ More important than attendance! ▪ Key to effective learning in semester-based system! ▪ Requires effort! ▪ Feel free to interrupt and ask questions and clarify different points! Any course-policies-related queries? What is Computer Architecture? It’s a story of an exciting journey covering major milestones in more than 70 years of computer design … From this … 32 Source: Wikipedia Electronic delay storage automatic calculator (EDSAC) (An early British Computer, 1948) To this … 33 Laptops Smartphones Digital Cameras Robots Automobiles Servers, datacenters, supercomputers Source: images.google.com, multiple sites Home Appliances and many more … With insights to undertake the design of modern computers … and getting your work done efficiently by a computer … Key ideas that shaped Computer Architecture over time Abstraction Layers to simplify design! 37 Application Algorithm Programming Language Operating System / Virtual Machines Instruction Set Architecture (ISA) Microarchitecture Classical view! Gates / Register-Transfer Level (RTL) Circuits Devices Physics Source: phys.org Modern view! Computer Architecture is the science and art of designing computing platforms (hardware, interface, system SW, and programming model) From CISC … 38 ▪ Complex Instruction Set Computers (CISC) ISAs* ▪ Example? ▪ x86 architecture! ▪ Support complex instructions ▪ Many addressing modes! ▪ Allow memory operands in multiple instructions! ▪ Redundant Instructions ▪ When same functionality could be achieved by one instruction! ▪ Probably, good for assembly language programmers! ▪ CISC Implementation ▪ Complicated! ▪ Harder to pipeline! ▪ Cannot easily enhance performance because of variations in instruction execution time! To RISC! 39 ▪ Reduced Instruction Set Computers (RISC) ISAs ▪ Examples? ▪ ARM, MIPS, RISC-V ▪ Support minimal instructions to achieve most commonly required functionality! ▪ Do not allow memory operands except in loads / stores instructions! ▪ Motivation? ▪ Instructions are more uniform in terms of execution! ▪ Easier to pipeline! ▪ Easier to make them execute more efficient! ▪ Easier to handle exceptions and interrupts! ▪ CISC architectures like x86 convert instructions into microcode which is RISC-like and efficient to implement! ▪ One of the reasons for their survival! RISC-V ISA 40 ▪ Based on long-term experience of computer design! ▪ Can say, well-thought-out ISA! ▪ And, it is open-standard! ▪ Anyone can use it freely to make a chip! ▪ Expected to accelerate open-standard computer hardware development just like open-source software! ▪ People have already done a lot of work around RISC-V! Technology Trends Moore’s Law 42 ▪ Moore’s observation ▪ The number of transistors on a denser integrated circuits doubled every two years … ▪ Technology Scaling ▪ Make transistor smaller and smaller … ▪ Allows to integrate more circuit on same area! ▪ More function! ▪ No longer progressing at pace of Moore’s law ▪ Ending scaling ▪ Power! Dennard Scaling 43 ▪ Observation by Robert Dennard in 1974 ▪ Power density remained constant for a silicon area even when the number of transistor increased due to scaling down the transistor dimensions ▪ Operating voltage kept dropping with decreasing transistor size! ▪ Dennard Scaling ended some years ago! ▪ Voltage couldn’t be reduced further while still reliably operating the transistor! ▪ That means, adding more transistor now can increase power budget and heat up the chip!! Performance of a Single Processor 44 ▪ Uniprocessor performance ended in early 20s Power and Energy 45 ▪ Dynamic energy ▪ Transistor switch from 0 -> 1 or 1 -> 0 ▪ ½ x Capacitive load x Voltage2 ▪ Dynamic power ▪ ½ x Capacitive load x Voltage2 x Frequency switched ▪ Static energy is the energy consumed due to leakage current! ▪ Increasing leakage with smaller transistor! ▪ Proportional to number of transistors! ▪ One solution is to cut off power if a module is not in used! Clock Rate 46 ▪ Intel 80386 consumed ~ 2 W ▪ 3.3 GHz Intel Core i7 consumes 130 W ▪ ▪ Heat must be dissipated from 1.5 x 1.5 cm chip This is the limit of what can be cooled by air! What architecting a modern computer requires? It’s the Science and the Art … 48 Source: Wikipedia Understanding the conventional wisdom and fundamental design principles with attention to current challenges to inspire new out-of-the-box designs … What’s the state-of-the-art research in Computer Architecture? Challenges with Memory System “Across the industry, today’s chips are largely able to execute code faster than we can feed them with instructions and data. There are no longer performance bottlenecks in the floating-point multiplier unit. The real design action is in the memory subsystems – caches, buses, bandwidth, and latency” - Richard Sites, 1996 Architecting Memory System 53 ▪ Need for main memory capacity and bandwidth increasing ▪ Core count doubling ~ every 2 years ▪ DRAM capacity doubling ~ every 3 years ▪ Memory capacity per core expected to drop by 30% every two years ▪ Scaling limitations will eventually stop memory capacity from increasing! ▪ Memory-intensive workloads! ▪ Machine learning and neural networks ▪ Memory bandwidth is a performance bottleneck! Architecting Memory System 54 ▪ Architecting for improved memory performance! ▪ 3-D Integration ▪ Improves density and bandwidth ▪ Emerging Non-Volatile Memory Technologies ▪ Better scaling potential ▪ Better storage density! ▪ Non-volatile (unlike volatile DRAM and SRAM) ▪ Performance much better than Flash! ▪ Examples ▪ Phase Change Memory (PCM) ▪ Resistive RAM (RRAM) ▪ Magneto-resistive RAM (STT-MRAM) ▪ Emerging memories do have reliability challenges! Architecting Memory System 55 ▪ Architecting for improved memory performance! ▪ Hybrid Main Memory ▪ Characteristics of an emerging technology like PCM ▪ Better scalability than DRAM ▪ More storage! ▪ Non-volatile! ▪ Read Write Time is more than that of DRAM! ▪ Write Energy is more than that of DRAM! ▪ Cell lifetime is less than that of DRAM! ▪ In hybrid main memory, DRAM can act as a cache for PCM! Architecting Memory System 56 ▪ Architecting for improved memory performance! ▪ Introduce new levels in the memory hierarchy ▪ Intel Optane Memory (based on emerging memory technologies) ▪ Intelligently stores frequently accessed data to accelerate program execution! https://www.intel.com/content/www/us/en/products/details/memory-storage/optane-memory.html Post Von Neumann Architectures 57 ▪ Von Neumann Architecture has ▪ A compute unit ▪ Memory unit ▪ Interconnect ▪ In early days, memory access was faster and used less energy compared to computation! ▪ So, data could reside away from the compute unit! ▪ Today, processors have been highly optimized ▪ Memory access is a bottleneck to performance! ▪ Memory access energy is much higher than energy spent in computation! ▪ 100~1000X more costly! Post Von Neumann Architectures 58 ▪ Processing-In Memory ▪ Implement logic within memory! ▪ Reduce costly data movement! ▪ Save energy and improve performance! Reliable and Secure Computing RowHammer in DRAM 60 ▪ A reliability as well as security vulnerability in DRAM! ▪ Accessing a DRAM row multiple times can flip the bits in adjacent rows! ▪ Can corrupt data i.e., a reliability challenge! ▪ Also a security challenge ▪ By precisely flipping the “rights access” bits, attacker can get accessed to privileged user and system data! Meltdown and Spectre 61 ▪ Security vulnerabilities based on speculative execution! ▪ CPU fetches data speculatively that it shouldn’t! ▪ Attacker deduces important information from the fetched data! ▪ Can use that to take over the system! Image Source: https://blog.malwarebytes.com/ Meltdown and Spectre 62 ▪ For more on meltdown and Spectre ▪ Check: ▪ https://meltdownattack.com/ ▪ A broad overview ▪ https://www.youtube.com/watch?v=syAdX44pokE ▪ Somewhat more detailed explanation ▪ https://www.youtube.com/watch?v=mgAN4w7LH2o Resistance Drift in PCM 63 Resistance ▪ PCM cell material resistance spontaneously increases over time ▪ Data may be changed as resistance increases B Error! 0 A 1 Time Other Reliability Challenges 64 ▪ Different error mechanisms in different memory technologies! ▪ Radiation/thermal-induced bit flips in DRAM, ▪ Write Disturbance (WD) in PCM (similar to RowHammer in DRAM)! ▪ Read Disturbance in STT-MRAM ▪ Hard errors (stuck-at faults) in emerging memories etc. ▪ Need to develop robust reliability enhancement or error correction techniques! Domain-Specific Architectures Domain-Specific Architecture 66 ▪ Designing custom solutions to best suit a particular application domain ▪ A simple example … ▪ Accelerating a task, ‘a+b+c+d+e+f+g+h’ Software ▪ Sequential execution on a processor! Accelerator ▪ A fast parallel datapath! Google Tensor Processing Unit (TPU) 67 ▪ Neural Network Accelerator https://storage.googleapis.com/nexttpu/index.html Google Tensor Processing Unit (TPU) 68 ▪ Neural Network Accelerator https://storage.googleapis.com/nexttpu/index.html Google Tensor Processing Unit (TPU) 69 ▪ A TPU contains massive arrays of processing elements for efficient matrix operations https://storage.googleapis.com/nexttpu/index.html Tesla self-driving computer (2019) 70 ▪ Machine Learning accelerator https://youtu.be/Ucp0TTmvqOE?t=4236 Relevant Resources 71 ▪ Digital Design and Computer Architecture by Onur Mutlu ▪ https://safari.ethz.ch/digitaltechnik/spring2023/doku.php?id=schedule ▪ Computer Architecture by Onur Mutlu ▪ https://safari.ethz.ch/architecture/fall2023/doku.php?id=schedule ▪ Introduction to Computer Architecture by David Black-Schaffer ▪ https://www.youtube.com/channel/UC0j4jTCkhMLmGwriVbbBtSw/videos ▪ https://www.youtube.com/channel/UCzf_XjIoKSf4Ve2fH7xn-3A/playlists ▪ RVFPGA Program ▪ Learning RISC-V architecture using FPGA! ▪ https://university.imgtec.com/rvfpga/ ▪ Intel FPGA Academic Program ▪ https://www.intel.com/content/www/us/en/developer/topic-technology/fpgaacademic/materials.html Relevant Reading 72 ▪ Computer Organization and Design, RISC-V 2nd Edition, By Hennessy and Patterson ▪ Chapter 1