ADMIN • Reading – Chapter 6 – Including RAID (6.9) but don’t stress memorizing the names of the levels – Can skip 6.10 and 6.11 IC220 Set #18: Storage and I/O (Chapter 6) 1 Big Picture 2 I/O Interrupts • Processor Important but neglected “The difficulties in assessing and designing I/O systems have often relegated I/O to second class status” Cache “courses in every aspect of computing, from programming to computer architecture often ignore I/O or give it scanty coverage” “textbooks leave the subject to near the end, making it easier for students and instructors to skip it!” Memory- I/O bus • Main memory I/O controller I/O controller I/O controller Graphics output Network GUILTY! — we won’t be looking at I/O in much detail — be sure and read Chapter 6 carefully Disk Disk — Later – IC322: Computer Networks 3 4 Outline (A) I/O Overview • A. B. C. D. E. Overview Physically connecting I/O devices to Processors and Memory (8.4) Interfacing I/O devices to Processors and Memory (8.5) Performance Measures (8.6) Disk details/RAID (8.2) Can characterize devices based on: 1. behavior 2. partner (who is at the other end?) 3. data rate • • Performance factors: — access latency — throughput — connection between devices and the system — the memory hierarchy — the operating system Other issues: – Expandability, dependability 5 (B) Connecting the Processor, Memory, and other Devices Two general strategies: 1. Bus: ____________ communication link Advantages: CPU Mem 6 Typical x86 PC I/O System Disk Disadvantages: 2. Point to Point Network: ____________ links Use switches to enable multiple connections Advantages: CPU Mem Disk Disadvantages: 7 8 (B) Bus Basics – Part 1 • • I/O Bus Examples Types of buses: – Processor-memory • Short, high speed, fixed device types • custom design – I/O • lengthy, different devices • Standards-based e.g., USB, Firewire • Connect to proc-memory bus rather than directly to processor Only one pair of devices (sender & receiver) may use bus at a time Firewire USB 2.0 PCI Express Serial ATA Serial Attached SCSI Intended use External External Internal Internal External Devices per channel 63 127 1 1 4 Data width 4 2 2/lane 4 4 Peak bandwidth 50MB/s or 100MB/s 0.2MB/s, 1.5MB/s, or 60MB/s 250MB/s/lane 1×, 2×, 4×, 8×, 16×, 32× 300MB/s 300MB/s Hot pluggable Yes Yes Depends Yes Yes Max length 4.5m 5m 0.5m 1m 8m Standard IEEE 1394 USB Implementers Forum PCI-SIG SATA-IO INCITS TC T10 – Bus _______________ decides who gets the bus next based on some ______________ strategy – May incorporate priority, round-robin aspects • Have two types of signals: – “Data” – data or address – Control 9 (B) Bus Basics – Part 2 • 10 Handshaking example – CPU read from memory Clocking scheme: 1. ____________________ Use a clock, signals change only on clock edge + Fast and small - All devices must operate at same rate - Requires bus to be short (due to clock skew) ReadReq 1 3 Data 2 2 4 6 4 Ack 2. ____________________ No clock, instead use “handshaking” + Longer buses possible + Accommodate wide range of device - more complex control 5 DataRdy 11 1. 2. 3. 4. 5. 6. 7. CPU requests read Memory acknowledges, CPU deasserts request Memory sees change, deasserts Ack Memory provides data, asserts DataRdy CPU grabs data, asserts Ack Memory sees Ack, deasserts DataRdy CPU sees change, deasserts Ack 7 12 (C) Processor-to-device Communication (C) Device-to-processor communication How does device get data to the processor? 1. CPU periodically checks to see if device is ready: _________________ • CPU sends request, keep checking if done • Or just checks for new info (mouse, network) How does CPU send information to a device? 1. Special I/O instructions x86: inb / outb How to control access to I/O device? 2. Device forces action by the processor when needed: _________________ • Like an unscheduled procedure call • Same as “exception” mechanism that handles TLB misses, divide by zero, etc. 2. Use normal load/instructions to special addresses Called ______________________ Load/store put onto bus Memory ignores them (outside its range) Address may encode both device ID and a command 3. DMA: • Device sends data directly to memory w/o CPU’s involvement • Interrupts CPU when transfer is complete How to control access to I/O device? 13 DMA Issues What could go wrong? 14 (D) I/O’s impact on performance Interrupts Processor Cache • Total time = CPU time + I/O time • Suppose our program is 90% CPU time, 10% I/O. If we improve CPU performance by 10x, but leave I/O unchanged, what will the new performance be? • • Old time = 100 seconds New time = Memory- I/O bus Main memory I/O controller Disk Disk I/O controller I/O controller Graphics output Network 15 16 (D) Measuring I/O Performance • • • Latency? Throughput? Throughput with maximum latency? • Transaction processing benchmarks – TPC-C – TPC-H – TPC-W (E) Disk Drives Platters Tracks • Platter Sectors Track File system / Web benchmarks – “Make” benchmark – SPECSFS – SPECWeb – SPECPower • To access data: — seek: position head over the proper track (3 to 14 ms. avg.) — rotational latency: wait for desired sector (.5 / RPM) — transfer: grab the data (one or more sectors) 30 to 80 MB/sec 17 (E) RAID • _______________ ________________ _______________ ___________ • • • Idea: lots of cheap, smaller disks Small size and cost makes easier to add redundancy Multiple disks increases read/write bandwidth 18 RAID RAID 0 – “striping”, no redundancy RAID 1 – mirrored 19 20 RAID RAID RAID 4 – Block-interleaved parity RAID 10 – Striped mirrors RAID 5 – Distributed Block-interleaved Parity Key point – still need to do other backups (e.g. to tape) – Provides protection from limited number of disk failures – No protection from human failures! • 21 Flash Storage • • 22 Fallacies and Pitfalls Nonvolatile semiconductor storage – 100× – 1000× faster than disk – Smaller, lower power, more robust – But more $/GB (between disk and DRAM) Flash bits wears out after 1000’s of writes – Not suitable for direct RAM or disk replacement – Wear leveling: remap data to less used blocks – Result: “solid-state hard drive” 23 • Fallacy: the rated mean time to failure of disks is 1,200,000 hours, so disks practically never fail. • Fallacy: magnetic disk storage is on its last legs, will be replaced. • Fallacy: A GB/sec bus can transfer 1 GB of data in 1 second. • Pitfall: Moving functions from the CPU to the I/O processor, expecting to improve performance without analysis. 24