第四章 I/O Organization 1 Agenda Single bus architecture I/O Bus Address lines Data lines Control lines I/O addressing Memory-mapped Isolated I/O I/O 2 處理器 記憶體 匯流排 I/O 裝置 n I/O 裝置 1 圖4.1 一個單一匯流排結構 3 System resources Every I/O device will use some system resources IRQ Hardware interrupt DMA Channels For direct memory access I/O ports (I/O addresses) Low memory area For communication between devices and cpu I/O memory Upper memory area(384K) For device driver mapped from BIOS For temporary data storage 4 I/O BUS Address lines Address Data lines Data the I/O devices put from device or cpu Control lines Read/write When data is ready 5 I/O addressing Memory-mapped I/O I/O device share the same address space with memory Isolated I/O Special Instruction I/O the 8086 used IN and OUT to read and write to I/O devices A output pin of CPU differentiate the address to memory or devices 6 位址線 資料線 控制線 匯流排 位址解碼器 控制電路 資料與狀態 暫存器 I/O 介面 輸入設備 圖4.2 輸入設備的I/O介面 7 資料傳入 資料傳出 狀態 DIRQ KIRQ SOUT 控制 DEN KEN 3 2 7 圖4.3 6 5 4 1 SIN 0 鍵盤和顯示介面中的暫存器 8 Input Output Techniques Program-controlled I/O Interrupt driven I/O DMA (Direct Memory Access) 9 Program-controlled I/O WAITK WAITD 圖4.4 Move #LINE,R0 初始化記憶體指標器。 TestBit #0,STATUS 測試SIN旗標。 Branch=0 WAITK 等待字元被輸入。 Move DATAIN,R1 讀取字元。 TestBit #1,STATUS 測試SOUT旗標。 Branch=0 WAITD 等待顯示裝置就緒。 Move R1,DATAOUT 送出字元到顯示裝置。 Move R1,(R0)+ 儲存字元並且前進指標器位置。 Compare #$0D,R1 檢查是否為Carriage Return。 Branch0 WAITK 如果不是,讀取另一個字元。 Move #$0A,DATAOUT 否則,送出換行Line Feed。 Call Process 呼叫一個子常式來處理輸入的一行資料。 程式會從鍵盤讀取一行資料,將它存到記憶體緩衝區中,並且回應到顯示裝置上 10 Interrupt driven I/O Refer to Page 4-22 圖4.9 11 程式 1 COMPUTE 常式 程式 2 PRINT 常式 1 2 中斷發生於此 i i +1 M 圖4.5 中斷發生時控制權的轉換 12 Avoid infinite interrupt Mask interrupt Ignore interrupt until entering first instruction of interrupt service routine Then disable interrupt in the first instruction of interrupt service routine Enable interrupt in the last instruction of interrupt service routine 13 Multiple devices interrupt Prioritize the devices Eq irq0 > irq 1> irq 2 .. 14 Interrupt Vectors 15 處理器 INTR1 設備 1 INTA1 I NTRp 設備 2 設備 p INTA p 優先等級仲裁電路 圖4.7 使用個別的中斷請求和確認線路的中斷優先等級實作 16 處理器 INTR INTA 設備 1 設備 2 設備 n (a) 菊花鏈 處理器 INTR1 INTA1 設備 設備 設備 設備 IN T R p INTA p 優先等級仲裁電路 (b) 優先等級群組的佈置 圖4.8 中斷優先等級方案 17 Software interrupt System call from applications A interrupted by scheduler (via clock) Context switching Scheduler selects B Before A’s time slice due, A requests I/O (eg. Keyboard input), A will issue system call to OS and OS will finally call I/O routine 18 OSINIT OSSERVICES SCHEDULER 設定中斷向量: Time-slice clock SCHEDULER Software interrupt OSSERVICES Keyboard interrupt IOData … 檢查堆疊來決定被要求的操作。 呼叫適當的常式。 儲存程式狀態。 選擇一個可以執行的處理程序。 還原新處理程序被儲存的context。 推進PS和PC的新值到堆疊中。 從中斷返回。 (a) OS初始化、服務和排程程式 IOINIT 設定處理程序的狀態為Blocked。 初始化記憶體緩衝區位址指標和計數器。 呼叫設備驅動程式來初始化設備 並且許可在設備介面中的中斷。 從子常式返回。 IODATA 探詢設備來決定中斷的來源 呼叫適當的驅動程式。 如果END=1,那麼設定處理程序狀態為可執行的。 從中斷返回。 (b) I/O常式 KBDINIT 許可中斷。 從子常式返回。 KBDDATA 檢查設備狀態。 如果是就緒的,那麼傳送字元。 如果字元=CR,那麼 { 設定END=1;禁止中斷 } 否則設定END=0。 從子常式返回。 圖4.10 (c) 鍵盤驅動程式 一些作業系統常式。 19 Direct Memory Access (DMA) CPU doesn’t involve data transferring except and the beginning and the end DMAC(DMA Controller) in charge of data transferring CPU tells DMAC The starting address of word to be transferred Work counts Direction (read or write) When DMA is transferring data, the request AP is blocked by OS, at the same time CPU can execute other programs When DMAC done work, it will interrupt CPU and OS will resume the blocked AP then the scheduler can schedule it 20 31 30 1 0 狀態與控制 IRQ 完成 IE R/ W 起始位址 字組計數 圖4.18 在一個DMA介面中的暫存器 21 處理器 主記憶體 系統匯流排 磁碟/DMA 控制器 磁碟 圖4.19 DMA 控制器 磁碟 印表機 鍵盤 網路介面 在一個計算機系統中DMA控制器的使用 22 23 CPU involved 24 25 26 DMA mode Burst Mode DMA – Here the DMA machine simply takes over control of the Bus makes the data transfer at top speed hands control back to the CPU. This is fast, but the CPU is stopped dead for the duration of the transfer. SUMMARY: Take control of BUS. Stop the CPU Send all the data to Memory Restore BUS control to CPU 27 DMA mode Cycle Stealing In the course of a normal program the CPU spends a lot of time executing internal CPU instructions (e.g. add ax,bx, inc bx, jnz fred) During the execute phase of these instructions the CPU does not need control of the Bus or access to memory. In cycle stealing mode the DMA machine has the ability to “steal” bus cycles for its own data transfer, stopping the CPU. But it can also use those bus cycles that the CPU doesn’t need. This cycle stealing approach slows down both the CPU and the DMA, but is more efficient overall. Not an interrupt, CPU does not switch context CPU suspended just before it accesses bus i.e. before an operand or data fetch or a data write SUMMARY: Take every second cycle off the CPU to use the BUS Also take any cycle where the CPU is not using the BUS (also known as Hidden DMA) 28 29 DMA Bus Line? Typical Default Use Other Common Uses 0 no Memory Refresh None 1 8/16-bit Sound card (low DMA) SCSI host adapters, ECP parallel ports, tape accelerator cards, network cards, voice modems 2 8/16-bit Floppy disk controller Tape accelerator cards 3 8/16-bit None ECP parallel ports, SCSI host adapters, tape accelerator cards, sound card (low DMA), network cards, voice modems, hard disk controller on old PC/XT 4 no None; cascade for DMAs 0-3 None 5 16-bit only Sound card (high DMA) SCSI host adapters, network cards 6 16-bit only None Sound cards (high DMA), network cards 7 16-bit only None Sound cards (high DMA), network cards 30 BUS Arbiter Who becomes bus master? Centralized distributed 31 B BSY BR 處理器 BG1 圖4.20 DMA 控制器 1 BG2 DMA 控制器 2 一個使用菊花鏈作為匯流排仲裁的簡易佈置 32 時間 BR BG1 BG2 BBSY 主匯流排 處理器 圖4.21 DMA 控制器 2 處理器 圖4.20中裝置的匯流排主控權轉移期間的訊號序列 此例表示DMAC2成為bus master的時序圖 33 V cc A RB 3 A RB 2 A RB 1 A RB 0 Start-Arbitration O.C. 0 1 0 1 0 1 1 1 設備 A 的介面電路 圖4.22 一個分散式仲裁方案 34 分散式仲裁以SCSI為例 目標檢驗 ID DB 2 DB 5 DB 6 BSY SEL Free 圖4.42 Arbitration Selection 在SCSI匯流排上的仲裁和選擇。裝置 6 贏得仲裁結果並選擇裝置 5 35 BUS timing Synchronous bus Includes a clock in the control lines A fixed protocol for communication that is relative to the clock Advantage: involves very little logic and can run very fast Disadvantages: Every device on the bus must run at the same clock rate To avoid clock skew, they cannot be long if they are fast Asynchronous It is not clocked (using master-ready and slave-ready instead) It can accommodate a wide range of devices It can be lengthened without worrying about clock skew It requires a handshaking protocol 36 Synchronous bus (ideal) 時間 匯流排時脈 定址與命令 資料 t0 t1 t2 匯流排週期 圖4.23 一個同步匯流排上的輸入傳送時序圖 37 Synchronous bus (consider delay) 時間 匯流排時脈 被主控裝置 偵測到 t AM 定址與命令 資料 t DM 被從動裝置 偵測到 tAS 定址與命令 資料 tDS t0 圖4.24 t1 t2 圖4.23的輸入傳送的詳細時序圖 38 Asynchronous bus 時間 定址與命令 主控裝置就緒 從動裝置就緒 資料 t0 t1 t2 t3 t4 t5 匯流排週期 圖4.26 在輸入操作期間資料傳遞的信號交換控制 39 Asynchronous bus 時間 定址與命令 資料 主控裝置就緒 從動裝置就緒 t0 t1 t2 t3 t4 t5 匯流排週期 圖4.27 在輸出作業期間資料傳遞的信號交換控制 40 Parallel port vs serial port serial port Information is transferred in or out one bit at a time RS232 USB Firewire (IEEE 1394) Serial ATA (Disk) parallel port data is transferred in or out in parallel Printer port microprocessor to communicate with peripherals PCI SCSI Parallel ATA (Disk) 41 Types of Buses Processor-Memory Bus (design specific) Short and high speed Only need to match the memory system Maximize memory-to-processor bandwidth Connects directly to the processor Optimized for cache block transfers I/O Bus (industry standard) Usually is lengthy and slower Need to match a wide range of I/O devices Connects to the processor-memory bus or backplane bus Backplane Bus (standard or proprietary) Backplane: an interconnection structure within the chassis Allow processors, memory, and I/O devices to coexist Cost advantage: one bus for all components 42 Standard I/O interfaces 處理器 主記憶體 處理器匯流排 橋接器 PCI 匯流排 SCSI 控制器 額外的記憶體 USB 控制器 乙太介面 ISA 介面 SCSI 匯流排 IDE 磁碟 影像 磁碟控制器 磁碟 1 圖4.38 磁碟 2 光碟機 控制器 光碟機 鍵盤 遊戲 一部計算機使用不同介面標準的範例 43 A Computer System with One Bus: Backplane Bus Backplane Bus Processor Memory I/O Devices A single bus (the backplane bus) is used for: Processor to memory communication Communication between I/O devices and memory Advantages: Simple and low cost Disadvantages: slow and the bus can become a major bottleneck Example: IBM PC - AT 44 Buses in PC-XT and PC-AT ISA (Industry Standard Architecture) IBM-PC and PC-XT: 8 bits at 4.77MHz, directly connect to 8088, 2-stage bus cycle (2.38Mbyte/sec bus bandwidth) AT bus: extension slot + 8 bit ISA 16 bits at 8.33MHz for 80286timer, BIOS int. contl. bus buffer ISA bus CPU DRAM contrl. DMA contrl. DRAM expansion slots 45 A Two-Bus System Processor Memory Bus Processor Memory Bus Adaptor I/O Bus Bus Adaptor Bus Adaptor I/O Bus I/O Bus I/O buses tap into the processor-memory bus via bus adaptors: Processor-memory bus: mainly for processor-memory traffic I/O buses: provide expansion slots for I/O devices Apple Macintosh-II NuBus: Processor, memory, and a few selected I/O devices SCCI Bus: the rest of the I/O devices 46 Buses in PC(486) 16-bit ISA cannot support Window applications --- video data VESA LB (local bus) -- linked to 486 local bus, 33MHZ, DRAM 32 bits 486 CPU local bus L2 cache ISA bridge ISA bus bus buffer video card LAN adapter HDD contrl. expansion slots 47 A Three-Bus System Processor Memory Bus Processor Memory Bus Adaptor Backplane Bus Bus Adaptor Bus Adaptor I/O Bus I/O Bus A small number of backplane buses tap into the processor-memory bus Processor-memory bus is only used for processor-memory traffic I/O buses are connected to the backplane bus Advantage: loading on the processor bus is greatly reduced 48 Buses in PC (Pentium) Pentium CPU A three-bus system host bus PCI chipset PCI bridge L2 Cache Data bus Memory PCI bus Video card LAN adapter PCI/ISA bridge SCSI adapter HDD CD-ROM IDE controller monitor ISA bus SuperIO chip Modem sound card 49 50 Processor/Memory Bus PCI Bus I/O Busses 51 PCI Bus 1992 by Intel Support PnP (plug and play), but is not popular until windows 95 release Support memory, I/O and configuration addressing, where configuration address is for PnP PCI is a 64-bit bus, though it is usually implemented as a 32-bit bus. It can run at clock speeds of 33 or 66 MHz. At 32 bits and 33 MHz, it yields a throughput rate of 133 MBps. PCI-Express 3GIO, 2.5Bb/s 1394b, USB 2.0, InfiniBand and Gigabit Ethernet PCI-X (PCI extended) 133 MBps to as much as 1 GBps. 52 Requirements for Plug and Play System Hardware system chipset and system bus controllers, must be capable of handling PnP devices Peripheral Hardware The System BIOS These devices must be PnP-aware so that they are capable of identifying themselves when requested, and able to accept resource assignments from the system when they are made. Routines built into the BIOS perform the actual work of collecting information about the different devices and determining what should use which resources. The BIOS also communicates this information to the operating system, which uses it to configure its drivers and other software to make the devices work correctly. The Operating System Finally, the operating system must be designed to work with the BIOS (and thus indirectly, with the hardware as well). support is Windows 95. 53 SCSI Transfer Mode Defining Standard "Regular" SCSI (SCSI-1) SCSI-1 Wide SCSI SCSI-2 Fast SCSI Fast Wide SCSI Ultra SCSI SCSI-2 SCSI-2 SCSI-3 / SPI Bus Width (bits) 8 16 8 16 Bus Speed (MHz) 5 5 10 10 Throughput (MB/s) 5 10 10 20 Special Features Cabling Signaling Method Maximum Devices Per Bus Maximum Cable Length (m) SE 8 6 HVD 8 25 SE 16 6 HVD 16 25 SE 8 3 HVD 8 25 SE 16 3 HVD 16 25 8 1.5 4 3 8 25 8 1.5 4 3 16 25 50-pin 68-pin 50-pin 68-pin SE 8 20 20 50-pin HVD Wide Ultra SCSI SCSI-3 / SPI SE 16 20 40 68-pin HVD 54 SCSI Transfer Mode Defining Standard Bus Width (bits) Bus Speed (MHz) Throughput (MB/s) Special Features Cabling Ultra2 SCSI Signaling Method LVD SCSI-3 / SPI-2 8 40 LVD SCSI-3 / SPI-2 16 40 Ultra3 SCSI 16 Ultra160(/m) SCSI SCSI-3 / SPI-3 Ultra160+ SCSI SCSI-3 / SPI-3 16 Ultra320 SCSI SCSI-3 / SPI-4 16 16 40 (DT) 40 (DT) 40 (DT) 80 (DT) 160 160 160 320 8 12 2 25 8 25 16 12 2 25 16 25 16 12 2 25 16 12 2 25 16 12 2 25 16 12 2 25 68-pin 80 HVD SCSI-3 / SPI-3 Maximum Cable Length (m) 50-pin 40 HVD Wide Ultra2 SCSI Maximum Devices Per Bus At least one of Fast-80, CRC, DV, QAS, Packet 68-pin Fast-80, CRC, DV 68-pin Fast-80, CRC, DV, QAS, Packet 68-pin Fast-160, ? 68-pin LVD LVD LVD LVD 55 USB Data rate USB Low speed 1.5Mb/s Full speed 12Mb/s USB 1.1 2.0 High speed 480Mb/s 127 devices (用7 bits表示連接的設備) PnP It is expected to completely replace serial and parallel ports 56 USB Tree architecture 主電腦 根集線器 集線器 集線器 I/O 設備 集線器 I/O 設備 I/O 設備 I/O 設備 I/O 設備 I/O 設備 圖4.43 通用序列匯流排樹狀結構 57 支援不同的速度連線 主電腦 根集線器 HS HS 集線器 B 集線器 A HS F/LS HS F/LS 設備 C - 高速 - 全速/低速 設備 D 圖4.44 分開的匯流排運作 58 PID 0 PID 1 PID 2 PID 3 PID 0 PID 1 PID 2 PID 3 (a) 封包辨識字欄位 位元 8 7 PID ADDR 4 ENDP 5 CRC16 (b) 標記封包IN或是OUT 位元 8 0 到 8192 16 PID D A TA CRC16 (c) 資料封包 圖4.45 USB封包型式 59 主機 集線器 I/O 裝置 權杖 資料0 A CK 時間 權杖 資料0 A CK 權杖 資料1 A CK 權杖 資料1 A CK 圖4.46 一個輸出傳輸 60