10 Local Bus 10.1 Introduction The main problem with the PC, ISA and EISA busses is that the transfer rate is normally much slower than the system clock. This is wasteful in processor time and generally reduces system performance. For example, if the system clock is running at 50 MHz and the EISA interface operates at 8 MHz then for 84% of the data transfer time the processor is doing nothing. This chapter discusses the main local busses and the next discusses motherboard design. 10.2 VESA VL-local bus CL340 XYZ11 CL340 CS220 MS1432 An improvement to the ISA interface is to transfer data at the speed of the system clock. For this reason the Video Electronics Standards Association (VESA) created the VLlocal bus to create fast processor-to-video card transfers. It uses a standard ISA connector with an extra connection to tap into the system bus (Figure 10.1). Memory, graphics and disk transfers are the heaviest for data transfer rates, whereas applications such as modems, Ethernet and sound cards do not require fast transfer rates. The VL-local bus addresses this by allowing the processor, memory, graphics and disk controller access to a 33 MHz/32-bit local bus. Other applications still use the normal ISA bus, as shown in Figure 10.2. The graphics adapter and disk controller connect to the local bus whereas other slower peripherals connect to the slower 8 MHz/16-bit ISA bus. A maximum of three devices can connect to the local bus (normally graphics and disk controllers). Note that the speed of the data transfer is dependent on the clock rate of the system and that the maximum clock speed for the VL-local bus is 33 MHz. CL340 XYZ11 Local bus extension Figure 10.1 VL-local bus interface card. 114 Local bus Processor Memory Graphics 115 Disk controller VL-local bus (33MHz/32-bit) System I/O ISA bus (8MHz/16-bit) FAX/ modem Ethernet Sound card Parallel port Figure 10.2 VESA VL-local bus architecture. Table 10.1 lists the pin connections for the 32-bit VL-local bus and it shows that, in addition to the standard ISA connector, there are two sides of connections, the A and the B side. Each side has 58 connections giving 116 connections. It has a full 32-bit data and address bus. The 32 data lines are labeled DAT00–DAT31 and 32 address lines are labeled from ADR00 to ADR31. Note that while the data and address lines are contained within the extra VL-local bus extension, some of the standard ISA lines are used, such as the IRQ lines. 10.3 PCI bus Intel have developed a new standard interface, named the PCI (Peripheral Component Interconnection) local bus, for the Pentium processor. This technology allows fast memory, disk and video access. A standard set of interface ICs known as the 82430 PCI chipset is available to interface to the bus. As with the VL-local bus, the PCI bus transfers data using the system clock, but has the advantage over the VL-local Bus in that it can operate over a 32-bit or 64-bit data path. The high transfer rates used in PCI architecture machines limit the number of PCI bus interfaces to two or three (normally the graphics adapter and hard disk controller). If data is transferred at 64 bits (8 bytes) at a rate of 33 MHz then the maximum transfer rate is 264 MB/sec. Figure 10.3 shows PCI architecture. Notice that an I/O bridge gives access to ISA, EISA or MCA cards. Unfortunately, to accommodate for the high data rates and for a reduction in the size of the interface card, the PCI connector is not compatible with PC, ISA or EISA. The maximum data rate of the PCI bus is 264 MB/sec, which can only be achievable using 64-bit software on a Pentium-based system. On a system based on the 80486 processor this maximum data will only be 132 M B/sec (that is, using a 32-bit data bus). The PCI local bus is a radical re-design of the PC bus technology and is logically different from the ISA and VL-local bus. Table 10.2 lists the pin connections for the 32bit PCI local bus and it shows that there are two lines of connections, the A and the B side. Each side has 64 connections giving 128 connections. A 64-bit, 2×94-pin connector version is also available. The PCI bus runs at the speed of the motherboard which for the Pentium processor is typically 33 MHz or 50 MHz (as compared to the VL-local bus which gives a maximum transfer rate of 33 MHz). 116 PC Interfacing, Communications and Windows Programming Table 10.1 32-bit VESA VL-local bus connections. Pin 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Side A D0 D2 D4 D6 D8 GND D10 D12 VCC D14 D16 D18 D20 GND D22 D24 D26 D28 D30 VCC A31 GND A29 A27 A25 A23 A21 A19 GND Side B D1 D3 GND D5 D7 D9 D11 D13 D15 GND D17 VCC D19 D21 D23 D25 GND D27 D29 D31 A30 A28 A26 GND A24 A22 VCC A20 A18 Processor Pin 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 Side A A17 A15 VCC A13 A11 A9 A7 A5 GND A3 A2 NC Side B A16 A14 A12 A10 A8 GND A6 A4 WBACK BE0 VCC BE1 BE2 RESET D/C GND M / IO BE3 W/R ADS KEY KEY KEY KEY RDYRTN LRDY GND IRQ9 LDEV LREQ BRDY BLAST GND IDO ID1 GND LCLK VCC VCC ID2 ID3 ID4 NC LBS16 LEADS LGNT Memory PCI bridge PCI bus (33MHz/32/64-bit) Graphics Disk controller ISA bridge Other ISA bus (8MHz/16-bit) Figure 10.3 PCI bus architecture. Local bus 117 Table 10.2 32-bit PCI local bus connections. Pin 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Side A –12V TCK GND TDO +5V +5V INTB INTD PRSNT1 Reserved PRSNT2 GND GND Reserved GND CLK GND REQ +5V(I/O) AD31 AD29 GND AD27 AD25 +3.3V C / BE3 AD23 GND AD21 AD19 +3.3V Side B TRST +12V TMS TDI +5V INTA INTC +5V Reserved +5V(I/O) Reserved GND GND Reserved RST +5V(I/O) GNT GND Reserved AD30 +3.3V AD28 AD26 GND AD24 IDSEL +3.3V FRAME AD20 GND TRDY Pin 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 Side A AD17 C / BE2 Side B AD16 +3.3V GND FRAME IRDY GND +3.3V TRDY DEVSEL GND GND STOP LOCK PERR +3.3V SDONE +3.3V SBO SERR GND PAR AD15 +3.3V AD13 AD11 GND AD09 KEY KEY +3.3V C / BE1 AD14 GND AD12 AD10 GND KEY KEY AD08 AD07 +3.3V AD05 AD03 GND AD01 +5V(I/O) C / BE0 +3.3V AD06 AD04 GND AD02 AD00 +5V(I/O) ACK64 REQ64 +5V +5V +5V +5V 10.3.1 PCI operation The PCI bus cleverly saves lines by multiplexing the address and data lines. It two modes (Figure 10.4): • Multiplexed mode. The address and data lines are used alternately. First the address is sent, followed by a data read or write. Unfortunately, this requires two or three clock cycles for a single transfer (either an address followed by a read or write cycle, or an address followed by read and write cycle). This causes a maximum data write transfer rate of 66 MB/s (address then write) and a read transfer rate of 44 MB/s (address, write then read), for a 32-bit data bus width. • Burst mode. The multiplexed mode obviously slows down the maximum transfer rate. Additionally, it can be operated in burst mode, where a single address can be initially sent, followed by implicitly addressed data. Thus, if a large amount of sequentially addressed memory is transferred then the data rate approach the maximum transfer of 133 MB/s for a 32-bit data bus and 266 MB/s for a 64-bit data bus. 118 PC Interfacing, Communications and Windows Programming Data3 Address3 Data1 Address1 Data2 Address2 Data1 Address Address1 Data1 Address2 Data2 PCI bus (normal mode) Address3 Data Address Data1 Data2 Data3 Data4 PCI bus (burst mode) Data2 Data3 Data4 Data5 Data5 Figure 10.4 PCI bus transfer modes. If the data from the processor is sequentially address data then PCI bridge buffers the incoming data and then releases it to the PCI bus in burst mode. The PCI bridge may also use burst mode when there are gaps in the addressed data and use a handshaking line to identify that no data is transferred for the implied address. For example in Figure 10.4 the burst mode could involve Address+1, Address+2 and Address+3 and Address+5, then the byte enable signal can be made inactive for the fourth data transfer cycle. To accommodate the burst mode, the PCI bridge has a prefetch and posting buffer on both the host bus and the PCI bus sides. This allows the bridge to build the data access up into burst accesses. For example, the processor typically transfers data to the graphics card with sequential accessing. The bridge can detect this and buffer the transfer. It will then transfer the data in burst mode when it has enough data. Figure 10.5 shows an example where the PCI bridge buffers the incoming data and transfers it using burst mode. The transfers between the processor and the PCI bridge, and between the PCI bridge and the PCI bus can be independent where the processor can be transferring to its local memory while the PCI bus is transferring data. This helps to decouple the PCI bus from the processor. The primary bus in the PCI bridge connects to the processor bus and the secondary bus connects to the PCI bus. The prefetch buffer stores incoming data from the connected bus and the posting buffer holds the data ready to be sent to the connected bus. The PCI bus also provides for a configuration memory address (along with direct memory access and isolated I/O memory access). This memory is used to access the configuration register and 256-byte configuration memory of each PCI unit. Local bus Processor bus Address1 Data1 119 PCI bus Prefetch buffer Address2 Data2 Address3 Processor Posting buffer Address1 Data1 Data3 Prefetch buffer Address4 Data4 Data2 Data3 Data4 Posting buffer PCI bridge transfers with burst mode Primary bus Secondary bus Figure 10.5 PCI bridge using buffering for burst transfer. 10.3.2 PCI bus cycles The PCI has built-in intelligence where the command/byte enable signals ( C/BE3 – C/BE0 ) are used to identify the command. They are given by: C/BE3 0 0 0 0 0 0 1 1 1 1 1 1 C/BE2 0 0 0 0 1 1 0 0 1 1 1 1 C/BE1 0 0 1 1 1 1 1 1 0 0 1 1 C/BE0 0 1 0 1 0 1 0 1 0 1 0 1 Description INTA sequence Special cycle I/O read access I/O write access Memory read access Memory write access Configuration read access Configuration write access Memory multiple read access Dual addressing cycle Line memory read access Memory write access with invalidations The PCI bus allows any device to talk to any other device, thus one device can talk to another without the processor being involved. The device that starts the conversion is known as the initiator and the addressed PCI device is known as the target. The sequence of operation for write cycles, in burst mode, is: • Address phase. The transfer data is started by the initiator activating the FRAME signal. The command is set on the command lines ( C/BE3 – C/BE0 ) and the address/data pins (AD31–AD0) are used to transfer the address. The bus then uses the byte enable lines ( C/BE3 – C/BE0 ) to transfer a number of bytes. • The target sets the T R D Y signal (target ready) active to indicate that the data has on the AD31–AD0 (or AD62–AD0 for a 64-bit transfer) lines is valid. In addition, the initiator indicates its readiness to the PCI bridge by setting the IRDY signal (indicator ready) active. Figure 10.6 illustrates this. 120 PC Interfacing, Communications and Windows Programming • The transfer continues using the byte enable lines. The initiator can block transfers if it sets IRDY and the target with TRDY . • Transfer is ended by deactivating the FRAME signal. The read cycle is similar but the on the bus is valid. TRDY line is used by the target to indicate that the data 10.3.3 PCI commands The first phase of the bus access is the command/addressing phase. Its main commands are: • INTA sequence. Addresses an interrupt controller where interrupt vectors are transferred after the command phase. • Special cycle. Used to transfer information to the PCI device about the processor’s status. The lower 16 bits contain the information codes, such as 0000h for a processor shutdown, 0001h for a processor halt, 0002h for x86 specific code and 0003h to FFFFh for reserved codes. The upper 16 bits (AD31–AD16) indicate x86 specific codes when the information code is set to 0002h. • I/O read access. Indicates a read operation for I/O address memory, where the AD lines indicate the I/O address. The address lines AD0 and AD1 are decoded to define whether an 8-bit or 16-bit access is being conducted. • I/O write access. Indicates a write operation to an I/O address memory, where the AD lines indicate the I/O address. • Memory read access. Indicates a direct memory read operation. The byte-enable lines ( C/BE3 – C/BE0 ) identify the size of the data access. • Memory write access. Indicates a direct memory write operation. The byte-enable lines ( C/BE3 – C/BE0 ) identify the size of the data access. Write access Read access Data Data FRAME PCI device Initiator (busmaster) TRDY FRAME PCI device Target (addressed PCI device) PCI device Initiator (busmaster) IRDY IDRY PCI device Target (addressed PCI device) TDRY Indicates that target can accept data from the bus Indicates that initiator has placed valid data on the bus Indicates that initiator can accept data from the bus Indicates that there is valid data on the bus Figure 10.6 PCI handshaking. Local bus 121 • Configuration read access. Used when accessing the configuration address area of a PCI unit. The initiator sets the IDSEL line activated to select it. It then uses address bits AD7–AD2 to indicate the addresses of the double words to be read (AD1 and AD0 are set to 0). The address lines AD10–AD18 can be used for selecting the addressed unit in a multi-function unit. • Configuration write access. As the configuration read access, but data is written from the initiator to the target. • Memory multiple read access. Used to perform multiple data read transfers (after the initial addressing phase). Data is transferred until the initiator sets the FRAME signal inactive. • Dual addressing cycle. Used to transfer a 64-bit address to the PCI device (normally only 32-bit addresses are used) in either a single or a double clock cycle. In a single clock cycle the address lines AD63–AD0 contain the 64-bit address (note that the Pentium processor only has a 32-bit address bus, but this mode has been included to support other systems). With a 32-bit address transfer the lower 32 bits are placed on the AD31–AD0 lines, followed by the upper 32 bits on the AD31–AD0 lines. • Line memory read access. Used to perform multiple data read transfers (after the initial addressing phase). Data is transferred until the initiator sets the FRAME signal inactive. • Memory write access with invalidations. Used to perform multiple data write transfers (after the initial addressing phase). 10.3.4 PCI interrupts The PCI bus support four interrupts ( INTA – INTD ). The INTA signal can be used by any of the PCI units, but only a multi-function unit can use the other three interrupt lines ( INTB – INTD ). These interrupts can be steered, using system BIOS, to one of the IRQx interrupts by the PCI bridge. For example, a 100 Mbps Ethernet PCI card can be set to interrupt with INTA and this could be steered to IRQ10. 10.3.5 Bus arbitration Busmasters are devices on a bus which are allowed to take control of the bus. For this purpose, PCI uses the REQ (request) and GNT (grant) signals. There is no real standard for this arbitration, but normally the PCI busmaster activates the REQ signal to indicate a request to the PCI bus, and the arbitration logic must then activate the GNT signal so that the requesting master gains control of the bus. To prevent a bus lock-up, the busmaster is given 16 CLK cycles before a time-overrun error occurs. 10.3.6 Other PCI pins The other PCI pins are: • • • (Pin A15). Resets all PCI devices. and PRSNT2 (Pins B9 and B11). These, individually, or jointly, show that there is an installed device and what the power consumption is. A setting of 11 (that is, PRSNT1 is a 1 and PRSNT2 is a 1) indicates no adapter installed, 01 indicates maximum power dissipation of 25 W, 10 indicates a maximum dissipation of 15 W and 00 indicate a maximum power dissipation of 7.5 W. DEVSEL (Pin B37). Indicates that addressed device is the target for a bus operation. RST PRSNT1 122 PC Interfacing, Communications and Windows Programming • TMS (test mode select), TDI (test data input), TDO (test data output), TRST (test reset), TCK (test clock). Used to interface to the JTAG boundary scan test. • IDSEL (Pin A26). Used for device initialization select signal during the accessing of the configuration area. • LOCK (Pin A15). Indicates that an addressed device is to be locked-out of bus transfers. All other unlocked device can still communicate. • PAR, PERR (Pins A43 and B40). The parity pin (PAR) is used for even parity for AD31–AD0 and C/BE3–C/BE0, and PERR indicates that a parity error has occurred. • SDONE, SBO (Pins A40 and A41). Used in snoop cycles. SDONE (snoop done) and SBO (snoop back off signal). • SERR (Pin B42). Used to indicate a system error. • STOP (Pin A38). Used by a device to stop the current operation. • ACK64 , REQ64 (Pins B60 and A60). The REQ64 signal is an active request for a 64bit transfer and ACK64 is the acknowledge for a 64-bit transfer. 10.3.7 Configuration address space Each PCI device has 256 bytes of configuration data, which is arranged as 64 registers of 32 bits. It contains a 64-byte predefined header followed by an extra 192 bytes which contain extra configuration data. Figure 10.7 shows the arrangement of the header. The definitions of the fields are: • Unit ID and Man. ID. A Unit ID of FFFFh defines that there is no unit installed, while any other address defines its ID. The PCI SIG, which is the governing body for the PCI specification, allocates a Man. ID. This ID is normally shown at BIOS start-up. Section 10.5 gives some example Man. IDs (and Plug-and-play IDs). • Status and Command. • Class code and Revision. The class code defines PCI device type. It splits into two 8bit values with a further 8-bit value that defines the programming interface for the unit. The first defines the unit classification (00h for no class code, 01h for mass storage, 02h for network controllers, 03h for video controllers, 04h for multimedia units, 05h for memory controller and 06h for a bridge), followed by a subcode which defines the actual type. Typical codes are: • • • • • • • • • • • • 0100h. SCSI controller. 0101h. IDE controller. 0102h. Floppy controller. 0200h. Ethernet network adapter. 0201h. Token ring network adapter. 0202h. FDDI network adapter. 0280h. Other network adapter. 0300h. VGA video adapter. 0301h. XGA video adapter. 0380h. Other video adapter. 0400h. Video multimedia device. 0401h. Audio multimedia device. • • • • • • • • • • 0480h. Other multimedia device. 0500h. RAM memory controller. 0501h. Flash memory controller. 0580h. Other memory controller. 0600h. Host. 0601h. ISA Bridge. 0602h. EISA Bridge. 0603h. MAC Bridge. 0604h. PCI-PCI Bridge. 0680h. Other Bridge. Local bus 31 123 0 Unit ID Man. ID Status Command Class code BIST Header Latency Rev. CLS Base Address Register Reserved 64-byte header in PCI configuration space Reserved Expansion ROM Base Address Reserved Reserved MaxLat MinGNT INT-Pin INT-Line Figure 10.7 PCI configuration space. • BIST, Header, Latency, CLS. The BIST (Built-in Self Test) is an 8-bit field, where the most significant bit defines if the device can carry out a BIST, the next bit defines if a BIST is to be performed (a 1 in this position indicates that it should be performed) and bits 3–0 define the status code after the BIST has been performed (a value of zero indicates no error). The Header field defines the layout of the 48 bytes after the standard 16-byte header. The most significant bit of the Header field defines whether the device is a multi-function device or not. A 1 defines a multi-function unit. The CLS (Cache Line Size) field defines the size of the cache in units of 32 bytes. Latency indicates the length of time for a PCI bus operation, where the amount of time is the latency+8 PCI clock cycles. • Base Address Register. This area of memory allows the device to be programmed with an I/O or memory address area. It can contain a number of 32- or 64-bit addresses. The format of a memory address is: Bit 64–4 Bit 3 Bit 2, 1 Bit 0 Base address. PRF. Prefetching, 0 identifies not possible, 1 identifies possible. Type. 00 – any 32-bit address, 01 – less than 1MB, 10 – any 64-bit address and 11 – reserved. 0. Always set to a 0 for a memory address. For an I/O address space it is defined as: Bit 31–2 Bit 1, 0 Base address. 01. Always set to a 01 for an I/O address. • Expansion ROM Base Address. Allows a ROM expansion to be placed at any position in the 32-bit memory address area. • MaxLat, MinGNT, INT-Pin, INT-Line. The MinGNT and MaxLat registers are readonly registers that define the minimum and maximum latency values. The INT-Line 124 PC Interfacing, Communications and Windows Programming field is a 4-bit field that defines the interrupt line used (IRQ0–IRQ15). A value of 0 corresponds to IRQ0 and a value of 15 corresponds to IRQ15. The PCI bridge can then redirect this interrupt to the correct IRQ line. The 4-bit INT-pin defines the interrupt line that the device is using. A value of 0 defines no interrupt line, 1 defines INTA , 2 defines INTB , and so on. 10.3.8 I/O addressing The standard PC I/O addressing ranges from 0000h to FFFFh, which gives an addressable space of 64 KB, whereas the PCI bus can support a 32-bit or 64-bit addressable memory. The PCI device can be configured using one of two mechanisms. Configuration mechanism 1 Passing two 32-bit values to two standard addresses configures the PCI bus: Address 0CF8h 0CFCh Name Description Configuration Address Configuration Data Used to access the configuration address area. Used to read or write a 32-bit (double word) value to the configuration memory of the PCI device. The format of the Configuration Address register is: Bit 31 Bit 30–24 Bit 23–16 Bit 15–11 Bit 10–8 Bit 7–2 Bit 1, 0 ECD (Enable CONFIG_DATA) bit. A 1 activates the CONFIG_DATA register, while a 0 disables it. Reserved. PCI bus number. Defines the number of the number of the PCI bus (to a maximum of 256). PCI unit. Selects a PCI device (to a maximum of 32). PCI thus supports a maximum of 256 attached buses with a maximum of 32 devices on each bus. PCI function. Selects a function within a PCI multi-function device (one of eight functions). Register. Selects a Dword entry in a specified configuration address area (one of 64 Dwords). Type. 00 – decode unit, 01 – CONFIG_ADDRESS value copy to ADx. Configuration mechanism 2 In this mode, each PCI device is mapped to a 4 KB I/O address range between C000h and CFFFh. This is achieved by used in the activation register CSE (Configuration Space Enable) for the configuration area at the port address 0CF8h. The format of the CSE register is located at 0CF8h and is defined as: Bit 7–4 Key. 0000 – normal mode, 0001…1111 – configuration area activated. A value other than zero for the key activates the configuration area mapping, that is, all I/O addresses to the 4 KB range between C000h and CFFFh would be performed as normal I/O cycles. Local bus Bit 3–1 125 Function. Defines the function number within the PCI device (if it represents a multi-function device). SCE. 0 defines a configuration cycle, 1 defines a special cycle. Bit 0 The forward register is stored at address 0CFAh and contains: Bit 7–0 PCI bus. The I/O address is defined by: Bit 31–12 Bit 11–8 Bit 7–2 Bit 1, 0 Contains the bit value of 0000Ch. PCI unit. Register index. Contains the bit value of 00b. 10.4 Exercises 10.4.1 Explain how PCI architecture uses bridges. 10.4.2 Explain how the 32-bit PCI bus transfers data. Prove that the maximum data rate for a 32-bit PCI in its normal mode is only 66 MB/s. Explain the mechanism that the PCI bus uses to increase the maximum data rate to 132 MB/s. 10.4.3 How does buffering in the PCI bridge aid the transfer of data to and from the processor. 10.4.4 Explain how the PCI bus uses the command phase to set up a peripheral. 10.4.5 How are interrupt line used in the PCI bus. Explain how these interrupts can be steered to the ISA bus interrupt lines. 10.4.6 Outline the concept of bus mastering and how it occurs on the PCI bus. What signal lines are used? 10.4.7 Explain how the PCI bus uses configuration addresses. 10.5 Example manufacturer and plug-and-play IDs Manufacturer NCR Motorola Mitsubishi EPSON Yamaha Cyrix Tseng Labs Man. ID 1000 1057 1067 1008 1073 1078 100c PNP ID 4096 4183 4199 4104 4211 4216 4108 Manufacturer Toshiba Compaq HP Intel Adaptec Matsushita Creative Man. ID 102f 1032 103c 8086 9004 10f7 10f6 PnP ID 4143 4146 4156 32902 36868 4343 4342