Chapter 2 Microprocessor Bus Transfers Big- and Little-Endian Ordering • Bit-endian processor architecture – High-order-byte-first (H-O-B-F) • map the highest-order byte of an internal register to the lowest memory byte-address • The address of this data item is the address of its most significant byte • EX: Motorola, Sparc RISC CPU • Little-endian processor architecture – Low-order-byte-first (LO-B-F) • Map the lowest-order byte of an internal register to the lowest memory byte-address • The address of this data item is the address of its least significant byte • EX: Intel 80x86 CPU 16- and 32-bit Microprocessors • In generally, n-bit microprocessor is – N-bit internal register – N-bit ALU – N-bit external data bus – Exception: Intel 80386SX: 32-bit internal architecture and a 16-bit data bus – Smallest byte address of the addressed field 16-bit big-endian microprocessor • 16-bit memory is connected to a 16-bit data bus • It is made up of two byte sections, section 0 (even section, 2N) and section 1 (odd section, 2N+1) • For byte-write operation, byte-enable signal may issue by the processor (EX: BHE#, section 0 can be triggered by the A0=0) 16-bit big-endian microprocessor • The least significant byte (B0) is stored in the lowest memory location • For the 16-bit word alignment operation, it only requires one bus cycle 32-bit big-engidn microprocessor From 4N address to read a word data, it need multiplex in the CPU If read from 4N+2, it don’t need multiplex in the CPU 32-bit little-endian microprocessor • The figure can operate under both the big- and littleendian modes 2.2.5 operand and instruction alignment • Operand (data) alignment: – For maximum performance, when data is stored in memory it must be aligned – 2-byte address A0 = 0 2N – 4-byte address A1, A0 = 0 4N – 8-byte address A2, A1, A0 = 0 8N • Instruction alignment: – It is assumed that code is always loaded in memory in an aligned form 2.3.1 synchronous/asynchronous buses • Synchronous bus operation – All events take place within a specified time period in synchronism with a system-wide clock – Both the processor master (the initiate the bus cycle) and memory or I/O slave (the respond to the request) are clocked by the same system clock. • Address, data, and control signals are synchronous by the system clock – Clock skew: • As distances increase, there are differences in clock arrival time at different points in the system • Synchronous design may yield a higher speed (don’t need to wake up; the CPU waits for device) and lower number of control lines (don’t need handshake control signals) – Imposes speed constraints on the external devices since a bus cycle is made up of a number of clock cycles, its duration will be constrained by the speed of the slowest device connected to the bus – Synchronous bus implementation must be based on the worst case analysis • Synchronous buses use a “wait protocol” to overcome this • Device use “ready” input pin to tell that the addressed device did not have enough time to respond and therefore the bus master must insert a wait state Asynchronous bus operation • Bus master and bus slave have their own individual input clocks (operate at their own different speeds) • Based on a handshake process – Ex: bus master issues a “strobe” signal to indicate the information on the bus line is “valid” – The addressed slave device will return a “data acknowledge” signal informing the master that it has responded to the request; i.e., it has either placed data on the data bus (read request) or latched the data off the data bus (write request) 2.3.2 Intel 80x8 bus transfers • 30-bit address A31 – A2 • Byte enable : BE1#, BE2#, BE3#, BE4# • Little-Endian • 80386 uses 4 bus clock cycle • I/O mapped I/O •Use M/IO# to indicate I/O or memory operation • bus cycle begins with ADS# active, and end with “read” active