Bus-Based Computer Systems Designing with microprocessors. Development and debugging. System-level performance analysis. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. System architectures Architectures and components: software; hardware. Some software is very hardwaredependent. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Hardware platform architecture Contains several elements: CPU; bus; memory; I/O devices: networking, sensors, actuators, etc. How big/fast much each one be? © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Software architecture Functional description must be broken into pieces: division among people; conceptual organization; performance; testability; maintenance. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Hardware and software architectures Hardware and software are intimately related: software doesn’t run without hardware; how much hardware you need is determined by the software requirements: speed; memory. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Evaluation boards Designed by CPU manufacturer or others. Includes CPU, memory, some I/O devices. May include prototyping section. CPU manufacturer often gives out evaluation board netlist---can be used as starting point for your custom board design. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Adding logic to a board Programmable logic devices (PLDs) provide low/medium density logic. Field-programmable gate arrays (FPGAs) provide more logic and multi-level logic. Application-specific integrated circuits (ASICs) are manufactured for a single purpose. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. The PC as a platform Advantages: cheap and easy to get; rich and familiar software environment. Disadvantages: requires a lot of hardware resources; not well-adapted to real-time. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Typical PC hardware platform CPU memory CPU bus intr ctrl DMA controller bus interface device high-speed bus timers bus interface low-speed bus device © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Typical busses PCI: standard for high-speed interfacing 33 or 66 MHz. PCI Express. USB (Universal Serial Bus), Firewire (IEEE 1394): relatively low-cost serial interface with high speed. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Software elements IBM PC uses BIOS (Basic I/O System) to implement low-level functions: boot-up; minimal device drivers. BIOS has become a generic term for the lowest-level system software. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Example: StrongARM StrongARM system includes: CPU chip (3.686 MHz clock) system control module (32.768 kHz clock). • • • • • • © 2008 Wayne Wolf Real-time clock; operating system timer general-purpose I/O; interrupt controller; power manager controller; reset controller. Overheads for Computers as Components 2nd ed. Debugging embedded systems Challenges: target system may be hard to observe; target may be hard to control; may be hard to generate realistic inputs; setup sequence may be complex. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Host/target design Use a host system to prepare software for target system: target system host system © 2008 Wayne Wolf serial line Overheads for Computers as Components 2nd ed. Host-based tools Cross compiler: compiles code on host for target system. Cross debugger: displays target state, allows target system to be controlled. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Software debuggers A monitor program residing on the target provides basic debugger functions. Debugger should have a minimal footprint in memory. User program must be careful not to destroy debugger program, but , should be able to recover from some damage caused by user code. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Breakpoints A breakpoint allows the user to stop execution, examine system state, and change state. Replace the breakpointed instruction with a subroutine call to the monitor program. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. ARM breakpoints 0x400 0x404 0x408 0x40c MUL r4,r6,r6 ADD r2,r2,r4 ADD r0,r0,#1 B loop uninstrumented code © 2008 Wayne Wolf 0x400 0x404 0x408 0x40c MUL r4,r6,r6 ADD r2,r2,r4 ADD r0,r0,#1 BL bkpoint code with breakpoint Overheads for Computers as Components 2nd ed. Breakpoint handler actions Save registers. Allow user to examine machine. Before returning, restore system state. Safest way to execute the instruction is to replace it and execute in place. Put another breakpoint after the replaced breakpoint to allow restoring the original breakpoint. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. In-circuit emulators A microprocessor in-circuit emulator is a specially-instrumented microprocessor. Allows you to stop execution, examine CPU state, modify registers. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Logic analyzers A logic analyzer is an array of low-grade oscilloscopes: © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Logic analyzer architecture sample memory UUT system clock microprocessor vector address controller clock gen state or timing mode © 2008 Wayne Wolf keypad Overheads for Computers as Components 2nd ed. display Boundary scan Simplifies testing of multiple chips on a board. Registers on pins can be configured as a scan chain. Used for debuggers, in-circuit emulators. © 2008 Wayne Wolf Overheads for Computers as Components How to exercise code Run on host system. Run on target system. Run in instruction-level simulator. Run on cycle-accurate simulator. Run in hardware/software co-simulation environment. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Debugging real-time code Bugs in drivers can cause nondeterministic behavior in the foreground problem. Bugs may be timing-dependent. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. System-level performance analysis Performance depends on all the elements of the system: CPU. Cache. Bus. Main memory. I/O device. © 2008 Wayne Wolf memory CPU cache Overheads for Computers as Components 2nd ed. Bandwidth as performance Bandwidth applies to several components: Memory. Bus. CPU fetches. Different parts of the system run at different clock rates. Different components may have different widths (bus, memory). © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Bandwidth and data transfers Video frame: 320 x 240 x 3 = 230,400 bytes. Transfer in 1/30 sec. Transfer 1 byte/msec, 0.23 sec per frame. Too slow. Increase bandwidth: Increase bus width. Increase bus clock rate. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Bus bandwidth T: # bus cycles. P: time/bus cycle. Total time for transfer: O1 D O2 W t = TP. D: data payload length. O1 + O2 = overhead O. © 2008 Wayne Wolf Tbasic(N) = (D+O)N/W Overheads for Computers as Components 2nd ed. Bus burst transfer bandwidth T: # bus cycles. P: time/bus cycle. Total time for transfer: 1 2 B … O W t = TP. D: data payload length. O1 + O2 = overhead O. © 2008 Wayne Wolf Tburst(N) = (BD+O)N/(BW) Overheads for Computers as Components 2nd ed. Memory aspect ratios 16 M 64 M 8M 1 © 2008 Wayne Wolf 4 Overheads for Computers as Components 2nd ed. 8 Memory access times Memory component access times comes from chip data sheet. Page modes allow faster access for successive transfers on same page. If data doesn’t fit naturally into physical words: A = [(E/w)mod W]+1 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Bus performance bottlenecks Transfer 320 x 240 video frame @ 30 frames/sec = 612,000 bytes/sec. Is performance bottleneck bus or memory? © 2008 Wayne Wolf memory Overheads for Computers as Components 2nd ed. CPU Bus performance bottlenecks, cont’d. Bus: assume 1 MHz bus, D=1, O=3: Tbasic = (1+3)612,000/2 = 1,224,000 cycles = 1.224 sec. Memory: try burst mode B=4, width w=0.5. Tmem = (4*1+4)612,000/(4*0.5) = 2,448,000 cycles = 0.2448 sec. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Performance spreadsheet bus clock period W D O N T_basic t © 2000 Morgan Kaufman 1.00E-06 2 1 3 612000 1224000 1.22E+00 memory clock period W D O B N 1.00E-08 0.5 1 4 4 612000 T_mem t 2448000 2.45E-02 Overheads for Computers as Components Parallelism Speed things up by running several units at once. DMA provides parallelism if CPU doesn’t need the bus: DMA + bus. CPU. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.