Architectural Complexity: Opening the Black Box Methods for Exposing Internal

advertisement
Architectural Complexity:
Opening the Black Box
Methods for Exposing Internal
Functionality of Complex Single and
Multiple Processor Systems
1
EECC-756
Modern Design Trends

Larger on-chip caches
 Extended levels of cache
 System-on-a-chip integration
 Overall increasing design complexity
All lead to more complex debugging of
designs
2
The Good News

Automated design tools are minimizing
design errors
 IP reuse minimizes bugs
 Simulation tools discover most logic errors
before fabrication
 Massive test suites allow comprehensive
testing
 So what happened to Intel with FPU flaw?
3
Past Methods for Debugging

Signal probing
 Bus monitoring
 Software debugging
4
Past Methods for Debugging
(cont’d)

Signal probing
– More internal logic per pin = less info on pin
– Pin inaccessibility due to modern packages (i.e.
sockets, BGAs)

Bus monitoring
– Caches hide data accesses

Software debugging
– Impractical for real-time applications
– Little or no hardware support in the past
5
Solutions

Test Access Port (TAP)
– Uses JTAG IEEE1149.1 specification for boundary
scan

Probe Mode
– Allows step by step analysis of code impact on internal
registers

In-circuit Emulation (ICE)
– Allows execution tracing
– Real-time applicability
6
Test Access Port (TAP)

Implementation of boundary scan JTAG
IEEE1149.1 specification
 Allows access to all internal flip-flops in
boundary scan chain
 Numerous chains serve different functions
(i.e. IO flip-flops)
 Allows non-destructive snapshot of internal
state at any point in time
7
Test Access Port (cont’d)

Single instruction register
 Multiple data registers (scan chains)
8
Probe Mode

Special processor mode halts program
execution
 Uses the TAP interface to receive
instructions and output internal data
 Allows read/write access to any internal
registers
 Allows memory accesses to test cache
functionality
9
Probe Mode (cont’d)
10
In-Circuit Emulation (ICE)
Support


Special pins provide branching information
Example: Pentium Dual Pipeline
– 3 dedicated pins



IU – Asserted when instruction completes in the U instruction
pipeline
IV – Asserted when instruction completes in the V instruction
pipeline
IBT – (Instruction Branch Taken) Asserted when a branch is
taken
11
In-Circuit Emulation (cont’d)

Branch signal information provides realtime
code tracing
 Branch trace message buffers provide
further information
 Branch trace message buffers in conjunction
with Probe Mode allow detailed realtime
code tracing
12
Branch Trace Message
Buffers





FIFO queue
Can be read through TAP during program
execution
Circular mode (trace-back from breakpoint) vs.
Jump-to-Probe Mode (maintain instruction stream)
Incident counter expands buffer size
Intel automatically generates a special BTM cycle
on local bus to export BTM info
13
Branch Trace Buffer Logic
Implementation
14
Multiprocessor Issues

Three methods for opening the “black box”
on a single processor system
– TAP (boundary scan)
– Probe Mode
– Branch Tracing Methods for ICE

Multiple processor system design also has
challenges
15
Multiprocessor Challenges

Race conditions due to parallel data
accesses
 Inconsistent and unpredictable network
paths
 Differing processor behaviors on
heterogeneous networks
 Communication patterns that restrict
performance or scalability
16
Multiprocessor Solutions :
Debugging Code

Create sequential version of code
 Execute parallel tasks on a single computer
as separate processes
 Visualization tools that create space-time
diagrams or animations to show 2dimensional changes of state
 Unified Trace Environment (IBM)
17
Multiprocessor Solutions :
Debugging Designs

Ability to monitor communication packets
circumvents most visibility problems
– Debug messages can be included in packet

Network protocol simulations
– Protocol verification programs
 (i.e. petri-nets)
– Network communication pattern simulators

However ...
18
Multiprocessor Design Trends

Currently, uniprocessor designs are hitting
roadblocks
– large dies
impractical signal transit time
– routing increases exponentially with die size

One possible solution : multiple processors
on a single die
re-emergence of
visibility problems
19
Conclusion

Several methods available for internal
execution tracing of uniprocessors
– Test Access Port (JTAG IEEE1149.1)
– Probe Mode extension
– Branch Tracing

Don’t count out TAP, Probe Mode, and ICE
for multiprocessors
20
Related documents
Download