Lec_07_Oct - Suraj @ LUMS

advertisement
Dynamic Scheduling Using Tomasulo’s Approach
Salient Characteristics:
• Track instruction dependences and
availability of operands
• Allow execution as soon as operands are
available to avoid RAW hazards
• Use register renaming to avoid WAW and
WAR hazards
• A dynamic scheduling scheme, in which
hardware reschedules instruction execution to
reduce stalls
The Structure of a DLX FP Unit
(see Figure 4.8)
• Instructions are issued in FIFO order from
Instruction Queue
• Reservation stations include the operation and the
actual operands
• Load buffers hold the results of outstanding loads
• All results from FP units or load units are put on
the common data bus (CDB)
Lifecycle of an Instruction
1. Issue
– Get an instruction from the Instruction Queue
– Issue it if there is an empty reservation station
– Send operands to the reservation station if they are in
the registers
– A load/store operation can issue if there’s an available
buffer
– If a buffer or reservation station is not available, the
instructions stalls due to a structural hazard
Lifecycle of an Instruction (Cont’d)
2. Execute
– Execute when both operands are available
– Monitor the CDB while waiting for operands
3. Write Result
– When the result is available, write it on the CDB
– From CDB, the result is written into the registers and
any reservation station waiting for this result
Reservation Stations Fields
• Every reservation station has six fields:
OP operation to perform
Qj, Qk - the reservation stations that will produce the
source operand;
Vj, Vk - the value of the source operands
Busy - indicates that this reservation station is busy
• The register file has a field, Qi
Qi - the reservation station or buffer that contains the
operation whose result is to be stored into the register
Tomasulo’s Algorithm - Example
LD
LD
MULTD
SUBD
DIVD
ADDD
F6, 34(R2)
F2, 45(R3)
F0, F2, F4
F8, F6, F2
F10, F0, F6
F6, F8, F2
See Figure 4.9 and 4. 10
See Figure 4.11 for steps in the algorithm
Tomasulo’s Algorithm: A Loop-Based Example
Loop:
LD
MULTD
SD
SUBI
BNEZ
F0, 0(R1)
F4, F0, F2
0(R1), F4
R1, R1, #8
R1, Loop; branches if R1  0
• If we predict taken branches, the loop is unrolled
dynamically by the hardware
Scoreboard - Steps in Execution
1. Issue: The scoreboard issues an instruction if
a. A functional unit for the instruction is free
b. No other active instruction has the same destination
register
If a structural or WAW hazard exists, the instruction issue
stalls.
2. Read Operands:
• The scoreboard monitors the availability of operands
• When operands become available, the execution begins
after reading the operands
• RAW hazards are dynamically resolved here
Scoreboard - Step in Execution (Cont’d)
3. Execution
• The functional unit begins execution
• When the result is ready, it notifies the scoreboard
4. Write a result
• The scoreboard checks for WAR hazard and stalls
writing the result if needed
Download