PPT

advertisement
CS 201
Compiler Construction
Lecture 13
Instruction Scheduling:
Trace Scheduler
1
Instruction Scheduling
Modern processors can exploit Instruction Level
Parallelism (ILP) by simultaneously executing multiple
instructions. Instruction scheduling influences
effectiveness with which ILP is exploited.
Pipelined processors (e.g., ARM): reordering of
instructions avoids delays due hazards.
EPIC/VLIW processors (e.g. Itanium): a single long
instruction is packed with multiple operations
(conventional instructions) that can be
simultaneously executed.
2
Compiler Support
Analyze dependences and rearrange the
order of instructions, i.e. perform instruction
scheduling.
Pipelined: limited amount of ILP is required -can be uncovered by reordering instructions
within each basic block.
EPIC/VLIW: much more ILP is required -can be uncovered by examining code from
multiple basic blocks.
3
Compiler Support
Two techniques that go beyond basic block
boundaries to uncover ILP:
(Acyclic Schedulers) Trace Scheduling: examines
a trace – a sequence of basic blocks along an
acyclic program path; instruction scheduling can
result in movement of instructions across basic
block boundaries.
(Cyclic Schedulers) Software Pipelining: examines
basic blocks corresponding to consecutive loop
iterations; instruction scheduling can result in
movement of instructions across loop iterations.
4
Trace Scheduling
A trace is a sequence of basic blocks that does
not extend across loop boundaries.
• Select a trace
• Determine the instruction
schedule for the trace
• Introduce compensation
code to preserve program
semantics
• Repeat the above steps till
some part of the program is
yet to be scheduled
5
Trace Selection
Selection of traces is extremely important for
overall performance – traces should represent
paths that are executed frequently.
A fast instruction schedule for one path is
obtained at the expense of a slower schedule for
the other path due to speculative code motion. 6
Picking Traces
O – operation/instruction
Count(o) – number of times o is expected to be
executed during an entire program run.
Prob(e) – probability that an edge e will be
executed -- important for conditional branches.
Count(e) = Count(branch) x Prob(e)
o Counts are estimated using profiling –
measure counts by running the program on a
representative input.
7
Algorithm for Trace Construction
1. Pick an operation with the largest execution count as
the seed of the trace.
2. Grow the trace backward from the seed.
3. Grow the trace forward from the seed.
Given that p is in the trace, include s
in the trace iff:
1.Of all edges leaving p, e has the
largest execution count.
2.Of all edges entering s, e has the
highest execution count.
Same approach taken to grow the
trace backward.
8
Algorithm Contd..
Trace stops growing
forward when:
Count(e1) < count(e2)
Premature termination of trace can occur
in the above algorithm. To prevent this, a
slight modification is required.
9
Algorithm Contd..
Lets say A-B-C-D has been included in
the current trace.
Count(D-E) > Count(D-F) => add E
Count(C-E) > Count(D-E) => do not add E
Premature termination occurs
because the trace that can include CE can no longer be formed because C
is already in the current trace.
Modification: consider only edges P-E st
P is not already part of he current trace.10
Algorithm Contd..
Trace cannot cross loop boundaries:
• if the edge encountered is a loop back edge; or
• if edge enters into a loop
then stop growing the trace.
1 & 2 cannot be placed in the
same trace because the edge
directly connecting them is a
loop back edge and edges
indirectly connecting them cross
loop boundaries.
11
Instruction Scheduling
Construct a DAG for the selected trace.
Generate an instruction schedule using a
scheduling heuristic: list scheduling with
critical path first.
Following generation of the instruction
schedule introduction of compensation code
may be required to preserve program
semantics.
12
Compensation Code
Consider movement of instructions across basic
block boundaries, i.e. past splits and merges in
the control flow graph.
1. Movement of a statement past/below a Split:
13
Compensation Code Contd..
2. Movement of a statement above a Join:
14
Compensation Code Contd..
3. Movement of a statement above a Split:
No compensation code introduced – speculation.
Note that i<-i+2 can be moved above spilt if i is
dead along the off-trace path.
15
Compensation Code Contd..
4. Movement of a statement below a Join:
This case will not arise assuming dead code has
been removed.
16
Compensation Code Contd..
5. Movement of a branch across a split.
17
Compensation Code Contd..
6. Movement of a branch above a join.
18
Compensation Code Contd..
6. Movement of a branch above a join.
19
Compensation Code Contd..
7. Packing multiple branches in a long instruction.
20
Code Explosion
21
Code Explosion Contd..
22
Building a DAG for Scheduling
DAG contains the following edges:
1.Write-After-Read data dependence
2.Write-After-Write data dependence
3.Read-After-Write data dependence
4.Conditional jumps: introduce write-afterconditional-read edge
between IF e & x=c+d to
prevent movement of
x=c+d above IF e.
23
Building a DAG Contd..
5. Condition jumps:
–
–
Introduce off-live edge
between x=a+b nd IF e.
This edge does not constrain
movement past IF e; it
indicates that if x=a+b is
moved past IF e then it can be
eliminated from the trace but
a copy must be placed along
the off-trace path.
24
Download