CS 7810 Lecture 11 Virtual-Physical Registers

CS 7810
Lecture 11
Delaying Physical Register Allocation Through
Virtual-Physical Registers
T. Monreal, A. Gonzalez, M. Valero, J. Gonzalez, V. Vinals
Proceedings of MICRO-32
November 1999
Register File Design Considerations
• Number of ports = 3 x issue width
• Number of entries = window size + logical-regs
• Multiple threads  more registers (more power)
• Wire delays, clock speeds  multiple cycle access
• Pipelining a RAM structure is hard
Register Allocation
assign pr7
cycle 4
cycle 15
no result – 26 cyc
Complete Wake-up
write pr7
cycle 30
read pr7
cycle 50
release pr7
cycle 80
useful time – 20 cyc no activity – 30 cyc
Two-Level Register File
Base regfile
Two-level regfile
Virtual-Physical Registers
Register map table
lr3  vr7
vr7 
 vr7
vr7 
Virtual map table
Virtual-Physical Registers
Register map table
lr3  vr7
vr7 
 vr7
vr7 
Virtual map table
Instruction issues
Virtual-Physical Registers
Register map table
lr3  vr7, pr9
 vr7 (pr9)
vr7  pr9
Virtual map table
vr7, pr9
Instruction completes
Is assigned pr9
Virtual-Physical Registers
Register map table
lr3  vr7, pr9
 vr7 (pr9)
vr7  pr9
 pr9
Virtual map table
Lack of Registers
Finishes, has no register,
keeps re-executing
In-flight window
Has physical register
Has no physical register
Lack of Registers
cycle t
cycle t+1
Finishes, has no register,
keeps re-executing
gets reg
In-flight window
Has physical register
Has no physical register
Who will generate a
register for this instr?
Finishes, has no register,
keeps re-executing
Solution: Reserve a
register for the
oldest instruction
In-flight window
Has physical register
Has no physical register
Sequential Execution
Oldest instr has reserved register
In-flight window
Has physical register
Has no physical register
Sequential Execution
instr commits, releases another
reg, that is then reserved for
the new oldest instr
In-flight window
Has physical register
Has no physical register
Sequential Execution
Behaves like an in-order processor
instr commits, releases another
reg, that is then reserved for
the new oldest instr
In-flight window
Has physical register
Has no physical register
Reserving All Registers
Allows quick progress, but almost
behaves like a conventional processor
Has physical register
Has no physical register
Register Stealing
Instr finishes; steals
register from the
youngest finished instr
In-flight window
Has physical register
Has no physical register
• No reservation of regs
• The younger instrs may
have to execute twice
• Note the pre-execution effect
• Finished instructions have to remain in issueq in
case they have to re-execute
• Issued dependents of the victim instruction need
not re-execute
• The VP tag of the victim has to be broadcast so
that unissued dependents can reset the ready bit
• Can benefit from an instruction reuse buffer?
• Pre-execution without explicitly attempting it
• Improves the base case by 5% (Int programs)
and 24% (FP programs)
• FP programs have more ILP, better branch
prediction, and are more limited by cache misses
• Re-executions: 10% (int) 58% (fp)
• Steals: 5% (int) 12% (fp)
• For the same IPC, VP registers employ 25% fewer
• Bullet