Computer Architecture Processing of control transfer instructions, part II Ola Flygt Växjö University http://w3.msi.vxu.se/users/ofl/ Ola.Flygt@msi.vxu.se +46 470 70 86 49 Outline 8.4.4 Extent of speculativeness Recocery from misprediction 8.4.5 Branch penalty 8.5 Multiway branching 8.6 Guarded Execution CH01 Extent of speculative processing Extent of speculative processing Recovery from a misprediction: Basic Tasks Necessary activities to allow or to shorten recovery from a misprediction Frequently employed schemes for shortening recovery from a misprediction shortening recovery from a misprediction: needs Using two instruction buffers in the supersparc to shorten recovery from a misprediction: Using three instruction buffers in the Nx586 to shorten recovery from a misprediction: 8.4.5 Branch penalty for taken guesses depends on branch target accessing schemes Compute/fetch scheme for accessing branch targets {IFAR vs. PC} BTAC scheme for accessing branch targets {associative search for BA, if found get BTA} {0-cycle branch: BA=BA-4} BTIC scheme: store next BTA BTIC scheme: calculate next BTA Successor index in the I-cache scheme to access the branch target path {index: next I, or target I} Successor index in the I-cache scheme: e.g. The microachitecture of the UltraSparc Predecode unit: detects branches, BTA, make predictions (based on compiler’s hint bit), set up I-cache Next address Branch target accessing trends 8.5 Multiway branching: {two IFA’s or PC’s} Threefold multiway branching: only one correct path! 8.6 Guarded Execution a means to eliminate branches by conditional operate instructions IF the condition associated with the instruction is met, THEN perform the specified operation ELSE do not perform the operation e.g. original beg r1, label // if (r1) = 0 branch to label move r2, r3 // move (r2) into r3 label: … e.g. guarded cmovne r1, r2, r3 // if (r1) != 0, move (r2) into r3 … Convert control dependencies into data dependencies Eliminated branches by full and restricted guarding {full: all instruction guarded, restricted: ALU inst guarded} Guarded Execution: Disadvantages guarding transforms instructions from both the taken and the not-taken paths into guard instruction increase number of instructions by 33% for full guarding by 8% for restricted guarding {more instructions more time and space} guarding requires additional hardware resources if an increase in processing time is to be avoided VLIW