The Effect on of Speculatively Branch Prediction Eric Hao, Department of Po-Yung Updating Branch Accuracy, Revisited Chang, Electrical and The University Ann Arbor, register need not branches Branch in Predictor tempting need not its this to conclude speculatively and that execution, branch perform scalar history this is not the of than branch They examined Level Adaptive Branch which excluded a fixed tion Adaptive branches prediction, processors, speculative out-of-order ment for high processors. is not branches are excluded This tory unresolved register cantly program improve that update in tively with we will tion super- which Because may have that on the from stored be basing its yet to predictions for should speculatively outcomes branches accurate whether they branch of where stream, on branch research of the branch most the [8] has recent are resolved with the their regis- predictors is exe- ters. predictor outcomes prediction, updated it the we show not register his- signifi- they showed specula- actually low- show in the unresolved that speculatively 4 provides the 3, we the update do not number varies, from worse Furof unre- machine branch their than their his- branch history concluding Prediction not prediction branches update some re- does branches. the significantly that most updating speculative because In sec- In section of unresolved present that Section the register accuracy. with [6]. excluding history We perform study experiments of speculatively that omit Branch 2 that prediction presence that when branch register. registers issuing the usefulness branches tory when branch does of further for of predictors in predictors issued history affect history solved pre- the the previous the results from the thermore, of the Its of previously ahead Past outcomes are crucial be far variation [7, 8, 4]. be speculatively instruction be resolved. the in the may a point dynamic because outcomes processor in the global Predictor [2, 3]. In this conbranch branches, history we revisit an explanation degrade that In addition, the branch predic- fairly affected from is- As the the concluded unresolved model) recently register. update performance). paper, accuracies branch speculative 2, we present branch pipelined different developed the Branch are the instructions cuting been deeply many significantly of most accuracy. reexamine require- they presence updating provide is a critical need, be examining are based branches ter. have Adaptive dictions the In this Prediction, execution, wide-issue, this algorithms Two-Level (i.e. of the Two- remained result, of register. skipped was increased, model accuracy execution prediction performance To address prediction paper, branch history unresolved significantly accurate branch prediction not Introduction Very of the excluded on this (the number of the skipped Based importance history of a version Predictor the of branches accuracy cent 1 from the branch affecting predictors Branch in the the performance ers prediction Two-Leuel [6] has questioned branches does speculative branch such stant. important. during research number revis- Science Michigan sued branch result varies is register recent number without worse it paper significantly machine predictors result, update. Keywords: dynamic this most why update significantly speculative the hismost Patt 48109-2122 including Adaptive This without in the branch of the Two-Level branch when because present the From the explains speculative shows branches also the updated. can be omitted that paper that that outcomes well. explains It the for work to performance. imply order be work outcomes contain N. computer Recent Recent research [6] has suggested recent and of MI Abstract tory Yale Engineering History regis- remarks. Based on Older results of the skipped Histories stated branches In this section, predictor model predicted a speculatively or not. tions used on branch history branches skipped ber of recent 228 histories on the prediction that are an The original that than from accuracy. the fixed from of results one to four ones some number varying is its predic- branches equal found omitted predictor bases older issued for 1). model It exchanges recently experiment branches that predictor. most figure skipped predictor register (see model the The updated of the branch affect [6]. by the standard number Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association of Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. MICRO 27- 11/94 San Jose CA USA @ 1994 ACM 0-89791 -707-3/94/001 1..$3.50 we revisit experiment of the its older the num- has little Sr.cirtober%rricted ~ , ~ 0.94 i bm bm+n bn+t bli $ 0.92 t, Standard :,.:.,:,.,..,.,. ,,, ;.., ......... ....5... :,. .... ,............. .................... .... ,.,,,..~....,. ................................,,... ,....,,. ..,,...,..,.. . . . . . . ... . .............. ~ ~ . , I :, , ;; I :: :: :: .,,*,.....+ .......,,;.,;. +..,.,:~:.,.;..,.,:.,..,.,,,:,,,.,..,.:.,.,.~,:,.,,.~,,.:.,.. i :.:,.y,f:,.r+ ,%}:, .),.,%,.,, ,:..$,,.,..,.,.:..,;..,:, ........ .... ... Model Skipped Model .. .. .. g 0.90 : : I Figurel: and m-bit skipped We branch models repeated trace-driven the Two-Level dictor devoted in periment tory table occur branch difference tern history in For those branches paper The are not the diction or do not do branch them sistent To model rations the If they useful with resolve to update prediction the 0.94 – $ 0.92-— A ... .. -.. ----- *..+ ... .. A.. -%.. .+. - ---.-. + -“%. -’%. h 0.84 — 0--• E $j 0.82 — + -.-+ ~ A ----- 0.80 — A ~.. * .* GAp(8,inf) GAp(4,inf) GAp(2,i@ I 3 I 2 I 1 0.78 ~ o the branches For a given recently I 5 the pre- comes If they each from recorded not affect prediction the experimental information, the re- is not same recently ~.&t history predictor issued until tables. predictions occurred in which a — xlisp. A st udy[9] has shown that updating the tables immediately no appreciable gain in prediction accuracy. 2Sc was omitted due to problems with the simulator. contents and history configu- register register for most cur an overwhelmingly out- call this ures 2— 6 plot for each branch tion of branch predictions previous provides sequence occurred. urations with eight, 229 this pairs patterns, sequence never of the a dominant consider register dropped predictor lengths below be 2n The would time. sequence. configuration 9370. prestatic branches sequence in which history would skipped predictor If we only there outcomes. and the a given majority dominant If for of static a single large the branch fraction branch we register outcomes. then For instruction, history pattern, of omitted prediction. branch branch n branches, sequences that static of the branch of omitted showed history th. the possible for each branch of a given was skipping skipped branch after of branch sequence were omitted sequence sults the that branch con- results. we reran fraction skipped prediction dictor excluding which 3: The dominant issued for with accuracy Figure branch should pattern I 4 indi- them WOUIA ~~~b=bly .. .. ~.. ●.. d 0.86 – 0 .-..-0 GAp(16,i@ % x ......x GAp(12,ir@ this excluding the a branches information. contradiction, the issued most useful using experiment information useful prediction in which — espresso. a2a:-:::~~~~~ “-. -.+.-, g 0.90 – accufrom g ~“:::-===*’:2=---2 ‘~ 0.88 – of skipped for predicting information, recorded 1A .e~ ~rey,ct~. was retired of useful register experiments and predictions occurred Number of Skipped Branche8 branches. experimental this the 10 million are omitted useful do provide the pat- confirmed number model is consistent lower for of recently provide history should W $ 0.96 — compress, results [6]: to SPEC92 eqntott, Our l.oo — 2 ~ 0,98 — after six of branch sequence o we mea- of the xlisp, skipped outcomes which The configuration, fraction skipped not be due histories. results more not only 2: The ex- did immediately as the 4 > 3 Number of Skipped Branches 2 considerations. provide provide accuracy sults. The of older the either would five -. 1 histhat that ..* static our mispredictions predictor experiment significantly branches from was simulated outcomes the outcomes prediction, the of the any for constant to space to each -%. -.. a pre- due to pattern instructions. fairly results that table eliminated branch benchmark is increased. due history predictor original remains dominant Predictor. espresso, branch of the racy than each benchmarks: conditional Figure The Branch updated accuracy and gcc 2. Each variation predictor were prediction using the global way, ‘-”%4-.. GAp(4,inQ GAp(2,irlr3 0 experiment model recorded tables prediction. cate This b“------- branches. mispredictions skipped GAp(16,int) GAp(12,inf) GAp(8,inf) 0.78 L standard model This standard forthe modeled pattern program. the the integer that one in the sured skipped conflicts. in registers n unresolved Adaptive the any occurred with simulator of the branch history ~.. 13,~6 0--0 X......x 0.84 ❑ --- ❑ 0.82 + -.-+ ~ ----- ~ 0.80 > 2 ~ &% Tii - \ “~ 0.88 reand OCWe Fig- the fracskipped configof at least Because l.rQ large 0.98 0.96 be thought 0.94 sociated 0.92 This 0.88 I — 0.86 GAp(16,infl 0.-..-0 0.84 0.82 A ----- 0.80[ A GAp(2jllf) and the accuracy predictor. As a result, exchange in return for exchanges are performed omitted branch skipped model model do For predictor older branch makes. Such in which match the remaining receives does branch instances not the a standard issued it rare out- predictor about the outcomes sequence. skipped of the prediction for branch recently information branch as- pair. to achieve to that about for every omitted a can pattern predictor the skipped information outcomes the model such by its register comparable outcomes inant represented of the skipped occurs of the sequence history representation allows sequence the value implicitly branch prediction the Number of Skipped Branches of as being implicit not x ......x GAp(12,ii fJ- -- ❑ GAp(8,il@ + --- + GAp(4,inf) skipped of the time, static comes 0.90 the dominant majority dom- instances, the information for free. Figure 4: The dominant ~ fraction skipped ; a 3 Speculative 0.96 0.90 “g 0.88 l-J.&5 0.-..-0 x .x D --+ -.-+ & 0.84 ~ 0.82 F 0.80 ,K I A .-..-A ❑ GAp(12,iI@ GAp(8,ir@ GAp(4,ir@ *.. this update speculative with Recent Branches author error Figure 5: The dominant fraction skipped of branch sequence predictions occurred in which a the 4 ““. ‘.. ---.. .. “..., .. . ~., ‘.. tions, -. %.. . ..% % ~. A .-.-A “., ” + “% ‘\ “h,. $... ~.. ~.. + GAp(4,ir@ ~. ‘, GAp(2Jnf) 0 i i 3 dominant skipped of branch sequence h 4 predictions occurred that only of number those of was result affected by Because machine unresolved The result correct the pre- predictors increased. path study’s due to an shows that the number only branch is speculatively of the program predictions accuracy in which they model of branches does This result not ex- affect contribute [9]. For the to the such predic- in the branch not can vary ulated such showed during the and speculative performance The machine omits was To deter- branches, its we sim- performance update. measured simulator that all of unresolved execution. compared reg- accuracy. that number unresolved omitting history prediction program with regardless that the branch to a predictor because simulator. dynamically-scheduled 230 apply a predictor predictor’s a trace-driven from affect of omitting of a predictor to be correct, or not. significantly does the effect contained experiment branches, mine — gee. are resolved number ister a outcomes are guaranteed skipped branches 5 branch The Each fraction correct a fixed to that 6: The the register unresolved Number of Skipped Branches Figure the of prediction of whether -.. 0 .-..-0 GAp(16,inf) x ......X GAp(12,~ ❑ --❑ GAp(8,irrf) time, all history ‘A,, the those -cl -.>. “... + --- down reported present. while exe- without than incorrect is not calculation -. “-+. the accuracy execution -E--- the prediction ecuting . ..- as branches we show program updated machine The made unresolved lower speculatively simulator[5]. of unresolved — compress. of during that update. in their predictions ‘h that show speculative of predictors [6] erroneously in the specmakes Furthermore, accuracies of present reports number varies decreased branches We with are significantly accuracy dramatically Skipped the speculative research diction Number of number update predictors GAp(2,inf) of prediction impact register prediction. in the machine. this the positive history of predictors independent because cution, h branch present the branch accuracies are branches show the accurate prediction that we updating towards the Update section, ulatively =”*B2’2F:.:yg.:: =:.:::,:,9 .-. + ---.-* -- .,. .. . . a . ..~.. “%. = ~.. . . .. ~.. .. ~,. “+ ~.. ~.. ~.. h .. .. *.. GAp(16;i *.. ~., 0.94 $ in which — eqntott. &.. .::.::: “#s 0.92 { predictions occurred In 1.(X3 ~ 0.98 ~ of branch sequence could using modeled issue up a to Branch to be Predicted ‘O’ntof’etiremen’ +~ bm Speculative ~lj ;b5 Update # bm+3 1 ; b5 Resotved E?zlbl~ I b/?l+4 Resolved + Issue Order bl ~ i b5 LVn+s bl ~ t i ~b5 ~, ❑ Resolved Branch~ ❑ I.hnesolved Branch v II -P ~; Retired Redred Branch ~ Figure 8: branch history Branch g.% T Seachmark prediction update accuracies variations of for the the four out-of-order model. 3 Figure 7: with bit branch history m speculative dictors update without instructions sued per per cycle. The a branch predictor tory register and eight significant dress the specified the prediction. and Upon to pattern checkpointing [1]. used 2 was Each table instruction same ex- correct each for esp 11 via used in perfor- 100 million Figure 9: IPC’S ations of the three had considered resolved+ ation, issue the outcome In as soon as that issued before model as soon as the is retired (i.e. tween executed). the the predictor with order, issue average the associated instructions Figure during the branch it have differences be- instructions simulation the retired per results cycle accuracy (IPC) are shown and number of for each benchmark. in figures 8 and update vari- 30%, in prediction of the the branch speculative that branch history those varying from the branch cycle is in and By allowing the the history teed ery to contain dynamic tinct predictor same the predictions the was predictor’s make different of the for predictor on longer that be to vary, guaranfor to identify Without this accurate predictions. resolved ability, variation occurrences the state. program ability the uses state of branches dynamic accuto machine branches no sequence with- due what predictions due experiment in the branch identify can not needed of a particular example for not in branches predictors present The 36% was model decrease program. can no longer Consider ‘??1 the the of the skipped re- average issued of unresolved register occurrence in recently to its number removing states in IPC. branches bases 21?10, and 41% were register resolved, corresponding The to cycle. history branch the upthan respectively accuracy branches of unresolved 19Y0, registers prediction. The had most update; number ing and speculative accuracies variations of They decrease showed 9. All retired exclusion thereby prediction update. and the out without speculative decreases of 29%, program updates before the the of a branch with issued 7 illustrates we allowed results history prediction The to rate branches variation the his- is identical vary branch model. lower accuracy. decreases from variations. We measured The checkpoint all branch all the to the of a branch that retired with with variation the four of the predictor solved+ vari- is resolved. the and branches The resolved outcome This register resolved, branch the exception of unresolved history the is resolved with a predictor is updated variation, with branch In the as that order of a program. branch been issue which be modeled: register as soon it are resolved. number execution the retired. is updated to the skipped the and history resolved+ register in could order, branch variations update of a branch the tory three speculative for significantly prediction We gcc w out-of-order variations date suffered instructions. without eqn Benchmark was path predictor’s was simulated 0 executed. the machine the 0.5 for after were was 1 ad- until set of benchmarks to measure benchmark itself branch, .! three to be used depended a .2 his- The resolved they to units branch tables. not which The global is- word-aligned history were one branch functional history immediately section eight branch’s a mispredicted recover mance. pattern branch ?s 4 1.5 a 16 bit of the 2.5 of pre}’ at most had with upon the resolving able with Branches all the instructions ecuted cycle bits variations update. machine and registers for predictors the three speculative eight least and ev- state, disthe makof the Acknowledgments ‘ranchtObepredic’ed=---+’+ This bl; bq+l bm+q high -------- versity J .. .. .. .. .. .. .. .. .. .. .. .. .. .. ....... ........... -----------------.., .............:....."."..".:".`..".".."."..".:".>".".."..".."..".. .".".".".".".".'."." .“....“.. .’. . .’.. .’.. .’ .. ............. . . .... .. . . . . . . . . . . .. . . . . . . .... . . .... . . .... . . . .. . . .. . .... . . ................ r .. .. .. .. ,. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ...%...%..... .. .. .. .. .. .. .. Intel, and bl; br+l . ------------------ for ‘Time Scientific ‘ clear the of our The support also industrial to paper part- is greatly thank Adam he provided of this in Uni- Hewlett-Packard, Software like explanations reviewers research at the of our Motorola, Engineering would ongoing implementation T/GIS, and We the and result computer AT& preciated. 1 is one of Michigan. ners: q --------bm+r paper performance for about their ap- Talcott his helpful work sugges- tions. Figure the lO: The resolved instances branch history variation for ofbranch register contents predicting used different References by [1] dynamic W.-M. ?V. pair bo. for ceedings same branch branches tion (see figure in the are such that must be omitted pose for sued branches ferent. the dictor be making varied of the branch This causes the tory table making entry register make each for for both [4] F. Lee and and Patt, “Checkpoint machines,” Annual International pp. A. J. branch pp. N. Architecture, 6-22, Smith, buffer January 1984. re- Pro- in Sympo- 18–26, “Branch target S. McFarling “ in Proceedings national Symposium 1987. prediction design,” IEEE S.-T. Pan, K. of and 76-84, J. T. dynamic correlation,” the cost of Annual Inter- Architecture, Rahmeh, branch in pp. “Improving prediction Proceedings Conference Programming ac- 13th on Computer So, accuracy ternational potentially prediction “Reducing of the 1986. branch his- and J. Hennessy, branches, the different, pattern prediction, and lowering Idth J. 396-403, the con- were to use a different predictions [3] of unresolved used the Computer, thepre- predictions, of on Computer K. Y. execution is- program prediction and sium strategies r are dif- then number [2] Sup- recently q and the same for the two history to r most where the predicbranches register. are thesame, the predictor different history historiesof because tents issued the occurrences present recently branch branch unresolved of the first be omitted, actual the time prediction, However, branches the must should of them, q most from the twodynamic Suppose at the the second If the 10). machine Hwu out-of-order Fifth In- Support for Systems, pp. on Architectural Languages and using of the Operating 1992. curacy. 4 Conclusion [5] A. R. Talcott, [6] A. R. Talcott, lVood, In this paper, periment Level the Adaptive outcomes from of the model can older branch branch predictors of unresolved present predictors with outperform updated in the branch history lose and the hence registers ability the More to ability were are not identify to make M. J. Serrano, “The branch Symposium impact prediction Computer R. C. of unre- scheme of the $Ist on nual importantly, to update, per- Annual Inter- Architecture, pp. accurate Patt, Yeh International and Y. N. 51-61, Patt, of the 19ih posium Computer 24th 1991. branch Annual An- on Com- “Alternative adaptive Proceedings adaptive of the Symposium pp. of two-level on “Two-level Proceedings implemenprediction,” International Architecture, pp. in Sym- 124–134, 1992, far be- [9] T.-Y. speculatively distinct N, in Microarchitecture, tations of unresolved Y. ACM/IEEE [8] T.-Y. by and prediction,” puter up- shown Yeh branch prediction speculative that on communication. 1994. [7] T.-Y, the affected Their number update without Yamamoto, in Proceedings national num- speculatively adversely of the Personal Nemirovsky, branches 12-21, skipped instruction machine. speculative predictors not the the branch that M. 1994. significantly from static solved the W. and formance,” omitted outcomes branches. are independent branches states are exTwo- that that because those the in be without We also showed the occurrence cause and can provided is fixed infer model update We showed register often history to be predicted. skipped branches accuracy omitted most accuracies Prediction. issued history prediction the of speculative Branch of outcomes dated use recently branch affecting ber we reexamined and June Yeh struction program ing predic- 25th tions. and fetch speculative Annual on Computer 232 Y. N. Pat t, mechanism execution, ACM/IEEE “A for ” comprehensive a processor in Proceedings International Microarchitecture, pp. in- supportof the Symposium 129–139, 1!392.