Compiler-Based Register Name Adjustment for Low-Power Embedded Processors Peter Petrov; Alex Orailoglu; ICCAD’03 Agenda Introduction Mathematical Formulation Heuristic Solutions For RNA Register PermuTation (RPT) Register PerturBation (RPB) Experimental Results Conclusions 2/19 Introduction Objective: Low-Power Key Point: Reduce bit transition activity on the register index streams. Concept: Register Name Adjustment (RNA) 3/19 Example add r3, r2, r4 011 010 100 sub r6, r3, r5 110 011 101 sub r3, r2, r6 011 010 110 mul r4, r4, r5 100 100 101 Total Bit Transitions: 7 + 4 + 5 = 16 add r6, 110 sub r7, 111 sub r6, 110 mul r4, 100 r2, r4 010 100 r6, r5 110 101 r2, r7 010 111 r4, r5 100 101 3 + 4 + 3 = 10 4/19 Agenda Introduction Mathematical Formulation Heuristic Solutions For RNA Register PermuTation (RPT) Register PerturBation (RPB) Experimental Results Conclusions 5/19 Cost Function n 1 cost f c M Pi ,l , M Pi 1,l C 3 l 1 i 1 fc(rega, regb): the hamming distance between rega and regb. l: the lth column in an instruction. M(Pi, j): a bijective mapping function from the original reg Pi, j to a new reg index 6/19 Literals Literals: unchangeable field in an instruction such as an opcode or immediate oprand. L(i, j): to record the literal positions. M Pi , j M ' P i, j if Li, j = 0 if Li, j = 1 7/19 Example ld add add mul st r5, (r1) 0 r3, r2, r5 r4, r3, r2 P= r3, r4, r3 r3, r7 (10) (v3, (v2, (v5, (v3, v4) v3) v3) v3) – – – – 3 2 1 1 v5 v1 0 v3 v2 v5 v4 v3 v2 v3 v4 v3 v3 v7 10 (v4, v7) – 1 (v5, v2) – 1 ( 0, v5) – 1 (10, v3) – 1 L= 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 (v1, v2) – 1 8/19 Agenda Introduction Mathematical Formulation Heuristic Solutions For RNA Register PermuTation (RPT) Register PerturBation (RPB) Experimental Results Conclusions 9/19 Flow RPB: Max the distribution skew of register pair occurrences Select Vi and Vj that maximize f(eij) + f(eji) Pick names for Vi and Vj and compute the cost All unassigned indices tried? No Yes Brute-Force TimeConsuming Name Vi and Vj with min cost All registers named? No Yes Finish 10/19 Cost Function of RPT Cij = H ck , ci H ci , ck kLi kRi , kI H ck , c j H c j , ck f e f e H ci , c j kL j kR j , kI ij ji Literali … eij Vi Regi … Literalj … eji Vj Regj … 11/19 Register PertuBation Number of higher utilization frequency↓ Performance↑ Number of self transition↑ Performance↑ 12/19 Cost Function of RPB x 2 2 D Dˆ NP N D: the number of self-transitions Maximize maximize Doesσto larger σ imply the distribution skew of larger skewness? register pair occurrences C0 Dˆ 1 ˆ 13/19 Register PertuBation Commutativity Transformation Question: would the data r1 r2, r3 r1 r2, r3 r4 r1, r2 dependency r4 r2, r1 increase? Note: these instructions must be commutable Dead Register Reassignment r1 r2, r3 r4 r1, r2 r2 r3, r4 r1 r2, r3 r2 r1, r2 r2 r3, r2 Note: r4 must be dead after the third instruction 14/19 Dead Register Reassignment r1 1 r2 2 4 r3 3 r1 3 1 4 6 5 r2 r3 2 6 5 Self-Transition 7 7 8 8 15/19 Agenda Introduction Mathematical Formulation Heuristic Solutions For RNA Register PermuTation (RPT) Register PerturBation (RPB) Experimental Results Conclusions 16/19 Experimental Results ˆ 1 ˆ C0 D RPT Circuit Total Total fdct ej RPB 70 58 73,837 63,169 Impr% 18.09 λ(0.0) 47 λ(0.25) 46 λ(0.5) 46 λ(0.75) 46 λ(1.0) Impr% 46 34.55 14.45 49,203 48,933 48,934 48,934 45,224 38.75 41.41 mmul 7,613 6,463 15.11 4,710 Does larger σ imply 4,460 larger 4,460skewness? 4,460 4,593 tri 5,929 5,400 8.92 3,490 3,489 3,489 3,489 3,335 43.76 sor 1,440 1,142 20.69 1,004 1,003 1,043 1,043 1,004 30.30 adpcm_e 20,513 15,338 25.23 15,897 15,144 15,144 15,144 14,750 28.10 adpcm_d 17,212 13,689 20.46 13,393 12,655 12,655 12,655 11,404 33.74 17/19 Agenda Introduction Mathematical Formulation Heuristic Solutions For RNA Register PermuTation (RPT) Register PerturBation (RPB) Experimental Results Conclusions 18/19 Conclusions Minimize the bit transitions , reduce the power consumption. RPT improves up to 25%. RPB improves up to 44%. 19/19