Fully Redundant Decimal Arithmetic Saeid Gorgin and Ghassem Jaberipur Dept. of Electrical & Computer Engr., Shahid Beheshti University, Tehran, Iran Gorgin@sbu.ac.ir, Jaberipur@sbu.ac.ir Kindly presented by Professor Behrooz Parhami Dept. Electrical & Computer Engr. Univ. of California, Santa Barbara, USA parhami@ece.ucsb.edu 19th IEEE Symposium on Computer Arithmetic. Portland, Oregon, USA. June 8‐10, 2009. Fully Redundant Decimal Arithmetic Motivations • 4‐bit decimal digit representations (e.g., BCD) have an Inherent capacity for representing some redundant decimal digit sets. For instance: – The decimal digit set [0, 15] via unsigned 4‐bit encoding – The decimal digit set [–7, 7] via 4‐bit 2’s complement Fully Redundant Decimal Arithmetic • In other words, no extra storage is needed to represent redundant decimal results that can be permanently saved in memory, as they are, for later manipulations with redundant inputs and output. That is there is no need to convert intermediate results to nonredundant format to fit the memory words. 19th IEEE Symposium on Computer Arithmetic. Portland, Oregon, USA. June 8‐10, 2009. 2 Fully Redundant Decimal Arithmetic Outline • Introduction – Nonredundant decimal addition – Redundant decimal addition – Previous redundant decimal adders • Decimal Septa Signed Digit (DSSD [–7, 7]) Adder • DSSD adder/subtractor • Fully redundant DSSD multiplier – PPG for DSSD operands • Fully redundant DSSD divider – DSSD multiplicative division – DSSD subtractive division • Comparison • Conclusions 19th IEEE Symposium on Computer Arithmetic. Portland, Oregon, USA. June 8‐10, 2009. 3 Fully Redundant Decimal Arithmetic Nonredundant decimal addition • Decimal full adder (DFA) as the basic cell – Functionality x y i i ⎢ x + yi + cin ⎥ cout = ⎢ i ⎥ 10 ⎣ ⎦ cin si = xi + yi + cin 10 – Realization via standard 4‐bit adders: x y i i cin cout si 19th IEEE Symposium on Computer Arithmetic. Portland, Oregon, USA. June 8‐10, 2009. 4 Fully Redundant Decimal Arithmetic The problem of Carry Propagation in Nonredundant Number Systems Carry acceleration techniques: y1x 1 y 2x 2 y 3x 3 • Carry Look‐ahead Adder – Ling Adder – Parallel Prefix Adder • • • • • sCarry Skip Adder s3 4 s2 y0 x0 Logarithmic latency at best! Carry Select Adder Conditional Sum Adder 34 Digit adder?!! Hybrid Adders …. s1 19th IEEE Symposium on Computer Arithmetic. Portland, Oregon, USA. June 8‐10, 2009. s0 5 Fully Redundant Decimal Arithmetic Redundant decimal addition • An example (digit set is [–7, 7]) 3 2 1 0 xi yi 2 3 −5 7 5 6 −6 −1 Step I pi 7 9 −11 6 Step II wi ti −3 −1 −1 6 1 1 −1 0 Step III si 1 −2 −2 −1 i 4 19th IEEE Symposium on Computer Arithmetic. Portland, Oregon, USA. June 8‐10, 2009. 6 6 Fully Redundant Decimal Arithmetic Redundant decimal addition – Let the operands and sum digits (xi , yi and si ) all in [−α, β]: – Decimal carry free addition with 4‐bit digits requires that: 19th IEEE Symposium on Computer Arithmetic. Portland, Oregon, USA. June 8‐10, 2009. 7 Fully Redundant Decimal Arithmetic Redundant decimal addition … • Straightforward implementation – Compute pi = xi + yi – Extract transfer ti+1 via comparison of pi with α – Compute the interim sum digit wi = pi − 10ti+1 – Form the final sum digit si = wi + ti 19th IEEE Symposium on Computer Arithmetic. Portland, Oregon, USA. June 8‐10, 2009. 8 Fully Redundant Decimal Arithmetic Previous redundant decimal adder • In chronological order: – Svoboda 1969: Digit set [−6,6] – Shirazi 1989: Digit Set [−7,7] – Nikmehr 2004: Digit Set [−9,9] – Moskal 2007: Digit Set [−9,9] ==> is the fastest previous with 18 ∆G Delay 19th IEEE Symposium on Computer Arithmetic. Portland, Oregon, USA. June 8‐10, 2009. 9 Fully Redundant Decimal Arithmetic Decimal Septa Signed Digit Adder • Form the immediate 2CL carry save sum (pi = xi + yi) • Partition pi to ui and vi ui X i3 xi2 x1i xi0 Yi3 yi2 y1i yi0 zi2 Z i1 Vi0 ti0+1 Z i2 v1i ti0 Ti0+1 vi2 • Extract transfer ti+1 ti +1 X i3 yY1i i3 yi02 xxi0i2 x1i 4-level logic 3-level logic ti0+1 Ti0+1 vi2 2 v1izi Z0i2 Vi vi zi ti Ti0 Z i1 19th IEEE Symposium on Computer Arithmetic. Portland, Oregon, USA. June 8‐10, 2009. 10 Fully Redundant Decimal Arithmetic DSSD Adder … X i3 xi2 Yi3 yi2 x1i 4- level logic zi2 3∆G ti0+1 0 1∆G Ti +1 qi3 8∆G Zi2 Zi1 4∆G yi0 X i3 xi2 x1i xi0 3- level logic Yi3 yi2 y1i yi0 zi2 Z i1 Vi0 3∆Gt 0 Z i2 v1i ti0 1∆G 0 vi2 y1i vi2 xi0 v1i Vi0 FA Qi2 7∆G HA 7∆G Ci2 FA 3∆G 3∆G 5∆G i FA Ti Qi2 9∆G Si3 si2 s1i si0 Overall latency is 9∆G 19th IEEE Symposium on Computer Arithmetic. Portland, Oregon, USA. June 8‐10, 2009. qi3 Ci2 Si3 si2 Ti0 s1i si0 s1i si0 11 Fully Redundant Decimal Arithmetic DSSD Adder/Subtractor X i3 Yi3 xi2 yi2 x1i y1i 4-level logic ti0+1 Ti0+1 zi2 qi3 Zi2 yi0 3-level logic Z i1 vi2 v1i Vi0 FA ti0 Qi2 HA Si3 xi0 si2 Ci2 FA s1i FA si0 DSSD Adder DSSD Adder /Subtractor 19th IEEE Symposium on Computer Arithmetic. Portland, Oregon, USA. June 8‐10, 2009. Ti0 Tim e lt a n e p e t a g R O X y: 12 Fully Redundant Decimal Arithmetic Fully redundant DSSD multiplier • The three phases of conventional multiplication: 1. Partial product generation (PPG) • Based on precomputed multiples of the multiplicand X: {±X, ±3X, ±4X} • Precomputed sets in the previous works: {X, 2X, 4X, 5X}, {±X, ±2X, 5X, 10X}, {X, 2X, 5X, 8X, 9X} 2. Partial product reduction (PPR) • Via DSSD adders 3. Final product computation • Not needed due to the fully redundant nature 19th IEEE Symposium on Computer Arithmetic. Portland, Oregon, USA. June 8‐10, 2009. 13 Fully Redundant Decimal Arithmetic DSSD Partial product generation μi − μi hi1 li1 hi−1 li−1 xi 3 μi 4 σi −3μi 3μi −4 μi 4 μi ςj 3 yj νj hi3 li3 hi−3 li−3 hi4 li4 hi−4 li−4 Mux 3 4 Mux Mux 3 4 3 4 4 5 3 hi′, j 3 4 hi , j 3 li′, j 4 li , j hi′−1, j 4 pi′, j 3 4 hi−1, j pi , j 19th IEEE Symposium on Computer Arithmetic. Portland, Oregon, USA. June 8‐10, 2009. 14 Fully Redundant Decimal Arithmetic DSSD Partial product generation… L3 P3 l2 l1 l0 H2 h1 h0 p2 p1 p0 19th IEEE Symposium on Computer Arithmetic. Portland, Oregon, USA. June 8‐10, 2009. 15 Fully Redundant Decimal Arithmetic Fully redundant DSSD divider • The impact of fully redundant operands and results on the state of the art decimal hardware division algorithms: – DSSD multiplicative division • The reciprocal of the divisor 1/D is produced via converging multiplications or 19th IEEE Symposium on Computer Arithmetic. Portland, Oregon, USA. June 8‐10, 2009. 16 Fully Redundant Decimal Arithmetic Fully redundant DSSD divider – DSSD subtractive division We adopt the techniques described in [Lang07], for the DSSD representation, to obtain the comparison multiples and new quotient digits. 19th IEEE Symposium on Computer Arithmetic. Portland, Oregon, USA. June 8‐10, 2009. 17 Fully Redundant Decimal Arithmetic Evaluation and Comparison • Adder 9 ∆G (proposed DSSD design) VS. 18 ∆G (the best previous one [Shir89]) Gate level: Logical effort: 7.80 FO4 VS. 13.56 FO4 • Multiplier 48.33 FO4 (proposed DSSD design) VS. 65 FO4 (the best previous non‐redundant multiplier [Vazq07]) 19th IEEE Symposium on Computer Arithmetic. Portland, Oregon, USA. June 8‐10, 2009. 18 Fully Redundant Decimal Arithmetic Evaluation and Comparison Ra tio • Comparison of adders based on synthesis 19th IEEE Symposium on Computer Arithmetic. Portland, Oregon, USA. June 8‐10, 2009. 19 19 Fully Redundant Decimal Arithmetic Conclusion and future works • We have proposed – A framework for fully redundant decimal arithmetic based on DSSD set [–7, 7] • 40% speed improvement over the fastest previous adder/subtractor • 34% speed improvement over the best nonredundant multiplier • The same delay as nonredundant divider • For future • Synthesis of the proposed multiplier and divider • Using benchmarks to evaluate the speed advantage of the proposed framework in typical computations. 19th IEEE Symposium on Computer Arithmetic. Portland, Oregon, USA. June 8‐10, 2009. 20 Questions? Esfahan-Iran • Greetings from the authors who eagerly wished to be here and meet the great scholars • With sincere thanks to Professor Parhami