Vedic Mathematics for Faster Mental Calculations and High Speed VLSI Arithmetic Himanshu Thapliyal , Ph.D. Student Department of Computer Science and Engineering University of South Florida Web: http://www.csee.usf.edu/~hthapliy/ Email: hthapliy@cse.usf.edu Introduction Vedic Mathematics is introduced by Jagadguru Swami Sri Bharati Krishna Tirthaji Maharaja (1884-1960). Vedic means derived from Vedas or another definition by Sri Sankaracharya is “the fountain head and illimitable store house of all knowledge”. Based on Sixteen Simple Mathematical formulae from the Vedas accumulated in a book ‘Vedic Mathematics’ by Swamiji Prof. Kenneth Williams from U.K and many other researchers have done significant study on Vedic Maths. Many books on Vedic Maths and its application are written by Prof. Kenneth Williams and other researchers. Design of Multiplier Architecture Using Vedic Mathematics Implemented Algorithm Urdhva Tiryakbhyam (Vertical & Crosswise) - Urdhva means vertically up-down, Tiryakbhyam means left to right or vice versa . TABLE 1- 16 x 16 bit Vedic multiplier Using Urdhva Tiryakbhyam CP- Cross Product (Vertically and Crosswise) A= A15 A14 A13 A12 A11 A10 A9 A8 X3 X2 B= B15 B14 B13 B12 Y3 B11 B10 Y2 B9 B8 A7 A6 A5 A4 X1 A3 A2 A1 A0 X0 B7 B6 B5 B4 Y1 B3 B2 B1 B0 Y0 X3 X2 X1 X0 Multiplicand[16 bits] Y3 Y2 Y1 Y0 Multiplier [16 bits] -----------------------------------------------------------------J I H G F E D C P7 P6 P5 P4 P3 P2 P1 P0 Product[32 bits] Where X3, X2, X1, X0, Y3, Y2, Y1 and Y0 are each of 4 bits. PARALLEL COMPUTATION & METHODOLOGY 1. CP 2. CP 3 CP 4 CP 5 CP 6 CP 7 CP Note: X0 = X0 * Y0 = A Y0 X1 X0 = X1 * Y0+X0 * Y1= B Y1 Y0 X2 X1 X0 = X2 * Y0 +X0 * Y2 +X1 * Y1=C Y2 Y1 Y0 X3 X2 X1 X0 = X3 * Y0 +X0 * Y3+X2 * Y1 +X1 * Y2=D Y3 Y2 Y1 Y0 X3 X2 X1 = X3 * Y1+X1 * Y3+X2 * Y2=E Y3 Y2 Y1 X3 X2 = X3 * Y2+X2 * Y3=F Y3 Y2 X3 = X3 * Y3 =G Y3 Each Multiplication operation is an embedded parallel 4x4 multiply module Array Multiplier E.L. Braun. Digital computer design. New York academic, 1963. Derivation of Array Multiplier from Vedic Mathematics Partition Multiplier Derivation from Vedic Mathematics 1. H. Thapliyal and M.B Srinivas, "High Speed Efficient N X N Bit Parallel Hierarchical Overlay Multiplier Architecture Based On Ancient Indian Vedic Mathematics", Enformatika (Transactions on Engineering, Computing and Technology),Volume 2,Dec 2004, pp.225-228. Square and Cube Architecture Using Vedic Mathematics Duplex for Binary Number • In order to calculate the square of a number “Duplex” D property of binary numbers has been taken advantage of. In the Duplex, we take twice the product of the outermost pair, and then add twice the product of the next outermost pair, and so on till no pairs are left. When there are odd number of bits in the original sequence, there is one bit left by itself in the middle, and this enters as such. [1] H.Thapliyal and and H.R. Arabnia , "A Time-Area-Power Efficient Multiplier and Square Architecture Based On Ancient Indian Vedic Mathematics", Proceedings of the 2004 International Conference on VLSI (VLSI'04: Las Vegas, USA), Paper acceptance rate of 35%; pp. 434-439. [2]H. Thapliyal and M.B. Srinivas ,”Design and Analysis of A Novel Parallel Square and Cube Architecture Based On Ancient Indian Vedic Mathematics", Proceedings of the 48th IEEE MIDWEST Symposium on Circuits and Systems (MWSCAS 2005), Cincinnati, Ohio, USA, August 7-10, 2005, pp.1462-1465. [3] H. Thapliyal and M.B. Srinivas ,”An Efficient Method of Elliptic Curve Encryption Using Ancient Indian Vedic Mathematics", Proceedings of the 48th IEEE MIDWEST Symposium on Circuits and Systems (MWSCAS 2005), Cincinnati, Ohio, USA, August 7-10, 2005, pp. 826-829. • Thus, • For a 1 bit number, D is the same number i.e D(X0)=X0. • For a 2 bit number D is twice their product i.e D(X1X0)=2 * X1 * X0. • For a 3 bit number D is twice the product of the outer pair + the e middle bit i.e D(X2X1X0)=2 * X2 * X0+X1. • For a 4 bit number D is twice the product of the outer pair + twice the product of the inner pair i.e D(X3X2X1X0) • =2 * X3 * X0+2 * X2 * X1 • The pairing of the bits 4 at a time is done for number to be squared. • Thus D (1)= 1; • D(11)=2 * 1 * 1; • D( 101)=2 * 1 * 1+0; • D(1011)=2 * 1 * 1+2 * 1 * 0; Square Proposed in [1,2] [1] Albert A. Liddicoat and Michael J. Flynn, "Parallel Square and Cube Computations", 34th Asilomar Conference on Signals, Systems, and Computers, California, October 2000. [2] Albert Liddicoat and Michael J. Flynn," Parallel Square and Cube Computations", Technical report CSL-TR-00-808 , Stanford University, August 2000. Proposed Square The proposed square architecture is an improvement over partition multipliers in which the NXN bit multiplication can be performed by decomposing the multiplicand and multiplier bits into M partitions where M=N/K ( here N is the width of multiplicand and multiplier(divisible by 4 ) and K is a multiple of 4 such as 4, 8 , 12 ,16……….. 4* n). The partition multipliers are the fastest multipliers implemented in the commercial processors and are much faster than conventional multipliers. [1] H. Thapliyal and M.B. Srinivas ,”Design and Analysis of A Novel Parallel Square and Cube Architecture Based On Ancient Indian Vedic Mathematics", Proceedings of the 48th IEEE MIDWEST Symposium on Circuits and Systems (MWSCAS 2005), Cincinnati, Ohio, USA, August 7-10, 2005, pp.1462-1465. [2] H. Thapliyal and M.B. Srinivas ,”An Efficient Method of Elliptic Curve Encryption Using Ancient Indian Vedic Mathematics", Proceedings of the 48th IEEE MIDWEST Symposium on Circuits and Systems (MWSCAS 2005), Cincinnati, Ohio, USA, August 7-10, 2005, pp. 826-829. Performance Improvement Comparison with [1,2] [1] Albert A. Liddicoat and Michael J. Flynn, "Parallel Square and Cube Computations", 34th Asilomar Conference on Signals, Systems, and Computers, California, October 2000. [2] Albert Liddicoat and Michael J. Flynn," Parallel Square and Cube Computations", Technical report CSL-TR-00-808 , Stanford University, August 2000. Cube Anurupya Sutra of Vedic Mathematics which states “If you start with the cube of the first digit and take the next three numbers(in the top row) in a Geometrical Proportion (in the ratio of the original digits themselves) you will find that the 4th figure ( on the right end) is just the cube of the second digit”. (a + b)3 = 3 a 2 ab 2 ab + + + 2a2b + 2ab2 3 b a3 + 3a2b + 3ab2 + b3 This sutra has been utilized in this work to find the cube of a number. The number M of N bits having its cube to be calculated is divided in two partitions of N/2 bits, say a and b, and then the Anurupya Sutra is applied to find the cube of the number. [1] H. Thapliyal and M.B. Srinivas ,”Design and Analysis of A Novel Parallel Square and Cube Architecture Based On Ancient Indian Vedic Mathematics", Proceedings of the 48th IEEE MIDWEST Symposium on Circuits and Systems (MWSCAS 2005), Cincinnati, Ohio, USA, August 7-10, 2005, pp.1462-1465. [2] H. Thapliyal and M.B. Srinivas ,”An Efficient Method of Elliptic Curve Encryption Using Ancient Indian Vedic Mathematics", Proceedings of the 48th IEEE MIDWEST Symposium on Circuits and Systems (MWSCAS 2005), Cincinnati, Ohio, USA, August 7-10, 2005, pp. 826-829. Cube Proposed in [1,2] [1] Albert A. Liddicoat and Michael J. Flynn, "Parallel Square and Cube Computations", 34th Asilomar Conference on Signals, Systems, and Computers, California, October 2000. [2] Albert Liddicoat and Michael J. Flynn," Parallel Square and Cube Computations", Technical report CSL-TR-00-808 , Stanford University, August 2000. A Comparison [1] Albert A. Liddicoat and Michael J. Flynn, "Parallel Square and Cube Computations", 34th Asilomar Conference on Signals, Systems, and Computers, California, October 2000. [2] Albert Liddicoat and Michael J. Flynn," Parallel Square and Cube Computations", Technical report CSL-TR-00-808 , Stanford University, August 2000. Design of Division Architecture Using Vedic Mathematics Straight Division Examples shown is from the book “Vedic Mathematics or Sixteen Simple Sutras From The Vedas” by Jagadguru Swami Sri Bharath, Krsna Tirathji, Motilal Banarsidas , Varanasi(India),1965. TABLE 2: 3 digit by 2 digit Vedic Division Algorithm X2 X1 X0 by Y0Y1 X2 Y0 Y1 X1: X0 C1 : C0 _________________ Z1 Z0 : RD ------------------------------ For example 35001/77 will work as follows 3 5 00: 1 7 7 7 :7 7 ---------------4 5 4 : 43 1. Divide 35 by 7 and get 5 as the quotient and 0 as the remainder. 2. Call ADJUST (5,0, 0,7,7) . => modified quotient=5 and remainder 7 Next Dividend K= ( 7 * 10 + 0)-(7 * 4)=42 3. Do K/ 7 and get 6 as quotient and 0 as remainder. 4. Call ADJUST(6,0,0,7,7). => modified quotient 5 and remainder 7 Next dividend K= (7 * 10+0)-(7 * 4)=42 5. Do K/7 and get 6 as quotient and 0 as remainder 6. Call ADJUST (6,0,1,7,7) => modified quotient= 4 and remainder 7 Remainder RD=(7 * 10+1)-(7 * 4)=43 Therefore Quotient =454 and Remainder=43 Steps: 1. First do X2/Y0 (divide) to get Z1 as quotient and C1 as remainder. 2. Call Procedure ADJUST(Z1,C1,X1,Y1,Y0). Now take the next dividend as K=( C1 * 10+X1)-(Y1 * Z1). 3. Do K/Y0(divide) to get Z0 as quotient and C0 as remainder. 4. Call procedure ADJUST (Z0,C0,X0,Y1,Y0). Now Our required remainder, RD=(C0 * 10+X0)-(Y1 * Z1). Hence the Quotient= Qt=Z1Z0 Remainder=RD Procedure ADJUST (H, I, E, A, B) { While ( (I * 10+E) < B * H) { 1. H. Thapliyal and H. R. Arabania,"High Speed Efficient N Bit by N Bit Division Algorithm And Architecture H=H-1; Based On Ancient Indian Vedic Mathematics", Proceedings of VLSI04, Las Vegas, U.S.A, June 2004, I=I+ A; pp. 413-419 } 2. H. Thapliyal and M.B Srinivas, “VLSI Implementation of RSA Encryption System using Ancient Indian } Vedic Mathematics ”, Proceedings of SPIE -- Volume 5837 VLSI Circuits and Systems II, Jose F. Lopez, Francisco V. Fernandez, Jose Maria Lopez-Villegas, Jose M. de la Rosa, Editors, June 2005, pp. 888-892 Verification and Synthesis • The algorithms are implemented in Verilog HDL and the simulation is done in Verilog simulator. • The code is synthesized in Synopsis FPGA Express. The design is optimized for speed using Xilinx, family Spartan, device S30VQ100. • The design is completely technology independent and can be easily converted from one technology to another • The Spartan family used for synthesis consists of FMAP & HMAP which are basically 4 inputs and 3 input XOR function respectively. Application of Vedic Division and Multiplier Architecture in Design of RSA Encryption Hardware Timing Simulation Results of RSA Circuitry Using Vedic Overlay Multiplier and Division Architectures RSA Architecture ( With Overlay Multiplier) Vendor Family Device Area Delay(µs) Spartan F M A P H M A P S30VQ 100 14077 164 2.838 Restore Division Xilinx Non-Restore Division Xilinx Spartan S30VQ 100 6616 73 2.828 Vedic Division Xilinx Spartan S30VQ 100 14942 293 1.507 Results and Discussion • Using the Vedic hierarchical overlay multiplier and the novel Vedic division algorithm lead to significant improvement in performance • The RSA circuitry has less timing delay compared to its implementation using traditional multipliers and division algorithms. Conclusions • Vedic Maths algorithms leads to faster mental calculation. • High speed VLSI arithmetic architectures can be derived from Vedic Maths • Due to its parallel and regular structure the Vedic algorithms can be easily laid out on silicon chip . • This presentation is a tribute to a great scholar and mathematician Jagadguru Swami Sri Bharati Krishna Tirthaji Maharaja. • Vedic maths India forum lead by Gaurav Tekriwal is doing a great job in promoting the Vedic Maths among the students. To refer to (cite) this presentation, the following style should be used: Himanshu Thapliyal, “Vedic Mathematics for Faster Mental Calculations and High Speed VLSI Arithmetic”, Invited talk at IEEE Computer Society Student Chapter, University of South Florida, Tampa, FL, Nov 14 2008.