vedicmaths_talk - Computer Science and Engineering

advertisement
Vedic Mathematics for Faster Mental
Calculations and High Speed VLSI Arithmetic
Himanshu Thapliyal , Ph.D. Student
Department of Computer Science and Engineering
University of South Florida
Web: http://www.csee.usf.edu/~hthapliy/
Email: hthapliy@cse.usf.edu
Introduction
 Vedic Mathematics is introduced by Jagadguru Swami Sri
Bharati Krishna Tirthaji Maharaja (1884-1960).
 Vedic means derived from Vedas or another definition by
Sri Sankaracharya is “the fountain head and illimitable store
house of all knowledge”.
 Based on Sixteen Simple Mathematical formulae from the
Vedas accumulated in a book ‘Vedic Mathematics’ by
Swamiji
 Prof. Kenneth Williams from U.K and many other
researchers have done significant study on Vedic Maths.
 Many books on Vedic Maths and its application are written
by Prof. Kenneth Williams and other researchers.
Design of Multiplier Architecture
Using Vedic Mathematics
Implemented Algorithm
Urdhva Tiryakbhyam (Vertical & Crosswise) -
Urdhva means vertically up-down, Tiryakbhyam means
left to right or vice versa .
TABLE 1- 16 x 16 bit Vedic multiplier Using Urdhva Tiryakbhyam
CP- Cross Product (Vertically and Crosswise)
A=
A15 A14 A13 A12
A11 A10 A9 A8
X3
X2
B=
B15 B14 B13 B12
Y3
B11 B10
Y2
B9 B8
A7 A6 A5 A4
X1
A3 A2 A1 A0
X0
B7 B6 B5 B4
Y1
B3 B2 B1 B0
Y0
X3 X2 X1 X0
Multiplicand[16 bits]
Y3 Y2 Y1 Y0
Multiplier [16 bits]
-----------------------------------------------------------------J
I
H
G
F
E
D
C
P7 P6
P5
P4
P3
P2
P1
P0
Product[32 bits]
Where X3, X2, X1, X0, Y3, Y2, Y1 and Y0 are each of 4 bits.
PARALLEL COMPUTATION & METHODOLOGY
1. CP
2. CP
3 CP
4 CP
5 CP
6 CP
7 CP
Note:
X0 = X0 * Y0 = A
Y0
X1 X0 = X1 * Y0+X0 * Y1= B
Y1 Y0
X2 X1 X0 = X2 * Y0 +X0 * Y2 +X1 * Y1=C
Y2 Y1 Y0
X3 X2 X1 X0 = X3 * Y0 +X0 * Y3+X2 * Y1 +X1 * Y2=D
Y3 Y2 Y1 Y0
X3 X2 X1 = X3 * Y1+X1 * Y3+X2 * Y2=E
Y3 Y2 Y1
X3 X2 = X3 * Y2+X2 * Y3=F
Y3 Y2
X3 = X3 * Y3 =G
Y3
Each Multiplication operation is an embedded parallel 4x4 multiply module
Array Multiplier
E.L. Braun. Digital computer design. New York academic, 1963.
Derivation of Array Multiplier from
Vedic Mathematics
Partition Multiplier Derivation from
Vedic Mathematics
1. H. Thapliyal and M.B Srinivas, "High Speed Efficient N X N Bit Parallel Hierarchical Overlay
Multiplier Architecture Based On Ancient Indian Vedic Mathematics", Enformatika (Transactions on
Engineering, Computing and Technology),Volume 2,Dec 2004, pp.225-228.
Square and Cube Architecture
Using Vedic Mathematics
Duplex for Binary Number
• In order to calculate the square of a number “Duplex” D
property of binary numbers has been taken advantage
of. In the Duplex, we take twice the product of the
outermost pair, and then add twice the product of the
next outermost pair, and so on till no pairs are left.
When there are odd number of bits in the original
sequence, there is one bit left by itself in the middle, and
this enters as such.
[1] H.Thapliyal and and H.R. Arabnia , "A Time-Area-Power Efficient Multiplier and Square Architecture Based On Ancient
Indian Vedic Mathematics", Proceedings of the 2004 International Conference on VLSI (VLSI'04: Las Vegas, USA), Paper
acceptance rate of 35%; pp. 434-439.
[2]H. Thapliyal and M.B. Srinivas ,”Design and Analysis of A Novel Parallel Square and Cube Architecture Based On
Ancient Indian Vedic Mathematics", Proceedings of the 48th IEEE MIDWEST Symposium on Circuits and Systems
(MWSCAS 2005), Cincinnati, Ohio, USA, August 7-10, 2005, pp.1462-1465.
[3] H. Thapliyal and M.B. Srinivas ,”An Efficient Method of Elliptic Curve Encryption Using Ancient Indian Vedic
Mathematics", Proceedings of the 48th IEEE MIDWEST Symposium on Circuits and Systems (MWSCAS 2005),
Cincinnati, Ohio, USA, August 7-10, 2005, pp. 826-829.
• Thus,
• For a 1 bit number, D is the same number i.e D(X0)=X0.
• For a 2 bit number D is twice their product i.e D(X1X0)=2 * X1 *
X0.
• For a 3 bit number D is twice the product of the outer pair + the e
middle bit i.e D(X2X1X0)=2 * X2 * X0+X1.
• For a 4 bit number D is twice the product of the outer pair + twice the
product of the inner pair i.e D(X3X2X1X0)
•
=2 * X3 * X0+2 * X2 * X1
• The pairing of the bits 4 at a time is done for number to be squared.
•
Thus D (1)= 1;
•
D(11)=2 * 1 * 1;
•
D( 101)=2 * 1 * 1+0;
•
D(1011)=2 * 1 * 1+2 * 1 * 0;
Square Proposed in [1,2]
[1] Albert A. Liddicoat and Michael J. Flynn, "Parallel Square and Cube Computations", 34th
Asilomar Conference on Signals, Systems, and Computers, California, October 2000.
[2] Albert Liddicoat and Michael J. Flynn," Parallel Square and Cube Computations", Technical
report CSL-TR-00-808 , Stanford University, August 2000.
Proposed Square
The proposed square architecture is an improvement
over partition multipliers in which the
NXN
bit
multiplication can be performed by decomposing the
multiplicand and multiplier bits into M partitions where
M=N/K ( here N is the width of
multiplicand and
multiplier(divisible by 4 ) and K is a multiple of 4 such
as 4, 8 , 12 ,16……….. 4* n). The partition multipliers
are the fastest multipliers implemented in the commercial
processors and are much faster than conventional
multipliers.
[1] H. Thapliyal and M.B. Srinivas ,”Design and Analysis of A Novel Parallel Square and Cube Architecture
Based On Ancient Indian Vedic Mathematics", Proceedings of the 48th IEEE MIDWEST Symposium on Circuits
and Systems (MWSCAS 2005), Cincinnati, Ohio, USA, August 7-10, 2005, pp.1462-1465.
[2] H. Thapliyal and M.B. Srinivas ,”An Efficient Method of Elliptic Curve Encryption Using Ancient Indian Vedic
Mathematics", Proceedings of the 48th IEEE MIDWEST Symposium on Circuits and Systems (MWSCAS 2005),
Cincinnati, Ohio, USA, August 7-10, 2005, pp. 826-829.
Performance Improvement
Comparison with [1,2]
[1] Albert A. Liddicoat and Michael J. Flynn, "Parallel Square and Cube Computations", 34th
Asilomar Conference on Signals, Systems, and Computers, California, October 2000.
[2] Albert Liddicoat and Michael J. Flynn," Parallel Square and Cube Computations", Technical
report CSL-TR-00-808 , Stanford University, August 2000.
Cube
Anurupya Sutra of Vedic Mathematics which states “If you start with the
cube of the first digit and take the next three numbers(in the top row) in
a Geometrical Proportion (in the ratio of the original digits themselves)
you will find that the 4th figure ( on the right end) is just the cube of the
second digit”.
(a +
b)3 =
3
a
2
ab
2
ab
+
+
+
2a2b + 2ab2
3
b
a3 + 3a2b + 3ab2 + b3
This sutra has been utilized in this work to find the cube of a number.
The number M of N bits having its cube to be calculated is divided in
two partitions of N/2 bits, say a and b, and then the Anurupya Sutra is
applied to find the cube of the number.
[1] H. Thapliyal and M.B. Srinivas ,”Design and Analysis of A Novel Parallel Square and Cube Architecture
Based On Ancient Indian Vedic Mathematics", Proceedings of the 48th IEEE MIDWEST Symposium on Circuits
and Systems (MWSCAS 2005), Cincinnati, Ohio, USA, August 7-10, 2005, pp.1462-1465.
[2] H. Thapliyal and M.B. Srinivas ,”An Efficient Method of Elliptic Curve Encryption Using Ancient Indian Vedic
Mathematics", Proceedings of the 48th IEEE MIDWEST Symposium on Circuits and Systems (MWSCAS 2005),
Cincinnati, Ohio, USA, August 7-10, 2005, pp. 826-829.
Cube Proposed in [1,2]
[1] Albert A. Liddicoat and Michael J. Flynn, "Parallel Square and Cube Computations", 34th
Asilomar Conference on Signals, Systems, and Computers, California, October 2000.
[2] Albert Liddicoat and Michael J. Flynn," Parallel Square and Cube Computations", Technical
report CSL-TR-00-808 , Stanford University, August 2000.
A Comparison
[1] Albert A. Liddicoat and Michael J. Flynn, "Parallel Square and Cube Computations", 34th
Asilomar Conference on Signals, Systems, and Computers, California, October 2000.
[2] Albert Liddicoat and Michael J. Flynn," Parallel Square and Cube Computations", Technical
report CSL-TR-00-808 , Stanford University, August 2000.
Design of Division Architecture
Using Vedic Mathematics
Straight Division
Examples shown is from the book “Vedic Mathematics or Sixteen Simple Sutras From The Vedas”
by Jagadguru Swami Sri Bharath, Krsna Tirathji, Motilal Banarsidas , Varanasi(India),1965.
TABLE 2: 3 digit by 2 digit Vedic Division Algorithm
X2 X1 X0 by Y0Y1
X2
Y0
Y1
X1: X0
C1 : C0
_________________
Z1
Z0 : RD
------------------------------
For example 35001/77 will work as follows
3 5 00: 1
7
7 7 :7
7
---------------4 5 4 : 43
1. Divide 35 by 7 and get 5 as the quotient and 0
as the remainder.
2. Call ADJUST (5,0, 0,7,7) .
=> modified quotient=5 and remainder 7
Next Dividend K= ( 7 * 10 + 0)-(7 * 4)=42
3. Do K/ 7 and get 6 as quotient and 0 as
remainder.
4. Call ADJUST(6,0,0,7,7).
=> modified quotient 5 and remainder 7
Next dividend K= (7 * 10+0)-(7 * 4)=42
5. Do K/7 and get 6 as quotient and 0 as remainder
6. Call ADJUST (6,0,1,7,7)
=> modified quotient= 4 and remainder 7
Remainder RD=(7 * 10+1)-(7 * 4)=43
Therefore Quotient =454 and Remainder=43
Steps:
1. First do
X2/Y0 (divide) to get
Z1 as
quotient and C1 as remainder.
2. Call Procedure ADJUST(Z1,C1,X1,Y1,Y0).
Now take the next dividend as
K=( C1 * 10+X1)-(Y1 * Z1).
3. Do K/Y0(divide) to get Z0 as quotient and C0 as
remainder.
4. Call procedure ADJUST (Z0,C0,X0,Y1,Y0).
Now Our required remainder,
RD=(C0 * 10+X0)-(Y1 * Z1).
Hence the Quotient= Qt=Z1Z0
Remainder=RD
Procedure ADJUST (H, I, E, A, B)
{
While ( (I * 10+E) < B * H)
{
1. H. Thapliyal and H. R. Arabania,"High Speed Efficient N Bit by N Bit Division Algorithm And Architecture
H=H-1;
Based On Ancient Indian Vedic Mathematics", Proceedings of VLSI04, Las Vegas, U.S.A, June 2004,
I=I+ A;
pp. 413-419
}
2. H. Thapliyal and M.B Srinivas, “VLSI Implementation of RSA Encryption System using Ancient Indian
}
Vedic Mathematics ”, Proceedings of SPIE -- Volume 5837 VLSI Circuits and Systems II, Jose F. Lopez,
Francisco V. Fernandez, Jose Maria Lopez-Villegas, Jose M. de la Rosa, Editors, June 2005, pp. 888-892
Verification and Synthesis
• The algorithms are implemented in Verilog HDL and the
simulation is done in Verilog simulator.
• The code is synthesized in Synopsis FPGA Express.
The design is optimized for speed using Xilinx, family
Spartan, device S30VQ100.
• The design is completely technology independent and
can be easily converted from one technology to another
• The Spartan family used for synthesis consists of
FMAP & HMAP which are basically 4 inputs and 3 input
XOR function respectively.
Application of Vedic Division and
Multiplier Architecture in
Design of RSA Encryption
Hardware
Timing Simulation Results of RSA Circuitry Using Vedic
Overlay Multiplier and Division Architectures
RSA
Architecture
( With Overlay
Multiplier)
Vendor
Family
Device
Area
Delay(µs)
Spartan
F
M
A
P
H
M
A
P
S30VQ
100
14077
164
2.838
Restore
Division
Xilinx
Non-Restore
Division
Xilinx
Spartan
S30VQ
100
6616
73
2.828
Vedic
Division
Xilinx
Spartan
S30VQ
100
14942
293
1.507
Results and Discussion
• Using the Vedic hierarchical overlay
multiplier and the novel Vedic division
algorithm lead to significant improvement in
performance
• The RSA circuitry has less timing delay
compared to its implementation using
traditional multipliers and division algorithms.
Conclusions
• Vedic Maths algorithms leads to faster mental calculation.
• High speed VLSI arithmetic architectures can be derived
from Vedic Maths
• Due to its parallel and regular structure the Vedic
algorithms can be easily laid out on silicon chip .
• This presentation is a tribute to a great scholar and
mathematician Jagadguru Swami Sri Bharati Krishna
Tirthaji Maharaja.
• Vedic maths India forum lead by Gaurav Tekriwal is doing
a great job in promoting the Vedic Maths among the
students.
To refer to (cite) this presentation, the following style should be used:
Himanshu Thapliyal, “Vedic Mathematics for Faster Mental
Calculations and High Speed VLSI Arithmetic”, Invited talk at IEEE
Computer Society Student Chapter, University of South Florida,
Tampa, FL, Nov 14 2008.
Download