Design and Implementation of Binary Signed Vedic Multiplier for

advertisement
Proc. of Int. Conf. on Recent Trends in Signal Processing, Image Processing and VLSI, ICrtSIV
Design and Implementation of Binary Signed Vedic
Multiplier for Real Time Digital Signal Processing
Applications
Sunil Kumar K B1 and Ganesh V N2
1
MITE/M.Tech 3rd Semester, Moodabidare, India
Email: sunilkumarkb7@gmail.com
2
MITE /Assistant Professor, Department of ECE, Moodabidare, India
Email: ganesh.vnaik@gmail.com
Abstract— In any digital signal processors (DSP’s) the speed of the multipliers plays a very
important role. The precision of the multipliers also plays very important role along with its
speed. For many high performance systems such as Digital Signal Processors,
Microprocessors etc Multipliers are the key components. Since the Floating point
multipliers can provide required precision but they are relatively slow and tend to consume
more silicon area compared to fixed point (Q-format) multipliers. Since the Multipliers are
the slowest component in any system, the system’s performance is determined by
multiplier’s performance. In this paper we propose a fast fixed point signed multiplication
using Urdhava Tiryakbhyam method of Vedic Mathematics.
Index Terms— Fractional fixed point, Intellectual Property, Q-format, Vedic Mathematics,
Urdhava Tiryakbhyam.
I. INTRODUCTION
Vedic Mathematics derived from the “Vedas” the ancient Indian scriptures which means the source of
Knowledge. Vedas covers all the forms of mathematic computations, be it algebra, geometry or trigonometry.
The significant feature of Vedic Mathematics is, its algorithms are designed in the way our mind naturally
works. This makes the fastest and easiest way to perform any mathematical calculations mentally. Vedic
Mathematics is created around 1500 BC and was rediscovered by Sri Bharthi Krishna Tirthaji (1884-1960)
around 1911 to 1918. He was a well known philosopher, Sanskrit scholar and a mathematician [1]. He
classified and organized the entire Vedic Mathematics into 16 formulae. And these 16 formulae are also
called as sutras. These 16 formulae forms the backbone of Vedic mathematics. A lot of research is been
carried out these years to implement these algorithms on digital processors. Because of symmetry and
coherence in these algorithms it consumes less area and can have a regular silicon layout [2,3].
Basically signal processing algorithms are implemented using high level languages like C or Matlab using
floating point number representations. The algorithms used for mapping architecture using floating point
representation is more expensive as it consumes a more hardware. Hence Fixed point number representation
which is a good option to implement the architecture at silicon level. Therefore we focus our work for
multiplication operation using optimized hardware modules which the most used operation in signal
processing applications such as Image processing systems, Optical signal processing, Fourier transforms etc.
DOI: 03.AETS.2014.5.206
© Association of Computer Electronics and Electrical Engineers, 2014
Considering the 16 bit Q15 format and 32 bit Q31 format fixed point representations which are best suited for
implementation on processors for digital signal processing applications with the required precision. The
advantage of Q-format over floating point multipliers is in Q-format, fraction multiplication is carried out
using integer multipliers which consumes very less silicon area and very fast. In this paper we propose the
Urdhava Tiryakbhyam method of vedic mathematics for the implementation of Q-format [6] high speed
multiplier. And we also implement multipliers using normal booth algorithm [7] and Xilinx parallel
multiplier Intellectual Property and compare the speed or maximum frequency of these multipliers.
II. FIXED-POINT ARITHMETIC
A N-bit number fixed point number [6] can be interpreted as either an fractional or a integer number. Fixed
point integer is difficult to use in the processors due to its possible overflow problem. For e.g. In a 16-bit
processor the dynamic range for signed integers is from - 215 to 215-1 i.e. 32768 to 32767. If we multiply 600
with 800, the result is 48000 which is an overflow. To overcome this situation we use fractional fixed point
representation also called as Q-format representation.
A. Fixed Point Multiplication
In general any Q-format representation is denoted by Qm,n, where m is the number of bits to represent integer,
n is number of bits to represent fractional part and the total number of bits is given by N = m+n+1 for signed
numbers. For e.g. Q 4,11 format signifies a total of 16 bits are required to represent fractional number in which
4 bits are reserved for the integer part and 11 bits for the fractional part and 1 bit indicates the Sign. Special
cases of Q-format consist of zero bits to indicate the integer part. Q0,15 (Q15) and Q0,31 (Q31) are the two such
formats for 16 bit and 32 bit representations respectively. Since there is no integer part the fractional number
lie between the range 1 and -1.
An N-bit number in Qm,n format is represented as follows [6].
an+m an+m-1… … … an.an-1 … … a1 a0
Here the ‘.’ an.an-1 represents the fixed point and the value of [1] is given by,
(an+m 2N-1+ an+m-1 2 N-2… … …+ a2 22+a1 2+a0)2-n
If we want to convert a fractional number in the range of the desired Qm,n format we should multiply it with
2n. The resultant value is rounded off or truncated to the nearest integer. Therefore a small amount of
precision loss is involved which reduces as the number of bits representing the fractional part increases. We
prefer rounding technique, since its error bias in positive and negative direction is same [6]. Therefore
rounded value will be more precise.
In general a Qm,n format has a resolution of 2-n and its dynamic range lies between -2 m to 2m - 2-n. Therefore as
the number of fractional bits increases the resolution increases and as the number of integer bits increases the
dynamic range increases. The resolution of Q15 format is 2-15.
B. Q - Format Multiplication
When two Q15 numbers are multiplied their product will be 32 long as illustrated in the Fig. 1. The product
consists of a extended sign bit or redundant bit. Since the memory of product to be stored should be a Q15
number we shift the product by one bit to left and the MSB 16 bits (including redundant sign bit) is stored in
the memory.
Fig. 1. Illustrates the multiplication of the two Q15 format numbers. Therefore, with the Q-format
representation, multiplications of the two fractional numbers are carried out using the integer multiplication.
Integer multiplications are faster compared to floating point multipliers and also consume less area which is
the major advantage of the Q-format representation.
III. URDHAVA TIRYAKBHYAM METHOD
Urdhava Tiryakbhyam [2] is a Sanskrit word which means crosswire and vertical in English. This method is a
general multiplication formula and can be applied to all the cases of multiplication. In this method
concurrently all the partial products are generated. Fig. 2 demonstrates a 4 x 4 binary multiplication using
Urdhava Tiryakbhyam method.
8
This multiplier based is on the Urdhava Tiryakbhyam algorithm of the vedic mathematics is independent of
the speed of the processor because their sums and partial products are calculated in parallel. And also by
increasing the data bus widths of input and output since they are regular in structure we can increase the
Figure 1. Product of two Q15 format numbers in Q15 format itself
Figure 2. Multiplication of two 4-bit numbers in urdhava tiryakbhyam method
processing power. Due to its regular structure, it consumes less area [3] and can be layout into a silicon chip
easily. The area and gate delay increases very slowly with the increase in the number of input bits as
compared to other multipliers. Hence Urdhava Tiryakbhyam multiplier is space, power and time efficient.
The line diagram in fig. 2 illustrates multiplying of the two 4-bit binary numbers a3a2a1a0 and b3b2b1b0.
Initially in step1 of Fig. 2, LSB of the multiplier is multiplied with LSB of the multiplicand (vertically
multiplied). The result we get will be the LSB of the product. In step2 the next bit of the multiplier is
multiplied with the least significant bit of the multiplicand and the least significant bit of the multiplier is
multiplied with the next bit of the multiplicand (crosswire multiplied). The two partial products are added
and the least significant bit of the sum is the next bit of the product and remaining bits are carried forward to
the next step. For example, in one of the step, if we get the 1010 as the resultt, then 0 will act as the resultant
bit (referred as fn ) and 101 as the carry bits (referred as Cn ). Hence Cn may be a multi-bit number. The sums
and partial products for every step can be parallel calculated. The significant feature is, the switching
instances of a processor increases with the increase in the operating frequency of the processor. This results
in the higher device operating temperatures due to the more power dissipation and also consumption in the
form of heat. Scalability is the another advantage of this multiplier.
For each step in fig. 2 the corresponding expressions are as follows:
f0 = a0b0.
(1)
c1f1 = a1b0 + a0b1.
(2)
c2f2 = c1 + a2b0 + a1b1 + a0b2.
(3)
9
c3f3 = c2 + a3b0 + a2b1 + a1b2 + a0b3.
(4)
c4f4 = c3 + a3b1 + a2b2 + a1b3.
(5)
c5f5 = c4 + a3b2 + a2b3.
(6)
c6f6 = c5 + a3b3.
(7)
With c6f6f5f4f3f2f1f0 being the final result [5]. Hence this mathematical formula is generally applicable for all
cases of multiplication.
For the multiplication of two 8-bit numbers using 4-bit multiplier we proceed as follows. Consider XHXL
and YHYL as the two 8-bit numbers, where XH and YH corresponds to the MSB 4 bits, XL and YL are the
LSB 4 bits of an 8-bit number. When these two numbers are multiplied according to Vedic algorithm
(vertically and crosswire) method, we get,
XH
XL
YH
YL
______________________
(XH * XL) + (XH * YL + YH* XL) + (XL * BL)
Thus we need two adders to add the partial products, four 4-bit multipliers and an 4-bit intermediate carry is
generated. The product of 4 x 4 bit multiplier is 8 bits long, the 4 MSB bits in each step is carried to the next
step and 4 LSB bits corresponds to the final product..This process is continued for 3 steps for this case.
Similarly, a 16 bit multiplier has two 16 bit adders and four 8 x 8 multipliers with 8 bit carry. Hence the
modularity of this multiplier is very high and leads to the scalability and regularity of the multiplier layout.
IV. ARCHITECTURE
Our design of Q-format signed multiplier includes Urdhava Tiryakbhyam integer multiplier [4] with certain
modifications. Since concurrently all the partial products are computed the operating speed of this multiplier
is very fast compared to other multipliers. The product of the 16-bit Q15 multiplier is 16 bits long which is
also a Q15 number. Now, if the most significant bit of input is 1 then the number is a negative number.
Hence before proceeding with the multiplication the 2’s complement of the number is taken. Since the most
significant bit signifies the sign of the number it is neglected and a ‘0’ is placed in most significant bit during
multiplication. A 16-bit Q15 format multiplier is as shown in Fig. 3, the resulting product is 32 bits long,
which consists of four 8 x 8 Urdhava multipliers. But the product of a Q15 number must also be a Q15
number which should be of 16 bits long. Hence 1 bit is left shifted to remove the redundant sign bit in the 32
bit product. And only the MSB 16 bits of the product are considered which forms final product. If the output
is ‘1’ then the 16 bit final result is converted to its 2’s complement format indicating that the product is
negative. To determine the sign of the result an xor operation is performed on input sign bits.
Fig. 3. Q15 format multiplier’s Architecture
10
V. IMPLEMENTATION AND RESULTS
The proposed Urdhava Tiryakbhyam Q-format multiplier is designed using structural form of coding and
VHDL. The 4 x 4 Urdhava Tiryakbhyam integer multiplier which is made up of two 2x2 multiplier blocks
which forms the basic block of Q15 multiplier. The clock is used to completely synchronize the design.
Further, normal booth’s multiplier and Xilinx parallel multiplier Intellectual Property (IP) generated by
Xilinx Core Generator which is optimized for speed with no pipelining stage were also implemented and
compared with this multiplier.
A. Simulation Results
The design was simulated using Xilinx ISE 14.2 version.
For the multiplication of Q15 format shown in Fig.4,
Input1 = -0.75 = 1010 0000 0000 0000
Input2 = -0.25 = 1100 0000 0000 0000
Output = 0.1875 = 0001 1000 0000 0000
Fig. 4. Simulation results of Q15 multiplier
TABLE I: C OMPARISON OF 16 BIT Q15-FORMAT M ULTIPLIERS
Maximum
Frequency
(in MHz)
6 – input Slice
LUT Usage
Urdhava
Multiplier
236.18
Xilinx Parallel
Multiplier IP
168.77
434/19200
Normal Booth
Multiplier
128.35
718/19200
662/19200
Factor which
Urdhava
Multiplier is
faster
1.40 times
1.84 imes
The speed of Urdhava Qformat multiplier is faster than the booth multiplier by a factor of 1.84. For a Q15
format multiplier, the speed is improved by a factor of 1.40 compared to Xilinx parallel multiplier IP and
booth multiplier respectively.
VI. CONCLUSION
This paper proposed a architecture of fast multiplier for multiplications of signed Q-format numbers using
Urdhava Tiryakbhyam algorithm of Vedic mathematics. The proposed multiplier speeds up the multiplication
operation compared to other multipliers which is the basic hardware block as the Q-format representation,
which is widely used in Digital Signal Processors. They are faster than the booth multipliers. It is also shown
that the Urdhava Tiryakbhyam Q-format multiplier is faster than Xilinx parallel multiplier Intellectual
Property (IP) using slice LUT’s and also using DSP48E blocks which are meant specifically for digital signal
processing operations like multiply-accumulate, multiply-add etc. Therefore the Q-format multiplier based on
the Urdhava Tiryakbhyam algorithm is best suited for faster multiplications required in the digital signal
processing applications.
11
REFERENCES
[1] Jagadguru Swami Sri BharatiKrisnaTirthaji Maharaja, “VedicMathematics: Sixteen Simple Mathematical Formulae
fromthe Veda,” MotilalBanarasidas Publishers, Delhi, 2009, pp. 5-45.
[2] H. Thapliyal and M. B. Shrinivas and H. Arbania, “Design and Analysisof a VLSI Based High Performance Low
Power Parallel Square Architecture,” Int. Conf. Algo.Math.Comp. Sc., Las Vegas, June 2005, pp. 72-76.
[3] HimanshuThapliyal and M. B. Srinivas,“An efficient method of elliptic curve encryption using Ancient Indian
Vedic Mathematics,”48th IEEE International Midwest Symposium on Circuits and Systems, 2005, vol. 1, pp. 826828.
[4] M. Pradhan and R. Panda, “Design and Implementation of Vedic Multiplier,” A.M.S.E Journal, Computer Science
and Statistics, Francevol. 15, July 2010, pp. 1-19.
[5] Harpreet Singh Dhillon ,AbhijitMitra, “A Reduced-Bit Multiplication Algorithm for Digital Arithmetics”
International Journal of Computational and Mathematical Sciences, Spring 2008, pp.64-69.
[6] Sen-Maw Kuo and Woon-SengGan, “Digital Signal Processor, architectures, Implementations and applications,”
Pearson PrenticeHall, 2005, pp. 253-323
[7] A.D. Booth, “A Signed Binary Multiplication Technique”, Qrt. J. Mech. App. Math., vol. 4, 1951, pp. 236–240.
12
Download