HIGH-SPEED SUPERIOR BI-ROTATIONAL CORDIC USING

advertisement
1295
1295
Vol 05, Article 10465, October 2014
International Journal of VLSI and Embedded Systems-IJVES
http://ijves.com
ISSN: 2249 – 6556
HIGH-SPEED SUPERIOR BI-ROTATIONAL CORDIC USING
QUADRANT AMENDMENT WITH PRE-SCALED
STRUCTURAL DESIGN
K JAYARAM KUMAR1, P SOUNDARYA MALA2
1
PG student (VLSI & Embedded Systems), GIET, Rajahmundry, India
2
Associate Professor, ECE Dept, GIET, Rajahmundry, India
1
jramworld@gmail.com, 2palivela.soundarya@gmail.com
ABSTRACT
Rotation of vectors through fixed and known angles has many applications in animations, robotics, games,
computer graphics and digital signal processing. In fixed angle rotation, the rotation of vectors are
uncontrolled by the design till all elementary rotations are completed. In real time applications it is a major
issue because of unpredictable angles and ineffective system gain. Therefore, in this paper, we propose a
optimized superior Bi-rotational CORDIC design for high speed fixed angle vector-rotation through specific
angles with angle correction and quadrant amendment. An efficient structural design is also developed with
conventional circuit to reduce the circuit complexity. The angle correction and quadrant amendment are done
by using rotation mapping and position sequencing algorithm respectively.
Keywords: Bi-rotational CORDIC, vector rotation, position sequencer, CORDIC.
I. INTRODUCTION
Coordinate Rotation DIgital Computer is abbreviated as CORDIC. The key concept of CORDIC arithmetic is
based on the simple and ancient principles of two-dimensional geometry. But the iterative formulation of a
computational algorithm for its implementation was first described in 1959 by Jack E. Volder [1], [2] for the
computation of trigonometric functions, multiplication and division. CORDIC-based computing received
increased attention in 1971, when John Walther [3], [4] showed that, by varying a few simple parameters, it
could be used as a single algorithm for unified implementation of a wide range of elementary transcendental
functions involving logarithms, exponentials, and square roots along with those suggested by Volder [1]. During
the same time, Cochran [5] benchmarked various algorithms, and showed that CORDIC technique is a better
choice for scientific calculator applications.
The popularity of CORDIC was very much enhanced thereafter primarily due to its potential for efficient and
low cost implementation of a large class of applications which include: the generation of trigonometric,
logarithmic and transcendental elementary functions; complex number multiplication, eigenvalue computation,
matrix inversion, solution of linear systems and singular value decomposition (SVD) for signal processing,
image processing, and general scientific computation. Some other popular and upcoming applications are:
1) direct frequency synthesis, digital modulation and coding for speech synthesis and communication.
2) direct and inverse kinematics computation for robot manipulation.
3) planar and three-dimensional vector rotation for graphics and animation.
Rotation of vectors through a fixed and known angle has wide applications in robotics, graphics, games and
animation [6], [13], [14]. Locomotion of robots is very often performed by successive rotations through small
fixed angles and translations of the links. The translation operations are realized by simple additions of
coordinate values while the new coordinates of a rotational step could be accomplished by suitable successive
rotations through a small fixed angle which could be performed by a CORDIC circuit for fixed rotation [6].
Similarly, interpolation of orientations between key-frames in computer graphics and animation could be
performed by fixed CORDIC rotations [14]. There are plenty of examples of uniform rotation starting from
electrons inside an atom to the planets and satellites. A simple example of uniform rotations is the hands of an
animated mechanical clock which perform one degree rotation each time. There are several cases where
highspeed constant rotation are required in games, graphic and animation. The objects with constant rotations
are very often used in simulation, modelling, games and animation. Efficient implementation of rotation through
a known small angle to be used in these areas could be implemented efficiently by simple and dedicated
CORDIC circuits. Keeping the requirements and constraints of different application environments in view, the
development of CORDIC algorithm and architecture has taken place for achieving high throughput rate and
reduction of hardware-complexity as well as the latency of implementation. Some of the typical approaches for
reduced-complexity implementation are focussed on minimization of the complexity of scaling operation and
the complexity of barrel-shifter in the CORDIC engine. Latency of implementation is an inherent drawback of
the conventional CORDIC algorithm. Angle recoding schemes, mixed-grain rotation and higher radix CORDIC
have been developed for reduced latency realization. Parallel and pipelined CORDIC have been suggested for
high-throughput computation.
2010-2014 – IJVES
Indexing in Process - EMBASE, EmCARE, Electronics & Communication Abstracts, SCIRUS, SPARC, GOOGLE Database, EBSCO, NewJour, Worldcat,
DOAJ, and other major databases etc.,
1296
1296
Vol 05, Article 10465, October 2014
International Journal of VLSI and Embedded Systems-IJVES
http://ijves.com
ISSN: 2249 – 6556
II. METHODOLOGY
A CORDIC can be operated in two different modes, the vectoring and the rotation mode. In vectoring mode,
coordinates (x,y) are rotated until y converges to zero. In rotation mode, initial vector (x,y) starts aligned with
the x-axis and is rotated by an angle of θi every cycle, so sfter n iterations, θn is the obtained angle.
All the trigonometric functions can be computed or derived from functions using vector rotations. The CORDIC
algorithm provides an iterative method of performing vector rotations by arbitrary angles using only shift and
add operations. The algorithm is derived using the general rotation transform.The CORDIC algorithm performs
a planar rotation. Graphically, planar rotation means transforming a vector (x,y) into a new vector (x',y'). Vector
V, came into image after anticlockwise rotation by an angle ϕ. From Fig.1 & 2, it can be observed that
x = r cosθ, y = r sinθ
(1)
cos ϕ −sin ϕ x
V′ = x′
=
(2)
y′
sin ϕ
cos ϕ y
Fig.1 Rotation of vector V by an angle ϕ
Fig.2 Vector v with magnitude r and phase θ
It is well known that the rotation matrix
R(ϕ) =
cos ϕ
sin ϕ
−sin ϕ
cos ϕ
(3)
1
Applying successive rotations to the initial vector
modified rotation matrix is
0
1
−tan ϕ
R ϕ = cos ϕ
(4)
tan ϕ
1
-i
The multiplication by the tangent term can be avoided if tanϕ = 2 . In digital hardware, this denotes a simple
shift operation and since cos(ϕ) = cos(-ϕ) is constant for a fixed number of iterations.
−i
R ϕ = cos ϕ 1−i −2
(5)
2
1
Equation (2) can be expressed as
−i
x
x′
= cos ϕ 1−i −2
(6)
y
y′
2
1
𝜙𝑖 may be positive or negative depending upon whether the rotation is anticlockwise or clockwise. Equation (6)
can be expressed as
1
−di 2−i x
x′
cos ϕ
(7)
′ =
y
y
di 2−i
1
III. ADVANCED CORDIC DESIGN FOR FIXED ANGLE VECTOR ROTATION
In rotation-mode CORDIC algorithm the basic CORDIC equation can now be expressed[7] as
xi+1 = ki (xi - yi di 2-i)
(8)
yi+1 = ki (yi + xi di 2-i)
(9)
Angle accumulator is
zi+1= zi – di tan−1 2−i
Or
zi+1= zi – di ϕi
(10)
Where i denotes the number of rotations for the angle of required vector, ki = cos(arctan(2-i)) and di=
±1(direction of rotation ), the product of the 𝑘𝑖 represents K-factor
n−1
K=
ki
(11)
i=0
The architecture of the new CORDIC rotator can be derived by a suitable hardware mapping of the algorithm
described above. For sake of clarity, the implementation of a 16-bit CORDIC rotator is described here as an
example. All of the discussions presented in this section can be generalized for an n- bit CORDIC rotator as
well.
Where
2010-2014 – IJVES
Indexing in Process - EMBASE, EmCARE, Electronics & Communication Abstracts, SCIRUS, SPARC, GOOGLE Database, EBSCO, NewJour, Worldcat,
DOAJ, and other major databases etc.,
1297
1297
Vol 05, Article 10465, October 2014
International Journal of VLSI and Embedded Systems-IJVES
http://ijves.com
ISSN: 2249 – 6556
𝑛−1
𝑘𝑖 = cos 𝜙0 . cos 𝜙1 . cos 𝜙2 … cos 𝜙𝑛−1
(12)
𝑖=0
(ϕ is the angle of rotation here for n times rotation). These ϕi are stored in the ROM of the CORDIC hardware as
the look up table. 𝑘𝑖 is the CORDIC gain and it is same for all CORDIC hardware.
For 8-bit hardware CORDIC approximation method., the value of 𝑘𝑖 as
7
k i = cos ϕ0 . cos ϕ1 . cos ϕ2 … cos ϕ7
i=0
= cos 45o . cos26.565o … cos 0.4469o = 0.6073
For 16 bit CORDIC hardware the value of 𝑘𝑖 is
(13)
k i = cos ϕ0 . cos ϕ1 … cos ϕ15 = 0.6073
(14)
15
i=0
For 24 bit CORDIC hardware, the value of 𝑘𝑖 is
23
k i = cos ϕ0 . cos ϕ1 … cos ϕ23 = 0.6073
(15)
i=0
For 32 bit CORDIC hardware, the value of 𝑘𝑖 is
31
k i = cos ϕ0 . cos ϕ1 … cos ϕ31 = 0.6073
(16)
i=0
In case of fixed rotation, ϕi could be pre-computed and the sign bits corresponding to d i could be stored in a
signbit register (SBR) in CORDIC circuit. The CORDIC circuit therefore need not compute the remaining angle
ϕi during the CORDIC iterations [9]. A reference CORDIC circuit for fixed rotations according to (8) and (9) is
shown in Fig.1. x0 and y0 are fed as set/reset input to the pair of input registers and the successive feedback
values xi and yi at the ith iteration are fed in parallel to the input registers. Note that conventionally we feed the
pair of input registers with the initial values x0 and y0 as well as the feedback values xi and yi through a pair of
multiplexers. Meanwhile, it is well known that the rotation through any angle, 0 < θ ≤ 2Π can be mapped into a
positive rotation through 0 < ɸ ≤ Π/4 without any extra arithmetic operations. Hence, as a first step of
optimization, we perform the rotation mapping so that the rotation angle lies in the range of 0 < ɸ ≤ Π/4. The
scale-factor K now depends on the the set of elementary angles. The accuracy of CORDIC algorithm depends
on how closely the resultant rotation ɸA due to all the micro-rotations in (8) & (9) approximates to the desired
rotation angle ɸ, where angular deviation, Δɸ = ɸ - ɸA,which in turn determines the deviation of actual rotation
vector from the estimated value. We show here that only a few elementary angles are sufficient to have a
CORDIC rotation in the range [0, π/4], and different sets of elementary angles can be chosen according to the
accuracy requirement.
Fig.3 The reference CORDIC circuit for fixed rotation
Fig.4 Hardwired-shifted bi-rotation CORDIC
circuit. SBR is sign-bit register of 2-bits size.
→ k(0) indicates right-shift by k(0) bit-locations.
1.Implementation of Micro-rotations: Since the elementary angles and direction of micro-rotations are
predetermined for the given angle of rotation, the angle estimation data-path is not required in the CORDIC
circuit for fixed and known rotations. Moreover, because only a few elementary angles are involved in this case,
the corresponding control-bits could be stored in a ROM of few words. In fig.4 The ROM contains the control-
2010-2014 – IJVES
Indexing in Process - EMBASE, EmCARE, Electronics & Communication Abstracts, SCIRUS, SPARC, GOOGLE Database, EBSCO, NewJour, Worldcat,
DOAJ, and other major databases etc.,
1298
1298
Vol 05, Article 10465, October 2014
http://ijves.com
International Journal of VLSI and Embedded Systems-IJVES
ISSN: 2249 – 6556
bits for the number of shifts corresponding the micro-rotations to be implemented by the barrel-shifter and the
directions of micro-rotations are stored in the sign-bit register (SBR).
For 8,16,24,32 Bit CORDIC Hardware
Table.1 Elementary angles
2.Minimization of Barrel-Shifter Complexity by Hardwired Pre-shifting: The hardware-complexity of barrelshifter increases linearly with the word-length and number of shifts. We can reduce the effective word-length in
the MUXes of the barrel-shifters, and so also the number of stages of MUXes by simple hardwired pre-shifting
as shown in Fig.4. If l is the minimum number of shifts in the set of selected micro-rotations, we can load only
the (L−l) most-significant bits (MSBs) of an input word from the registers to the barrel-shifters, since the l less
significant bits (LSBs) would get truncated during shifting. The barrel-shifter, therefore, needs to implement a
maximum of (s−l) shifts only, where s is the maximum number of shifts in the set of selected micro-rotations.
The output of the barrel- shifters are loaded as the (L−l) LSBs to the add/subtract units, and the l MSBs of the
corresponding operand of add/subtract unit are hardwired to 0. Therefore, the hardware-complexity of a barrelshifter could be reduced by the hardwired pre- shifting approach. The time involved in a barrel-shifter could also
be reduced by hardwired pre-shifting, since the delay of the barrel-shifter is proportional to the number of stages
of MUXes, and it is also be possible to reduce the number stages by hardwired pre-shifting.
3.Bi-rotation CORDIC Cell: We find that using only two micro-rotations, it is possible to get an accuracy up to
0.033 radian. Although the accuracy achieved by two micro- rotations is inadequate in many situations, but can
be used for some applications where the outputs are quantized, e.g., in case of speech and image compression
etc. [11], [12]. Besides, the rotations with four and six micro-rotations can also be implemented successively by
two and three pairs of micro-rotations, respectively. Therefore, we design an efficient CORDIC circuit to
implement a pair of micro-rotations, and named as “bi-rotation CORDIC”. The circuit for bi-rotation CORDIC
is shown in Fig.4. It consists of an adder-module, two 2:1 multiplexers and a sign-bit register (SBR) of two bit
size. The adder-module consists of a pair of adders/subtractors. The adders/subtractors perform additions or
subtractions according to the sign-bit available from the SBR. The components of the input vector (real and the
imaginary parts of the input complex operand) are loaded to the input-registers through set/reset input. The
output of the registers are sent in two lines where the content of the register is fed to one of the adder/subtractor
directly while that in the other line is loaded to the barrel-shifter pre-shifted by k(0) bit-locations to right by
hardwired pre-shifting technique. The output of the adders are loaded back to the input registers for the second
CORDIC iteration. The bi-rotation CORDIC involves only a pair of barrel-shifters consisting of only one stage
of 2:1 MUXes. The control-bit for the barrel-shifters is 0 for the first micro-rotation (no shift) and 1 for the
second micro-rotation (shift through k(1)−k(0)). The control bits are generated by a T flip-flop, since they are 1
and 0 in each alternate cycle.
2010-2014 – IJVES
Indexing in Process - EMBASE, EmCARE, Electronics & Communication Abstracts, SCIRUS, SPARC, GOOGLE Database, EBSCO, NewJour, Worldcat,
DOAJ, and other major databases etc.,
1299
1299
Vol 05, Article 10465, October 2014
International Journal of VLSI and Embedded Systems-IJVES
http://ijves.com
ISSN: 2249 – 6556
IV. HIGH-THROUGHPUT DESIGN USING SUPERIOR BI-ROTATIONAL CORDIC
In proposed model, a two stage bi-rotational CORDIC with position sequencer Algorithm is used to increase the
efficiency in fixed angle rotation
A. position sequencer Algorithm:
Fig.5 position sequencer signal
For Reducing Errors in Quadrants the angle range should be -90° to +90° and all angles should be in the range
of first and fourth Quadrant in rotational implementation. These blocks improves the CORDIC fixed rotation
with more accuracy . For compound rotations larger than 90° an additional algorithm is required to make any
iteration approach to first and fourth quadrant known as position sequencer .
Now the upper two bits of arctan represents quadrant in which the angle lies 00 → I, 01 → II, 10 → III, 11 →
IV. I, II, III & IV are quadrants. For 00 and 11 angle correcion is not required.it lies in I and IV quadrants
For 01, the correction algorithm is
x'= -di×y
y' = di×x
z' = z + di× Π /2
where
di = +1 if y < 0,
-1 otherwise
----------------------------------------------------------For 10, the correction algorithm is
x'= di×y
y' = -di×x
z' = z - di× Π /2
where
di = -1 if x < 0,
+1 otherwise
B. Superior Bi-rotational CORDIC
For reduction of adder complexity over the cascaded single-rotation CORDIC, the micro-rotations could be
implemented by a cascaded birotation CORDIC circuit. A two-stage cascaded superior bi-rotation CORDIC is
shown in Fig. The first two of the microrotations out of the four-optimized microrotations could be implemented
by stage-1, while the rest two are performed by stage-2. The structure and function of the bi-rotation CORDIC is
shown in Fig.4. For implementing six selected micro-rotations, we can use a three-stage-cascade of bi-rotation
CORDIC cells. The three-stage superior bi-rotation cells could however be extended further when higher
accuracy is required.
V. SCALING OPTIMIZATION AND IMPLEMENTATION
Optimized set of angle rotations and micro-rotations are discussed in this section
Fixed Rotation Angle scaling approach
The normalized equation for the scale-factor given can be articulated explicitly for the set of selected set of m1
micro rotations as
m1−1
1 + 2−2𝑘(𝑖)
K=
−1
2
(17)
0
where k(i) for 0 ≤ i ≤ m1is the no. of shifts in the ith micro-rotation. Except for k(i)=0 (i.e., rotation by 45), by
binomial expansion, any term in (4) can be written as
𝑥
3𝑥 2
2
8
1- +
Where x=2-2
−
5𝑥 3
16
+
35𝑥 4
125
−
3𝑥 5
256
+
231 𝑥 6
1024
…
(18)
2010-2014 – IJVES
Indexing in Process - EMBASE, EmCARE, Electronics & Communication Abstracts, SCIRUS, SPARC, GOOGLE Database, EBSCO, NewJour, Worldcat,
DOAJ, and other major databases etc.,
1300
1300
Vol 05, Article 10465, October 2014
International Journal of VLSI and Embedded Systems-IJVES
http://ijves.com
ISSN: 2249 – 6556
VI. PERFORMANCE ANALYSIS
The performance simulations of superior Bi-rotational CORDIC are carried out with MODELSIM ALTERA
6.5e Simulator. Synthesis has been carried out with XILINK ISE. The test frequency is 132.258MHz, the power
supply voltage is 1.1 V and delay of 7.038ns. The number of 4-bit shift register required is 1, registers and slice
flip flops are 516, 4 input LUTs 577 and number of IOBS 67. The Bi-rotational CORDIC is successfully tested
by simulation result and gain factor is limited with 1.64.
Fig.7 Two-stage superior Bi-rotational CORDIC cell
VII. RESULTS AND EXAMINATION
Fig.8 Block Diagram
Fig.9 Technology Schematic of CORDICalgorithm
Fig.10 Simulation Results of CORDIC algorithm
2010-2014 – IJVES
Indexing in Process - EMBASE, EmCARE, Electronics & Communication Abstracts, SCIRUS, SPARC, GOOGLE Database, EBSCO, NewJour, Worldcat,
DOAJ, and other major databases etc.,
1301
1301
Vol 05, Article 10465, October 2014
http://ijves.com
International Journal of VLSI and Embedded Systems-IJVES
ISSN: 2249 – 6556
VIII. CONCLUSION
The Superior Bi-rotational CORDIC is attractive for the calculation of fixed angle elementary functions because
of its accuracy and parallel processing.The proposed CORDIC architecture requires more area over the
reference design, but offer high throughput. The simulation is done in MODELSIM and the code is functionally
verified to be correct. The CORDIC features like computation of division, multiplication, square-root and
evaluation of trigonometric functions has made it an eye-catching choice for a wide variety of applications. For
high-throughput applications, efficient cascaded bi-rotational CORDIC with more than two stages could be
developed to take the advantage of CORDIC, because the digital hardware is getting cheaper along with the
progressive device-scaling. The area-delay-accuracy trade-off for different advanced algorithms may be
investigated in detail and compared with in future work.
REFERENCES
[1] J. E. Volder, “The CORDIC trigonometric computing technique,” IRE Transactions on Electronic
Computers, vol. EC-8, pp. 330–334, Sep.1959.
[2] J. S. Walther, “A unified algorithm for elementary functions,” in Proceedings 38th Spring Joint Computer
Conference, Atlantic City, New
Jersey, 1971, pp. 379–385.
[3] J. S. Walther, “A unified algorithm for elementary functions,” in Proc. 38th Spring Joint Computer Conf.,
Atlantic City, NJ, 1971, pp. 379–385.
[4] J. S.Walther, “The story of unified CORDIC,” J. VLSI Signal Process., vol. 25, no. 2, pp. 107–112, June
2000.
[5] D. S. Cochran, “Algorithms and accuracy in the HP-35,” Hewlett- Packard J., pp. 1–11, Jun. 1972.
[6] P. K. Meher, J. Valls, T.-B. Juang, K. Sridharan, and K. Maharatna, “50 years of CORDIC: Algorithms,
architectures and applications,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 56, no. 9,
pp. 1893–1907, Sep. 2009.
[7] C.-S. Wu, A.-Y. Wu, and C.-H. Lin, “A high performance/low-latency vector rotational CORDIC
architecture based on extended elementary angle set and trellis-based searching schemes,” IEEE Transactions
on Circuits and Systems II: Analog and Digital Signal Processing, vol. 50, no. 9, pp. 589–601, Sep. 2003.
[8] T.-B. Juang, S.-F. Hsiao, and M.-Y. Tsai, “Para-CORDIC: parallel CORDIC rotation algorithm,” IEEE
Transactions on Circuits and Systems I: Regular Papers, vol. 51, no. 8, pp. 1515–1524, Aug. 2004.
[9] Y. H. Hu, “CORDIC-based VLSI architectures for digital signal processing,”IEEE Signal Processing
Magazine, vol. 9, no. 3, pp. 16–35,Jul. 1992.
[10] K. Maharatna, S. Banerjee, E. Grass, M. Krstic, and A. Troya, “Modified virtually scaling free adaptive
CORDIC rotator algorithm and architecture,” IEEE Transactions Circuits and Syst. for Video Technology, vol.
15, no. 11, pp. 1463–1474, Nov. 2005.
[11] A. K. Jain, Fundamentals of Digital Image Processing. Englewood Cliffs, NJ: Prentice-Hall, 1989.
[12] N. Jayant, Ed., Signal Compression : Coding of Speech, Audio, Text, Image And Video, ser. Selected Topics
in Electronics and Systems. NJ: World Scientific, 1997, vol. 9.
[13] S. Suchitra, S. Sukthankar, T. Srikanthan, and C. T. Clarke, “Elimination of sign precomputation in flat
CORDIC,” in IEEE Int. Symp. On Circuits Syst., ISCAS’05, May 2005, vol. 4, pp. 3319–3322.
[14] E. Deprettere, P. Dewilde, and R. Udo, “Pipelined CORDIC architectures for fast VLSI filtering and array
processing,” in IEEE Int. Conf. on Acoust., Speech, Signal Process., ICASSP’84,Mar. 1984, vol. 9, pp. 250–253.
2010-2014 – IJVES
Indexing in Process - EMBASE, EmCARE, Electronics & Communication Abstracts, SCIRUS, SPARC, GOOGLE Database, EBSCO, NewJour, Worldcat,
DOAJ, and other major databases etc.,
Download