Area Efficient Parallel FIR Digital Filter Algorithm G.Sreenivasulu

advertisement
International Journal of Engineering Trends and Technology (IJETT) – Volume 16 Number 2 – Oct 2014
Area Efficient Parallel FIR Digital Filter
Structures of Even Length Based on Fast FIR
Algorithm
G.Sreenivasulu1, M.Maddhu Sudhan Reddy2
M.Tech, Depatment of ECE, GPREC, KNL.A.P,India1
Asst.professor, M.Tech, MIETE, MIEE, Department of ECE,GPREC,KNL,A.P,India2
Abstract- This paper introduces novel parallel FIR filter
structures which are advantageous to symmetric coefficient
in the reduction of hardware cost , based, on“Fast finite
impulse response
(FIR),
algorithms” (FFA),under the
condition that number of taps is multiple of two or three. The
novel parallel finite impulse response determines the inherent
nature of symmetric coefficients which reduces the half of the
required multiplier in sub filter block at additional increase of
adders in pre processing and post processing blocks
.Interchanging multipliers with adders is profitable because
multipliers occupy more silicon area compared to that of
adders which occupy less silicon area .and also additional
increase adders ,stay fixed only in preprocessing and post
processing blocks they do not in the subfilter section. For
example for a 2 parallel 24 tap FIR filter the proposed
structure save 6 multipliers at expense of 2 adders for a 3
parallel 24 tap FIR filter the proposed structure saves 6
multipliers at expense of 7 adders and for a 4 parallel 24 tap
FIR filter the proposed structure saves 9 multipliers at the
expense of 11 adders. The advantage of novel parallel FIR
filter structures is that the more number of multipliers are
saved as the length of the FIR filter increases.
Keywords-Digital signal processing (DSP), fast finite-impulse
response (FIR) algorithms (FFAs), parallel FIR, symmetric
convolution, very large scale integration (VLSI).
I. INTRODUCTION
Due to the explosive growth of multimedia application, the
demand for high-performance and low-power digital signal
processing (DSP) is getting higher and higher. Finite-
ISSN: 2231-5381
impulse response (FIR) digital filters are one of the most
widely used fundamental devices performed in DSP
systems, ranging from wireless communications to video
and image processing. Some applications need the FIR
filter to operate at high frequencies such as video
processing, whereas some other applications request high
throughput with a low-power circuit such as multiple- input
multiple-output (MIMO) systems used in cellular wireless
communication. Furthermore, when FIR filter narrow
transition band characteristics are required, the much
higher order in the FIR filter is unavoidable. For example,
a 576-tap digital filter is used in a video ghost canceller for
broadcast television, which reduces the effect of multipath
signal echoes. On the other hand, parallel and pipelining
processing are two techniques used in DSP applications,
which can both be exploited to reduce power consumption.
Pipelining shortens the critical path by interleaving
pipelining latches along the data path, at the price of
increasing the number of latches and the system latency,
whereas parallel processing increase the sampling rate by
replicating hardware so that multiple inputs can be
processed in parallel and multiple outputs are generated at
the same time, at the expense of increased area. Both
techniques can reduce the power consumption by lowering
the supply voltage, where the sampling speed does not
increase. In this paper, parallel processing in the digital
FIR filter will be discussed. Due to its linear increase in the
hardware implementation cost brought by the increase of
the block size L, the parallel processing technique loses its
advantage in practical implementation. In poly phase
decomposition is mainly manipulated, where the smallsized parallel FIR filter structures are derived first and then
the larger block-sized ones can be constructed by cascading
or iterating small-sized parallel FIR filtering blocks. Fast
FIR algorithms (FFAs) it can implement a L-parallel filter
http://www.ijettjournal.org
Page 58
International Journal of Engineering Trends and Technology (IJETT) – Volume 16 Number 2 – Oct 2014
using approximately (2L-1) sub filter blocks, each of which
is of length N/L. FFA structures successfully break the
constraint that the hardware implementation cost of a
parallel FIR filter has a linear increase along with the block
size L. It reduces the required number of multipliers to
(2N- N/L) from L×N. , the fast linear convolution is utilized
to develop the small-sized filtering structures and then a
long convolution is decomposed into several short
convolutions, i.e., larger block-sized filtering structures can
be constructed through iterations of the small-sized
filtering structures. However, in both categories of method,
when it comes to symmetric convolutions, the symmetry of
coefficients has not been taken into consideration for the
design of structures yet, which can lead to a significant
saving in hardware cost. In this paper, we provide new
parallel FIR filter structures based on FFA consisting of
advantageous poly phase decompositions, which can
reduce amounts of multiplications in the sub filter section
by exploiting the inherent nature of the symmetric
coefficients, compared to the existing FFA fast parallel FIR
filter structure.
This paper is organized as follows. A brief introduction of
FFAs is given in Section II. In Section III, the proposed
parallel FIR filter structures are presented. Section IV
investigates the complexity and comparisons.
Y
(
)
X (Z )Z
= H (Z )Z (2)
Where,
X
∑∞
H
∑
Y
(
(
)
∑∞
)
(
(
)
)
for p,q,r = 0,1,2,3,….,L-1.
From This FIR filtering equation, it shows that the
traditional FIR filter will require L2-1FIR subfilter
blocks of length for implementation.
A. (2x2 FFA) (L=2)
According to (2), a two- parallel FIR filter can be
expressed as
In Section V, the description of hardware implementation
and the experimental results are shown. Section VI gives
the conclusion.
Y + Z Y = (H + Z H )(X + Z
X )
= H X + Z (H X + H X ) + Z H X
II. FAST FIR ALGORITHM (FFA)
(3)
Consider an N-tap FIR filter which can be expressed in the
general form as
( )=∑
Implying that
( ) ( − ), = , , , … … … , ∞
(1)
Where,
=
X(n)  Infinite length input Sequence
=
+
,
+
(4)
Equation (4) shows the traditional two parallel filter
structure, which will require four length- N/2 FIR
sub filter blocks, two processing adders and totally
2N multipliers and 2N-2 adders. However, (4) can
be written as
H(i)  FIR filter coefficients.
N  Order of the Filter.
The traditional L- parallel FIR filter can be derived using
polyphase decomposition as
Y =H X +Z
H X
Y = ((H + H )(X + X ) − H X − H X
(5)
ISSN: 2231-5381
http://www.ijettjournal.org
Page 59
International Journal of Engineering Trends and Technology (IJETT) – Volume 16 Number 2 – Oct 2014
III. PROPOSED FFA STRUCTURES FOR SYMMETRIC
CONVOLUTIONS
Fig1.Existing 2 parallel FIR Filter
The implementation of (5) will require three FIR sub filter
blocks of length N/2, one preprocessing and three post
processing adders, and 3N/2 multipliers and 3(N/2-1)+4
adders, which reduces approximately one fourth over the
traditional two-parallel filter hardware cost from (4). The
two-parallel (L=2) FIR filter implementation using FFA
obtained from (5) is shown in Fig. 1.
To utilize the symmetry of coefficients, the main idea
behind the proposed structures is actually pretty intuitive,
to manipulate the polyphase decomposition to earn as
many subfilter blocks as possible which contain symmetric
coefficients so that half the number of multiplications in
the single subfilter block can be reused for the
multiplications of whole taps, which is similar to the fact
that a set of symmetric coefficients would only require half
the filter length of multiplications in a single FIR filter.
Therefore, for an N-tap L-parallel FIR filter the total
amount of saved multipliers would be the number of
subfilter blocks that contain symmetric coefficients times
half the number of multiplications in a single subfilter
block (N/2L).
B. 3x3 FFA(L=3)
By the similar approach, a three-parallel FIR filter using
FFA can be expressed as
+
× [( + )( + )
−
]
]−(
= [( + )( + ) −
−
)
= [( +
+ )( + + )]
]
− [( + )( + ) −
− [( + )( + ) −
]
=
−
A. 2 x 2 Proposed FFA (L = 2):
From (4) two-parallel FIR filter can also be written as:
1
Y = [(H + H )(X + X )
2
+ (H − H )(X − X )]
−H X
(6)
Y =
+Z
H X
1
[(H + H )(X + X ) − [(H − H )(X − X )]
2
(7)
When it comes to a set of even symmetric coefficients, (7)
can earn one more subfilter block containing symmetric
coefficients than (5), the existing FFA parallel FIR filter.
Fig. 5.1 shows implementation of the proposed twoparallel FIR filter based on (7).
Fig2.existing 3 parallel FIR filter
The hardware implementation of (6) requires six lengthN/3 FIR subfilter blocks, three pre processing and seven
post processing adders, and three N multipliers and 2N+4
adders, which has reduced approximately one third over
the traditional three-parallel filter hardware cost. The
implementation obtained from (6) is shown in Fig. 2.
ISSN: 2231-5381
Fig.3: Proposed two-parallel FIR filter
http://www.ijettjournal.org
Page 60
International Journal of Engineering Trends and Technology (IJETT) – Volume 16 Number 2 – Oct 2014
Example 1: Consider a 24-tap FIR fiter with a set of
symmetric coefficients applying to the proposed twoparallel FIR filter.
{ℎ(0), ℎ(1), ℎ(2), … … ℎ(23)}
ℎ(0) = ℎ(23), ℎ(1) = ℎ(22), ℎ(2) = ℎ(21), ℎ(3)
= ℎ(20), ℎ(4) = ℎ(19), … . , ℎ(11)
= ℎ(12),
where
±
= {ℎ(0) ± ℎ(1), ℎ(2) ± ℎ(3), ℎ(4) ± ℎ(5), ℎ(6)
± ℎ(7) … , ℎ(18) ± ℎ(19), ℎ(20)
± ℎ(21), ℎ(22) ± ℎ(23)}
(8)
As can be seen from the example above, two of three
subfilter blocks from the proposed two-parallel FIR filter
structure,H₀-H₁ and Hₒ+H₁, are with symmetric coefficients
now, as (8), which means the subfilter block can be
realized by Fig. 4, with only half the amount of multipliers
required. Each output of multipliers responds to two taps.
Note that the transposed direct-form FIR filter is employed.
Compared to the existing FFA two-parallel FIR filter
structure, the proposed FFA structure leads to one more
subfilter block which contains symmetric coefficients
however. However, it comes with the price of the increase
of amount of adders in preprocessing and postprocessing
blocks. In this case, two additional adders are required for
L=2.
Y =
1
[(H + H ) (X + X ) + (H − H )(X − X )]
2
−H X
+Z
(H + H + H )(X + X + X )
− (H + H )(X + X )
1
− [(H + H )(X + X )
2
− (H − H )(X − X )] − H X
Y =
Y =
1
[(H + H ) (X + X ) − (H − H )(X − X )]
2
1
(H + H )(X + X )
+Z
2
− (H − H )(X − X )
1
− [(H + H )(X + X )
2
+ (H − H )(X − X )] + H X
1
[(H + H )(X + X ) − [(H − H )(X − X )]
2
+H X
(9)
Fig.5: Proposed three-parallel FIR filter
Fig4: subfilter block implementation with symmetric
coefficients
Fig. 5 shows implementation of the proposed three-parallel
FIR filter. When the number of symmetric coefficients N is
the multiple of 3, the proposed three-parallel FIR filter
B. 3 x 3 Proposed FFA (L = 3):
structure presented in (9) enables four subfilter blocks with
With the similar approach, from (6), a three-parallel FIR
filter can also be written as (9).
symmetric coefficients in total, whereas the existing
FFAparallel FIR filter structure has only two ones out ofsix
subfilter blocks. A comparison figure is shown in Fig. 6,
ISSN: 2231-5381
http://www.ijettjournal.org
Page 61
International Journal of Engineering Trends and Technology (IJETT) – Volume 16 Number 2 – Oct 2014
Fig8.proposed four parallel fir filter implementation
C.Proposed Cascading FFA
a small modification is adopted here for lower
hardware consumption. As we can see, the proposed
parallel FIR structure enables the reuse of multipliers
in parts of the subfilter blocks but it also brings more
adder cost in preprocessing and postprocessing
blocks. When cascading the proposed FFA parallel
FIR structures for larger parallel block factor L. The
increase of adders can become larger. Therefore,
other than applying the proposed FFA FIR filter
structure to all the decomposed subfilter blocks, the
existing FFA structures which have more compact
operations in preprocessing and postprocessing
blocks are employed for those subfilter blocks that
contain no symmetric coefficients, whereas the
proposed FIR f ilter structures are still applied to the
ISSN: 2231-5381
rest of subfilter blocks with symmetric coefficients.
An illustration of the proposed cascading process for
a four-parallel FIR filter (L=4) as an example is
shown in Fig. 7, and the realization is shown in Fig.
8. From Fig. 7, it is clear to see that the proposed
four-parallel FIR structure earns three more subfilter
blocks containing symmetric coefficients than the
existing FFA one, which means 3N/8 multipliers can
be saved for an N-tap FIR filter, at the price of 11
additional adders in preprocessing and postprocessing
blocks. By this cascading approach, parallel FIR filter
structures with larger block factor L can be realized.
IV. COMPLEXITY ANALYSIS AND COMPARISON
When an L-parallel FIR filter comes with a set of
symmetric coefficients of length N, the number of required
http://www.ijettjournal.org
Page 62
International Journal of Engineering Trends and Technology (IJETT) – Volume 16 Number 2 – Oct 2014
multipliers for the proposed parallel FIR filter structures is
provided by (10)
= −
∏
2
(10)
Where
L is the small parallel block size, r is the number of FFAs
used. M is the number of subfilter blocks resulted from ith
FFA.,S is the number of subfilter blocks containing
symmetric coefficients.
V. IMPLEMENTATION AND EXPERIMENTAL
RESULT
TABLE 2
COMPARASION OF DEVICE LOGIC UTILIZATION FOR TWO
PARALLEL EXISTING AND PROPOSING
LOGIC
UTILIZATION
AVAILABLE
EXISTING
PROPOSING
Number
of
slices
Number
of
slices
Number of 4
input LUTS
Number
of
bonded IOBS
Number
of
MULT18X18SI
Number
of
GCLKS
4656
1822
1474
9312
51
58
9312
3450
2807
232
328
296
20
20
20
4
1
1
The proposed FFA structures and the existing FFA
structures are implemented in Verilog HDL with filter
length of 24 with word length 16-bit.
TABLE 1
COMPARISION OF PROPOSED AND THE EXISTING FFA
STRUCTURES NUMBER OF REQUIRED MULTIPLIERS
(M.),REDUCED MULTIPLIERS (R.M.),NUMBE OF REQUIRED
ADDERS IN SUBFILTER SECTION (SUB.),NUMBER OF REQUIRED
ADDERS IN PRE/POST PROCESSIN BLOCKS(PRE/POST.), AND
THE NUMBER OF THE INCREASED ADDERS(I.A.)
VI. CONCLUSION
In this paper, we have presented new parallel FIR filter
structures, which are beneficial to symmetric convolutions
when the number of taps is the multiple of 2 or 3.
Multipliers
L
2
Length
24-tap
3
24-tap
4
24-tap
Structure
Existing
Proposed
Existing
Proposed
Existing
Proposed
M
30
24
40
32
51
42
R.M.
6
I.A
2
are
the
major
portions
in
hardware
consumption for the parallel FIR filter implementation. The
proposed new structure exploits the nature of even
8
7
symmetric coefficients and save a significant amount of
9
11
multipliers at the expense of additional adders. Since
multipliers outweigh adders in hardware cost, it is
TABLE2
COMPARASION OF DEVICE LOGIC UTILIZATION FOR TWO
PARALLEL EXISTING AND PROPOSING
profitable to exchange multipliers with adders. Moreover,
the number of increased adders stays still when the length
of FIR filter becomes large, whereas the number of reduced
LOGIC
UTILIZATION
AVAILABLE
EXISTING
PROPOSING
Number
of
slices
Number
of
slices
Number of 4
input LUTS
Number
of
bonded IOBS
Number
of
MULT18X18SI
Number
of
GCLKS
4656
989
609
Consequently, the larger the length of FIR filters is, the
9312
28
28
more the proposed structures can save from the existing
9312
1883
1174
232
298
238
consisting of advantageous polyphase decompositions
20
20
20
dealing with symmetric convolutions comparatively better
4
1
1
multipliers increases along with the length of FIR filter.
FFA structures, with respect to the hardware cost. Overall,
in this paper, we have provided new parallel FIR structures
ISSN: 2231-5381
than the existing FFA structures in terms of hardware
consumption.
http://www.ijettjournal.org
Page 63
International Journal of Engineering Trends and Technology (IJETT) – Volume 16 Number 2 – Oct 2014
REFERENCES
[1] J. G. Chung and K. K. Parhi, “Frequency-spectrum-based low-area
low-power parallel FIR filter design,” EURASIP J. Appl. Signal Process.,
vol. 2002, no. 9, pp. 444–453, 2002.
[2] Z.-J. Mou and P. Duhamel, “Short-length FIR filters and their use in
fast non recursive filtering,” IEEE Trans. Signal Process., vol. 39, no. 6,
pp. 1322–1332, Jun. 1991.
[3] C. Cheng and K. K. Parhi, “Hardware efficient fast parallel FIR filter
structures based on iterated short convolution,” IEEE Trans. Circuits Syst.
I, Reg. Papers, vol. 51, no. 8, pp. 1492–1500, Aug. 2004.
[4] C. Cheng and K. K. Parhi, “Low-cost parallel FIR structures with2stage parallelism,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol.54, no.
2, pp. 280–290, Feb. 2007.
[5]
N. Sankarayya , Kaushik Roy, and Debashis Bhattacharya,
“Algorithms for Low Power and High Speed FIR Filter Realization Using
Differential Coefficients” IEEE transactions on circuits and systems—ii:
analog and digital signal processing, vol. 44, no. 6, june 1997
[6] I.-S. Lin and S. K. Mitra, “Overlapped block digital filtering,” IEEE
Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 43, no. 8, pp.
586–596, Aug. 1996.
[7] Yu-Chi Tsao and Ken Choi, “ Area-Efficient VLSI Implementation
for Parallel Linear-Phase FIR Digital Filters of Odd Length Based on Fast
FIR Algorithm” IEEE Transactions on circuits and systems—ii: express
briefs, vol. 59, no. 6, June 2012 .
[8] Yongtao wang and Kaushik Roy , “A New Complexity Reduction
Technique for Multiplier less Implementation of Digital FIR Filters”
.IEEE Transactions On Circuits And Systems—I: Regular Papers, Vol.
52, No. 9, September 2005 .
[9] Akhiro Matsuura,Mitsuteru Yukishita,Akira Nagoya“An efficient
Hierarchical clustering method for the multiple constant multiplication
problem”.
[10].Yasuhiro Takahashi,Kazukiyo Takahashi And Michio Yokoyama,
“synthesis of multiplierless FIR filter by efficient sharing of horizontal
and vertical common sub expression elimination”., international
conference on circuits and systems.
ISSN: 2231-5381
http://www.ijettjournal.org
Page 64
Download