An Efficient Architecture for 3-D Discrete Wavelet Transform

advertisement
International Journal of Engineering Trends and Technology (IJETT) – Volume 6 Number 2- Dec 2013
An Efficient Architecture for 3-D Discrete Wavelet
Transform
Bala Tejavath1
N.Suresh Babu2
1 PG Student (M. Tech), Dept. of ECE, Chirala Engineering College, Chirala, A.P, India.
2 Professor, Vice Principal & HOD-ECE, Chirala Engineering College, Chirala, A.P, India .
*
Abstract: The 3-D discrete wavelet transforms (DWT) have been widely used in many
applications like image compression, signal processing, speech compression because
of their multi-resolution of signals with localization both in time and frequency. In the
past, many architectures were proposed aimed at providing high – speed 3-D DWT
computation with the requirement of utilizing a reasonable amount of hard ware
resources. These architectures can be broadly classified into separable and non
separable architectures. The separable method is the most straight forward
implementation method. In separable method, a 3-D filtering operations, one for
processing the data row-wise and the other column-wise. In this method the
intermediate coefficients stores in a frame memory first. Then it performs 1-D DWT in
other direction with these intermediate coefficients to complete one-level 3-D DWT
.Because the size of this frame memory is usually assumed to off chip. However, the
separable method performs 1-D DWT in both directions simultaneously. In this paper,
separable pipeline architecture for fast computation of the 3-D DWT with a less
memory and low latency is proposed. The low latency and less memory is achieved by
proper designing of three 1-D DWT filtering processes and also efficiently transferring
the data between the three 1-D DWT filters.
Keywords: Discrete wavelet transforms, image compression, lifting, video, VLSI
architecture.
ensures high resource utilization, that
1. Introduction
Nowadays,
most
of
the
too in cost effective platforms like field
applications require real-time DWT
programmable
engines
computing
designing such architecture does offer
potentiality for which a fast and
some flexibilities like speeding up the
dedicated very-large-scale integration
computation
(VLSI) architecture appears to be the
pipelined
best
processing, possibilities of reduced
with
possible
large
solution.
ISSN: 2231-5381
While
it
gate
by
array
adopting
structures
http://www.ijettjournal.org
(FPGA),
and
more
parallel
Page 69
International Journal of Engineering Trends and Technology (IJETT) – Volume 6 Number 2- Dec 2013
memory consumptions through better
literatures
task scheduling or low-power and
requirements.
portability features. To overcome one
with
reduced
Nevertheless,
following
attempt
with 3-D-DWT architectures—viz., the
computation and reduce the storage
memory requirement, block based [7],
requirement thereafter, computation
[8] or scan-based architectures with
of a lifting step is carried out in two
independent group of pictures (GOP)
stages and performed sequentially. In
transform
effect,
been
reported.
it
regularize
doubles
the
memory
referencing
quality while the independent GOPs
consumption
introduce annoying jerks in video
required processing speed by two fold.
playback
at
Besides, those are merely temporal
[1].
transform methods; and clearly, there
Alternatively, some successful scan-
is a gap in the literature for a
based
transform
complete 3-D-DWT architecture which
convolution
employs lifting and running transform
transform
to
PSNR
drop
boundaries
running
architectures
with
filtering have been reported avoiding
with infinite
these limitations.
principle.
After the advent of the lifting
while
related
lifting
However, blocking degrades the PSNR
due
and
the
their
of the toughest problems associated
have
to
memory
power
increasing
the
GOP in its working
2. Pipeline for the 3-D Dwt
scheme in 1994, the computation of
Computation
DWT has experienced a sea change.
In a pipeline structure for the DWT
While
providing
a
computation, multiple stages are used
complexity,
to carry out the computations of the
in-place computation, ease in building
various decomposition levels of the
non linear and inverse wavelets [6],
transform.
the lifting also reduces the memory
corresponding to each decomposition
requirement. Thus, it has become a
level needs to be mapped to a stage or
powerful tool to the researchers for
stages of the pipeline. In order to
computation of both 2-D and 3-D-
design a pipeline structure capable of
DWT in several applications. Some
performing a fast computation of the
lifting-based
temporal
DWT with low expense on hardware
infinite
resources and low design complexity,
reduced
computational
transform
GOPs
facilities
solely
techniques
have
been
ISSN: 2231-5381
with
like
reported
in
The
computation
an optimal mapping of the overall
http://www.ijettjournal.org
Page 70
International Journal of Engineering Trends and Technology (IJETT) – Volume 6 Number 2- Dec 2013
task of the DWT computation to the
available for its operation. Once the
various stages of the pipeline needs to
operation of a stage is started, it must
be determined. Any distribution of the
continue until the task assigned to it
overall task of the DWT computation
is fully completed.
to stages must consider the inherent
Consider the timing diagram given
nature of the sequential computations
in Fig. 2 for the operation of the three
of the decomposition levels that limit
stages, where t1,t2 and t3 are the
the computational parallelism of the
times taken individually by stages 1,2
pipeline stages, and consequently the
and 3, respectively, to complete their
latency of the pipeline. Further, in
assigned tasks, and ta and tb are the
order to minimize the expense on the
times elapsed between the starting
hardware resources of the pipeline,
points of the tasks, by stages 1 and 2,
the number of filter units used by
and that stages 2 and 3 respectively.
each stage ought to be minimum and
proportional to the amount of the task
assigned to the stage.
Figure 2 Timing Diagram for the operations of three stages
Figure 1 Pipeline structure with N stages
Note that the lengths of the times
t1,t2 and t3 to complete the tasks by
3. Synchronization of stages
The
distribution
the
the same, since the ratios of the tasks
computational load among the three
assigned and the resources made
stages, and the hardware resources
available to the three stages are the
made available to them are in the
same. The average times to compute
ratio 8:2:1. The stages of pipeline
one output sample by stages 1,2 and
need to be synchronized in such a
3 are in the ratio 1:4:8. In Fig. 2 the
way
the
relative widths of the slots in the three
operation at an earliest possible time
stages are shown to reflect this ratio.
when
Our objective is to minimise the total
that
the
each
stage
required
ISSN: 2231-5381
of
individual stages are approximately
starts
data
become
http://www.ijettjournal.org
Page 71
International Journal of Engineering Trends and Technology (IJETT) – Volume 6 Number 2- Dec 2013
computation
time
ta+tb+t3
by
matrix, which can be computed more
minimizing t,t and t individually.
quickly than the analogous Fourier
Design of stages
matrix. Most notably, the discrete
In
the
proposed
three-stage
wavelet transform is used for signal
architecture, stages 1 and 2 perform
coding, where the properties of the
the computations of levels 1 and 2
transform are exploited to represent a
respectively, and stage 3 that of all
discrete signal in a more redundant
the remaining levels. Fig. 3 shows the
form, often as a preconditioning for
block
data
diagram
of
the
three-stage
architecture.
compression.
The
discrete
wavelet transform has a huge number
of
applications
Engineering,
in
Science,
Mathematics
and
Computer Science.
Wavelet compression is a form of
data
compression
well
suited
for
image compression (sometimes also
video
compression
and
audio
compression). The goal is to store
image data in as little space as
possible in a file. A certain loss of
Figure 3 Block Diagram of the three-stage architecture
4. Different types of transforms
quality
is
accepted
Compression).
1. FT (Fourier Transform).
2. DCT (Discrete Cosine Transform).
3. DWT (Discrete Wavelet Transform).
Discrete Wavelet Transform (DWT)
The
discrete
wavelet
transform
(DWT) refers to wavelet transforms for
which the
wavelets
are discretely
sampled. A transform which localizes
a function both in space and scaling
and has some desirable properties
compared to the Fourier transform.
The transform is based on a wavelet
ISSN: 2231-5381
Figure 4 PROPOSED ARCHITECTURE
http://www.ijettjournal.org
Page 72
(lossy
International Journal of Engineering Trends and Technology (IJETT) – Volume 6 Number 2- Dec 2013
Using a wavelet transform, the
wavelet
compression
methods
are
the implemented 3-D,2-D and 1-D
Wavelet transforms respectively.
better at representing transients, such
Fig.8 Shows the RTL schematic of
as percussion sounds in audio, or high-
the
frequency
utilization
components
in
two-
dimensional images, for example an
proposed
system.
summary
is
The
device
shown
Table-1
image of stars on a night sky.
Signal can be represented by a
smaller amount of information than
would
be
transform,
the
case
such
if
as
some
other
the
more
widespread discrete cosine transform,
had
been
used.
First
a
wavelet
transform is applied. This produces as
many coefficients as there are pixels in
the image (i.e.: there is no compression
yet since it is only a transform). These
coefficients can then be compressed
Figure 5 TOP MODULE 3-d DWT
more easily because the information is
statistically concentrated in just a few
coefficients.
5. Results & Conclusions
In this paper, fast computation of
the 3-D DWT with a less memory and
low latency is proposed. The low
latency and less memory is achieved
by proper designing of three 1-D DWT
Figure 6 Simulation result for 2-d DWT
filtering processes and also efficiently
transferring the data between the
three 1-D DWT architectures. This
architecture is simulated, synthesized
and
implemented
by
VERILOG
language using XILINX ISE Tool. Fig.
5,6,7 shows the simulation results for
ISSN: 2231-5381
http://www.ijettjournal.org
Page 73
in
International Journal of Engineering Trends and Technology (IJETT) – Volume 6 Number 2- Dec 2013
Total memory usage is 198724 kilobytes
Acknowledgements
The authors would like to thank
the
anonymous
reviewers
for
their
comments which were very helpful in
improving the quality and presentation
of this paper.
Figure 7 Simulation result for 1-d DWT
References:
[1] M. Vishwanath, R. Owens, and M. J.
Irwin, ―VLSI architectures for the discrete
wavelet transform,‖ IEEE Trans. Circuits
Syst. II, Analog. Digit. Signal Process.,
vol. 42, no. 5, pp. 305–316, May 1995
[2] C. Chakrabarti and M. Vishwanath,
―Efficient realizations of the discrete and
continuous
wavelet
transforms:
From
single chip implementations to mapping
Figure 8 RTL schematic of the proposed system
on SIMD array computers,‖ IEEE Trans.
Table-1
Device Utilization Summary
(estimated values)
Logic Utilization Used Available
Signal Process., vol. 43, no. 3, pp. 759–
771, Mar. 1995.
Utiliz
ation
Number of Slices
202
4656
4%
Number of Slice
Flip Flops
215
9312
2%
Number of 4 input
355
LUTs
9312
3%
232
35%
Number of bonded
IOBs
82
Number of GCLKs
1
[3] H. Y. Liao, M. K. Mandal, and B. F.
Cockburn, ―Efficient architectures for 1D
and
2-D
4%
Table 1Device Utilization summary for the device xc3s500e4fg320
Total 16.917ns (11.928ns logic, 4.990ns
wavelet
transforms,‖ IEEE Trans. Signal Process.,
vol. 52, no. 5, pp. 1315–1326, May 2004.
[4]
D.
Guevorkian,
Launiainen,
―Architectures
24
lifting-based
and
for
P.
V.
Liuha,
Lappalainen,
Discrete
Wavelet
Transforms,‖ U.S. 6976046, Dec. 13,
2005
[5] M. Alam,W. Badawy, V. Dimitrov, and
G. Jullien, ―An efficient architecture
route) (70.5% logic, 29.5% route)
ISSN: 2231-5381
A.
http://www.ijettjournal.org
Page 74
International Journal of Engineering Trends and Technology (IJETT) – Volume 6 Number 2- Dec 2013
for a lifted 2-D biorthogonal DWT,‖ J.
Authors Profile:
VLSI Signal Process., vol. 40, pp. 333–
342, 2005.
[6]
C.
Bala
Yu
and
S.-J.
Chen,
―VLSI
Tejavath
is
Pursuing his M. Tech
implementation of 2-D discrete wavelet
from
Chirala
transform
Engineering
College,
for
processing,‖
real-time
IEEE
video
Trans.
signal
Consum.
Chirala
of
in
the
Electronics
&
Electron., vol. 43, no. 4, pp. 1270–79,
department
Nov. 1997.
Communications Engineering (ECE) with
[7] P.-C. Wu and L.-G. Chen, ―An efficient
specialization
architecture for two-dimensional discrete
Systems
in
VLSI
&
Embedded
wavelet transform,‖ IEEE Trans. Circuits
Syst. Video Technol., vol. 11, pp. 536–
Prof. N.Suresh Babu is
545, Apr. 2001.
vice-Principal & HOD of
[8]. C.-T. Huang, P.-C. Tseng, and L.-G.
Chen,
―Memory
Architecture
Discrete
for
Wavelet
Analysis
ECE
and
Chirala.
Three-Dimensional
Transform,‖
Dept.
M.Tech
in
in
He
in
Engineering
got
CEC
his
Microwave
from
Birla
Proceedings of the IEEE Int. Conf. on
Institute of Technology, Ranchi. He has
Acoustics, Speech and Signal Processing,
14 years of Teaching Experience and 2
2004, pp. V13–V16.
years of Industrial Experience in various
[9]. P.-C. Tseng, C.-T. Huang, and L.-G.
organisations
Chen, ―Generic RAMBased Architecture
for Three-Dimensional Discrete Wavelet
Transform with Line-Based Method,‖ in
Proceedings
of
the
Asia-Pacific
Conference on Circuits and Systems,
2002, pp. 363– 366.
[10] M. Vishwanath, R. Owens, and M. J.
Irwin, ―VLSI architectures for the discrete
wavelet transform,‖ IEEE Trans. Circuits
Syst. II, Analog. Digit. Signal Process.,
vol. 42, no. 5, pp. 305–316, May 1995.
ISSN: 2231-5381
http://www.ijettjournal.org
Page 75
Download