46.A Pipeline VLSI Architecture for High

advertisement
A Pipeline VLSI Architecture for High-Speed Computation of the
1-D Discrete Wavelet Transform
ABSTRACT:
In this paper, a scheme for the design of a high-speed pipeline VLSI architecture for the
computation of the 1-D discrete wavelet transform (DWT) is proposed. The main focus of the
scheme is on reducing the number and period of clock cycles for the DWT computation with
little or no overhead on the hardware resources by maximizing the inter- and intrastage
parallelisms of the pipeline. The interstage parallelism is enhanced by optimally mapping the
computational load associated with the various DWT decomposition levels to the stages of the
pipeline and by synchronizing their operations. The intrastage parallelism is enhanced by
decomposing the filtering operation equally into two subtasks that can be performed
independently in parallel and by optimally organizing the bitwise operations for performing
each subtask so that the delay of the critical data path from a partial-product bit to a bit of the
output sample for the filtering operation is minimized. It is shown that an architecture designed
based on the proposed scheme requires a smaller number of clock cycles compared to that of
the architectures employing comparable hardware resources. In fact, the requirement on the
hardware resources of the architecture designed by using the proposed scheme also gets
improved due to a smaller number of registers that need to be employed. Based on the
proposed scheme, a specific example of designing architecture for the DWT computation is
considered. In order to assess the feasibility and the efficiency of the proposed scheme, the
architecture thus designed is simulated and implemented on a field-programmable gate-array
board. It is seen that the simulation and implementation results conform to the stated goals of
the proposed scheme, thus making the scheme a viable approach for designing a practical and
realizable architecture for real-time DWT computation.
Key-Words: Discrete wavelet transform (DWT), DWT computation, field-programmable gatearray (FPGA) implementation, inter- and intrastage parallelisms, multiresolution filtering,
parallel architecture, pipeline architecture, real-time processing, and VLSI architecture.
INTRODUCTION:
SINCE the development of the theory for the computation of the discrete wavelet
transform (DWT) by Mallat in 1989, the DWT has been increasingly used in many different areas
of science and engineering mainly because of the multiresolution decomposition property of
the transformed signals. The DWT is computationally intensive because of multiple levels of
decomposition involved in the computation of the DWT. It is therefore a challenging problem to
design an efficient VLSI architecture to implement the DWT computation for real-time
applications, particularly those requiring processing of high-frequency or broadband signals.
VEDLABS, #112, Oxford Towers, Old airport Road, Kodihalli, Bangalore-08,
www.vedlabs.com , Email id: projects@vedlabs.com, Ph: 080-42040494.
Page 1
Many architectures have been proposed in order to provide high-speed and area-efficient
implementations for the DWT computation. The polyphase matrix of a wavelet filter is
decomposed into a sequence of alternating upper and lower triangular matrices and a diagonal
matrix to obtain the so-called lifting-based architectures with low hardware complexity.
However, such architectures have a long critical path, which results in reducing the processing
rate of input samples. On the other hand, the problem of low processing rate is not acute in the
architectures that use convolution low- and high-pass filtering operations to compute the DWT
BLOCK DIAGRAM:
Fig 1: Block diagram of the two-stage architecture
Since, in the stage-equalized architectures, the two stages together perform the DWT
computation, with the amount and type of computations of the individual stages being the
same, each of the two stages can use identical processing units. However, the control units to
be employed by the stages have to be different, since, as seen from Algorithm 1 of the previous
section, the operation of stage 1 is autonomous, whereas stage 2 must always synchronize its
operation with that of stage 1. Based on this algorithm, the design of the control unit used by
stage 2 would have to be a bit more involved than that of the control unit used by stage 1.
Obviously, in order to synchronize the operation of stage 2 with that of stage 1, a buffer has to
be used to store the low-pass output samples from the two stages. Fig. 1 shows a block diagram
incorporating all these requirements for the design of the proposed architecture. The two
processing units are referred to as PU in stage 1 and PU in stage 2. Note that, in this
architecture, the high-pass samples from PU and PU are outputted directly.
HARDWARE AND SOFTWARE REQUIREMENTS:
VEDLABS, #112, Oxford Towers, Old airport Road, Kodihalli, Bangalore-08,
www.vedlabs.com , Email id: projects@vedlabs.com, Ph: 080-42040494.
Page 2
Software Requirement Specification:

Operating System: Windows XP with SP2

Synthesis Tool: Xilinx 12.2.

Simulation Tool: Modelsim6.3c.
Hardware Requirement specification:

Minimum Intel Pentium IV Processor

Primary memory: 2 GB RAM,

Spartan III FPGA

Xilinx Spartan III FPGA development board

JTAG cable, Power supply
REFERENCES:
[1] S. Mallat, “A theory for multiresolution signal decomposition: The wavelet representation,”
IEEE Trans. Pattern Anal. Mach. Intell., vol.11, no. 7, pp. 674–693, Jul. 1989.
[2] J. Chilo and T. Lindblad, “Hardware implementation of 1D wavelet transform on an FPGA for
infrasound signal classification,” IEEE Trans. Nucl. Sci., vol. 55, no. 1, pp. 9–13, Feb. 2008.
[3] S. Cheng, C. Tseng, and M. Cole, “Efficient and effective VLSI architecture for a waveletbased broadband sonar signal detection system,” in Proc. IEEE 14th ICECS, Marrakech,
Morocco, Dec. 2007, pp. 593–596.
[4] K. G. Oweiss, A. Mason, Y. Suhail, A. M. Kamboh, and K. E. Thomson, “A scalable wavelet
transform VLSI architecture for real-time signal processing in high-density intra-cortical
implants,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 54, no. 6, pp. 1266–1278, Jun. 2007.
VEDLABS, #112, Oxford Towers, Old airport Road, Kodihalli, Bangalore-08,
www.vedlabs.com , Email id: projects@vedlabs.com, Ph: 080-42040494.
Page 3
Download