vlsi projects-2013-2014

advertisement
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
VLSI PROJECTS-2013-2014
1. An Image Steganography Technique using X-Box Mapping
Abstract: Image Steganography is a method of concealing information into a cover image to hide it.
Least Significant-Bit (LSB) based approach is most popular steganographic techniques in spatial domain
due to its simplicity and hiding capacity. This paper presents a novel technique for Image Steganography
based on LSB using X-box mapping where we have used several Xboxes having unique data. The
embedding part is done by this Steganography algorithm where we use four unique X-boxes with sixteen
different values (represented by 4- bits) and each value is mapped to the four LSBs of the cover image.
This mapping provides sufficient security to the payload because without knowing the mapping rules
no one can extract the secret data (payload).
2. FPGA Implementation of high speed 8-bit Vedic multiplier using barrel shifter
Abstract- This paper describes the implementation of an 8-bit Vedic multiplier enhanced in terms of
propagation delay when compared with conventional multiplier like array multiplier, Braun multiplier,
modified booth multiplier and Wallace tree multiplier. In our design we have utilized 8-bit barrel shifter
which requires only one clock cycle for ‘n’ number of shifts. The design is implemented and verified
using FPGA and ISE Simulator. The core was implemented on Xilinx Spartan-6 family xc6s1x75T-3-fgg676
FPGA. The propagation delay comparison was extracted from the synthesis report and static timing
report as well. The design could achieve propagation delay of 6.781ns using barrel shifter in base
selection module and multiplier
3. Design of Low Power TPG Using LP-LFSR
Abstract- This paper presents a novel test pattern generator which is more suitable for built in self test
(BIST) structures used for testing of VLSI circuits. The objective of the BIST is to reduce power dissipation
without affecting the fault coverage. The proposed test pattern generator reduces the switching activity
among the test patterns at the most. In this approach, the single input change patterns generated by a
counter and a gray code generator are Exclusive–ORed with the seed generated by the low power linear
feedback shift register [LP-LFSR]. The proposed scheme is evaluated by using, a synchronous pipelined
4x4 and 8x8 Braun array multipliers. The System-On-Chip (SOC) approach is adopted for implementation
on Altera Field Programmable Gate Arrays (FPGAs) based SOC kits with Nios II soft-core processor. From
the implementation results, it is verified that the testing power for the proposed method is reduced by a
significant percentage.
4. An Implementation of AES Algorithm Based on FPGA
Abstract—An implementation of high speed AES algorithm based on FPGA is presented in this paper in
order to improve the safety of data in transmission. The mathematic principle, encryption process and
logic structure of AES algorithm are introduced. So as to reach the purpose of improving the system
computing speed, the pipelining and parallel processing methods were used. The simulation results
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
show that the high-speed AES encryption algorithm implemented correctly. Using the method of AES
encryption the data could be protected effectively.
5. Speed optimization of a FPGA based modified Viterbi Decoder
Abstract — In the modern era of electronics and communication decoding and encoding of any data(s)
using VLSI technology requires low power, less area and high speed constrains. The viterbi decoder using
survivor path with necessary parameters for wireless communication is an attempt to reduce the power
and cost and at the same time increase the speed compared to normal decoder. This paper presents
three objectives. Firstly, an orthodox viterbi decoder is designed and simulated. For faster process
application, the Gate Diffused Input Logic (GDIL) based viterbi decoder is designed using Xilinx ISE,
simulated and synthesized successfully. The new proposed GDIL viterbi provides very less path delay
with low power simulation results. Secondly, the GDIL viterbi is again compared with our proposed
technique, which comprises a Survivor Path Unit (SPU) implements a trace back method with DRAM.
This proposed approach of incorporating DRAM stores the path information in a manner which allows
fast read access without requiring physical partitioning of the DRAM. This leads to a comprehensive gain
in speed with low power effects. Thirdly, all the viterbi decoders are compared, simulated, synthesized
and the proposed approach shows the best simulation and synthesize results for low power and high
speed application in VLSI design. The Add-Compare-Select (ACS) and Trace Back (TB) units and its sub
circuits of the decoder(s) have been operated in deep pipelined manner to achieve high transmission
rate. Although the register exchange based survivor unit has better throughput when compared to trace
back unit, but in this paper by introducing the RAM cell between the ACS array and output register bank,
a significant amount of reduction in path delay has been observed. All the designing of viterbi is done
using Xilinx ISE 12.4 and synthesized successfully in the FPGA Spartan-3 target device operated at 64.516
MHz clock frequency, reduces almost 41% of total path delay.
6. Traffic-aware Design of a High Speed FPGA Network Intrusion Detection System
Abstract—Security of today’s networks heavily rely on Network Intrusion Detection Systems (NIDSs).
The ability to promptly update the supported rule sets and detect new emerging attacks makes Field
Programmable Gate Arrays (FPGAs) a very appealing technology. An important issue is how to scale
FPGA-based NIDS implementations to ever faster network links. Whereas a trivial approach is to balance
traffic over multiple, but functionally equivalent, hardware blocks, each implementing the whole rule set
(several thousands rules), the obvious cons is the linear increase in the resource occupation. In this
work, we promote a different, traffic-aware, modular approach in the design of FPGA-based NIDS.
Instead of purely splitting traffic across equivalent modules, we classify and group homogeneous traffic,
and dispatch it to differently capable hardware blocks, each supporting a (smaller) rule set tailored to
the specific traffic category. We implement and validate our approach using the rule set of the well
known Snort NIDS, and we experimentally investigate the emerging trade-offs and advantages, showing
resource savings up to 80% based on real world traffic statistics gathered from an operator’s backbone.
7. Design and Implementation of Automated Wave-Pipelined Circuit using ASIC
Abstract— Wave-pipelining enables a digital circuit to be operated at higher frequency. In the literature,
only trial and error and manual procedures are adopted for the choice of the optimum value of clock
and clock skew between the I/O registers of wave-pipelined circuits. The major contribution of this
paper is the proposal for automating the above procedure for the ASIC implementation of wave-
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
pipelined circuits using built in self test approach. This is studied by a multiplier using dedicated AND
gate by adopting three different schemes: wave-pipelining, pipelining and non-pipelining. From the
implementation results, it is verified that the wavepipelined multipliers are faster by a factor of 1.08
compared to the non-pipelined multipliers. The wavepipelined Multiplier dissipates less power in the
factor of 1.43 compared to the pipelined multiplier.
8. FPGA-based adaptive noise cancellation for ultrasonic NDE application
Abstract— Adaptive filter has been widely used in different applications for interference cancellation,
predication, inverse modeling and identifications. In this paper, Field Programmable Gate Array (FPGA)based adaptive noise cancellation is studied for adaptive filtering in ultrasonic non-destructive
evaluation. Simulation and experimental results showed that backscattered noise from microstructures
inside material can be efficiently reduced by adaptive filter. Additionally, four different architectures of
filter realization on FPGA are discussed and compared. This type of study could have a broad range of
applications such as target detection, object localization and pattern recognition.
9. FPGA Based Implementation of a Double Precision IEEE Floating-Point Adder
Abstract-Floating-Point addition imposes a great challenge during implementation of complex algorithm
in hard realtime due to the enormous computational burden associated with repeated calculations with
high precision numbers. Moreover, at the hardware level, any basic addition or subtraction circuit has to
incorporate the alignment of the significands. This paper presents a novel technique to implement a
double precision IEEE floating-point adder that can complete the operation within two clock cycles. The
proposed technique has exhibited improvement in the latency and also in the operational chip area
management. The proposed double precision IEEE floating-point adder has been implemented with
XC2V6000 and XC3SI500 Xilinx© FPGA devices.
10. Least Complex S-Box and Its Fault Detection for Robust Advanced Encryption Standard Algorithm
Abstract- Advanced Encryption Standard (AES) is the symmetric key standard for encryption and
decryption. In this work, a 128-bit AES encryption and decryption using Rijndael Algorithm is designed
and synthesized using verilog code. The fault detection scheme for their hardware implementation plays
an important role in making the AES robust to the internal and malicious faults. In the proposed AES, a
composite field S-Box and inverse S-Box is implemented using logic gates and divided them into five
blocks. Any natural or malicious faults which defect the logic gates are detected using parity based fault
detection scheme. For increasing the fault exposure, the predicted parities of each of the block S-box
and inverse S-box are obtained. The multi-bit parity prediction approach has low cost and high error
coverage than the approaches using single bit parities. The Field Programmable Gate Array (FPGA)
implementation of the fault detection structure has better hardware and time complexities.
11. An Application Instance of FPGA in the Field of PHM
Abstract-With the extensive application of new types of sensors, the rate of data acquisition and
transmission are becoming a problem in Prognostic and Health Management (PHM). This paper presents
an FPGA-based method for high speed data acquisition and transmission taking full advantage of the
parallel processing capabilities and easy to modify and upgrade features of FPGA, to cope with the
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
difficulties of highspeed data acquisition and transmission. This article focuses on the components
performance and characteristics of the FPGA based fault prediction and diagnosis system. The system
uses an FPGA chip as the core operational component to achieve fast transfer of large amounts of data
by FIFO. The use of programmable FPGA hardware, makes the data acquisition system design flexible,
while improving the reliability of the system. It is also important for achieving the real-time online
monitoring of failure.
12. A Moving Window Architecture for a HW/SW Codesign Based Canny Edge Detection for FPGA
Abstract - This paper proposes an accelerator for Canny edge detection implemented on FPGA. The
proposed architecture relies on a moving window consisting of 7x8 pixels, which performs the more
computational complex operations of the algorithm: smoothing, gradient’s magnitude and direction
computation, nonmaximum suppression and double thresholding. By employing the proposed window,
intermediate results are stored within the FPGA, without the need to buffer them in large memory
structures. Furthermore, the design has a high throughput rate, due to its large numbers of pipeline
stages, allowing considerable performance for the proposed algorithm.
13. An Electrocardiogram Diagnostic System Implemented in FPGA
Abstract—In this paper, we present a signal processing method capable of detecting angina in
electrocardiograms, and its implementation in Field Programmable Gate Array - FPGA. The adopted
procedure is based on fuzzy clustering to reduce the amount of data sampling and a comparison with
samples from a previously established database. By using the correlation method on the samples, it is
possible to establish an initial indication of angina. The reduced number of samples of the clustering
process turns the processing simpler and allows its hardware (FPGA) implementation. According to the
tests conducted, the method achieves 85% correct diagnoses.
14. Enhanced Area Effi cient Architecture for 128 bit Modified CSLA
Abstract- In the design of Integrated circuits, area occupancy plays a vital role because of increasing
necessity of portable systems. Carry Select Adder (CSLA) is a fast adder used in dataprocessing
processors for performing fast arithmetic functions. From the structure of the CSLA, the scope is to
reduce the area of CSLA based on the efficient gate-level modification. In this paper 128 bit Regular
Linear CSLA, Modified Linear CSLA, Regular Square-root CSLA (SQRT CSLA) and Modified SQRT CSLA
architectures have been developed and compared. However, the Regular CSLA is still area-consuming
due to the dual RippleCarry Adder (RCA) structure. For reducing area, the CSLA can be implemented by
using a single RCA and an add-one circuit instead of using dual RCA. Comparing the Regular Linear CSLA
with Regular SQRT CSLA, the Regular SQRT CSLA has reduced area as well as comparing the Modified
Linear CSLA with Modified SQRT CSLA; the Modified SQRT CSLA has reduced area. The results and
analysis show that the Modified Linear CSLA and Modified SQRT CSLA provide better outcomes than the
Regular Linear CSLA and Regular SQRT CSLA respectively. This project was aimed for implementing high
performance optimized FPGA architecture. Modelsim 6.3c is used for simulating the CSLA and
synthesized using Xilinx PlanAhead12.2. Then the implementation is done in spartan3 FPGA Kit.
15. PNOC: Implementation on Verilog for FPGA
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
Abstract—Network on Chip (NoC) architectures provide a very efficient means for performance
enhancement in digital circuits. The paper describes a NoC implementation that is specifically targeted
towards FPGA based designs. Our implementation is based on a lightweight circuit-switched
architecture called programmable NoC (PNoC). It is captured in the Verilog hardware description
language and is implemented using the Xilinx Virtex-II pro FPGA (XC2Vp30-7) device at 126 MHz. The
proposed architecture allows parametrization at the compile time for the number of nodes and amount
of data. Moreover, experimental results have confirmed that the proposed implementation is the most
efficient one in terms of performance.
16. Real-time FPGA-based Template Matching Module for Visual Inspection Application
Abstract-Template matching enables localization of objects under inspection, but suffers from long
computation due to high computation complexity. In this work, a real-time FPGA-based template
matching module which accelerates time consuming normalized cross-correlation (NCC) template
matching was presented. To improve NCC computation speed, we simplified the original NCC algorithm
and designed pipelined parallel processing circuit architecture. Experimental results showed that our
FPGA module accelerates NCC speed up to 80 times faster than PC performance. The real-time template
matching module has been integrated into an LED die inspection system to localize LED dies. This realtime template matching module can be applied to LED, PV, and semiconductor inspection and
manufacturing applications. Furthermore, this module can also be applied to vision applications for
either service or industrial robots.
17. High-performance Implementation of a New Hash Function on FPGA
Abstract— Skein has the advantages of higher security (resisting against traditional attacks), faster
speed and selectable parameters. Therefore, it becomes a strong competitor for next generation secure
hash algorithm standard (SHA-3) which will be used widely in communication and security for
substitution of SHA-2. The problems of existing works lie in implementation for only one structure and
lack detailed comparison of different structures. Based on analysis of the algorithm, we accomplished
three structures (iterative, 4-unrolled and 8-unrolled) of Skein and ported the designs to FPGA
respectively. Finally detailed analysis and comparison with our different structures and other
implementations are provided from aspects of hardware resource and performance. The results show
that our implementation has better performance and takes up less hardware resources than existing
works under the same structure. Our implementation can meet the requirement of real-time and high
performance field.
18. Improved Floating-Point Matrix Multiplier
Abstract – Floating-point matrix multiplier is widely used in scientific computations. A great deal of
efforts has been made to achieve higher performance. The matrix multiplication consists of many
multiplications and accumulations. Yang and Duh proposed a modular design of floating-point matrix
multiplier which reserving intermediate result as two vectors. It brings shorter delay but more cost. This
work modifies Yang and Duh’s design with Booth encoding in multiplication to reduce the number of
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
partial products. As the result, the improved floating-point matrix multiplier has better performance
with shorter delay and much less hardware cost than Yang and Duh’s design.
19. An area efficient multiplexer based CORDIC
Abstract—In the literature, multiplexer has been proposed for the ASIC implementation of unrolled
CORDIC (Coordinate Rotation DIgital Computer) processor. In this paper, the efficacy of this approach is
studied for the implementation on FPGA. For this study, both non pipelined and 2 level pipelined
CORDIC with 8 stages and using two schemes – one using adders in all the stages and another using
multiplexers in the second and third stages. A 16 bit CORDIC for generating the sine/cosine functions is
implemented using all the four schemes on both Xilinx Virtex 6 FPGA(XC6VLX240) and Altera Cyclone II
FPGA(EP2C20F484C7). From the implementation results, it is found that the nonpipelined and pipelined
CORDICs using multiplexer requires 1.6, 1.4 times lower area in Xilinx FPGA and 1.8, 1.6 times lower area
in Altera FPGA than that using only adders. This is achieved without reduction in speed.
20. Simulation and Implementation of LDPC Code in FPGA
Abstract—The paper deals with implementation of Low-Density Parity-Check (LDPC) codes [1] in FPGAbased bridge for Free- Space Optical link. The coder was designed with a regular parity matrix for code
rate 1/2. The matrix of dimension 8x16 for the experimental implementation was found using a random
search in MATLAB. The main advantage of this matrix is the decoder can correct all single-bit errors. The
simulation for all possible values shows that Bit Error Ratio (BER) is zero. This result was not obtained
with other matrices. An experimental communication channel was realized with encoder and decoder
implemented in FPGA Virtex 5 development board ML505. DIP switches are sources for information bits
and these values are shown on LCD display. The bit-flipping method is used in decoder and result code
word is shown in the second line on the LCD display.
21.FPGA based implementation of a fuzzy Neural network modular architecture For embedded
systems
This paper presents a FPGA based approach for a modular architecture of Fuzzy Neural Networks (FNN)
to embed easily different topologies set up. The project is based on a Takagi – Hayashi (T-H) method for
the construction and tuning of fuzzy rules, this is commonly referred as neural network driven fuzzy
reasoning. The proposed architecture approach consists of two main configurable modules: a Multilayer
Perceptron – MLP with sigmoidal activation function that composes the first module to determine a
Fuzzy membership function; the second employs an MLP with pure linear activation function to define
the consequents. The DSPBuilder® software along the Simulink® is used to connect, set and synthesize
the Fuzzy Neural Network desired. Other hardware components employed in the architecture proposed
cooperate to the system modularity. The system was tested and validated through a control problem
and an interpolation problem. Several papers proposed different hardware architecture to implement
hybrid systems by using Fuzzy logic and Neural Network. However, there is no approach with this
specific neural network driven fuzzy reasoning by T-H method and the aim to be embedded. The SelfOrganizing Map (SOM) and Levenberg-Marquardt back propagation were used to train the FNN
proposed off-line.
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
22. Efficient Interleaver Design for MIMO-OFDM Based Communication Systems on FPGA
In this paper, we present a memory-efficient and faster interleaver implementation technique for
MIMO-OFDM communication systems on FPGA. The IEEE 802.16 standard is used as a reference for
simulation, implementation, and analysis. A method for the interleaver design on FPGA and its memory
utilization results are presented. Our design utilizes the minimum required on-chip memory for the
interleaver implementation. Using the proposed interleaver design method, the data rates for MIMOOFDM based communication systems are doubled for 2×2 MIMO systems without using the transmit
diversity.
23. A FPGA-Based Deep Packet Inspection Engine for Network Intrusion Detection System
Abstract— Pattern matching has became a bottleneck of software based Network Intrusion Detection
System (NIDS) as the number of signature have recently increased dramatically. Many FPGA based
architectures for detecting malicious patterns have been proposed recently. However, these approaches
have just considered matching pattern separately while more and more complex combination of several
patterns are utilized to describe intrusion activities. In this paper we present our work which
concentrates on multi-pattern signature and propose a FPGA based deep packet inspection engine for
NIDS. The system can support both static and dynamic patterns. We employ Snort signature set and
realize our system on Net FPGA platform. The evaluation on real network environment shows that our
system can maintain gigabit line rate throughput without dropping packets.
24. FPGA Implementation of 16-Point Radix-4 Complex FFT Core Using NEDA
Abstract--NEDA is one of the techniques to implement many digital signal processing systems that
require multiply and accumulate units. FFT is one of the most employed blocks in many communication
and signal processing systems. This paper proposes FPGA implementation of a 16 point radix-4
complexFFT core using NEDA. The proposed design has improvement in terms of hardware utilization
compared to traditional methods. The design has been implemented on a range of FPGAs to compare
the performance. The proposed design has a power consumption of 728.89 mW on XC2VP100-6FF1704
FPGA at 50 MHz. The maximum frequency achieved is 114.27 MHz on XC5VLX330-2FF1760 FPGA at a
cost of higher power and the maximum throughput observed is 1828.32 Mbit/s and minimum slice delay
product observed is 9.18. The design is also implemented using synopsys DC synthesis for both 65 nm
and 180 nm technology libraries.
25. A High-Performance FPGA-Based Implementation of the LZSS Compression Algorithm
The increasing growth of embedded networking applications has created a demand for highperformance logging systems capable of storing huge amounts of high-bandwidth, typically redundant
data. An efficient way of maximizing the logger performance is doing a real-time compression of the
logged stream. In this paper we present a flexible high-performance implementation of the LZSS
compression algorithm capable of processing up to 50 MB/s on a Virtex-5 FPGA chip. We exploit the
independently addressable dual-port block RAMs inside the FPGA chip to achieve an average
performance of 2 clock cycles per byte. To make the compressed stream compatible with the ZLib library
[1] we encode the LZSS algorithm output using a fixed Huffman table defined by the Deflate
specification [2]. We also demonstrate how changing the amount of memory allocated to various
internal tables impacts the performance and compression ratio. Finally, we provide a cycle-accurate
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
estimation tool that allows finding a trade-off between FPGA resource utilization, compression ratio and
performance for a specific data sample.
26. FPGA Implementation of heterogeneous multicore platform with SMID/MIMD custom
accelerators
27. Novel High Speed Vedic Mathematics Multiplie using Compressors
Abstract—With the advent of new technology in the fields of VLSI and communication, there is also an
ever growing demand for high speed processing and low area design. It is also a well known fact that the
multiplier unit forms an integral part of processor design. Due to this regard, high speed multiplier
architectures become the need of the day. In this paper, we introduce a novel architecture to perform
high speed multiplication using ancient Vedic maths techniques. A new high speed approach utilizing 4:2
compressors and novel 7:2 compressors for addition has also been incorporated in the same and has
been explored. Upon comparison, the compressor based multiplier introduced in this paper, is almost
two times faster than the popular methods of multiplication. With regards to area, a 1% reduction is
seen. The design and experiments were carried out on a Xilinx Spartan 3e series of FPGA and the timing
and area of the design, on the same have been calculated.
28. FPGA Implementation of BASK-BFSK-BPSK Digital Modulators
Abstract: Field-programmable gate-array (FPGA) implementations of binary amplitude-shift keying
(BASK), binary frequency shift keying (BFSK), and binary phase-shift keying (BPSK) digital modulators are
presented. The proposed designs are aimed at educational purposes in a digital communication course.
They employ the minimum number of blocks necessary for achieving BASK, BFSK, and BPSK modulation,
and for full integration with the other functional parts of the Altera Development and Education (DE2)
FPGA board. The input carrier signal and the bit stream (modulating signal) are user controllable. These
digital modulators were developed and compiled to a Verilog Hardware Description Language (HDL)
netlist, and were later implemented into an Altera DE2 FPGA board. The functionality of these digital
modulators was demonstrated through simulations using the Quartus II simulation software, and
experimental measurements of the real-time modulated signal via an oscilloscope.
29. Design and Implementation of Adaptive filtering algorithm for Noise Cancellation in speech signal
on FPGA
Abstract- In recent years FPGA systems are replacing dedicated Programmable Digital Signal Processor
(PDSP) systems due to their greater flexibility and higher bandwidth, resulting from their parallel
architecture. This paper presents the applicability of a FPGA system for speech processing. Here
adaptive filtering technique is used for noise cancellation in speech signal. Least Mean Squares (LMS ) ,
one of the widely used algorithm in many signal processing environment , is implemented for adaption
of the filter coefficients. The cancellation system is implemented in VHDL and tested for noise
cancellation in speech signal. The simulation of VHDL design of adaptive filter is performed and analyzed
on the basis of Signal to Noise ratio (SNR) and Mean Square Error (MSE).
30. A Programmable FPGA-based 8-Channel Arbitrary Waveform Generator for Medical Ultrasound
Research Activities
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
Abstract: In modern ultrasound imaging systems, digital transmit beamformer module typically
generates accurate control of the amplitude of individual elements in a multielement array probe, as
well as of the time delays and phase between them, to enable the acoustic beam to be focused and/or
steered electronically. However, these systems do not provide the ultrasound researchers access to
transmit front-end module. This paper presents the development of a digital transmit beamformer
system for generating simultaneous arbitrary waveforms, specifically designed for research purposes.
The proposed architecture has 8 independent excitation channels and uses an FPGA (Field
Programmable Gated Array) device for electronic steering and focusing of ultrasound beam. The system
allows operation in pulse-echo mode, with pulse repetition rate of excitation from 62.5 Hz to 8 kHz,
center frequency from 500 kHz to 20 MHz, excitation voltage over 100 Vpp, and individual control of
amplitude apodization, phase angle and time delay trigger. Experimental results show that this
technique is suitable for generating the excitation waveforms needed for medical ultrasound imaging
researches.
31. Area-Time Efficient Scaling-Free CORDIC Using Generalized Micro-Rotation Selection
Abstract—This paper presents an area-time efficient CORDIC algorithm that completely eliminates the
scale-factor. By suitable selection of the order of approximation of Taylor series the proposed CORDIC
circuit meets the accuracy requirement, and attains the desired range of convergence. Besides we have
proposed an algorithm to redefine the elementary angles for reducing the number of CORDIC iterations.
A generalized micro-rotation selection technique based on high speed most-significant-1-detection
obviates the complex search algorithms for identifying the micro-rotations. The proposed CORDIC
processor provides the flexibility to manipulate the number of iterations depending on the accuracy,
area and latency requirements. Compared to the existing recursive architectures the proposed one has
17% lower slice-delay product on Xilinx Spartan XC2S200E device.
32. ADPLL Design and Implementation on FPGA
Abstract:-This paper presents the ADPLL design using Verilog and its implementation on FPGA. ADPLL is
designed using Verilog HDL. Xilinx ISE 10.1 Simulator is used for simulating Verilog Code.This paper gives
details of the basic blocks of an ADPLL. In this paper, implementation of ADPLL is described in detail. Its
simulation results using Xilinx are also discussed. It also presents the FPGA implementation of ADPLL
design on Xilinx vertex5 xc5vlx110t chip and its results. The ADPLL is designed of 200 kHz central
frequency. The operational frequency range of ADPLL is 189 Hz to 215 kHz, which is lock range of the
design.
33. Implementation and Comparison of Effective Area Efficient Architectures for CSLA
Abstract- In the design of Integrated circuits, area occupancy plays a vital role because of increasing
necessity of portable systems. Carry Select Adder (CSLA) is a fast adder used in data processing
processors for performing fast arithmetic functions. From the structure of the CSLA, the scope is to
reduce the area of CSLA based on the efficient gate-level modification. In this paper 128 bit Regular
Linear CSLA, Modified Linear CSLA, Regular Square-root CSLA (SQRT CSLA) and Modified SQRT CSLA
architectures have been developed and compared. However, the Regular CSLA is still area-consuming
due to the dual RippleCarry Adder (RCA) structure. For reducing area, the CSLA can be implemented by
using a single RCA and an add-one circuit instead of using dual RCA. Comparing the Regular Linear CSLA
with Regular SQRT CSLA, the Regular SQRT CSLA has reduced area as well as comparing the Modified
Linear CSLA with Modified SQRT CSLA; the Modified SQRT CSLA has reduced area. The results and
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
analysis show that the Modified Linear CSLA and Modified SQRT CSLA provide better outcomes than the
Regular Linear CSLA and Regular SQRT CSLA respectively. This project was aimed for implementing high
performance optimized FPGA architecture. Modelsim 6.3c is used for simulating the CSLA and
synthesized using Xilinx PlanAhead12.2. Then the implementation is done in spartan3 FPGA Kit.
34. Automatic Gain Control on FPGA for Software- Defined Radios
Abstract— This paper introduces an efficient implementation of the Automatic Gain Control (AGC) for
an IEEE 802.15.3c compliant receiver developed using a Field-Programmable Gate Arrays (FPGAs), both
feed-forward and feed-backward AGCs are designed, implemented and evaluated.
35. Design of Plural-Multiplier Based on CORDIC Algorithm for FFT Application
Abstract—CORDIC plural-multiplier is the key module to affecting the speed and accuracy of FFT
processor. Considering these demands, the problem of CORDIC algorithm is discussed in detail and the
according optimization methods are given in this paper. Then, the hardware pipelining structure of the
CORDIC multiplier is put forward. Comparison results about RTL simulation results with MATLAB
calculation indicate that the design is feasible and practical.
36. VLSI Friendly ECG QRS Complex Detector for Body Sensor Networks
Abstract—This paper aims to present a very-large-scale integration (VLSI) friendly electrocardiogram
(ECG) QRS detector for body sensor networks. Baseline wandering and background noise are removed
from original ECG signal by a mathematical morphological method. Then the multipixel modulus
accumulation is employed to act as a low-pass filter to enhance the QRS complex and improve the
signal-to-noise ratio. The performance of the algorithm is evaluated with standard MIT-BIH arrhythmia
database and wearable exercise ECG Data. Corresponding power and area efficient VLSI architecture is
designed and implemented on a commercial nano-FPGA. High detection rate and high speed
demonstrate the effectiveness of the proposed detector.
37. FPGA Architecture for OFDM Software Defined Radio with an optimized Direct Digital Frequency
Synthesizer
Abstract— A Software Defined Radio (SDR) is a transmitter and receiver system that uses digital signal
processing (DSP) for coding, decoding, modulating, and demodulating data. This paper presents the
framework for hardware implementation of SDR using Orthogonal Frequency Division Multiplexing
(OFDM). The framework comprises of VLSI mapping of algorithms, Orthogonal Frequency Division
Multiplexing(OFDM), Quadrature Phase Shift Keying (QPSK), Fast Fourier Transform (FFT) Algorithms
and most importantly, the algorithm for Direct Digital Frequency Synthesis (DDFS). A digital frequency
synthesizer with optimized time and area resources has been proposed for the SDR. This VLSI
implementation of the DDFS computes the sine and cosine function on a single edge of clock, thus
proving to be optimized in terms of area and speed. Fixed-Point implementation was accomplished with
ModelSim simulator. Verilog HDL was used as a description language for mapping Algorithms in VLSI.
Xilinx Spartan 3 XC3S200 Field Programmable Gate Array (FPGA) was chosen as a Hardware Platform for
the System Implementation.
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
38. FPGA based Area optimized and efficient Architecture for NMS and Thresholding used for Canny
Edge Detector
Abstract—— In this paper, we present an architecture for Non Maximal Suppression used in Canny edge
detection algorithm that results in significantly reduced memory requirements decreased latency and
increased throughput with no loss in edge detection. The new algorithm uses a low-complexity 8-bin
non-uniform gradient magnitude histogram to compute block-based hysteresis thresholds that are used
by the Canny edge detector. Furthermore, FPGA-based hardware architecture of our proposed
algorithm is presented in this paper and the architecture is synthesized on the Xilinx Virtex 5 FPGA. The
design development is done in VHDL and simulates the results in modelsim 6.3 using Xilinx 12.2.
39. Self-Programmable Multipurpose Digital Filter Design Based on FPGA
Abstract: Conventional digital filter can only obtain fixed frequency domain characteristics at a time. In
order to obtain variable characteristics, the digital filter’s type, number of taps and coefficients should
be changed constantly such that the desired frequency-domain characteristics can be obtained. This
paper proposes method for self-programmable Variable Digital Filter (VDF) design based on FPGA.
Taking finite impulse response (FIR) digital filter as an example, we implement a digital filter system by
using Custom embedded Micro-Processor, programmable FIR macro module, coefficient-loader, clock
manager and A/D or D/A controller and other modules. The self-programmable VDF can provide the
best solution for realization of digital filter algorithms, which are the low-pass, high-pass, band-pass and
band-stop filter algorithms with variable frequency domain characteristics. The design examples with
minimum 1 to maximum 32 taps FIR filter, based on Modelsim post-routed simulation and onboard
running on XUPV5-LX110T, are provided to demonstrate the effectiveness of the proposed method and
the online-adaptability of variable digital filters.
40. FPGA-based reconfigurable control for fault tolerant Back-to-back converter without redundancy
Abstract— In this paper, an FPGA-based fault tolerant back-to-back converter without redundancy is
studied. Before fault occurrence, the fault tolerant converter operates like a conventional back-to-back
six-leg converter and after the fault it becomes a five-leg converter. Design, implementation and
experimental verification of an FPGA-based reconfigurable control strategy for this converter are
discussed. This reconfigurable control strategy allows the continuous operation of the converter with
minimum affection from a fault in one of the semiconductor switches. A very fast detection scheme is
used to detect and locate the fault. Implementation of the fault detection and of the fully digital control
schemes on a single FPGA is realized, based on a suited methodology for rapid prototyping. FPGA in loop
and also experimental tests are carried out and the results are presented. These results confirm the
capability of the proposed reconfigurable control and fault tolerant structure.
41. Reliable On-Chip Network Design Using an Agent-based Management Method
Abstract—As the complexity of evolving integrated circuits and the number of cores in each chip
increase, reliability aspects are becoming an important issue in complex chip designs. In this paper, we
present an on-chip network architecture that incorporates a novel agent-based management method to
enhance the reliability and performance of network-based Chip Multi-Processor (CMP) and System-on
Chip (SoC) designs against faulty links and routers. In addition, to utilize the fault information required
for the routing process in a scalable manner, we classify the fault information to be exploited in the
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
proposed distributed and hierarchical management structure. The experimental results show that the
proposed architecture incurs only a small hardware overhead.
42. Advanced Cryptographic System for data Encryption and Decryption
Abstract- in this paper, we mainly focus on the implementation of Advanced Cryptographic System using
two crypto algorithms in a single chip to provide high security and high performance. This paper proves
the confidentiality of data over insecure medium. The two algorithms namely 128-bit AES and RSA
implemented in a single chip prove difficult for the hacker to crack the system. It combines two
transformations of AES algorithms which achieves high speed and less area on chip. This system
generates keys internally, which also achieves high security and high performance with faster execution
of the algorithm.
43. Mapping FPGA to Field Programmable neural Network Array (FPNNA)
Abstract: My paper presents the implementation of a generalized back-propagation multilayer
perceptron (MLP) architecture, on FPGA, described in VLSI hardware description language (VHDL). The
development of hardware platforms is not very economical because of the high hardware cost and
quantity of the arithmetic operations required in online artificial neural networks (ANNs), i.e., general
purpose ANNs with learning capability. Besides, there remains a dearth of hardware platforms for design
space exploration, fast prototyping, and testing of these networks. Our general purpose architecture
seeks to fill that gap and at the same time serve as a tool to gain a better understanding of issues unique
to ANNs implemented in hardware, particularly using field programmable gate array (FPGA).This work
describes a platform that offers a high degree of parameterization, while maintaining generalized
network design with performance comparable to other hardware-based MLP implementations.
Application of the hardware implementation of ANN with back-propagation learning algorithm for a
realistic application is also presented.
44. Design and Implementation of Reed Solomon Decoder for 802.16 Network using FPGA
Abstract—This paper presents a design and FPGA implementation of a reconfigurable FEC Decoder
based on Reed Solomon Code for WiMax Network. The implementation, written in Very High Speed
hardware description Language (VHDL) is based on Berlekamp Massey, Forney and Chein Algorithm. The
802.16 network standards recommend the use of Reed-Solomon code RS (255,239), which is
implemented and discussed in this paper. It is targeted to be applied in a forward error correction
system based on 802.16 network standard to improve the overall performance of the system. The
objective of this work is to implement a Reed- Solomon VHDL code to measure the performance of the
RS Decoder on Xilinx Virtex II pro (xc2vp50- 5-ff1148) and Xilinx Spartan 3e (xc3s500e-4-fg320) FPGA
Them performance of the implemented RS codec on both FPGAs will be compared .The performance
metrics to be used are the area occupied by the design and the speed at which the design can run.
45. Wavelet-Based SC-FDMA System
ABSTRACT: Recent research has shown that the Single Carrier Frequency Division Multiple Access (SCFDMA) is an attractive technology for uplink broadband wireless communications, because it does not
have the problems of Orthogonal Frequency Division Multiple Access (OFDMA) such as the large Peakto-Average Power Ratio (PAPR). In this paper, an efficient transceiver scheme for the SC-FDMA systems,
using the wavelet transform is proposed. In the proposed scheme, the Fast Fourier Transform (FFT) and
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
its inverse (IFFT) are replaced by the Discrete Wavelet Transform (DWT) and its inverse (IDWT). Wavelet
filter banks at the transmitter and the receiver have the ability to reduce distortion in the reconstructed
signals, while retaining all the significant features present in the signals. The performance of the
proposed scheme is investigated with different wireless channels. Simulation results show that the
proposed scheme provides better performance, when compared to the conventional SC-FDMA, while
the complexity of the system is slightly increased.
46. An Improved VLSI Architecture of S-box for AES Encryption
Abstract—This paper presents an improved VLSI architecture of S-box for AES encryption system.
Certain basic blocks in conventional architecture are replaced by efficient multiplexers and an optimized
combinational logic to facilitate speed improvement. The proposed as well as conventional architecture
are implemented in Xilinx FPGA and 0.18 μm standard cell ASIC technology. ASIC implementation
indicates speed enhancement while maintaining constant area compared to conventional architecture.
FPGA implementation also confirms speed improvement of about 0.6 ns along with low utilization of
FPGA fabrics. Furthermore, there is significant power improvement (155 %) compared to conventional
structure.
47. FPGA Based Design and Implementation of Modified Viterbi Decoder for a Wi-Fi Receiver
Abstract— Viterbi Decoders are employed in digital wireless communication systems to decode the
convolution codes which are the forward error correcting codes. Although widely-used, the most
popular communications decoding algorithm, the Viterbi Algorithm (VA), requires an exponential
increase in hardware complexity to achieve greater decode accuracy.
When the applications
based with wireless technology has been developed tremendously with the world. The constraint
length associated with the input bits are large, hence it needs to implement the larger
constraint length with lesser hardware and lesser computations for decode the original data. When
the decoding process uses the Modified Viterbi Algorithm (MVA) computations 50% reduced and
reduction in the hardware utilization, which follows the maximum- likelihood path. It shows plan
ahead associated with the modified Viterbi decoder implementation using Xilinx tool in verilog
design. An implementation on Field Programmable Gate Arrays (FPGA) provides user flexibility to a
programmable solutions and lowering the cost.
48. VLSI Architecture for a Reconfigurable Spectrally Efficient FDM Baseband Transmitter
Abstract—Spectrally efficient FDM (SEFDM) systems employ non-orthogonal overlapped carriers to
improve spectral efficiency for future communication systems. One of the key research challenges for
SEFDM systems is to demonstrate efficient hardware implementations for transmitters and receivers.
Focusing on transmitters,this paper explains the SEFDM concept and examines the complexity of
published modulation algorithms, with particular consideration to implementation issues. We then
present two new variants of a digital baseband transmitter architecture for SEFDM, based on a
modulation algorithm which employs the discrete Fourier transform (DFT) implemented efficiently using
the fast Fourier transform (FFT). The algorithm requires multiple FFTs, which can be configured either as
parallel transforms, which is optimal for throughput or using a multi-stream FFT architecture, for
reduced circuit area. We propose a simplified approach to IFFT pruning for pipeline architectures, based
on a token-flow control style, specifically optimized for the SEFDM application. Reconfigurable
implementations for different bandwidth compression ratios, including conventional OFDM, are easily
derived from the proposed implementations. The SEFDM transmitters have been synthesized, placed
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
and routed in a commercial 32 nm CMOS process technology and also verified in FPGA. We report circuit
area and simulated power dissipation figures, which confirm the feasibility of SEFDM transmitters.
49. Research and Implementation of Speaker Recognition Algorithm Based on FPGA
Abstract: As one of biometric identification technologies, speaker recognition shows better application
prospects in many fields. At present, the implementation of speaker recognition algorithm on the
hardware is mostly based on System on a Programmable Chip(SOPC) platform of Field Programmable
Gate Array(FPGA) with Nios II Intellectual Property(IP) core. And the algorithm can be selected and
optimized effectively on SOPC. However the high-speed and parallel operation of FPGA can’t be fully
utilized. By researching and analyzing the speaker recognition algorithm, a FPGA-based speaker
recognition method is presented in this paper. This method includes the Voice Active Detection (VAD),
Mel Frequency Cepstrum Coefficient (MFCC) extraction and Vector Quantization (VQ) recognition
algorithm. By using the technology such as Ping-pang operation, Pipeline and module reuse, it makes full
use of FPGA’s speed. After testing, this method can effectively meet the requirement of real-time data
processing.
50. Efficient Window-Architecture Design using Completely Scaling-Free CORDIC Pipeline
Abstract—Filtering being one of the most important modules in signal processing paradigm, this paper
presents an FPGA implementation of various window-functions using CORDIC algorithm to minimize
area-delay product. The existing window architecture uses a linear CORDIC processor in series with
circular CORDIC processor that results in a long pipeline. Firstly, we replace the linear CORDIC with
multiple optimized shift-add networks to reduce area and pipeline depth. Secondly, the conventional
circular CORDIC processor is replaced by a completely scaling-free CORDIC processor to further improve
the area-time efficiency of the existing design. As a result, the proposed window-architecture, on an
average requires approximately 64.34% less pipeline stages and saves upto 48% area. Both the existing
and the proposed window-architecture are capable of generating Hanning, Hamming and Blackman
window families.
VENSOFT Technologies, www.ieeedeveloperslabs.in Email: info@ieeedeveloperslabs.in Contact: 9448847874
Download