Compression and Decompression of FPGA Bit Stream Using Bitmask K.Khuresh Gouse

advertisement
International Journal of Engineering Trends and Technology (IJETT) – Volume 18 Number2- Dec 2014
Compression and Decompression of FPGA Bit Stream Using Bitmask
Technique
K.Khuresh Gouse1
1
N.Chitra2
K.Maheshwari3
PG Student (M.Tech), 2 Associate Professor, 3Associate Professor,
Dept. of ECE, Gates Institute of Technology, Gooty
Abstract — This Paper proposes an efficient decoding aware compression technique for compressing FPGA
configuration bit streams to improve compression ratio of the bits and decrease decompression ratio. This is
accomplished by efficiently choosing decoding aware data such as (length of the word, dictionary size and number
and bitmask type) combined with run length coding of repetitive word pattern. The decompression ratio is reduced
by reorganizing compressed bits to form length of fixed words. The experimental results illustrate that our approach
improves compression ratio by 10-15% over existing bit stream compression techniques and decompression
hardware is capable of running at 300MHZ. The decompression time to configure FPGA is decreased by 20-25%
over decompression accelerator.
Keywords— Bit stream compression, Decompression hardware, Decode aware algorithm, FPGA.
I. INTRODUCTION
In an ancient mounted word book based mostly
compression formula, the computer file with length N
bits is split into n words of every length w bits. Lists of
all distinctive words square measure then sorted in
descendent order of their occurrences. of these words
square measure then encoded victimization indices to
words keep Field Programmable Gated Arrays (FPGA)
store information on memories which are usually limited
in capacity and bandwidth. As FPGA are commonly
used in reconfigurable systems and application specific
integrated circuits (ASIC),. The bit stream compression
algorithms solve memory constraint issue by reducing
the size of the bit streams and decompression
accelerators increase the decoding speed by simple
decoding logic. But there are very few algorithms that
orders both efficient compression ratio and fast
decompression. Figure 1 shows the typical low of
compressed FPGA bit stream reconfiguration. Bit
streams generated by vendor specific bit generation
programs are compressed and stored on a persistent
memory. The decompression hardware decoded and
transfers the compressed bits from memory to
configuration hardware which is then transferred to
configurable logic blocks (CLB) memory.
ISSN: 2231-5381
Figure 1 Traditional FPGA reconfiguration with
compression
Compression ratio is the metric commonly used to
measure effectiveness of a compression technique,
defined as
() =
(
(
)
)
We can classify the existing bit stream compression
techniques into two categories: those having good
compression ratio but unacceptable decompression
overhead and complexity, and others which accelerate
decompression but compromises compression ratio. The
main idea of these algorithms is to store frequently
occurring sequence of bits using a static or sliding
dictionary or to use FPGA specific features (partial
http://www.ijcttjournal.org
Page 77
International Journal of Engineering Trends and Technology (IJETT) – Volume 18 Number2- Dec 2014
reconfiguration or read back to obtain repetitive patterns.
One of the promising compression techniques is bitmask
based code compression because of its good
compression ratio and simple decompression logic. The
direct application of this algorithm is not flexible in
choosing word length or number, size and type of
bitmasks or dictionary size. It is obvious that unlimited
use of these will result in better matches but will result
in multiple variable length encodings. However using
them is not profitable as they will result in slower and
complex decompression hardware. Hence it is a major
challenge to develop an efficient compression technique
which significantly reduces the bit stream size without
sacrificing decompression performance.
There are numerous compression algorithms that can
be used to compress configuration bit streams. These
techniques can be classified into two categories based on
how the redundancies are exploited: format specific
compression and generic bit stream compression.
The compression techniques within the initial class
exploit the native redundancies in an exceedingly single
or multiple bit streams by reading back the organized
information and storing the differences by activity
exclusive-OR (XOR) operation. These algorithms need
FPGA to support partial reconfiguration and frame read
back practicality. Pan et al. uses frame rearrangement
within the absence of read back facility on FPGA.
during this technique frames square measure reordered
specified the similarity between ensuant frames
organized is most. The distinction between consecutive
frames (difference vector) is then encoded using either
Huffman based mostly} run length encryption or LZSS
based compression. Another technique projected in the
same article organizes and read back the configured
frames. The frames are organized such that compressed
bit stream contains minimal number of difference
vectors and maximal read back of configured frames
thus reducing the compressed frames significantly. Such
complex encoding schemes tend to produce excellent
compression ratio. However, decompression is a major
bottleneck and is not addressed by Pan et al.
The generic bit stream compression technique uses
complete bit stream to extract the redundancies
insidelittle window (usually thirty two bytes) and
encrypt the data. a plus of those techniques is that no
ISSN: 2231-5381
special FPGA support is needed for decompression.
Parameterized LZSS chooses economical parameters
appropriate for bit stream compression and
decompression. The compression focuses on the
foremost repetition lengths within the matched strings to
encrypt partial set of these lengths exploitation less
range of bits and therefore the rest is encoded
exploitation canonical illustration. The decompression
hardware is fairly straightforward and is in a position to
decrypt at acceptable speed. LZ77 algorithmic rule
projected in a n exceedingly so works within the same
manner by matching the redundant symbols in a little
window.
In sum, the compression technique in achieves vital
compression however incurs forceful decompression
overhead. On the opposite hand the approaches in
second class try and maintain decompression overhead
in a suitable vary however compromises on compression
potency. Our technique tries to contemplate
decompression bottleneck and overhead throughout the
compression of bit streams. The compression parameters
area unit chosen such that compressed bit streams area
unit decrypt friendly whereas maintaining an honest
compression quantitative relation
II. DECODE AWARE BIT STREAM COMPRESSION
On the compression aspect, FPGA configuration bit
stream is analysed for choice of profitable wordbook
entries and bitmask patterns. The compressed bit stream
is
then generated
exploitation bitmask-based
compression and run length secret writing (RLE). Next,
our decode-aware placement algorithmic rule is used to
put the compressed bitstream within the memory for
efficient decompression. Throughout run-time, the
compressed bitstream is transmitted from the memory to
the decompression engine, and therefore the original
configuration bit stream is made by decompression.
Algorithm 1 outlines four important steps in our
decode-aware compression framework (shown in Fig.31) bitmask selection; 2) dictionary selection; 3) RLE
compression; and 4) decode-aware placement. The input
bitstream is first divided into a sequence of symbols with
length of ‘w’. Then bitmask patterns and dictionary
entries used for bitmask-based compression are selected.
Next, the symbol sequence is compressed using bitmask
and RLE. We use the same algorithm in to perform the
http://www.ijcttjournal.org
Page 78
International Journal of Engineering Trends and Technology (IJETT) – Volume 18 Number2- Dec 2014
bitmask-based compression. Finally, placing the
compressed bit stream into a decode friendly layout
within the memory using placement algorithm.
le in bitmask-based compression. Generally, there are
two types of bitmask patterns. One is “fixed” bitmask,
which can only be applied on fixed positions in a symbol.
The other one is “sliding” bitmask, which can be applied
Algorithm 1: Decode Aware Bitstream at any position. For example, a 2-bit fixed bitmask (“2f”
bitmask) is restricted to be used on even locations, but a
Compression
2-bit sliding bitmask (“2s” bitmask) can be used
anywhere. Clearly, fixed bitmasks require less bits to
Input: Input bitstream
encode its location, but they can only match bit changes
Output: Compressed bitstream placed in memory
at fixed positions. On the other hand, sliding bitmasks
are more flexible, but consume more bits to encode. In
Step 1:Divide input bitstream into symbol sequence SL. other words, only a few number of bitmask patterns or
their combinations are profitable for compression.
Step 2: Perform bitmask pattern selection.
Similar to [5], in our study of bit stream compression,
we
only
use
profitable
bitmask
Step 3: Perform dictionary selection.
patterns(1s,2s,2f,3s,3f,4s,4f).
Step 4: Compress symbol SL into code sequence CL 2. Dictionary Selection:
using bitmask and RLE (Run-Length
Algorithm 2 shows our dictionary selection algorithm.
Encoding)
Compared to the dictionary selection approach proposed
in [5] for instruction compression, we made an
important optimization at Step 5). In the original
algorithm [5], any node adjacent to the most profitable
node is removed, if its profit is less than certain
threshold. This mechanism is designed to reduce the
dictionary size. However, if the threshold is not chosen
properly, some high frequency symbols may be
incorrectly removed. Since the dictionary size in bit
stream compression is usually negligible compared with
Figure 2 Decode-aware bit stream compression framework. the size of the bit stream, it is not beneficial to reduce
Since memory and communication bus are designed the dictionary size by scarifying the compression ratio.
in multiple of bytes (8 bits), storing dictionaries or
transmitting data other than multiple of byte size is not
efficient. Thus, we restrict the symbol length to be
multiples of eight in our current implementation. Since
the dictionary for bit stream compression is smaller
compared to the size of the bit stream itself, we use d=2i
to fully utilize the bits for dictionary indexing, where is
the number of indexing bits.
1.Bitmask Selection:
Our bitmask-based compression is similar to [5],
where three types of encoding formats are used. Fig. 3
shows the formats in these cases: no compression,
Figure 3 Decompression Mechanism
compression using dictionary, and compression using
bitmask. The selection of bitmask plays an important ro
ISSN: 2231-5381
http://www.ijcttjournal.org
Page 79
International Journal of Engineering Trends and Technology (IJETT) – Volume 18 Number2- Dec 2014
Therefore, our algorithm used new heuristics in Step
5), which carefully removes edges instead of nodes.
Experimental results in Section V-A show that our
approach is more suitable for bit stream compression,
because we ensure better dictionary coverage.
Input bit steams
Divide input bitstream into symbol
sequence SL
Perform bitmask pattern selection.
Perform dictionary selection.
bitmasks for storing bitmask differences.Koch et al. [4]
benchmarks are compressed using 16 bit symbols, with
16 entry dictionary and a 2-bit sliding bitmask.
Compression Efficiency:
We first compare our improved bitmaskcompression technique with the original approach
proposed in [5]. To avoid the bias caused by parameter
selection, we use the same bitmask parameters for both
of them.
Three different compression techniques are compared
for
compression
efficiency:
1)
bitmaskbased
compression (BMC) [5]; 2) BMC with our dictionary
selection technique (pBMC); and 3) BMC with our
dictionary selection technique and run length encoding
(pBMC+RLE). Fig. 4 shows the compression results on
Pan et al. [1] and Koch et al. [4] benchmarks.
Compress SL symbol into CL symbol
using Bitmasking
Perform decode aware placement of
CL
Compressed bitstream is placed in
memory
The same results are found applicable to other
families and vendors too. In our experiments, Pan et al.
[1] benchmarks are compressed with 32 bit symbols,
512 entry dictionary entries and two sliding 2- and 3-bit
ISSN: 2231-5381
Fig 5.RLE based Compression
It can be seen that our dictionary selection algorithm
outperform the original technique.
The dictionary generated by our algorithm improves
the compression ratio by 4% to 5%. Since in our
approach we do not have to find the threshold value
manually for each bit stream, our algorithm adaptively
finds the most suitable dictionary entries for each bit
stream. On the other hand, our method has the same
performance.
The experimental results also illustrate the
improvement of compression ratio due to the run length
encoding used in our technique.
Decompression Efficiency:
We measured the decompression efficiency using the
time required to reconfigure a compressed bit stream,
the resource usage and maximum operating frequency of
http://www.ijcttjournal.org
Page 80
International Journal of Engineering Trends and Technology (IJETT) – Volume 18 Number2- Dec 2014
the decompression engine. The reconfiguration time is
calculated using the product of number of cycles
required to decode the compressed bit stream and
operating clock speed. We have synthesized
decompression units for variable-length bitmask-based
compression, difference vector-based compression (DV
RLE RB), LZSS (8 bit symbols6), and our proposed
approach on Xilinx Virtex II family XC2v40 device
FG356 package using ISE 9.2.04i to measure the
decompression efficiency.
III.
RESULTS AND CONCLUSION
This thesis analysed 2 set of compression algorithms.
a group of algorithms that reduces bit stream size with
higher compression magnitude relation however doesn't
contemplate the decompression overhead. Another set of
compression techniques that square measure economical
decompression however with unacceptable compression
magnitude relation. This thesis planned AN economical
rewrite aware compression technique that tries to
balance between higher compression magnitude relation
and least decompression overhead. The planned
compression technique analyses the impact of
parameters on decompression overhead and selects
compression parameters that square measure rewrite
friendly. This well combined with run length encryption
of consecutive repetitive patterns improves the
compression and decompression potency. This thesis
planned a strategic arranging rule to reorganize variable
length compressed bits to get mounted length
compressed bit streams. The mounted encryption of the
compressed words enabled the decompression engine to
rewrite at FPGA's high operational frequency. a unique
wordbook choice rule is devised that produces
wordbook, covering most words exploitation least
wordbook size and minimum variety of bitmasks. The
planned technique to compress reconfiguration bitstream
is found to enhance compression magnitude relation by
around 10-15% and therefore the decompression engine
capable of in operation at around 200MHZ. The
reconfiguration time is reduced by around 15-20%
compared to nearest decompression accelerator.
Memory and communication bandwidth has been a
major bottleneck in most of the system design. The
operational speed of different components is diverging
apart at an ever increasing pace. Decode aware
ISSN: 2231-5381
compression promises to bridge this gap by reducing the
data size and by accelerating the decompression process.
This thesis explored only few problems in
reconfigurable
systems
where
decode
aware
compression can system performance.
The proposed techniques in this paper can be further
explored in the following directions:
•
Bitmask compression technique allows better
compression and faster decompression engine. Binary
tries work on longest prefix and bit differences,
drastically reducing the bits required to encode. An
interesting approach is to combine these two techniques
to compress very hard to compress audio and video data.
Such a combination would provide faster decoding and
better lossless data compression.
•
The proposed technique can be explored to
apply in compressing data sent over heterogeneous
network elements. The decode aware decompression can
bridge the gap between the different bandwidth at which
the existing network elements work.
•
Further studies can be conducted to eliminate
the threshold parameter that is used to limit the
exploration of word length. The input data pattern can be
automatically analysed to choose the parameters for
compression. This will potentially bring the compression
ratio and decompression overhead closer to optimum
efficiencies.
•
The
current
application
of
optimal
representation of n bit difference can be further explored
on systems that store bit differences. The systems that
require large number of bitmasks to encode the data will
be benefited by the proposed optimal encoding scheme.
Some of the systems which we identified are in the area
of efficient database storage and differential data backup
based systems
Figure 4 Simulation result for Compression
http://www.ijcttjournal.org
Page 81
International Journal of Engineering Trends and Technology (IJETT) – Volume 18 Number2- Dec 2014
Input:
Uncompressed Bitstream:
00000000000000000000000000000000000000000
000000001000010
Output:
Compressed Bitstream: 0100011100000101
Figure 7 Technology Schematic Decompression
REFERENCES
Figure 5 Simulation result for Compression
Input:
Compressed Bitstream: 0100011100000101
Output: Uncompressed Bitstream:
000000000000000000000000000000000000000000
00000001000010
1. D. E. Knuth, J. H. M. Jr., and V. R. Pratt, “Fast
pattern matching in strings,” SIAM J. Comput.,
vol. 6, no. 2, pp. 323–350, June 1977.
2. R. S. Boyer and J. S. Moore, “A fast string
matching algorithm,” Commun. ACM, vol. 20, no.
10, pp. 762–772, October 1977
3. J. H. Pan, T. Mitra, and W. F. Wong,
“Configuration
bitstream
compression for
dynamically reconfigurable FPGAs,” in Proc. Int.
Conf. Comput.-Aided Des., 2004, pp. 766–773.
4. L. Feinstein, D. Schnackenberg, R. Balupari, and
D. Kindred, “Statistical approaches to ddos attack
detection and response,” in DISCEX, 2003.
5. D. Koch, C. Beckhoff, and J. Teich, “Bitstream
decompression
for
high speed
FPGA
configuration from slow memories,” in Proc. Int.
Conf. Field-Program. Technol., 2007, pp. 161–
168.
6. L. Spitzner, Honeypots: Tracking Attackers.
Addison-Wesley, 2002.
7. C.Morrow
http://www.secsup.org/Tracking. BlackHole Route
Server and Tracking Traffic on an IP Network.
8. http://www.snort.org. SNORT: Open-Source
Network IDS/IPS.
9. A. V. Aho and M. J. Corasick, “Efficient string
matching: an aid to bibliographic search,”
Commun. ACM, vol. 18, no. 6, pp. 333–340, 1975.
Figure 6 Technology Schematic for Compression
ISSN: 2231-5381
http://www.ijcttjournal.org
Page 82
International Journal of Engineering Trends and Technology (IJETT) – Volume 18 Number2- Dec 2014
Authors Profiles
K.Khuresh Gouse is pursuing his
Master degree M.Tech in VLSI
&Embedded systems Design in
Gates Institute of Technology,
Gooty.
N.Chitra, is working as Associate
Professor in Gates Institute Of
Technology, Gooty. Her areas of
interest include Communication
systems and VLSI.
K. Maheswari, is working as
Associate Professor in Gates
Institute Of Technology, Gooty.
Her areas of interest include Mobile
Communication,
wireless
communication, Cryptography
ISSN: 2231-5381
http://www.ijcttjournal.org
Page 83
Download