Document 12909693

advertisement
International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 9- September 2013
Efficient Field Programmable Gate Array
Implementation of Advanced Encryption Standard
Algorithm using VHDL
1
2
Archana garg1, Harmanjot Singh Dhaliwal2
Student,M.Tech,ECE,Punjabi university, Patiala.
Assistant Professor,ECE,Punjabi university, Patiala.
Abstract- For the security of data, various solutions
unintelligible data (known as ciphertext) through an
algorithms were proposed. The AES also known as the
algorithm referred to as cipher. Encryption is the
Rijndael algorithm was selected as a Standard by
transformation of data into a form that is as close to
National Institute of Standards and Technology (NIST).
Encryption algorithms are used to ensure security of
transmission channels. This paper presents an efficient
FPGA implementation approach of the Advanced
impossible as possible to read without the appropriate
knowledge (a key). Its purpose is to ensure privacy
by keeping information hidden from anyone for
Encryption Standard (AES) Algorithm. In this paper
whom it is not intended, even those who have access
two different architectures of AES named Basic AES
to the encrypted data. Decryption is the reverse of
and Fully Pipelined AES have been designed in VHDL.
encryption; it is the transformation of encrypted data
The codes have been synthesized using Xilinx ISE 9.2i
back into an intelligible form. Encryption and
software for a Virtex 3 FPGA device. The comparison is
decryption generally require the use of some secret
being done between the Basic AES and Fully Pipelined
information, referred to as a key. For some
AES algorithm on the basis of power consumption,
Maximum pin Delay, Clock delay, Slice Flip flops.
Keywords : , AES, Fully pipelined, FPGA, VHDL
encryption mechanisms, the same key is used for
both
encryption
and
decryption;
for
other
mechanisms, the keys used for encryption and
decryption is different. The general model of
Encryption and Decryption is shown in the figure
I INTRODUCTION
below :
In today’s digital world, encryption is emerging as a
disintegrable part of all communication networks and
information processing systems, for protecting both
stored and in transit data.
Encryption is the
transformation of plain data (known as plaintext) into
ISSN: 2231-5381
http://www.ijettjournal.org
Page 3956
International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 9- September 2013
An outline of AES encryption is given as :
SubBytes Transformation:
The SubBytes transformation is a non-linear byte
substitution, operating on each of the state bytes
independently. The SubBytes transformation is done
Figure 1: The Encryption Model
using a once-precalculated substitution table called Sbox. That S-box table contains 256 numbers (from 0
There are innumerous encryption algorithms that are
now commonly used in computation, but the U.S.
government has adopted the Advanced Encryption
Standard (AES) to be used by Federal departments
and agencies for protecting sensitive information.
The National Institute of Standards and Technology
(NIST) has published the specifications of this
encryption standard in the Federal Information
Processing Standards (FIPS) Publication [1].
II DESCRIPTION OF AES ALGORITHM
to 255) and their corresponding resulting values.
More details of the method of calculating the S-box
table refers to [3].
ShiftRows Transformation:
In Shift Rows transformation, the rows of the state
are cyclically left shifted over different offsets. Row
0 is not shifted; row 1 is shifted one byte to the left;
row 2 is shifted two bytes to the left and row 3 is
shifted three bytes to the left.
Mix Columns Transformation:
In Mix Columns transformation, the columns of the
The AES algorithm is a symmetric block cipher that
state are considered as polynomials over GF (28) and
can encrypt and decrypt information. Encryption
multiplied by modulo x4 + 1 with a fixed polynomial
converts data to an unintelligible form called cipher-
c(x), given by:
text. Decryption of the cipher-text converts the data
c(x)={03}x3 + {01}x2 + {01}x + {02}.
back into its original form, which is called plain-text.
AddRoundKey Transformation:
In the AddRoundKey transformation, a Round Key is
A. AES ENCRYPTION
added to the State - resulted from the operation of the
The AES algorithm operates on a 128-bit block of
Mix Columns transformation - by a simple bitwise
data and executed Nr - 1 loop times. A loop is called
XOR operation. The Round Key of each round is
a round and the number of iterations of a loop, Nr,
derived from the main key using the Key Expansion
can be 10, 12, or 14 depending on the key length. The
algorithm [4]. The encryption/decryption algorithm
key length is 128, 192 or 256 bits in length
needs eleven 128-bit Round Key, which are denoted
respectively. The first and last rounds differ from
Round Key [0] Round Key[10] (the first Round Key
other rounds in that there is an additional
[0] is the main key).
AddRoundKey transformation at the beginning of the
first round and no MixCoulmns transformation is
B. AES DECRYPTION
performed in the last round.[2]
ISSN: 2231-5381
http://www.ijettjournal.org
Page 3957
International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 9- September 2013
Decryption is a reverse of encryption which inverse
increases the performances and draws part of the
round transformations to computes out the original
architecture from the FPGA which supports the
plaintext of an encrypted cipher-text in reverse order.
pipeline. The pipelining is a technique that allows the
The round transformation of decryption uses the
increase of throughput, but for it it needs an increase
functions
in the used resources.
AddRoundKey,
InvMixColumns,
InvShiftRows, and InvSubBytes successively.
Pipelining is a form economic to make parallelism
AddRoundKey:
temporary in computers. The idea is as the assembly
AddRoundKey is its own inverse function because
lines of the industrial plants. The task is divided in a
the XOR function is its own inverse. The round keys
sequence of subtasks, each one of which it is
have to be selected in reverse order. The description
executed in a specialized stage of hardware that
of the other transformations will be given as follows.
works concurrently with another one of the stages of
InvShiftRows Transformation:
pipelining. The stages are circuits that conduct
InvShiftRows
exactly
functions
the
same
as
arithmetical operations or logics on the data set that
ShiftRows, only in the opposite direction. The first
flows through pipe. They are separated of registers of
row is not shifted, while the second, third and fourth
very high speed that store the intermediate results
rows are shifted right by one, two and three bytes
between stages, calls latches. This allows to increase
respectively.
throughput of the system of considerable form.
InvSubBytes transformation:
The pipelining does not reduce the run time of an
The InvSubBytes transformation is done using a
individual instruction; it increases the number of
once-precalculated substitution table called InvS-box.
instructions that are executed simultaneously and the
That InvS-box table contains 256 numbers (from 0 to
speed which are started and finished. [5]
255) and their corresponding values.
InvMixColumns Transformation:
In
the
InvMixColumns
IV FPGA IMPLEMENTATION OF BASIC AES
transformation,
the
ALGORITHM AND FULLY PIPELINED AES
polynomials of degree less than 4 over GF(28), which
coefficients are the elements in the columns of the
state, are multiplied modulo (x4 + 1) by a fixed
polynomial d(x) = {0B}x3 + {0D}x2 + {09}x + {0E},
AES is implemented as defined in the FIPS-197
document [1]. The Algorithm is implemented in ECB
mode.
where {0B}, {0D}; {09}, {0E} denote hexadecimal
values.
III FULLY PIPELINED AES
ALGORITHM
As we wish to obtain high performances and with
low consumption of the resources, the combination of
several techniques is essential. On a side, the pipeline
ISSN: 2231-5381
http://www.ijettjournal.org
Page 3958
International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 9- September 2013
ADDROUNDKEY
SUB
BYTES
SHIFT
ROWS
MIX
COLS
CIPHERTEXT
INPUTTEXT
[128BITS]
KEY[128BITS]
SUB
WORD
ROT
WORD
Figure 4.2: Process sequence for encryption/decryption
RCON
W[i-NK]
\\
Figure 2: Architectural block diagram
The decryption process follows virtually the same
order as encryption except for another round of mix
columns on the generated keys before giving them to
the add round key step. This flow is clearly explained
in
the
FIPS-197
document
[1]
.
The
encryption/decryption sequence Input data and key is
fed in two blocks of 64 bits in consecutive clock
A. RESULTS AND DISCUSSIONS
cycles with the load signal. 64 bits of input and key
are read in the posedge after the load signal goes high
and another block of 64 bits of input and key are read
in the posedge after the load signal goes low. Hence
the complete data and key is loaded only when the
load
signal
makes
a
low-high-low
transition
(basically a pulse). The process starts once the start
signal is pulsed and the output is validated with 'done'
signal 13 clock cycles after the 'start' signal goes low.
'done' remains high until the next start cycle.
ISSN: 2231-5381
http://www.ijettjournal.org
Page 3959
International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 9- September 2013
Table 2 show the summary of resources utilized by the Fully
Table 1. show the summary of resources utilized by the basic AES
pipelined AES for a Virtex 3 device. Out of available 64896
core for a Virtex 3 device. Out of available 64896 Slice Flip Flops,
Slice Flip Flops, 64896 4 input LUTs, 804 bonded IOBs and
64896 4 input LUTs, 804 bonded IOBs and 4
GCLKs and 4
4 GCLKs and 4 GCLKIOBs the designed core has only
GCLKIOBs the designed core has only utilized 906 Slice Flip
utilized 14057 Slice Flip Flops, 34479 4 input LUTs, , 385
Flops, 15742 4 input LUTs, , 261 bonded IOBs and 1 GCLKs and
bonded IOBs and 1 GCLKs and 1 GCLKIOBs. Thus %age
1 GCLKIOBs. Thus %age utilization of resources is 1% Slice Flip
utilization of resources is 21% Slice Flip Flops, 53% 4 input
Flops, 24% 4 input LUTs,32% bonded IOBs and 25% GCLKs and
LUTs,47% bonded IOBs and 25%
25% GCLKIOBs.
GCLKIOBs
GCLKs and 25%
V CONCLUSION
In this paper two different architectures of AES
named Basic AES and Fully Pipelined AES have
been designed in VHDL. The codes have been
synthesized using Xilinx ISE 9.2i software for a
Virtex 3 FPGA device.
ISSN: 2231-5381
http://www.ijettjournal.org
Page 3960
International Journal of Engineering Trends and Technology (IJETT) – Volume 4 Issue 9- September 2013
DEVICE
BASIC
FULLY
Fully Pipelined architecture uses more resources and
UTILIZATION
AES
PIPELINED
consumes more power, yet it has very high speed as
AES
compared
21%
connection delay has decreased from 868 ns to 1.631
SUMMERY
NUMBER OF SLICE
1%
Basic
AES
core.
The
average
ns only. Results also show that clock delay decreases
FLIP FLOPS
NUMBER OF 4
INPUT LUTS
NUMBER OF
BONDED IOBS
NUMBER OF
GCLKS
NUMBER OF
GLCKIOBS
POWER
CONSUMPTION
MAXIMUM PIN
DEALY
CLOCK DELAY
AVERAGE
CONNECTION
DELAY
to
24%
53%
from 1.427 ns to 1.407 ns.
32
47
REFERENCES
25
25
25
25
7 mW
367 mW
3.843 ns
5.037 ns
1.427 ns
868 ns
1.407 ns
1.631 ns
[1] National Institute of Standards and Technology,
“Federal
Information
Processing
Standards
Publication 197”, 2001
[2] Hoang Trang; Nguyen Van Loi; “An efficient
FPGA implementation of the Advanced Encryption
Standard algorithm” IEEE 2012.
[3] Ahmad, N.; Hasan, R.; Jubadi, W.M; “Design of
AES S-Box using combinational logic optimization”,
Table 3 : Comparison between basic AES and Fully pipelined AES
IEEE Symposium on Industrial Electronics &
The implementation of Basic AES and Fully
Applications (ISIEA), pp. 696-699, 2010.
Pipelined AES shows that Basic AES core uses 1%
Slice Flip Flops, 24% 4 input LUTs,32% bonded
[4] Daemen J., and Rijmen V, "The Design of
IOBs and 25% GCLKs and 25% GCLKIOBs. The
Rijndael: AES-the Advanced Encryption Standard",
core will take only 7 mW of power. The average
Springer-Verlag, 2002
connection delay for this design is 868 ns. The
maximum pin delay is 3.843 ns. The clock delay for
[5] Oscar Perez, Yves Berviller, Camel Tanougast,
the core will be 1.427 ns. Whereas the Fully
Serge Weber “Comparison of various strategies of
Pipelined AES core uses 21% Slice Flip Flops, 53%
implementation of the algorithm of encryption AES
4 input LUTs, 47% bonded IOBs and 25% GCLKs
on FPGA” IEEE ISIE 2006, July 9-12, 2006,
and 25% GCLKIOBs. The designed core will take
only 367 mW of power. The average connection
[6] Yang Jun Ding Jun Li Na Guo Yixiong “FPGA-
delay for this design is 1.631 ns. The maximum pin
based design and implementation of reduced AES
delay is 5.037 ns. The clock delays for the core will
algorithm” IEEE 2010
be 1.407 ns.
ISSN: 2231-5381
http://www.ijettjournal.org
Page 3961
Download