Uploaded by International Research Journal of Engineering and Technology (IRJET)

IRJET- Low Power and Simple Implementation of Secure Hashing Algorithm (SHA-2) using VHDL Implemented on FPGA of SHA-224/256 Core

advertisement
International Research Journal of Engineering and Technology (IRJET)
e-ISSN: 2395-0056
Volume: 06 Issue: 04 | Apr 2019
p-ISSN: 2395-0072
www.irjet.net
Low Power and Simple Implementation of Secure Hashing Algorithm (SHA-2) Using VHDL
implemented on FPGA of SHA-224/256 core.
Prof. Rishi Jha
Deptt of Electronics and Communication
Ms.Shreshtha Mishrs (Garg)
M Tech (Embedded System and VLSI Design)
---------------------------------------------------------------------***---------------------------------------------------------------------The most critical step for implementation is compression that
consists of a 64-step loop (80 for SHA-384/512) that is
dependent on each other and therefore not parallelizable. Even
expansion in the official drafting of the algorithm is a loop,
however, it can be carried out without loosing clock cycles
while loading the block through a buffer and a pointer over it.
Abstract- Cryptography plays an important role in the
security of data. Even though the data is encrypted it can be
altered while transmitting on the network so data should be
verified using a digital signature. Hashing algorithms are used
to create these digital signatures for verification of the data
received. Hashing algorithm like Secure Hash Algorithm-2
(SHA-2(224/256)) is designed which has a fixed output length
of 512-bits. Then to improve on power a low-power technique
such as latch based clock gating technique is used.
We performed other modifications and optimizations
(rescheduling, pipelines between the stages of the different
blocks, redundancies) to reduce the total delay, some of the
many documents available on the net, the main ones in the
bibliography (2) (3), others introduced by Tailored to my
design. I have examined and discarded many other
possibilities (unrolling, quasi-pipeline, special additions)
because they are not suitable for our architecture.
Introduction- In this research paper several hardware
optimization techniques for the SHA-2 hashing functions were
explored. A new architecture i.e., Round Pipelined Technique
was proposed for the SHA-2 core, which eliminates the data
dependency between iteration using data forwarding to
improve the throughput per area. The fully iterative and
Round Pipelined Techniques were investigated and developed
using HDL. A comparison with other published results depicts
57% improvement in throughput per slice for SHA-256 and
17% improvement in throughput per slice for SHA-512.
Implementation results indicate that the Round Pipelined
technique can help to achieve good tradeoff between
throughput and area. As future work, implementations of the
SHA-2 core may be attempted by adopting various other
design techniques like a design optimized for area using better
resource sharing or loop unrolling techniques. Another
optimization effort could be to increase the depth of the
pipeline stages to increase the throughput.
Cryptographic Hash Function Designs (SHA-2 )
Three new revised versions of SHA were added into SHA
family by NIST in August 2002 as FIPS 180-2, known to be
SHA-256, SHA-384, and SHA-512 with the respective hash
value lengths of 256, 384, and 512 bits. Later in 2008, FIP
PUB 180-3 was issued as a revised document, which added
SHA-224 [RFC3874] into the family. Collectively, these
algorithms are recognized as SHA2.
The aim of the project is to implement VHDL of the Hash
Algorithm version 2 (SHA-2) hashing cryogenic algorithm,
which will only provide partial and useful information to
understand my work.
From an arbitrary length message (<264), the algorithm
generates a fixed length hash, equal to the number present in
the name of the four variants of the SHA-224, SHA-256,
SHA-384, SHA-512 algorithm.
Fig. One Iteration of Compression Function of SHA-2
Family
A preprocessing (or padding) phase is provided in which the
message is bit 1 and so many bits 0 to its length to a value
divided by 512 for the rest 448 (i.e., 512 to 64) and then
accodated The length of the original message (64 bits).
SHA256 Algorithm
The SHA256 hashing calculation task can be advantageously
separated into three unmistakable activities. They are as per
the following:
The message is divided into 512-bit blocks and the algorithm
is applied sequentially to cascade on the blocks, using the
result obtained from the previous block as the input data for
the next calculation. Each block passes through three phases:
data expansion, compression cycle, and hash processing (or
intermediate digestion).
© 2019, IRJET
|
Impact Factor value: 7.211
•
•
|
Pre-handling: Operation that performs cushioning
rationale and parses the info message
Message scheduler: Function that creates sixty-four
words from a 16 word input message square
ISO 9001:2008 Certified Journal
|
Page 1713
•
International Research Journal of Engineering and Technology (IRJET)
e-ISSN: 2395-0056
Volume: 06 Issue: 04 | Apr 2019
p-ISSN: 2395-0072
www.irjet.net
Wt = Mt
Compression work: Function that completes the
genuine hashing activity of the message-subordinate
word that leaves the message scheduler in each round
For 16 ≤ t ≤ 63,
Wt = 1(Wt-2) + Wt-7 + 0(Wt-5) + Wt-16
SHA256 Pre-preparing
Equation 6: SHA256 Message Scheduler. Source: [26]
The SHA256 pre-preparing is the underlying advance that
should be performed before the message planning and the
pressure capacity can be connected. The pre-preparing stage
plays out the accompanying three undertakings all together:
•
•
•
Here, σ 0 and σ 1 are two logical functions specific to the
SHA256 message scheduler that operate on a 32 bit word. The
details of these functions are provided below:
0(x) = ROTR7(x) ⊕ ROTR18(x) ⊕ SHR3(x)
Pad the message to make it a various of 512 bits,
Parse this message into 512 piece squares, and
Set the underlying hash esteem
1(x) = ROTR17(x) ⊕ ROTR19(x) ⊕ SHR10(x)
Equation 7: Logical Functions σ0 and σ1. Source: [26]
Padding the Message
The two consistent capacities 0 and 1 work on an expression
of the information message and apply the above bitwise tasks
to it. ROTRx remains for bitwise turn appropriate for x bits,
SHRx remains for bitwise move right and ⊕ remains for the
bitwise selective or. This message plan square is normally
executed in equipment by utilizing 16 phases of 32 bit move
registers and three 32 bit adders [33] for the 512 piece
information square handling.
The message to be hashed should be cushioned first.
Cushioning is done as such as to guarantee that the message to
be hashed is a numerous of the square size for SHA256 i.e.
512 bits. Presently, in the event that we consider that the
length of the message is l bits, the cushioning rationale is with
the end goal that it attaches a bit "1" toward the finish of the
real message which is then trailed by k number of zero bits.
Here, k is the littlest, non negative answer for the
accompanying condition [26]:
Each round, the 32 bit estimation of Wt is moved to one side
utilizing the move enrolls as past estimations of Wt are
required to ascertain future estimations of Wt. It is engaging
realize that two sensible capacities σ0, σ1 and the message
plan rationale clarified don't become an integral factor until
the point that the seventeenth round. The 512 piece input
message is nourished as it is to the message pressure work for
the initial 16 rounds.
l + 1 + k = 448 mod 512
Condition 5: SHA256 Padding Logic
The condition above is such in light of the fact that SHA256
permits an info message to have a length of up to 264 bits.
After the trail of 0 bits, a 64 bit square is attached toward the
end that is equivalent to l spoke to in a double portrayal.
SHA256 Message Compression Function
Parsing the Padded Message
After the message has been cushioned utilizing the rationale
clarified above, it is parsed into N 512-piece squares with the
goal that the message planning and hash calculation can be
initiated.
Setting the Initial Hash Value (H0)
Prior to the hash calculation begins, the underlying hash
esteem is set which comprises of the accompanying 32 bit
words:
H00
H01
H02
H03
H04
H05
H06
H07
0x6a09e660xbb67ae80x3c6ef37 0xa54ff53 0x510e52 0x9b0568 0x1f83d9a0x5be0cd1
7
5
2
a
7f
8c
b
9
The message pressure work plays out the genuine hashing
activity and is the principle task that authorizes the restricted
property of SHA256. Other than the eight expressions of
working factors A, B, C, D, E, F, G and H that are utilized and
refreshed in each cycle, two transitory words T1 and T2 are
additionally utilized by the message pressure work for
calculation of the factors An and E in each round. The initial
step that the message pressure work performs is that it
initialises these 8 working factors with the IV (in the event
that it is the principal square) or with the halfway hash of the
past square (on the off chance that it isn't the primary square
being hashed).
Table 1: SHA256 Initialisation Vector. Source: [26]
It is interesting to know the origin of this 8 word value. They
were obtained by taking the first 32 bits of the fractional parts
of the square roots of the first 8 prime numbers. This initial
hash value acts as the IV for the SHA256 algorithm as
explained in the earlier section.
SHA256 Message Scheduler
Figure 3: SHA256 Message Compression Function (Above)
and Message Scheduler (Below)
After the pre-processing stage is completed, the message
schedule block takes the first 512 bit message block and
outputs the message dependant words Wt. The 32 bit messagedependant words that that are output by the message scheduler
for every round are labelled as W0, W1,…, W63 (for t=0 to
63) and they are calculated as follows:
The figure above demonstrates the common execution of the
message pressure work and the message scheduler that work
couple. It can be obviously observed from the figure over that
at each cycle, 6 out of 8 estimations of A, B, C, and E, F, G
are moved by one position to B, C, D and F, G, H separately.
Wt is the 32 bit information figured by the message scheduler
For 0 ≤ t ≤ 15,
© 2019, IRJET
|
Impact Factor value: 7.211
|
ISO 9001:2008 Certified Journal
|
Page 1714
International Research Journal of Engineering and Technology (IRJET)
e-ISSN: 2395-0056
Volume: 06 Issue: 04 | Apr 2019
p-ISSN: 2395-0072
www.irjet.net
and bolstered to the pressure capacity and Kt is a round
particular 32 bit steady whose round particular qualities are
determined in addendum A. SHA256 utilizes 64 constants (1
for each round) that are 32 bit words and they have been
gotten by taking the initial 32 bits of the partial parts of the
block foundations of the initial 64 prime numbers. Factors An
and E are reliant on all information esteems and are figured in
each round utilizing conditions clarified straightaway. After
the 8 working factors are initialised as clarified before, 64
rounds of the pressure work are connected to them and middle
of the road round estimations of these factors are ascertained
as takes after:
T1 = H + Σ1(E) + Ch(E, F, G) + Kt + Wt
T2 = Σ0(A) + Maj(A, B, C)
H = G; G = F; F = E
Figure 4: SHA256 Compression Function Along with the
Final Additions.
E = D + T1 = D + H + Σ1(E) + Ch(E, F, G) + Kt + Wt
D = C; C = B; B = A
The figure above represents a different look at the
compression function but it conveys the same message. The
point to take away from this figure is that after the
compression function has been applied 64 times i.e. after the
64 rounds have been completed, the values contained in the
working variables A to H are finally added to the 8 word data
block that was fed to the compression function at the
beginning. This value could either be the constant IV for
SHA256 or an intermediate message digest. This is because of
the fact that the SHA256 algorithm follows the Davies-Meyer
construction where the input is added to the output at the end.
Now, the intermediate/final hash is given by the following
equation:
A = T1 + T2 = H + Σ1(E) + Ch(E, F, G) + Σ0(A) + Maj(A, B,
C) + Kt + Wt
Equation 8: Message Compression Function. Source: [26]
The four logical functions mentioned above perform the core
operation of introducing the confusion and diffusion in Wt that
enters in 32 bit words at each round. After applying the above
equations to the working variables for 64 rounds, an
appropriate level of the avalanche effect is observed. These 4
logical functions are now explained next.
Ch(X, Y, Z) = (X ∧ Y)⊕(¬X ∧ Z)
Maj(X, Y, Z)= (X ∧ Y)⊕(X ∧ Z)⊕(Y ∧ Z)
H0 i+1 = A + H01
Σ0(X) = ROTR2(X) ⊕ ROTR13(X) ⊕ ROTR22(X)
H1 i+1 = B + H11
Σ1(X) = ROTR6(X) ⊕ ROTR11(X) ⊕ ROTR25(X)
H2 i+1 = C + H21
Equation 9: Logical Functions Ch, Maj, Σ0 and Σ1.
H3 i+1 = D + H31
Here, logical functions Ch and Maj take 3 words as input and
produce a single word output. ∧ stands for a 32 bit Bitwise
AND operation while ¬ is the compliment operation. The Ch
function always takes the working variables E, F and G as
inputs while the Maj function always takes A, B and C as
inputs. Variables A and E are the ones that need to be
computed at each round. Functions Σ0 and Σ1 always take
variables A and E as their input. We canthus see some sort of
symmetry in the message compression function that divides it
into two parts. This symmetry is evident from figure 3.
H4 i+1 = E + H41
H5 i+1 = F + H51
H0 i+1 = G + H61
H0 i+1 = H + H71
Equation 10: Calculation of Intermediate/Final Hash Value
After all the message blocks including the final Nth message
block has been processed in this manner, the final hash i.e. the
256 bit message digest of the message is represented in the
following manner:
SHA256(M) = H0N || H0N || H0N || H0N || H0N || H0N || H0N H0N
Equation 11: Resulting SHA256 Message Digest. Source: [26]
This 8 word information square (H0 - H7i) is the default
steady SHA256 initialisation vector (IV) if the message was
not exactly or equivalent to 512 bits (counting the cushioning).
In the event that the length of the message (counting
cushioning) is more noteworthy than 512 bits, at that point this
8 word information square is the middle hash figured of the
past 512 piece square. This game plan where the halfway hash
estimation of the past square is nourished as IV to the hash
© 2019, IRJET
|
Impact Factor value: 7.211
|
ISO 9001:2008 Certified Journal
|
Page 1715
International Research Journal of Engineering and Technology (IRJET)
e-ISSN: 2395-0056
Volume: 06 Issue: 04 | Apr 2019
p-ISSN: 2395-0072
www.irjet.net
calculation of the following square is known as the MerkleDamgård development. SHA256 depends on this development
called the Merkle-Damgård Paradigm and is worked to be
impact safe as the hidden SHA256 pressure work is crash safe.
Sg_sha_kt_rom
This is the 64 words coefficients rom for the square hash
center.
•
•
Module Architecture of Proposed Design
Sg_sha256
This is the sg_sha256 motor best level.
•
•
•
•
•
•
•
Sg_sha_msg_sch
The sg_sha256 is a stream hash motor, i.e., the
information words are hashed as a surge of words read
from an info transport, with control contributions for
BEGIN/END of the message information stream. The
information transport is a 32bit word transport, with a
byte path selector to signalize what number of bytes are
legitimate in the last word.
The center is a basic coordination of the rationale
obstructs for the SHA256 motor, with the inner
information way and control-way wires.Written in synthesizable VHDL, the hash motor is a
low asset, zone productive execution of the SHA256
hash calculation.
The information input port is sorted out as a 32bit word
compose enroll, with stream control and start/end
signals.
The 256bit outcome enlist is sorted out as 8 x 32bit
registers that can be perused at the same time.
This execution is a traditionalist usage of the affirmed
FIPS-180-4 calculation, with a reasonable bargain of
assets, containing just 32 registers of 32bit words for
the hash motor, with a solitary cycle combinational
rationale for every calculation step.
The SG_SHA256 is a fundamental cryptographic
square, utilized by all encryption and advanced mark
plans.
This is the message scheduler information way for the sha256
processor.
Sg_sha_padding
This is the byte cushioning datapath rationale for the sha256
processor.
•
•
•
•
The regs square has the yield result registers for the SHA256
processor.
•
•
The sequencer state machine controls the hash data
way modules, making addresses for the coefficients
ROM, stack/enable signs for the
•
message design, hash focus and yield registers circuit
squares, and control signals for the data padding
reason.
The only practical way of a high speed SHA256 engine is to
implement it in hardware be it either FPFAs or the recent
technology of ASICs. Software implementations of Bitcoin
mining used in CPU or GPU mining have become obsolete as
they simply cannot compete with the hashing power of
hardware implementations. These hardware implementations
truly serve the meaning of a fast implementation and various
hardware optimisations have been proposed over the years in
order to increase their throughput and to reduce their power
consumption. These optimisations, however, are aimed at the
hardware implementation of the SHA256 hashing algorithm in
general rather than SHA256 employed for Bitcoin mining.
Most of these optimisations are aimed towards the longest data
path or the critical path in the SHA256 core which is the
calculation of working variable A in the message compression
function that involves mod 232 additions of 7 operands (see
equation 8 in section 3.2.3). We shall now have a look at the
various SHA256 hardware speedup proposals made.
Sg_sha_hash_core
This is the 256bit single-cycle hash center preparing rationale
for each of the 64 square advances.
•
The combinational profundity of this square is 8 layers
of rationale and adders.
SHA256 Hardware Optimisations
Many hardware implementations have been seen in the
literature that are either FPGA [15] [16] [25] [30] [39] or
ASIC designs [21] [22] [33] [37] [47]. These implementations
designs contain one or a combination of the following
optimisations so as to speed up the calculations i.e. the
throughput of the SHA256 core. The main design difference
for the hardware implementation of SHA256 lies in the trade-
Sg_sha_Ki_rom
Starting qualities for the hash result registers.
•
•
This module is demonstrated as a settled esteem work.
It can be actualized as a nearby steady settled esteem.
© 2019, IRJET
|
Impact Factor value: 7.211
It is a solitary cycle 256bit Accumulator for the square
hash comes about, and can be actualized as a 32bit
MUX and a 32bit convey chain for each enroll.
The Hardware Implementations and Optimisations of
SHA256
This is the control route method of reasoning for the
sg_sha256 brisk engine.
It is a totally synchronous diagram, with all signs
synchronous to the rising edge of the system clock.
The cushioning of the last square is controlled by the
byte path selectors and the last words selectors.
These control signals are produced at the Control Logic
square of the SHA256 processor.
A consistency check blunder flag is created, to hail
illicit control conditions.
This square is a completely combinational circuit, with
2 layers of rationale.
Sg_sha_regs
Sg_sha_control
•
It is displayed as an offbeat addressable ROM memory.
Depending on the manufacture procedure and
innovation, this memory can be executed as an OTP, a
MUX, a settled LUT or a combinational capacity.
|
ISO 9001:2008 Certified Journal
|
Page 1716
International Research Journal of Engineering and Technology (IRJET)
e-ISSN: 2395-0056
Volume: 06 Issue: 04 | Apr 2019
p-ISSN: 2395-0072
www.irjet.net
off between throughput and the area complexity which is
measured in Gate Equivalents (GE). But, in our cases of
Bitcoin mining, we typically have no area/space constrains
and thus we shall concentrate on the throughput optimisations
only. More the area, more is the throughput and lesser is the
number of required clock cycles to perform the SHA256
computation.
but this architecture requires additional control circuitry for
the additional register introduced in the architecture.
Addition of Kt and Wt
Looking at figure 3 and figure 4, we can see that the addition
of Kt and Wt can be performed independent of the message
compression function. The architecture proposed in [52] uses
this as an improvement by moving Kt + Wt to the message
scheduler stage. This can be done because both Wt and Kt are
available before and are independent of the other operands
(see equation 8 in section 3.2.3). However, it is seen that
quasi-pipelining architecture proposed by Dadda et. al. [21]
[22] [36] performs a similar separation of the operands and the
resulting critical path is even shorter than in [52].
Use of Carry-Save Adders (CSAs)
As mentioned before, the calculation of the working variable
A for each round of the compression function forms the
longest data path or the critical path in the SHA256 core. This
involves mod 232 additions on 7 operands (Kt, Wt, H, Σ1(E),
Ch(E, F, G), Σ0(A), Maj(A, B, C)). Architectures [21] [22]
[35] [36] [39] employing Carry Save Adders (CSAs) minimise
the delay caused by the carry propagation time by separating
the sum and the carry paths. CSAs accept 3 operands as inputs
and so, the working variable A can be computed using just 5
CSAs [22]. Having said that, CSAs require another 2-input
adder for the recombination of the sum and carry paths. This 2
operand addition can either be performed by using CLAs i.e.
Carry Look Ahead adders or by using CPAs i.e. Carry
Propagation Adders. The net result of using CSAs for the
critical path is that they reduce the carry propagation delay
caused as compared to traditional CPAs used on the critical
path.
Operation Rescheduling
Architectures that employ operational rescheduling allow an
efficient use of a pipelined structure without increasing the
area complexity. This in turn allows higher throughputs. [15]
[16] have claimed that they were able to reduce the critical
path in a similar manner as unrolling techniques and gain a
higher throughput without adding more area complexity.
Module Architecture of Proposed Design
SHA-2 hardware is architected by using iterative and round
pipeline basis where iterative SHA224/256 takes 64 clock
cycles and pipelined operation takes 68 clock cycles.
SHA384/512 takes 80 clock cycles for iterative operation and
84 clock cycles for pipelined operation to calculate the final
hash value.
Unrolling
Unrolled architectures [20] [35] [36] [39] reduce the number
of clock cycles required to perform the SHA256 hash
computation by implementing multiple rounds of the SHA256
compression function using combinational logic. These
architectures help improve the throughput by optimising the
data dependencies involved in the message compression
function. Say if the SHA256 core was unrolled once, then this
would effectively mean that the hash should be calculated in
half the number of clock cycles. As a trade-off, unrolling the
SHA256 core architecture comes at the cost of a decrease in
the clock frequency and an increase in the area complexity.
The aim is to implement the designed hash function core on
VHDL. The whole package and separate modules were
synthesized and analyzed using Xilinx ISE 12.1 tool for the
targeted Virtex-VI FPGA.
The VHDL implementation was divided into five modules:
•
•
(Quasi-) Pipelining
The goal of quasi-pipelining is to optimise the critical path and
therefore increase the clock frequency. Quasi-pipelined
SHA256 architectures [21] [22] [36] [39] use registers to
break the long path or the critical path of the computation of
the working variable A in the message compression function.
Thus, such quasi-pipelines architectures allow higher data
throughputs and higher frequencies of hash calculations by
achieving very short critical paths. Pipelining is not as easy to
achieve as it sounds due to the feedback associated due to the
way in which the SHA256 compression function is designed.
As a result, an external control circuitry is required such that
the registers are enabled correctly.
•
•
•
Delay Balancing
Dadda et al. [22] have been the pioneers in hardware
optimisations of SHA256 and they have also spent their
research efforts on delay balancing along with the use of
CSAs. Just as described earlier, a CLA is used to combine the
sum and the carry paths output by the CSA but these sum and
carry paths are first registered so that the CLA adder is
removed from the critical path. This increases the throughput
© 2019, IRJET
|
Impact Factor value: 7.211
|
Initial module: - It collects the serial input bits and
sends 512 bit blocks to the next module.
Round module: - It performs the hashing calculations
and operations on the input message block and
previous hash output to generate a new hash value.
Last Block module: - At the end of the message bit
stream the final message block of 512 bits has to be
prepared by adding 64 bits of message length at the end
of 448 bits of input message block, padded accordingly
to suffice the word size requirement. This final
message block does this function of preparing the last
message block.
Final module: - This module computes the hash value
by adding the previous hash value to the new hash
value achieved from the Round module. Then it sends
the 256 bit hash value, bit by bit (serially).
Top module: - This module is the control unit for
controlling the functioning of the rest of the modules
and to ensure that the SHA-2 algorithm flow is
followed and maintained
ISO 9001:2008 Certified Journal
|
Page 1717
International Research Journal of Engineering and Technology (IRJET)
e-ISSN: 2395-0056
Volume: 06 Issue: 04 | Apr 2019
p-ISSN: 2395-0072
www.irjet.net
control path logic
Complete Top Level Logic Design of SHA-256
Sg_sha_hash_core
Sg_sha256
This is the sg_sha256 engine top level.
•
•
•
•
•
•
•
This is the 256bit single-cycle hash core processing logic for
each of the 64 block steps.
The sg_sha256 is a stream hash engine, i.e., the data
words are hashed as a stream of words read from an
input bus, with control inputs for BEGIN/END of the
message data stream. The input bus is a 32bit word bus,
with a byte lane selector to signalize how many bytes
are valid in the last word.
The core is a structural integration of the logic blocks
for the SHA256 engine, with the internal data-path and
control-path wires.Written in synthesizable VHDL, the hash engine is a
low resource, area-efficient implementation of the
SHA256 hash algorithm.
The data input port is organized as a 32bit word write
register, with flow control and begin/end signals.
The 256bit result register is organized as 8 x 32bit
registers that can be read simultaneously.
This implementation is a conservative implementation
of the approved FIPS-180-4 algorithm, with a fair
compromise of resources, comprising of only 32
registers of 32bit words for the hash engine, with a
single-cycle combinational logic for each algorithm
step.
The SG_SHA256 is a basic cryptographic block, used
by almost all encryption and digital signature schemes.
•
256bit single-cycle hash core processing logic
Sg_sha_Ki_rom
Initial values for the hash result registers.
Sg_sha_control
•
•
This is the control path logic for the sg_sha256 fast engine.
•
•
•
It is a fully synchronous design, with all signals
synchronous to the rising edge of the system clock.
The sequencer state machine controls the hash data
path modules, generating addresses for the coefficients
ROM, load/enable signals for the
message schedule, hash core and output registers
circuit blocks, and control signals for the input padding
logic.
© 2019, IRJET
The combinational depth of this block is 8 layers of
logic and adders.
|
Impact Factor value: 7.211
This module is modelled as a fixed value function.
It can be implemented as a local constant fixed value.
Sg_sha_kt_rom
This is the 64 words coefficients rom for the block hash core.
•
|
It is modelled as an asynchronous addressable ROM
memory.
ISO 9001:2008 Certified Journal
|
Page 1718
•
International Research Journal of Engineering and Technology (IRJET)
e-ISSN: 2395-0056
Volume: 06 Issue: 04 | Apr 2019
p-ISSN: 2395-0072
www.irjet.net
Depending on the fabrication process and technology,
this memory can be implemented as a OTP, a MUX, a
fixed LUT or a combinational function.
Sg_sha_msg_sch
This is the message scheduler data path for the sha256
processor.
Sg_sha_padding
This is the byte padding datapath logic for the sha256
processor.
•
•
•
•
The padding of the last block is controlled by the byte lane
selectors and the last words selectors.
These control signals are generated at the Control Logic block
of the SHA256 processor.
A consistency check error signal is generated, to flag illegal
control conditions.
This block is a fully combinational circuit, with 2 layers of
logic.
Registers for the SHA256
Conclusion
The future cryptographic hash standard SHA-2 should
be sensible and versatile for a broad assortment of
usages, featuring meanwhile a perfect security quality.
In this work, we showed an aggregate hardware
depiction of the SHA-2. A round rescheduling
technique and an exceptional reason memory design
are moreover proposed. Post-mix comes to fruition , a
low-control littler utilization of SHA-2 has been
Implemented. I assume that a similar approach for
littler VLSI use of cryptographic traditions is a
productive choice to reduce the area and power use of
the organized circuit.
The foremost purpose of this proposition was to
propose improvements in the SHA256 hashing
estimation. This point was roused by the manner in
which that various hardware based progressions in
SHA256 gear executions have quite recently been
suggested anyway they have been away for the
SHA256 hashing estimation when all is said in done.
byte padding datapath logic
Sg_sha_regs
In perspective of that, the fundamental responsibility
made in this suggestion has been the SHA256 figuring
headway proposition that are specific to decreasing
hardware. The hypothesis in like manner made an
undertaking to deal with establishment information and
moreover the related information which would be
required with a particular true objective to totally
acknowledge what was being prescribed. A talk has
moreover been impacted concerning the precision of
the Savings To factor and the necessity for executing
and taking a gander at as performed by off-the-rack
SHA256 and the propelled version of SHA256 for a
more exact assessment of this Factor. The prerequisite
for an essential examination of the figuring
improvements' closeness with existing gear headways
has furthermore been discussed. It is assumed that the
prescribed improvements will accomplish radical
throughput overhauls in devices.
The regs block has the output result registers for the SHA256
processor.
•
It is a single-cycle 256bit Accumulator for the block
hash results, and can be implemented as a 32bit MUX
and a 32bit carry chain for each register.
© 2019, IRJET
|
Impact Factor value: 7.211
|
ISO 9001:2008 Certified Journal
|
Page 1719
International Research Journal of Engineering and Technology (IRJET)
e-ISSN: 2395-0056
Volume: 06 Issue: 04 | Apr 2019
p-ISSN: 2395-0072
www.irjet.net
The extensive variety of achieved displays makes prepared for
the utilization of the SHA-2 ability to various hardware use.
International Conference on Field-Programmable
Technology, 2002.
[12.]
Shay Gueron, Vlad Krasnov, “Parallelizing message
schedules to accelerate the computations of hash
functions”, Journal of Cryptographic Engineering,
Volume 2, Issue 4, pp 241-253,November 2012.
[13.]
Helion Technology Ltd. (http://www.heliontech.com)
[14.]
Chanjuan Li1, Qingguo Zhou2, Yuli Liu2, Qi,
“Costefficient Data Cryptographic Engine Based on
FPGA”,Fourth International Conference on Ubi-Media
2011.
Shutting remarks are,
•
'Lightweight' is the rising star of cryptography. The
term 'lightweight' alone covers a wide range, for
instance, lightweight to the extent zone, speed, control
usage, essentialness use, or a blend of these, dependent
upon the specific application.
•
This ask about solely centers around the lightweight for
zone, which in like manner realizes lightweight for
typical power usage in numerous applications.
•
The usage of square memories is avoided for similitude
on different stages.
•
We have been productive in accomplishing our target
of most lessened entryway count, and even made sense
of how to beat a segment of the starting late proposed
lightweight hash fills in to the extent littleness and
throughput.
REFERENCES
[1.] William Stallings, “Cryptography and Network
Security,Principles and Practices” Fourth Edition, 2005.
[2.] NIST “SECURE HASH STANDARD”, Federal
Information Processing Standards Publication 180-1,
August 1995.
[3.] NIST “SECURE HASH STANDARD”, Federal
Information Processing Standards Publication 180-2,
August 2002.
[4.] NIST “SECURE HASH STANDARD”, Federal
Information Processing Standards Publication 180-3,
August 2008.
[5.] Harris E. Michail, Athanasios S. Milidonis, “A TopDown Design Methodology for Ultrahigh-Performance
Hashing Cores” IEEE transaction on Dependable and
Secure computing, vol. 6, No. 4, October-December
2009.
[6.] N. Skluvos, G. Dimitroulakos, and O. Koufopavlou,“An
Ultra HighSpeed Architecture for VLSI Implementation
of HashFunctions”, Electronics, Circuits and Systems,
2003.
[7.] Marco Macchetti, Luigi Dadda, “Quasi-Pipelined Hash
Circuits”, Proceedings of the 17th IEEE Symposium on
Computer Arithmetic 2005.
[8.] Robert P. McEvoy, Francis M. Crowe, Colin C. Murphy
and William P. Marnane, “Optimisation of the SHA-2
Family of Hash Functions on FPGAs”, IEEE Computer
Society Annual Symposium onEmerging VLSI
Technologies and Architectures, 2006.
[9.] Hoang Anh Tuan, Katsuhiro Yamazaki, Shigeru
Oyanagi,“Three-stage Pipeline Implementation for SHA2 using data forwarding”, International Conference on
Field Programmable Logic and Applications, 2008.
[10.]
Ricardo
Chaves,Georgi
Kuzmanov,Leonel
Sousa,Stamatis
Vassiliadis,“Cost-Efficient
SHA
Hardware Accelerators”, IEEE Transaction onVery Large
Scale Integration (VLSI) systems, Vol. 16, No. 8, Aug
2008.
[11.]
M. McLoone, J. V. McCanny, “Efficient Single-Chip
Implementation of SHA-384 &SHA-512”, IEEE
© 2019, IRJET
|
Impact Factor value: 7.211
|
ISO 9001:2008 Certified Journal
|
Page 1720
Download