KWAME NKRUMAH UNIVERSITY OF SCIENCE AND
TECHNOLOGY, KUMASI
COLLEGE OF ENGINEERING
IMPLEMENTATION OF ENCRYPTION AND DECRYPTION ALGORITHMS AT
THE HARDWARE LEVEL - THE DESIGN AND DEVELOPMENT OF
MICROMICROPROCESSORS AT THE TI MICROELECTRONICS LAB
COE 499: VACATION TRAINING
TECHNICAL REPORT
STUDENT NAME: ISAAC DANSO
INDEX NUMBER: 3029420
PROGRAMME: BSc COMPUTER ENGINEERING
DURATION: EIGHT WEEKS
STARTING DATE: 4TH SEPTEMBER, 2023
END DATE: 27TH OCTOBER, 2023
1
ACKNOWLEDGEMENTS
First and foremost, I am grateful to God Almighty for providing me with the strength
and ability to complete my eight-week internship successfully.
Second, I want to express my gratitude to Chris Nutsukpui and Prof. K. O. Boateng
for providing me with the chance to complete my curriculum internship at their
esteemed institution and for giving up their valuable time to teach me about and
assist me in using the technologies that are necessary in today's market. I learned a
lot from it, and it gave me exposure to a field that used a range of unique technical
design methods and approaches.
Finally, I would want to express my gratitude to my fellow interns for their help and
support during my internship. Their experience and skills have been really helpful to
my growth and learning.
2
ABSTRACT
During the eight-week internship, two significant projects were completed at Kwame
Nkrumah University of Science and Technology's College of Engineering's TI
MICROELECTRONICS LAB. These included the ShangMi Version-4 cipher (Sm4)
algorithm's implementation in VHDL and the design and installation of a dedicated
micromicroprocessor to compute an algorithm using the Hardware Description
Language called VHDL. A thorough report of the activities completed throughout the
period is provided in this booklet. The appendix of this publication also includes the
simulations, tests, and outcomes.
3
4
TABLE OF CONTENTS
ACKNOWLEDGMENTS i
ABSTRACT ii
1 INTRODUCTION
6
1.1 THE INTERNSHIP......................… ...........................……............................…6
1.2 SELECTION OF INTERNSHIP ORGANIZATION.............................……............ 6
1.2.1PERSONAL INTERNSHIP INSTITUTION SELECTION......…...........…..6
BRIEF HISTORY OF THE TI MICROELECTRONICS LAB.................................……..6
1.4 WHAT DOES
FPGA AND VHDL MEAN?.......................................……......7
1.4 WHAT IS CRYPTOGRAPHY?.................…………………..............................…….7
2 THE INTERNSHIP ……………………………………………………………………………………..8
2.1 INTRODUCTION …………………………………………………………………………..8
2.2 INTERNSHIP PROJECT ONE……………………………………………………………..
10
2.2.1 OVERVIEW ............................................................................... 10
2.2.2 PROBLEM AND ALGORITHM.......................……..........................10
2.2.3IMPLEMENTATION .................................................................... 10
2.2.4 SYNTHESIS, SIMULATION AND TESTING ..................…............... 12
2.3 INTERNSHIP PROJECT TWO ………………………….………………………….………….14
2.3.1 WHAT IS THE SM4 CIPHER? ……………………………………………….. 14
2.3.2OVERVIEW ............................................................................... 14
2.3.3IMPLEMENTATION ............................................................... 18
2.3.4 SYNTHESIS, SIMULATION AND TESTING .................................. 22
3
CONCLUSION
22
REFERENCE
23
APPENDIX A
26
1 IMPLEMENTATION OF COMPONENTS IN VHDL .............…………......…....…........ 24
APPENDIX B
30
2 TOP IMPLEMENTATION.......................................................................…....... 44
5
1 INTRODUCTION
1.1 THE INTERNSHIP
While vacationing or taking a break from their regular academic or professional
duties, tertiary students can participate in training programmes that improve their
skills, knowledge, and competencies in a particular sector. These programmes are
sometimes referred to as vacation training.
The vacation training course offered by the College of engineering at the Kwame
Nkrumah University of Science and Technology is a compulsory course that spans a
minimum of eight weeks aimed to expose students to jobs in the engineering sector
and the needed engineering standards and ethics that are needed for those jobs. It is
aimed at helping students apply their engineering knowledge from school to
solve real world problems in a job setting. It helps them to meet and learn from
expects in their course of study, thereby increasing the employability of the student
after the completion of their four years programme
1.2 SELECTION OF INTERNSHIP ORGANIZATION
The office of the College of Engineering provided a website to facilitate the
registration and placement of candidates to various internship centers. Candidates
also have the liberty to apply for an institution of their choice
1.2.1 PERSONAL INTERNSHIP INSTITUTION SELECTION
The chosen internship venue was the TI MICROELECTRONICS LAB in the College of
Engineering's Faculty of Electric and Computer Engineering. Prof. K. O. Boateng
and Chris Nutsukpui, a Mphil candidate oversaw the internship. The internship
provider was chosen because of its compatibility with the candidate's academic
interests and career aspirations. The training was a first step towards the objective
of becoming a Hardware Architect, on which more knowledge might be added. The
following areas were open to applicants for the internship programme: FPGAs and
encryption algorithms; hardware implementation of algorithms; working with
datapaths and control units, VHDL.
1.3 BRIEF HISTORY OF TI MICROELECTRONICS LAB
The goal of TI MICROELECTRONICS LAB is to teach and learn about microcircuitry. It
is equipped with cutting edge software to help students gain knowledge in the field
of the chip business. Cadence Software Suit, Xilinx Software Suit, and Vivado
Software Suit are just a few of the software tools that are available in the lab.
1.4 WHAT DOES FPGA AND VHDL MEAN?
A field-programmable gate array (FPGA) is a kind of integrated circuit that can be
programmed or reprogrammed after manufacturing.It is made up of several
interconnected switches and programmable logic blocks that can be set up to carry
out different digital tasks. An FPGA's logic blocks can be set up to execute simple
logic gates like AND and XOR or to carry out intricate combinatorial operations.
A hardware description language (HDL) is typically used to specify an FPGA's design.
The Institute of Electrical and Electronics Engineers (IEEE) standardised the VHSIC
Hardware Description Language (VHDL), a hardware description language (HDL) that
can be used for design entry, documentation, and verification. It can model the
behaviour and structure of digital systems at multiple levels of abstraction, from the
system level down to that of logic gates.
1.5 WHAT IS CRYPTOGRAPHY?
Information can be hidden or coded using cryptography so that only the intended
recipient can read a communication. The practice of cryptography dates back
thousands of years, to the time of the ancient Egyptian hieroglyphics, who utilised it
to encode communications.It offers confidentiality, authentication, and privacy.
In order to produce intricate codes that conceal a message's actual meaning, it
blends several academic fields, including computer science, engineering, and
mathematics.
It is a component of computer passwords, bank cards, and online shopping.
Information can be encrypted and decrypted using ciphers and algorithms used in
modern cryptography techniques, such as 128- and 256-bit encryption keys.
7
8
2 THE INTERNSHIP
2.1 INTRODUCTION
The internship ran for eight days, beginning on Monday, September 4, 2023.
Officially, the internship concluded on September 27, 2023.
We were introduced to all of the participants and the internship supervisors on the
first day of the programme.
The next few days of the week were dedicated to acquainting us interns with our
two primary projects. We also learned about the tools and software we would need
for the internship, such as the Spartan 3 FPGA board and the Xilinx programme.
Small projects and assignments were given to us during this time to assist us review
important ideas that we needed to know for the internship, such as datapaths,
control units, states machines, sequential and combinational circuits, and boolean
algebra.
The weeks that followed were filled with continuous tasks, smaller projects, and
presentations meant to monitor our development as interns.
After that, we were given our first major project. I was given a specialized
microprocessor to compute and an algorithm, and I was required to make periodic
presentations to demonstrate my work on the project.
The last project involved designing, simulating, synthesizing, and testing the ShangMi
Version-4 cipher, a Chinese cipher. Regular presentations were given to demonstrate
my progress.
The internship focused on two key areas, from which projects would be undertaken
Design and Implementation of algorithms in hardware using VHDL and
Cryptography was covered.
2.1.2 DEDICATED MICROPROCESSORS
Implementation of algorithms in hardware is the genesis of Dedicated Processing
units. The following were covered in the scope of the internship program. ˆ
Design and Optimization of Datapath
Design and Optimization of Control Unit
Implementation of Algorithm
Simulation and Test of Design (VHDL)
Synthesis of Implementation on Field Programmable Gate Array (FPGA)
A dedicated microprocessor was designed from algorithm, to Datapath to Control
Unit, optimized and the design was implemented in VHDL.
2..1.3 CRYPTOGRAPHY
For cryptography, the following were covered:
Cryptographic Techniques
Transposition Algorithms
Substitution Algorithms
Shannon’s Proposal for Diffusion and Confusion
Feistel Structure
2.1.4 SUMMARY
The internship covered a variety of academic subjects. the use of components like
RAM, decoders, counters, and registers using VHDL,
Algorithm design and implementation in cryptography and hardware
9
2.2 INTERNSHIP PROJECT ONE-DEDICATED MICROPROCESSOR
2.2.1 OVERVIEW
My first major project involved developing an algorithm based on a given question,
then building and simulating a dedicated microprocessor using the hardware
descriptive language, VHDL, and an FPGA board. This involved using my
understanding of datapaths, control units, and how they interact to compute the
algorithm and produce meaningful results.
2.2.2 QUESTION AND ALGORITHM
I was given a question that stated “Design a dedicated Datapath for inputting three
8-bit unsigned numbers, and then output the smallest number”
From which I was able to generate the following algorithm
input A;
input B;
input C;
if (A < B){
E = A;
}else {
E = B;
}
if (E < C){
Y = E;
} else{
Y= C;
}
output Y;
2.2.3 IMPLEMENTATION
I was able to determine that I required a tri-state buffer, two eight-bit multiplexers,
three eight-bit registers, and two eight-bit comparator from the algorithm that was
developed. Creating a Datapath with these elements was the first step in the
implementation phase. The created Datapath is displayed in Figure 2.0 below.
The next step was to create the Datapath's corresponding state diagram, which will
serve as the foundation for the control unit.
Figure 2.1 displays the eight-stage state diagram that was created, with each phase
addressing a different aspect of the process.
Following the state diagram , I drew a control table that highlights the control signals
and and at what states they are effective
Figure 2.0 Datapath for Dedicated microprocessor
11
Figure 2.1 State diagram for Dedicated microprocessor
Instructions eload
load
muxsel1
Muxsel2
Output_enable Q2Q1Q0
Input A, B,C 0
1
x
x
0
000
No
operation
0
0
x
x
0
001
E=A
1
0
1
x
0
010
E=B
1
0
0
x
0
011
E
0
0
x
x
0
100
Y=E
0
0
x
1
0
101
Y=C
0
0
x
0
0
110
Output Y
0
0
0
0
1
Figure 2.2 Control table for Dedicated microprocessor
12
13
Figure 2.3 Top level representation of Dedicated microprocessor
111
2.3 INTERNSHIP PROJECT TWO- SHANGMI VERSION-4 (SM4)
CIPHER
2.3.1 WHAT IS THE SM4
CIPHER
First published by Lu Shumang in 2006, the Sm4 is a Chinese 128-bit symmetric key
block cipher used by the WLAN Authentication and Privacy Infrastructure (WAPI)
standard for ensuring data confidentiality in wireless networks.
It uses 128-bit keys and 128-bit blocks, needing 32 round functions for both
encryption and decryption. The cipher uses left circular shift and bitwise xor
operations.
For each of the 32 round functions used in the encryption and decryption procedure,
round keys are generated.
The encryption/decryption algorithm and the key generation procedure make up the
cipher.
2.3.1.1 KEY EXPANSION ALGORITHM
The round keys in an algorithm are generated from the cipher key via the key
expansion algorithm,
The cipher key, MK = (MK0 , MK1 , MK2, MK3) where each is a 32-bit , then the
round keys are generated as follows (K0 ,K1 ,K2 ,K3 ) = (MK0 ⊕FK0 , MK1 ⊕FK1 , MK2 ⊕
FK2 , MK3 ⊕FK3 ) rki = Ki+4= Ki ⊕T’(Ki+1 ⊕Ki+2 ⊕Ki+3 ⊕CKi), i = 0,1,..., 31 . T‘ is the same
as T, just that after the Nonlinear transformation,ζ the L from T is replaces by L’: L’(B)
= B ⊕(B<<< 13) ⊕(B <<< 23). The system parameter FK is: FK0 = (A3B1BAC6), FK1 =
(56AA3350), FK2 = (677D9197), FK3 = (B27022DC)
The thirty-two fixed parameters, CK were used for each round of the key generation
step
Below is the list of all the fixed parameters
00070E15, 1C232A31, 383F464D, 545B6269, 70777E85, 8C939AA1, A8AFB6BD,
C4CBD2D9, E0E7EEF5, FC030A11, 181F262D, 343B4249, 50575E65, 6C737A81,
888F969D, A4ABB2B9, C0C7CED5, DCE3EAF1, F8FF060D, 141B2229, 30373E45,
4C535A61, 686F767D, 848B9299, A0A7AEB5, BCC3CAD1, D8DFE6ED, F4FB0209,
10171E25, 2C333A41, 484F565D, 646B727
14
2.3.1.2 THE ENCRYPTION PROCESS
In 32 rounds, a 128-bit plaintext is transformed into ciphertext; each round requires
a 32-bit round-key, rki.
After 32 iterations of the round function F, the encryption algorithm applies the
reverse transformation R at the conclusion;
Figure 2.4 Diagram representing the Encryption / Decryption process of the Sm4
cipher
2.3.4 ROUND FUNCTION, F
Suppose the input to the round function is Xx,X1,X2,X3 and the round key is rk0, then F
can be expressed as X4 = F(X0,X1,X2,X3,X4,rk0) = X0 ⊕T(X1 ⊕X2⊕X3 ⊕rk0)
15
After the 32 round functions, we get X32, X33, X34, X35 in that other which then goes
through a last permute function that changes the order to X 32, X34, X33, X32 and this
becomes our ciphertext
Each round function is made up of linear and nonlinear transformations
Figure 2.5 Diagram representing a round function
2.3.5 PERMUTATION, T
An invertible transformation which is composed of a non linear
transformation(ζ) and a linear transformation(L)
mathematically
T(B) = L{ζ(B)};
2.3.6 LINEAR TRANSFORMATION, L
Obtained by bitwise xor of left circular shifted versions of input
Input is output from nonlinear transformation
Output,C = L(S) = S ⊕ S <<< 2 ⊕ S <<< 10 ⊕ S <<< 18 ⊕ S <<< 24;
16
Figure 2.5 Diagram representing the linear transformation process
NONLINEAR TRANSFORMATION,ζ
The nonlinear transformation is made up of four Substitution boxes sbox0, sbox1,
sbox2 ,sbox3 in
parallel . A 32-bit input,B to ζ is divided into 4 bytes ie (b0,b1,b2,b3) with b0,b1,b2,b3
being fed as input to sbox0, sbox1, sbox2 ,sbox3 respectively
After the required substitution has happened in the Sboxes, the
output is fed as input to the linear transformation(L)
SUBSTITUTION BOXES(Sboxes)
All four Sboxes contain the same values,they are used in substituting and modifying
the values of their input, substitution values for the byte xy (in hexadecimal format),
e.g. when the input is 'EF', then the output is the value in row E and column F, i.e.
Sbox(EF) = 84
17
2.3.1.4 DECRYPTION
The structure of the decryption transformation is the same as the encryption
transformation ,
The only difference is the order of the round keys.In decryption, the round keys are
used in the order of (rk31,rk30, … , rk0)
IMPLEMENTATION
The initial step was to create our Datapath, The Datapath for this cipher was
divided into three subsections ;
Key Generation Block(Specifically for all 32 round key generation)
Containing a 128-bit register, a 32 by 32 ROM, four multiplexers and four
substitution boxes
Counter with RAM Block(Counter for iterating through the 32 rounds and the
RAM for storing generated keys)
Containing a flip-flop, a 6-bit register and a subtractor being used as a upcounter, two multiplexers and a RAM
Encryption / Decryption Block(For encrypting plaintext or decrypting ciphertext)
The Datapath, the finite state machines, the control table and the top -level model of
derived for the implementation are shown below
18
Figure 2.6 Diagram representing the Round Key Generation Datapath
19
Figure 2.7 Diagram representing the Counter with RAM Datapath
Figure 2.8 Diagram representing the Encryption / Decryption Datapath
Two finite state machines were derived, one for Key Generation Datapath and the
other for the Encryption / Decryption Datapath
Below are the state diagrams and the control tables
Figure 2.9 Left:
State diagram for the Key Generation process
Right: State diagram for the Encryption / Decryption process
State
Instructions
Rst
Write_en MKload
ILoad
Muxsel1 Muxsel2
000
I=0
1
0
0
0
0
x
001
Input
masterkey
0
0
1
0
0
x
010
T’, I++, load
register
0
1
1
1
1
1
011
T’, I++, load
register
0
1
1
1
1
0
100
Terminating 0
state
0
0
0
x
x
Figure 3.0 Control table for Key Generation
21
State
Instructions
Rst
Write_en Xload
ILoad
Muxsel3 outputEnable
000
I=0
1
0
0
0
0
0
001
Input
masterkey
0
0
1
0
0
0
010
T’, I++, load
register
0
0
1
1
1
0
011
T’, I++, load
register
0
0
1
1
1
0
100
Terminating 0
state
0
0
0
x
x
Figure 3.1 Control table for Encryption / Decryption process
SYNTHESIS, SIMULATION AND TESTING
Xilinx ISE was used for the synthesis, simulation, and testing. The results were
compared to standard tests to demonstrate that the DES design was operational.
Appendix B contains the results of the simulations and tests.
CONCLUSION
It was discovered that the construction of computing pathways,
datapaths, and control units is necessary for day to day computing.
A hardware description language was utilised to construct the
implementation, as demonstrated in the implementation of the
algorithm and Dedicated Processor, as well as the Datapath and
Control Unit. In conclusion, this internship offered an overview of
cryptography research and some aspects of hardware implementation of
algorithms. Therefore, a means to awaken curiosity and interest in
this and related topics
22
REFERENCE
[1] W. Stallings, Cryptography and network security: principles and practice.
Hoboken, New Jersey: Pearson Education, Inc, eigth edition ed., 2019.
[2] Free range vhdl Bryan Mealy, Fabrizio Tappero
[3] Digital Logic and Micromicroprocessor Design With VHDL. Enoch O. Hwang La
Sierra University, Riverside
[4] Spartan-3 FPGA Starter Kit Board User Guide UG130 (v1.2) June 20, 2008
23
APPENDIX A
3.0 IMPLEMENTATION OF COMPONENTS IN VHDL
3.1
A 32 x 32 RAM
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use ieee.numeric_std.all;
use ieee.std_logic_unsigned.all;
entity ram_RK is
port(
address: in std_logic_vector(4 downto 0);
input: in std_logic_vector(31 downto 0);
write_en: in std_logic;
clk: in std_logic;
output: out std_logic_vector(31 downto 0)
);
end ram_RK;
architecture Behavioral of ram_RK is
type ram is array(0 to 31) of std_logic_vector(31 downto 0);
signal rk : ram;
begin
process(clk)
begin
if(write_en = '0') then --read
output <= rk(to_integer(unsigned(address)));
elsif(rising_edge(clk)) then
if(write_en = '1') then
rk(to_integer(unsigned(address))) <= input;
end if;
end if;
end process;
end Behavioral;
24
3.2 A 3 x 8 DECODER
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
entity decoder is
port(A, B, C: in std_logic;
y: out std_logic_vector(7 downto 0)
);
end decoder;
architecture Behavioral of decoder is
signal abc : std_logic_vector(2 downto 0);
begin
abc <= A&B&C;
process(A, B, C, abc)
begin
case (abc) is
when "000" => y <=
"10000000";
when "001" => y <=
"01000000";
when "010" => y <=
"00100000";
when "011" => y <=
"00010000";
when "100" => y <=
"00001000";
when "101" => y <=
"00000100";
when "110" => y <=
"00000010";
when "111" => y <=
"00000001";
when others => y <= "00000000";
end case;
end process;
end Behavioral;
25
3.3 A 32-BIT REGISTER
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
entity reg32 is
Port (
clk
: in
STD_LOGIC;
load
: in
STD_LOGIC;
input : in
std_logic_vector(31 downto 0);
output: out std_logic_vector(31 downto 0)
);
end reg32;
architecture Behavioral of reg32 is
begin
process(clk)
begin
if rising_edge(clk) then
if load = '1' then
output <= input;
end if;
end if;
end process;
end Behavioral;
26
3.4 A COMPARATOR
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
entity comparator is
port(A, B: in std_logic_vector(3 downto 0);
AgtB, AltB, AeqB: out std_logic
);
end comparator;
architecture Behavioral of comparator is
begin
process(A, B)
begin
if(A < B) then
AltB <= '1';
else AltB <= '0';
end if;
if(A > B) then
AgtB <= '1';
else AgtB <= '0';
end if;
if (A = B) then
AeqB <= '1';
else AeqB <= '0';
end if;
end process;
end Behavioral;
27
3.3 A COUNTER USING FSM
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
entity counterinfsm is
port(
clk,rst: in std_logic;
hsync, rollover: out std_logic;
);
end counterinfsm;
architecture Behavioral of counterinfsm is
type state_type is (h_a, h_b);
signal ps,ns: state_typ
begin
sync: process(clk,rst)
begin
if(rst = '1') then
ps <= h_a;
cur <= "00000000";
elsif(rising_edge(clk)) then
ps <= ns;
hcntcurr <= hcntnext;
end if;
end process;
comb: process(ps)
begin
case ps is
when h_a => hcntnext <= hcntcurr;
end Behavioral;
28
3.5 A TRI - STATEBUFFER
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
use ieee.numeric_std.all;
entity tristatebuffer is
Port (
data: in unsigned(7 downto 0);
enable: in STD_LOGIC;
output: out unsigned(7 downto 0)
);
end tristatebuffer;
architecture Behavioral of tristatebuffer is
begin
process(data, enable)
begin
if enable = '1' then
output <= data;
else
output <= "ZZZZZZZZ";
end if;
end process;
end Behavioral;
29
APPENDIX B
SIMULATION RESULTS AND SCHEMATICS FOR
DEDICATED MICROPROCESSOR
Figure a: Top level representation of the dedicated microprocessor
Figure b: The expanded view of the dedicated microprocessor
30
Figure c: Control unit relating to the dedicated microprocessor
Figure d: Datapath for the dedicated microprocessor
31
Figure e: simulation result of the dedicated microprocessor using the Xilinx software
SIMULATION RESULTS AND SCHEMATICS FOR THE SM4
CIPHER
Figure f: Top level representation of the sm4 cipher
32
Figure g: Expanded view of the sm4 cipher
Figure h: datapath for the sm4 cipher
Figure i: encryption / decryption control unit
Figure j: key generation control unit
Figure k: Encryption Simulation using the Xilinx software
Figure l: Decryption Simulation using the Xilinx software
0
You can add this document to your study collection(s)
Sign in Available only to authorized usersYou can add this document to your saved list
Sign in Available only to authorized users(For complaints, use another form )