International Journal of Engineering & Technology Sciences Volume 03, Issue 01, Pages 13-21, 2015 ISSN: 2289-4152 A Hardware Implementation of Twine Block Cipher High Level Synthesis Approach Ali Nemati a,b, Arash Ahmadi a,b,*, Vahab Al-din Makki a,b a Department of Electrical Engineering, College of Basic Science, Kermanshah Science and Research Branch, Islamic Azad University, Kermanshah, Iran). b Department of Electrical Engineering, College of Basic Science, Kermanshah Branch, Islamic Azad University, Kermanshah, Iran). * Corresponding author. Tel.: +989373919597; E-mail address: aahmadi@razi.ac.ir Abstract Keywords: Lightweight Cryptography, Twine Algorithm, High Level Synthesis, Field Programmable Gate Array (FPGA). Accepted: 23 January2015 Radio frequency identification (RFID) are becoming a part of our everyday life with a wide range of applications such as labeling products and supply chain management and etc. At the same time, security and privacy issues remain as an important problem, thus, with the large deployment of low resource devices such as RFID systems and sensor nodes, increasing need to provide security and privacy among such devices, has arisen. Accordingly, concept of lightweight cryptography is introduced. Encryption algorithms are classified into various types such as block and stream ciphers. Lightweight block cipher plays a major role as a building block for security protocols. In 2011 Tomoyasu Suzaki et al in NEC Corporation proposed a new versatile block cipher named as Twine, which are suitable for extremely constrained environments and multiple platforms. In this paper, we simulate and synthesize the Twine block cipher family. Synthesis estimation results of the Twine cryptography algorithm on a FPGA are presented. © Academic Research Online Publisher. All rights reserved. 1. Introduction Radio Frequency Identification (RFID) technology universal technology, privacy and security are the important problems to be solved [6]. is one of the most important technologies in the field During last decade, lightweight cryptography has of computing that is present everywhere. Although provided security solutions for highly constrained this technology gives many advantages than other devices and embedded systems. A large number of identification systems, there are also related security research works have already been done in this risks that can’t be easily passed. RFID security is a domain, which have specifically achieved optimized significant research subject today, as proved by the block ciphers suitable for lightweight cryptographic great number of research papers published in this applications, for instance Katan, Piccolo and several domain in last decade. As RFID technology is a others [1, 3]. As RFID devices become broadly Ali Nemati et al. / International Journal of Engineering and Technology Science (IJETS) 3 (1): 13-21, 2015 utilized in different applications, it is vital to provide -128 to denote the corresponding version. The global various levels of security, associating to the structure of Twine is a variant of Type-2 GFS with application. 16 4-bit sub-blocks. Given a 64-bit plaintext, P(64), and a round key, RK(32×36), the cipher produces the Different applications have different security requirements, and in any case to achieve the desired level of security, protocols should be developed. Security has a direct relation with key size, meaning that with the increase of key-length, the level of security grows. Each block cipher is adaptively to initial cryptographic principles and we expect that any lightweight protocol can be based on a block cipher with the suitable data block and key size. Although some standard cryptographic algorithms such as Advance Encryption Standard (AES) have been proposed for lightweight applications, choosing AES is not a good option for highly constrained environments [12, 7]. Only few number cipher text C(64). Round key RK(32×36) is derived from the secret key, K(n) with n ∈ {80, 128}, using the key schedule. A round function of Twine consists of a nonlinear layer using 4-bit S-boxes and a diffusion layer, which permutes the 16 blocks. This round function is iterated for 36 times for both key lengths, where the diffusion layer of the last round is omitted. The encryption process can be written as Algorithm 2.1. The S-box, S, is a 4-bit permutation defined as Table 1. The permutation of block indexes, π : {0, ... , 15} → {0, ... , 15}, where j-th sub-block (for j = 0, ... , 15) is mapped to π[j]-th sub-block, is depicted at Table 2. Round function of Twine algorithm is shown in Figure 1 [2]. of the block ciphers can be implemented on a small circuit with minimal power and area requirements Algorithm 2.1: Twine .Enc (P(64), RK(32×36), C(64)) and We organize this paper as follows: In Section 2 we X1(64) P RK1(32) || … || RK35(36) RK(32×36) for i 1 to 35 Xi0(4) || Xi1(4) || … || Xi14(4) || Xi15(4) Xi(64) RKi0(4) || RKi1(4) || … || RKi6(4) || RKi7(4) RKi(32) for j 0 to 7 do do Xi2j+1 S(Xi2j ⊕ RKij) ⊕ Xi2j+1 for h 0 to 15 do Xi+1π[h] Xih Xi+1 Xi+10 || Xi+11 || … || Xi+114 || Xi+115 for j 0 to 7 do X362j+1 S(X362j ⊕ RK36j) ⊕ X362j+1 C X36 [2] describe the specification of Twine briefly. Section ⊕ denotes bitwise exclusive-OR. For binary strings, 3 presents the synthesis of the Twine block cipher x and y, x∥y denotes the concatenation of x and y. family. In Sections 4 we present the synthesis Let |x| denote the bit length of a binary string x. If |x| estimation results of Twine algorithm on FPGA, = m, x is also written as x(m) to emphasize its bit respectively. Section 5 concludes the paper. length. The key schedule produces RK(32×36) from some others can be embedded on microcontrollers with limited flash, SRAM, and/or power availability [7]. Obviously, any effort to optimize the performance of a parameter on one side will have a negative impact on the performance of other parameters on the other side meaning that the trade-off law is in place. 2. Algorithm Description the secret key, K(n), where n ∈ {80, 128}. The pseudo code of key schedule for 80-bit key is in Algorithm Twine is a 64-bit block cipher with two supported 2.2. Round constant, CONi(6) = CONiH(3) ∥ CONi L(3), key lengths, 80 and 128 bits. If the key length is is defined as 2i in GF(26) with primitive polynomial needed to be specified, we write Twine -80 or Twine Z6+Z+1. The exact values are listed at Table 3 [2]. 14 | P a g e Ali Nemati et al. / International Journal of Engineering and Technology Science (IJETS) 3 (1): 13-21, 2015 Fig.1: Round Function of Twine [2] Table 1: S-box (hexadecimal) [2] X S(X) 0 C 1 0 2 F 3 A 4 2 5 B 6 9 7 5 8 8 9 3 A D B 7 C 1 D E E 6 F 4 Table 2: Block Shuffle and -1[2] j [j] 1[j] 0 5 1 1 0 2 2 1 11 3 4 6 4 7 3 5 12 0 6 3 9 7 8 4 8 9 10 11 12 13 14 15 13 6 9 2 15 10 11 14 7 10 13 14 5 8 15 12 Table 3: Round Constant [2] i CONi i CONi 1 01 19 0F 2 02 20 1E 3 04 2 3C 4 08 22 3B 5 10 23 35 6 20 24 29 7 03 25 11 8 06 26 22 9 0C 27 07 10 18 28 0E 11 30 29 1C 12 23 30 38 13 05 31 33 14 0A 32 25 15 14 33 09 16 28 34 12 17 13 35 24 18 26 36 0B Table 4: Test Vector (hexadecimal) [2] key length key plaintext ciphertext 80-bit 00112233 44556677 8899 01234567 78ABCDEF 7C1F0F80 B1DF9C28 Algorithm 2.2: Twine key schedule – 80 (K (80), RK (32×36)) WK (80) K WK 0(4) || WK 1(4) || … || WK 18(4) || WK 19(4) WK RK10(4) WK1, RK11(4) WK3 , RK12(4) WK4 , RK13(4) WK6, RK14(4) WK13 , RK15(4) WK14 , RK16(4) WK15 , RK17(4) WK16 RK1(32) RK10 || RK11 || … || RK16 || RK17 for i 2 to 36 WK1 WK1 ⊕ S(WK0) , WK4 WK4 ⊕ S(WK16) WK7 WK7 ⊕ 0||CONi-1H , WK19 WK19 ⊕ 0||CONi-1L do Tmp0 WK0 , Tmp1 WK1 , Tmp2 WK2 , Tmp3 WK3 for j 0 to 3 WK j*4 WKj*4+4 , WK j*4+1 WKj*4+5 do WK j*4+2 WKj*4+6 , WK j*4+3 WKj*4+7 WK16 tmp1 , WK17 tmp2 , WK18 tmp3 , WK19 tmp0 RKi0 WK1 , RKi1 WK3 , RKi2 WK4 , RKi3 WK6 , RKi4 WK13, RKi5 WK14 , RKi6 WK15 , RKi7 WK16 RKi(32) RKi0(4) || RKi1(4) || … || RKi6(4) || RKi7(4) RK(32×36) RK1(32) || RK2(32) || … || RK35(36) || RK36(32) [2] 15 | P a g e 3. High Level Synthesis HLS produces register-transfer level designs from behavioural characteristics in an automatic procedure. A high-level synthesis system which realizes design automation for such a data path needs the following two steps: 1) A high-level synthesis system should prepare a draftsman with more than one possible design options for a given behavioural specification. Thus a designer can choose the optimum design between the options based on several criteria. 2) In order to attain the necessity(1), a high level synthesis systems gives more exact information on performance, area and power wastage for each design options. Ali Nemati et al. / International Journal of Engineering and Technology Science (IJETS) 3 (1): 13-21, 2015 Table 5: Truth Table I/O No. 0 Input (abcd) Output (wxyz) 0000 1 0001 2 0010 3 0011 4 0100 5 0101 6 0110 7 0111 8 1000 9 1001 A 1010 B 1011 C 1100 D 1101 E 1110 F 1111 1100 0000 1111 1010 0010 1011 1001 0101 1000 0011 1101 0111 0001 1110 0110 0100 Fig.2: Simulation Results High-level synthesis is similar to software compilation transposed to the hardware scope. HLS two sub-sections: 1. Synthesis of S-box layer, 2. Synthesis of data processing part. In view of the tools automate the process that generates RTL architecture from an algorithmic behavior of the original characteristic [8]. S-box table presented in Section 2, Truth table can be used for inputs and outputs are presented in Table 5. To start the simulation and synthesis steps, we have chosen Twine-80 algorithm. According to section 2, it is explicit that data block length = 64 bit, key length = 80 bit and rounds = 36. Also, an example of this type of algorithm is considered in Table 4. At first, we’ve written VHDL code of Twine-80 algorithm for above example and simulated it in As well as it is explicit that according to the truth table proposed in table 5, we can write some relations between inputs and outputs based on the sum of products (minterms) as follows: W = a'b'c'd' + a'b'cd' + a'b'cd + a'bc'd + a'bcd' + ab'c'd' + ab'cd' + abc'd = a'b'd' + ab'd' + bc'd + a'b'cd + a'bcd' = b'd' + bc'd + a'c (b⊕d) (1) Modelsim software. Simulation results are shown in Figure 2. According to this figure, it is clear that X = a'b'c'd' + a'b'cd' + a'bcd + ab'cd' + ab'cd + abc'd + abcd' + abcd = a'b'd' + ab'c + abc + abc'd + a'bcd = ac + a'b'd' clock, plaintext and key are considered as input + bd (a⊕c) (2) parameters and ciphertext is considered as an output parameter. The clock for this simulation is selected 100 KHz. Then due to the structure of the Twine encryption algorithm, synthesis can be divided into Y = a'b'cd' + a'b'cd + a'bc'd' + a'bc'd + ab'c'd + ab'cd + abc'd + abcd' = a'b'c + a'bc' + ab'd + ab (c⊕d) (3) Z = a'b'cd' + a'bc'd + a'bcd' + a'bcd + ab'c'd + ab'cd' + ab'cd + abc'd' = ab'd + b'cd' + a'bc + bc' (a⊕d) (4) 16 | P a g e Ali Nemati et al. / International Journal of Engineering and Technology Science (IJETS) 3 (1): 13-21, 2015 Regarding the logical relations obtained, data flow step, we are designed hardware required to run the graph can be drawn for S-box. Allocation of S-box that depicted in Figure 4. This hardware operational and storage units and interconnections includes 17Muxes, 3ALUs and 3 standard Registers are a subset of the allocation process. Choosing each and type of the library from operation units that used to sequencer for this hardware is shown in Figure 5. perform the desired operation is process of resource This sequencer includes two AND gates, one allocation. Memory units employed for storing data counter and a very simple controller. According to values. It should be considered that in allocation and the number of control steps, counter determines binding steps, a specific hardware can be used to which outputs of the controller are ‘0’and which are perform several operations, so we've used a concept ‘1’. Sequencer gives necessary instruction execution called resource sharing. Data Flow Graph (DFG) for time of the hardware. Sequencer unit parameters for the S-box is drawn and efficient state for scheduling, this hardware are summarized in Table 6. hardware design is allocation and binding is shown in Figure 3. In next a NOT b ALU 1 NOT c ALU 2 NOT ALU 3 CS 1 b d b a' XOR ALU 3 d c' b' NOT AND ALU 1 c' ALU 2 CS 2 a' REG c AND d ALU 3 b d' AND ALU 1 d AND ALU 2 CS 3 b' AND ALU 2 OR ALU 2 a' d' AND ALU 1 REG REG b' AND ALU 3 AND ALU 3 CS 4 a d' c XOR ALU 1 AND ALU 1 CS 5 a OR ALU 2 AND ALU 3 OR ALU 3 c REG CS 6 a b AND a W ALU 2 b' AND ALU 1 CS 7 c REG d d XOR ALU 2 AND ALU 1 OR ALU 3 CS 8 a' AND ALU 2 REG b AND ALU 1 AND ALU 1 b'd' X c AND ALU 3 CS 9 ab'd c' OR ALU 2 OR ALU 3 CS 10 a'b' REG a c AND ALU 1 REG d XOR ALU 3 AND ALU 3 a'b c AND ALU 2 REG CS 11 bc' OR ALU 1 OR CS 12 OR ALU 2 OR ALU 3 CS 13 Y Z Fig.3: Scheduling, Allocation and Binding for S-box 17 | P a g e ALU 2 efficient. Designed Ali Nemati et al. / International Journal of Engineering and Technology Science (IJETS) 3 (1): 13-21, 2015 a c' bc' b d b' S0 S1 0 1 2 3 MUX 1 a'b ab' a'b' S0 S1 0 1 2 3 MUX 3 S0 S1 b' S0 S1 0 1 MUX5 b c 0 1 2 3 MUX 2 a'b'c S0 d' a' a'c b'd' c a4 a'b' b' d a7 S0 S1 S0 0 1 2 3 S1 MUX 6 c d c (c d)a6 0 1 2 3 MUX 7 a6 a9 a S0 S1 0 1 2 3 MUX 12 a3 ac ab'd S0 S1 0 1 2 3 MUX 8 a'b'c+a5 a2 S0 S1 0 1 2 3 MUX 9 a'bc a'b a8 S0 S1 S0 0 1 2 3 S1 MUX 14 a d a a'bc+a2 S0 S1 0 1 2 3 MUX 10 S0 S1 0 1 2 3 MUX 11 S0 S1 0 1 2 3 MUX 16 a5 0 1 2 3 MUX 4 d a'ca1 a1 a'ca1+b'd' ALU 1 0 1 2 3 MUX 17 a d REG 3 a6 a7 a8 a9 MUX 1 MUX 2 MUX 3 MUX 4 MUX 5 MUX 6 MUX 7 MUX 8 MUX 9 Fig.4: Hardware Design Clock Ready 1 Ready 2 Ready 3 C0 C1 C2 C3 C0 Input C1 Counter Reset C2 Controller C3 MUX 10 MUX 11 MUX 12 MUX 13 MUX 14 MUX 15 MUX 16 MUX 17 Fig.5: Sequencer Unit 18 | P a g e Start 1 Start 2 Start 3 REG 1-IN REG 1-OUT REG 2-IN REG 2-OUT REG 3-IN REG 3-OUT Z (a d)bc' b'd'c+ab'd b'd'c X (a c)a7 (a c)a7+ac a'b'd' a'b' a5 a'c a4 0 1 2 3 MUX 15 bc' b'd'c (a d)bc' bd ab'd (c d)a6+a9 ab REG 2 a3 d' c' Y a'bc+a2 a2 a'bc (c d)a6+a9 a1 (c d)a6 W REG 1 a'bc' a c ALU 3 c d ab a'ca1 a'ca1+b'd' bd b d b' ab' a'b'c a'bc' a'b a'b'c+a5 ab'd ac a c b'd' bc'd a' bc' bc'd d' 0 1 2 3 MUX 13 (a c)a7 b'd' (a c)a7+ac S0 S1 ALU 2 a'b'd' b d ab'd+b'd'c S0 S1 a' Ali Nemati et al. / International Journal of Engineering and Technology Science (IJETS) 3 (1): 13-21, 2015 Table 6: Sequencer Unit Parameters MUX 4 (S1S0) 1 0 0 0 0 0 0 0011 1 1 1 1 0 0 0 0 0 0100 1 1 1 0 1 1 0 1 0 0101 1 1 1 0 0 0 0 0 0 0110 1 1 1 1 0 0 1 0 1 0111 1 1 1 0 0 0 0 0 0 1000 1 1 1 0 1 1 0 0 0 1001 1 1 1 0 0 0 1 1 0 1010 1 1 1 0 0 0 0 0 0 1011 1 1 1 1 0 1 0 1 0 1100 1 1 1 0 1 0 1 0 0 1101 0 1 1 0 0 0 0 0 1 × 0 × 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 1 1 1 × × × RESET 1110 Xi2j RKij XOR MUX 17 (S1S0) MUX 3 (S1S0) 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 1 × × × × MUX 16 (S1S0) MUX 2 (S1S0) 1 × × 0 0 0 1 1 0 1 1 1 1 × × 0 1 × × 0 0 1 1 × × × × MUX 15 (S1S0) MUX 1 (S1S0) 0010 0 0 0 1 1 0 1 1 0 0 0 0 0 0 × × 0 1 × × × × × × × × MUX 14 (S1S0) REG 3 OUT 0 MUX 13 (S1S0) REG 3 IN 0 MUX 12 (S1S0) REG 2 OUT 0 MUX 11 (S1S0) REG 2 IN 0 MUX 10 (S1S0) REG 1 OUT 0 MUX 9 (S1S0) REG 1 IN 0 MUX 8 (S1S0) Start 3 1 MUX 7 (S1S0) Start 2 1 MUX 6 (S1S0) Start 1 1 MUX 5 (S0) Control Steps (C3C2C1C0) 0001 0 0 0 0 0 0 0 1 1 0 1 1 0 0 × × × × × × × × × × × × × × 0 0 0 0 0 1 1 0 1 1 × × 0 0 × × × × × × × × × × 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 0 1 × × × × × × 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 1 × × × × × × 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 × × 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 0 0 × × 0 0 0 1 1 0 1 1 × × × × 0 0 × × × × × × × × × × 0 0 0 1 0 1 1 0 1 1 × × × × × × × × 0 0 × × × × 0 0 × × 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 × × × × × × × × 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 × × 0 0 × × × × 0 0 × × 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 × × 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 1 i 1 + ALU 1 CS 1 ALU 2 ''100100'' S-box COM < CS 2-14 Xi2j+1 XOR CS 15 Xi2j+1 ALU 1 Yes No REG (i<36) REG (i=36) End of Encryption Fig.6: Scheduling, Allocation and Binding for Data Processing 19 | P a g e Ali Nemati et al. / International Journal of Engineering and Technology Science (IJETS) 3 (1): 13-21, 2015 Table 7: Implementation Result Resource Available Utilization (%) Register 122880 346 1% LUT 122880 419 1% Slice 30720 131 1% IO 960 64 6% BUFG 32 7 21% In next step, for synthesize data processing part, flexibility of the algorithm used in various according to the before step and the keys obtained in applications and environments, data and key length each round, the efficient state of scheduling for data can be selected with different sizes. In Twine-80, processing is depicted in Figure 6, that in each due to having too few resources as shown in our control step, we have needed just 3 ALUs and 1 implementation results, it is very suitable for Comparator. Encryption function is done in 15 lightweight application and embedded systems. control steps, so that whole encryption operation is performed in 36×15 = 540 CS. A noteworthy point References in Figure 6 is that comparator can be used in one of [1] CD Cannière, O Dunkelman, and M Kneževi´c. the steps 2 to 14 according to view of designer. KATAN and KTANTAN - A Family of Small and Efficient 4. Implementation Hardware-Oriented Block Ciphers. Springer-Verlag, [In CHES 2009, Lecture Notes in The Xilinx synthesis technology synthesizes VHDL Computer Science] 2009; 5747: 272–88. code to make Xilinx files with .ngr and .ngc [2] T Suzaki, K Minematsu, S Morioka, and E extensions after the functional simulation [18].We Kobayashi. TWINE: A Lightweight, Versatile implemented the Twine-80 algorithm using Xilinx Block ISE development tools on FPGA model Virtex-5 Shimonumabe, Nakahara-Ku, Kawasaki, Japan, XC5VFX200T. The results and resource utilization 2012. is hardware [3] K Shibutani, T Isobe, H Hiwatari, A Mitsuda, T implementation illustrates that the algorithm can be Akishita, and T Shirai. Piccolo: An Ultra- implemented with very small area requirements Lightweight Blockcipher. Sony Corporation 1-7-1 (practically 131 number of slices and 419 number of Konan, Minato-ku, Tokyo, Japan 2011; 108-0075. Look Up Tables). [4] M Zwolinski. Digital System Design with presented in Table 7. This Cipher. NEC Corporation, 1753 VHDL. 2ndEdition 2004. 5. Conclusion [5] GD Micheli. Synthesis and optimization of In this paper, we have described, synthesized and digital circuits 1994. implemented a lightweight block cipher Twine, [6] E Vahedi, VWS Wong, IF Blake. An Overview which has 64-bit block and 80 or 128- bit key. of Cryptography. Chapter 5, University of British Twine's algorithm is a lightweight flexible algorithm Columbia, Canada 2014. for using in RFID systems and wireless sensor [7] R Beaulieu, D Shors, J Smith, ST Clark, B networks (WSNs) and deployment capabilities in Weeks, L Wingers. The simon and speck of many other applications as well. Due to the lightweight block ciphers. National Security Agency 20 | P a g e Ali Nemati et al. / International Journal of Engineering and Technology Science (IJETS) 3 (1): 13-21, 2015 9800 Savage Road, Fort Meade, MD 20755, USA Cryptography Implementations. IEEE Design & 2013. Test of Computersm 2007; 24(6): 522-533. [8] P Coussy, A Morawiec. High Level Synthesis [20] W Zhang, Z Bao, D Lin, V Rijmen, B Yang. from Algorithm to Digital Circuit. Springer 2008. RECTANGLE: A Bit-slice Ultra-Lightweight Block [9] D Hong, J Sung, S Hong, J Lim, S Lee, B Koo, Cipher Suitable for Multiple Platforms. State Key C Lee, D Chang, J Lee, K Jeong, H Kim, J Kim, S Laboratory of Information Security, Institute of Chee. HIGHT: A new block cipher suitable for low- Information Engineering, Chinese Academy of resource device. CHES’06, LNCS 4249, Springer- Sciences, Beijing, P. R. China and KU Leuven, Verlag 2006; 46–59. ESAT/COSIC and iMinds; Kasteelpark Arenberg [10] G Leander, C Paar, A Poschmann, K Schramm. 10, Leuven, Belgium 2014. New lightweight DES variants. FSE’07 LNCS 4953, [21] M Kumar, SK Pal and A Panigrahi. FeW: A Springer-Verlag 2007; 196–210. Lightweight Block Cipher. Scientific Analysis [11] A Bogdanov, L Knudsen, G Leander, C Paar, A Group, DRDO, Delhi, INDIA, Department of Poschmann, M. J. B. Robshaw, Y Seurin, C Mathematics, University of Delhi, INDIA 2014. Vikkelsoe. PRESENT: An ultra-lightweight block cipher. CHES’07, LNCS 4727, Springer-Verlag 2007; 450–466. [12] FIPS. Advanced Encryption Standard (AES). Federal Information Processing Standards Publication 197. [13] FIPS. Data Encryption Standard. Federal Information Processing Standards Publication 46. [14] M Hell, T Johansson, A Maximov, W Meier. The Grain family of stream ciphers. New Stream Cipher Designs, The eSTREAM Finalists, LNCS 4986, Springer 2008; 179–190. [15] CD Canniere, B Preneel. Trivium. New Stream Cipher Designs, The eSTREAM Finalists, LNCS 4986, Springer 2008; 244–266. [16] T Isobe. A single-key attack on the full GOST block cipher. FSE’11, LNCS 6733, 2011; 290–305. [17] AJ Menezes, PC Van Oorschot and SA Vanstone. Handbook of Applied Cryptography. CRC Press 2001; 202-203. [18] Xilinx, Inc. Virtex-5 User Guide, available at www.xilinx.com. [19] T Eisenbarth, S Kumar, C Paar, A Poschmann and L Uhsadel. 21 | P a g e A Survey of Lightweight