Haskell and Cryptography Jay-Evan J. Tevis, Ph.D. Department of Computer Science Western Illinois University www.wiu.edu/users/jjt107 Overview • Imperative and Function Programming Paradigms • Implementation of the CAST-128 Encryption Algorithm in C and Haskell • Results from Student Projects 2 Imperative and Functional Programming Paradigms Programming Languages Data Types Expressions Procedures Syntax Lexical Structure Semantics Scheme Prolog Ada Java Imperative Object-oriented Functional Logic C/C++ Haskell Operational 4 Major Features of Imperative Programming • • • • • • • Assignment Control loops Environment state Array indexing Memory addresses Functions and procedures Side effects 5 Major Features of Functional Programming • • • • • • • Functions with parameters and results Binding of parameters Recursive calls Referential transparency Functions as first-class values Higher-order functions Pattern matching 6 Other Features of Functional Programming • • • • Strong typing (both static and dynamic) Arbitrary length of numbers Polymorphic data typing Normal order evaluation 7 Brief Summary of Haskell • Based on lambda calculus, which was invented by Alonzo Church • Named after the mathematician Haskell Curry • Purely functional programming language • Started out in the 1980s as a research language • Stable version of the language is Haskell 98 • Source code is usually translated by an interpreter but can also be compiled • Main website: www.haskell.org 8 Implementations of the CAST-128 Encryption Algorithm in C and Haskell Description of CAST-128 • Invented by Carlisle Adams and defined in RFC 2144, May 1997 • Belongs to the class of encryption algorithms known as Feistel ciphers • Uses a 12- or 16-round approach with a block size of 64 bits and a key size up to 128 bits • Creates 32 subkeys from the initial 128-bit key • Uses eight substitution boxes with 256 entries each • Uses three different permutation functions based on the round number 10 Software Development Environment • 1.3Ghz, 256MB RAM, Windows XP • C programming – jGRASP IDE – Borland C compiler – GNU C compiler • Haskell programming – HUGS interpreter – Glasgow Haskell compiler 11 Software Development Process • Requirements analysis: Based on RFC 2144 • Software architecture (High-level design) – Four modules arranged in a call-and-return architecture • Incremental development for each module (done in tandem for both C and Haskell) – Low-level design of functions – High-level and low-level implementation – Black box , white box, and integration testing 12 Software Architecture - Read text file - Write text file - Convert chars to block - Convert block to chars - Extract a byte from a word - Create subkey schedule - Encrypt a block - Decrypt a block - Permute a 32-bit word (three functions) - Rotate a word to the left 13 - Define eight arrays for the substitution boxes Software Testing Strategy • Used the same input test values for the similar functions in C and Haskell; compared returned results • Compared the 32 subkeys created in both the C and Haskell implementations of the key schedule • Used the test vectors supplied in RFC 2144 – 128-bit key, 64-bit plaintext block, 64-bit ciphertext block • Encrypted/decrypted documents of various byte lengths – Text files contained either C source code or HTML – Decrypted files were tested for byte errors by compiling or browser viewing 14 Building the output file (in C) void buildEncryptedOutputFile(FILE *inputFilePtr, FILE *outputFilePtr) { // Declarations were removed to fit the code on the slide createSubkeySchedule(key128Bits, subKeySchedule); while (!EOF_Found) { EOF_Found = readBlockOfCharacters(inputFilePtr, block.array); plainBlock[0] = block.pair.left; plainBlock[1] = block.pair.right; encryptBlock(subKeySchedule, plainBlock, cipherBlock); block.pair.left = cipherBlock[0]; block.pair.right = cipherBlock[1]; for (i = 0; i < MAX_BYTES; i++) fputc(block.array[i], outputFilePtr); } // buildEncryptedOutputFile 15 Building the output file (in Haskell) buildOutputFile:: Handle -> Handle -> [Char] -> IO () buildOutputFile inFile outFile direction | (direction == "-e") = do buildEncryptedOutputFile inFile outFile (createSubKeySchedule test128BitKey) buildEncryptedOutputFile:: Handle -> Handle -> KeyScheduleType -> IO () buildEncryptedOutputFile inFile outFile keySchedule = do (inString, endOfFile) <- readUpTo8Characters inFile block <- charsToBlock inString outString <- blockToChars (encryptBlock keySchedule block) hPutStr outFile outString if (inString!!7 == '\0') then putStr "End of file detected\n" else buildEncryptedOutputFile inFile outFile keySchedule 16 Execution Space and Average Time # bytes in data file Executable File Size (bytes) 390 Test A (secs) 4,880 Test B (secs) 15,987 Test C (secs) 73,095 Test D (secs) 104,348 Test E (secs) 67, 584 Borland C 0.03 0.04 0.04 0.10 0.12 37,794 GNU C 0.05 0.05 0.06 0.12 0.14 1,656,819 GHC (Optimized) 0.04 0.07 0.12 0.41 0.57 1,887,680 GHC (Normal) 0.08 0.09 0.47 2.03 2.83 N/A HUGS Interpreter 0.51 2.30 7.30 33.10 46.90 17 Implementation Lessons Learned (1) • Overall, the C implementation of the basic CAST-128 algorithm was straightforward because RFC 2144 contains C pseudocode • For any mathematical expressions, the ease or difficulty of implementation in C or Haskell was the same (except for the need to code the rotate left function in C) • The driver software in both C and Haskell are not tied to the CAST-128 algorithm; consequently, they can be used when implementing other 128bit key and 64-bit block ciphers • Use of the array data structure in Haskell greatly simplified the creation of the subkey schedule • Pattern matching in Haskell relieved the need for condition checking on many of the function input values and permitted a different algorithm approach for subkey creation than the one used in C 18 Implementation Lessons Learned (2) • Exception handling in Haskell simplified the need to check for end-of-file when reading the text file • Strong typing in Haskell ensured that the function interfaces were correct • Recursion in Haskell made the iterative algorithms much easier and quicker to code, debug, and understand • C implementation required the use of unsigned numeric types (unsigned long and unsigned char); otherwise, the key building and the encryption/decryption will not work properly • Both C and Haskell automatically perform modulo 32 arithmetic on the types of unsigned long (in C) and Word32 (in Haskell) • Source code size for executable statements is nearly the same between C and Haskell; what makes the C code larger are the data declarations 19 Results from Student Projects Comparison of Implementations in Haskell and Java/C++ • • • • • • • RSA encryption algorithm Quicksort using temporary files HTML to ASCII file converter Regular expression evaluation C/C++ source code formatter String tree-searching algorithm Solving a Sudoku puzzle 21 Advantages of using Haskell instead of Java/C++ • The algorithms coded in Haskell are much shorter than those in Java/C++ • Haskell functions are easier to test individually because of their inherent referential transparency • Haskell syntax “forces” a programmer to write more modular code • It is simpler to locate and correct errors in a Haskell program • Haskell code was shorter, more elegant, and easier to test • Haskell detects and helps prevent type errors • Haskell lists can be used in lieu of arrays in Java/C++ • Recursive algorithms are straightforward to implement in Haskell 22 Disadvantages of using Haskell instead of Java/C++ • Haskell abstractions do not consider the limits of the computer’s architecture • Haskell I/O is more difficult to program with than that of Java/C++ • Haskell could not do exponentiation of larger numbers • Java/C++ loops are easier to follow than Haskell’s recursion • Java/C++ code is easier to read and understand than Haskell code 23 Conclusion Summary • It is time for functional programming to prove its worth • It is possible to build a complete encryption program in Haskell • Need to move from the von Neumann paradigm into a mathematically based paradigm…a functional paradigm • Functional programming may hold the key to building software that is more secure 25 Major References • • • • • • • • • • Adams, C. RFC 2144: The CAST-128 Encryption Algorithm. (May 1997). www.ietf.org. Bird, R. Introduction to Functional Programming using Haskell, 2nd Edition. Prentice Hall, 1998. Graff, M. and van Wyk, K. Secure Coding. O'Reilly, 2003. Howard, M. and LeBlanc, D. Writing Security Code. Microsoft Press, 2002. Hoyte, D. Haskell Implementation of Blowfish. www.hcsw.org. 2002. Hudak, P. The Haskell School of Expression. Cambridge University Press, 2000. Jones, P. and Hughes. J. Report on the Programming Language Haskell 98. Journal of Functional Programming, Jan 2003. Schildt, H. C: The Complete Reference. McGraw-Hill, 2000. Viega, J. and McGraw, G. Building Secure Software. Addison-Wesley, 2002. Viega, J. and Messier, M. Secure Programming Cookbook. O'Reilly, 2003. 26 Questions? www.wiu.edu/users/jjt107 27 Backup Slides Building the output file (in C) void buildEncryptedOutputFile(FILE *inputFilePtr, FILE *outputFilePtr) { // Declarations were removed to fit the code on the slide createSubkeySchedule(key128Bits, subKeySchedule); while (!EOF_Found) { EOF_Found = readBlockOfCharacters(inputFilePtr, block.array); plainBlock[0] = block.pair.left; plainBlock[1] = block.pair.right; encryptBlock(subKeySchedule, plainBlock, cipherBlock); block.pair.left = cipherBlock[0]; block.pair.right = cipherBlock[1]; for (i = 0; i < MAX_BYTES; i++) fputc(block.array[i], outputFilePtr); } // buildOutputFile 29 Building the output file (in Haskell) buildOutputFile:: Handle -> Handle -> [Char] -> IO () buildOutputFile inFile outFile direction | (direction == "-e") = do buildEncryptedOutputFile inFile outFile (createSubKeySchedule test128BitKey) buildEncryptedOutputFile:: Handle -> Handle -> KeyScheduleType -> IO () buildEncryptedOutputFile inFile outFile keySchedule = do (inString, endOfFile) <- readUpTo8Characters inFile block <- charsToBlock inString outString <- blockToChars (encryptBlock keySchedule block) hPutStr outFile outString if (inString!!7 == '\0') then putStr "End of file detected\n" else buildEncryptedOutputFile inFile outFile keySchedule 30 Creation of key schedule (in C) void createSubkeySchedule(unsigned long key128Bits[], unsigned long subKeys[]) { // 128-bit key separated into four 32-bit words unsigned long x0x1x2x3 = key128Bits[0]; unsigned long x4x5x6x7 = key128Bits[1]; unsigned long x8x9xAxB = key128Bits[2]; unsigned long xCxDxExF = key128Bits[3]; unsigned long z0z1z2z3, z4z5z6z7, z8z9zAzB, zCzDzEzF; // Temp 128-bit key unsigned long x0,x1,x2,x3,x4,x5,x6,x7,x8,x9,xA,xB,xC,xD,xE,xF; unsigned long z0,z1,z2,z3,z4,z5,z6,z7,z8,z9,zA,zB,zC,zD,zE,zF; (Shows the function signature and the variable declarations) 31 Creation of key schedule (in C) z0z1z2z3 = x0x1x2x3 ^ S5[xD] ^ S6[xF] ^ S7[xC] ^ S8[xE] ^ S7[x8]; extractBytes(z0z1z2z3, &z0,&z1,&z2,&z3); z4z5z6z7 = x8x9xAxB ^ S5[z0] ^ S6[z2] ^ S7[z1] ^ S8[z3] ^ S8[xA]; extractBytes(z4z5z6z7, &z4,&z5,&z6,&z7); z8z9zAzB = xCxDxExF ^ S5[z7] ^ S6[z6] ^ S7[z5] ^ S8[z4] ^ S5[x9]; extractBytes(z8z9zAzB, &z8,&z9,&zA,&zB); zCzDzEzF = x4x5x6x7 ^ S5[zA] ^ S6[z9] ^ S7[zB] ^ S8[z8] ^ S6[xB]; extractBytes(zCzDzEzF, &zC,&zD,&zE,&zF); subKeys[1] = S5[z8] ^ S6[z9] ^ S7[z7] ^ S8[z6] ^ S5[z2]; subKeys[2] = S5[zA] ^ S6[zB] ^ S7[z5] ^ S8[z4] ^ S6[z6]; subKeys[3] = S5[zC] ^ S6[zD] ^ S7[z3] ^ S8[z2] ^ S7[z9]; subKeys[4] = S5[zE] ^ S6[zF] ^ S7[z1] ^ S8[z0] ^ S8[zC]; (Shows a portion of the code to create four keys) 32 Creation of key schedule (in Haskell) createSubKeySchedule mainKey = array (1,32) ( k1k2k3k4 ++ k5k6k7k8 ++ k9k10k11k12 ++ k13k14k15k16 ++ k17k18k19k20 ++ k21k22k23k24 ++ k25k26k27k28 ++ k29k30k31k32 ) where (xzA, k1k2k3k4) = createK1K2K3K4 (mainKey, [0x0,0x0,0x0,0x0]) (xzB, k5k6k7k8) = createK5K6K7K8 xzA (xzC, k9k10k11k12) = createK9K10K11K12 xzB (xzD, k13k14k15k16) = createK13K14K15K16 xzC (xzE, k17k18k19k20) = createK17K18K19K20 xzD (xzF, k21k22k23k24) = createK21K22K23K24 xzE (xzG, k25k26k27k28) = createK25K26K27K28 xzF (xzH, k29k30k31k32) = createK29K30K31K32 xzG (Shows how the complete key schedule is brought together) 33 Creation of key schedule (in Haskell) createK1K2K3K4 :: XZKeysPairType -> (XZKeysPairType, [(Word32,Word32)]) createK1K2K3K4 ((xAlpha:xBeta:xGamma:xOmega:[]),(zAlpha:zBeta:zGamma:zOmega:[])) = ( ((xAlpha:xBeta:xGamma:xOmega:[]),(nzAlpha:nzBeta:nzGamma:nzOmega:[])), (1,k1):(2,k2):(3,k3):(4,k4):[]) where nzAlpha = xAlpha `xor` (sBox5!(xOmega#2)) `xor` (sBox6!(xOmega#4)) `xor` (sBox7!(xOmega#1)) `xor` (sBox8!(xOmega#3)) `xor` (sBox7!(xGamma#1)) nzBeta = xGamma `xor` (sBox5!(nzAlpha#1)) `xor` (sBox6!(nzAlpha#3)) `xor` (sBox7!(nzAlpha#2)) `xor` (sBox8!(nzAlpha#4)) `xor` (sBox8!(xGamma#3)) k1 = (sBox5!(nzGamma#1)) `xor` (sBox6!(nzGamma#2)) `xor` (sBox7!(nzBeta#4)) `xor` (sBox8!(nzBeta#3)) `xor` (sBox5!(nzAlpha#3)) (Shows how each subkey is built) 34 Read up to 8 characters (in C) int readBlockOfCharacters(FILE *inFilePtr, unsigned char buffer[]) { int i = 0, j, symbol, EOF_Detected = FALSE; while (i < MAX_BYTES) { symbol = fgetc(inFilePtr); if (symbol == EOF) { EOF_Detected = TRUE; break; } buffer[i] = symbol; i++; } // End while for (j = i; j < MAX_BYTES; j++) buffer[j] = 0; } // End readBlockOfCharacters (Some code was removed to save space) 35 Read up to 8 characters (in Haskell) readUpTo8Characters:: Handle -> IO ([Char], Bool) readUpTo8Characters inputFile = do (c1,b1) <- getCharOrNull inputFile; (c2,b2) <- getCharOrNull inputFile (c3,b3) <- getCharOrNull inputFile; (c4,b4) <- getCharOrNull inputFile (c5,b5) <- getCharOrNull inputFile; (c6,b6) <- getCharOrNull inputFile (c7,b7) <- getCharOrNull inputFile; (c8,b8) <- getCharOrNull inputFile return ( (c1:c2:c3:c4:c5:c6:c7:c8:[]), b8) where getCharOrNull:: Handle -> IO (Char,Bool) getCharOrNull inputFile = do catch (do symbol <- hGetChar inputFile return (symbol, False) ) (\error -> do return ('\0', True) ) (Show exception handling for end-of-file in Haskell) 36 8 chars to a 64-bit word (in C) typedef struct { unsigned long left; unsigned long right; } wordPairType; typedef unsigned char byteBlockType[MAX_BYTES]; typedef union { wordPairType pair; byteBlockType array; } blockType; 37 Conversion is done implicitly in both directions in C by means of a union data structure 8 chars to 64-bit word (in Haskell) charsToBlock :: [Char] -> IO [Word32] charsToBlock (b1:b2:b3:b4:b5:b6:b7:b8:[]) = return [wordLeft, wordRight] where wordLeft = ((intToWord32 (fromEnum b1)) `shiftL` 24) `xor` ((intToWord32 (fromEnum b2)) `shiftL` 16) `xor` ((intToWord32 (fromEnum b3)) `shiftL` (intToWord32 8) `xor` (fromEnum b4)) wordRight = ((intToWord32 (fromEnum b5)) `shiftL` 24) `xor` ((intToWord32 (fromEnum b6)) `shiftL` 16) `xor` ((intToWord32 (fromEnum b7)) `shiftL` (intToWord32 (fromEnum b8)) 38 8) `xor` 64-bit word to 8 chars (in Haskell) blockToChars :: [Word32] -> IO [Char] blockToChars [wordLeft, wordRight] = return [c1,c2,c3,c4,c5,c6,c7,c8] where c1 = toEnum (word32ToInt (wordLeft#1)) c2 = toEnum (word32ToInt (wordLeft#2)) c3 = toEnum (word32ToInt (wordLeft#3)) c4 = toEnum (word32ToInt (wordLeft#4)) c5 = toEnum (word32ToInt (wordRight#1)) c6 = toEnum (word32ToInt (wordRight#2)) c7 = toEnum (word32ToInt (wordRight#3)) c8 = toEnum (word32ToInt (wordRight#4)) 39 Encryption Algorithm (in C) newLeft = plainBlock[0]; newRight = plainBlock[1]; for (roundCount = 1; roundCount <= MAX_ROUNDS; roundCount++) { oldLeft = newLeft; oldRight = newRight; newLeft = oldRight; if ( (roundCount % 3) == 0) newRight = oldLeft ^ type3Function(oldRight, subKeys[roundCount], subKeys[roundCount + 16]); else if ( (roundCount % 3) == 1) newRight = oldLeft ^ type1Function(oldRight, subKeys[roundCount], subKeys[roundCount + 16]); else if ( (roundCount % 3) == 2) newRight = oldLeft ^ type2Function(oldRight, subKeys[roundCount], subKeys[roundCount + 16]); } // End for cipherBlock[0] = newRightSide; cipherBlock[1] = newLeftSide; 40 Encryption Algorithm (in Haskell) encryptBlock :: KeyScheduleType -> [Word32] -> [Word32] encryptBlock keySchedule plainList = auxEncryptBlock keySchedule plainList 1 auxEncryptBlock :: KeyScheduleType -> [Word32] -> Word32 -> [Word32] -- swap left and right auxEncryptBlock keySchedule (leftHalf : rightHalf:[]) 17 = (rightHalf : leftHalf : []) auxEncryptBlock keySchedule (leftHalf : rightHalf:[]) counter = auxEncryptBlock keySchedule (rightHalf : newRightHalf : []) (counter + 1) where newRightHalf = leftHalf `xor` (fChoice rightHalf (keySchedule!counter) (keySchedule!(counter + 16)) counter) fChoice :: Word32 -> Word32 -> Word32 -> Word32 -> Word32 fChoice halfBlock maskingKey rotatingKey roundNbr | (roundNbr `mod` 3) == 1 = type1Function halfBlock maskingKey rotatingKey | (roundNbr `mod` 3) == 2 = type2Function halfBlock maskingKey rotatingKey | otherwise = type3Function halfBlock maskingKey rotatingKey 41 Permute Function (in C and Haskell) unsigned long type1Function(unsigned long halfBlock, unsigned long maskingKey, unsigned long rotatingKey) { unsigned long Iword, Ia, Ib, Ic, Id, ls5bits, result; ls5bits = (rotatingKey << 27) >> 27; Iword = rotateLeft( (maskingKey + halfBlock), ls5bits); extractBytes(Iword, &Ia,&Ib,&Ic,&Id); result = ((S1[Ia] ^ S2[Ib]) - S3[Ic]) + S4[Id]; return result; } type1Function :: Word32 -> Word32 -> Word32 -> Word32 type1Function halfBlock maskingKey rotatingKey = ((sBox1!(word#1) `xor` sBox2!(word#2)) - sBox3!(word#3)) + sBox4!(word#4) where word = ( (maskingKey + halfBlock) `rotateL` (word32ToInt ls5bits)) ls5bits = ((rotatingKey `shiftL` 27) `shiftR` 27) 42 Extract bytes (in C and Haskell) void extractBytes( unsigned long word, unsigned long *byte1, unsigned long *byte2, unsigned long *byte3, unsigned long *byte4) { *byte1 = word >> 24; *byte2 = (word << 8) >> 24; *byte3 = (word << 16) >> 24; *byte4 = (word << 24) >> 24; } (#) :: Word32 -> Int -> Word32 (#) word position | position == 1 = word `shiftR` 24 | position == 2 = (word `shiftL` 8) `shiftR` 24 | position == 3 = (word `shiftL` 16) `shiftR` 24 | position == 4 = (word `shiftL` 24) `shiftR` 24 | otherwise = error "Error with extraction operator (#): position invalid" 43 Rotate bits to the left (in C) unsigned long rotateLeft(unsigned long word, unsigned long nbrBitPositions) { unsigned long result, i; result = word; for (i = 1; i <= nbrBitPositions; i++) { // Check if the most significant bit is a one if (result & MSB_SET_ONLY_NUMBER) // Bitwise AND the result with 2**31 result = (result << 1) + 1; else result = (result << 1); } return result; } // End rotateLeft (Note: rotateL is a library-supplied function in Haskell) 44 End of Backup Slides