PowerPoint Presentation (ISAS 2006, April 21

advertisement
Haskell and Cryptography
Jay-Evan J. Tevis, Ph.D.
Department of Computer Science
Western Illinois University
www.wiu.edu/users/jjt107
Overview
• Imperative and Function Programming Paradigms
• Implementation of the CAST-128 Encryption
Algorithm in C and Haskell
• Results from Student Projects
2
Imperative and Functional
Programming Paradigms
Programming Languages
Data Types
Expressions
Procedures
Syntax
Lexical Structure
Semantics
Scheme
Prolog
Ada
Java
Imperative
Object-oriented
Functional
Logic
C/C++
Haskell
Operational
4
Major Features of Imperative
Programming
•
•
•
•
•
•
•
Assignment
Control loops
Environment state
Array indexing
Memory addresses
Functions and procedures
Side effects
5
Major Features of Functional
Programming
•
•
•
•
•
•
•
Functions with parameters and results
Binding of parameters
Recursive calls
Referential transparency
Functions as first-class values
Higher-order functions
Pattern matching
6
Other Features of Functional
Programming
•
•
•
•
Strong typing (both static and dynamic)
Arbitrary length of numbers
Polymorphic data typing
Normal order evaluation
7
Brief Summary of Haskell
• Based on lambda calculus, which was invented by Alonzo
Church
• Named after the mathematician Haskell Curry
• Purely functional programming language
• Started out in the 1980s as a research language
• Stable version of the language is Haskell 98
• Source code is usually translated by an interpreter but can
also be compiled
• Main website: www.haskell.org
8
Implementations of the CAST-128
Encryption Algorithm
in C and Haskell
Description of CAST-128
• Invented by Carlisle Adams and defined in RFC 2144, May
1997
• Belongs to the class of encryption algorithms known as
Feistel ciphers
• Uses a 12- or 16-round approach with a block size of 64 bits
and a key size up to 128 bits
• Creates 32 subkeys from the initial 128-bit key
• Uses eight substitution boxes with 256 entries each
• Uses three different permutation functions based on the round
number
10
Software Development Environment
• 1.3Ghz, 256MB RAM, Windows XP
• C programming
– jGRASP IDE
– Borland C compiler
– GNU C compiler
• Haskell programming
– HUGS interpreter
– Glasgow Haskell compiler
11
Software Development Process
• Requirements analysis: Based on RFC 2144
• Software architecture (High-level design)
– Four modules arranged in a call-and-return architecture
• Incremental development for each module (done
in tandem for both C and Haskell)
– Low-level design of functions
– High-level and low-level implementation
– Black box , white box, and integration testing
12
Software Architecture
- Read text file
- Write text file
- Convert chars to block
- Convert block to chars
- Extract a byte from a word
- Create subkey schedule
- Encrypt a block
- Decrypt a block
- Permute a 32-bit word
(three functions)
- Rotate a word to the left
13
- Define eight arrays for
the substitution boxes
Software Testing Strategy
• Used the same input test values for the similar functions in C
and Haskell; compared returned results
• Compared the 32 subkeys created in both the C and Haskell
implementations of the key schedule
• Used the test vectors supplied in RFC 2144
– 128-bit key, 64-bit plaintext block, 64-bit ciphertext block
• Encrypted/decrypted documents of various byte lengths
– Text files contained either C source code or HTML
– Decrypted files were tested for byte errors by compiling
or browser viewing
14
Building the output file (in C)
void buildEncryptedOutputFile(FILE *inputFilePtr, FILE *outputFilePtr)
{
// Declarations were removed to fit the code on the slide
createSubkeySchedule(key128Bits, subKeySchedule);
while (!EOF_Found)
{
EOF_Found = readBlockOfCharacters(inputFilePtr, block.array);
plainBlock[0] = block.pair.left;
plainBlock[1] = block.pair.right;
encryptBlock(subKeySchedule, plainBlock, cipherBlock);
block.pair.left = cipherBlock[0];
block.pair.right = cipherBlock[1];
for (i = 0; i < MAX_BYTES; i++)
fputc(block.array[i], outputFilePtr);
} // buildEncryptedOutputFile
15
Building the output file (in Haskell)
buildOutputFile:: Handle -> Handle -> [Char] -> IO ()
buildOutputFile inFile outFile direction
| (direction == "-e") =
do buildEncryptedOutputFile inFile outFile
(createSubKeySchedule test128BitKey)
buildEncryptedOutputFile:: Handle -> Handle -> KeyScheduleType -> IO ()
buildEncryptedOutputFile inFile outFile keySchedule =
do (inString, endOfFile) <- readUpTo8Characters inFile
block <- charsToBlock inString
outString <- blockToChars (encryptBlock keySchedule block)
hPutStr outFile outString
if (inString!!7 == '\0') then putStr "End of file detected\n"
else buildEncryptedOutputFile inFile outFile keySchedule
16
Execution Space and Average Time
# bytes in data file
Executable
File Size (bytes)
390
Test A
(secs)
4,880
Test B
(secs)
15,987
Test C
(secs)
73,095
Test D
(secs)
104,348
Test E
(secs)
67, 584
Borland C
0.03
0.04
0.04
0.10
0.12
37,794
GNU C
0.05
0.05
0.06
0.12
0.14
1,656,819
GHC (Optimized)
0.04
0.07
0.12
0.41
0.57
1,887,680
GHC (Normal)
0.08
0.09
0.47
2.03
2.83
N/A
HUGS Interpreter
0.51
2.30
7.30
33.10
46.90
17
Implementation Lessons Learned (1)
• Overall, the C implementation of the basic CAST-128 algorithm was
straightforward because RFC 2144 contains C pseudocode
• For any mathematical expressions, the ease or difficulty of
implementation in C or Haskell was the same (except for the need to code
the rotate left function in C)
• The driver software in both C and Haskell are not tied to the CAST-128
algorithm; consequently, they can be used when implementing other 128bit key and 64-bit block ciphers
• Use of the array data structure in Haskell greatly simplified the creation of
the subkey schedule
• Pattern matching in Haskell relieved the need for condition checking on
many of the function input values and permitted a different algorithm
approach for subkey creation than the one used in C
18
Implementation Lessons Learned (2)
• Exception handling in Haskell simplified the need to check for end-of-file
when reading the text file
• Strong typing in Haskell ensured that the function interfaces were correct
• Recursion in Haskell made the iterative algorithms much easier and
quicker to code, debug, and understand
• C implementation required the use of unsigned numeric types (unsigned
long and unsigned char); otherwise, the key building and the
encryption/decryption will not work properly
• Both C and Haskell automatically perform modulo 32 arithmetic on the
types of unsigned long (in C) and Word32 (in Haskell)
• Source code size for executable statements is nearly the same between C
and Haskell; what makes the C code larger are the data declarations
19
Results from Student Projects
Comparison of Implementations in
Haskell and Java/C++
•
•
•
•
•
•
•
RSA encryption algorithm
Quicksort using temporary files
HTML to ASCII file converter
Regular expression evaluation
C/C++ source code formatter
String tree-searching algorithm
Solving a Sudoku puzzle
21
Advantages of using Haskell instead of
Java/C++
• The algorithms coded in Haskell are much shorter than those in
Java/C++
• Haskell functions are easier to test individually because of their
inherent referential transparency
• Haskell syntax “forces” a programmer to write more modular code
• It is simpler to locate and correct errors in a Haskell program
• Haskell code was shorter, more elegant, and easier to test
• Haskell detects and helps prevent type errors
• Haskell lists can be used in lieu of arrays in Java/C++
• Recursive algorithms are straightforward to implement in Haskell
22
Disadvantages of using Haskell instead of
Java/C++
• Haskell abstractions do not consider the limits of the computer’s
architecture
• Haskell I/O is more difficult to program with than that of Java/C++
• Haskell could not do exponentiation of larger numbers
• Java/C++ loops are easier to follow than Haskell’s recursion
• Java/C++ code is easier to read and understand than Haskell code
23
Conclusion
Summary
• It is time for functional programming to prove its worth
• It is possible to build a complete encryption program in
Haskell
• Need to move from the von Neumann paradigm into a
mathematically based paradigm…a functional paradigm
• Functional programming may hold the key to building
software that is more secure
25
Major References
•
•
•
•
•
•
•
•
•
•
Adams, C. RFC 2144: The CAST-128 Encryption Algorithm. (May 1997). www.ietf.org.
Bird, R. Introduction to Functional Programming using Haskell, 2nd Edition. Prentice
Hall, 1998.
Graff, M. and van Wyk, K. Secure Coding. O'Reilly, 2003.
Howard, M. and LeBlanc, D. Writing Security Code. Microsoft Press, 2002.
Hoyte, D. Haskell Implementation of Blowfish. www.hcsw.org. 2002.
Hudak, P. The Haskell School of Expression. Cambridge University Press, 2000.
Jones, P. and Hughes. J. Report on the Programming Language Haskell 98. Journal of
Functional Programming, Jan 2003.
Schildt, H. C: The Complete Reference. McGraw-Hill, 2000.
Viega, J. and McGraw, G. Building Secure Software. Addison-Wesley, 2002.
Viega, J. and Messier, M. Secure Programming Cookbook. O'Reilly, 2003.
26
Questions?
www.wiu.edu/users/jjt107
27
Backup Slides
Building the output file (in C)
void buildEncryptedOutputFile(FILE *inputFilePtr, FILE *outputFilePtr)
{
// Declarations were removed to fit the code on the slide
createSubkeySchedule(key128Bits, subKeySchedule);
while (!EOF_Found)
{
EOF_Found = readBlockOfCharacters(inputFilePtr, block.array);
plainBlock[0] = block.pair.left;
plainBlock[1] = block.pair.right;
encryptBlock(subKeySchedule, plainBlock, cipherBlock);
block.pair.left = cipherBlock[0];
block.pair.right = cipherBlock[1];
for (i = 0; i < MAX_BYTES; i++)
fputc(block.array[i], outputFilePtr);
} // buildOutputFile
29
Building the output file (in Haskell)
buildOutputFile:: Handle -> Handle -> [Char] -> IO ()
buildOutputFile inFile outFile direction
| (direction == "-e") =
do buildEncryptedOutputFile inFile outFile
(createSubKeySchedule test128BitKey)
buildEncryptedOutputFile:: Handle -> Handle -> KeyScheduleType -> IO ()
buildEncryptedOutputFile inFile outFile keySchedule =
do (inString, endOfFile) <- readUpTo8Characters inFile
block <- charsToBlock inString
outString <- blockToChars (encryptBlock keySchedule block)
hPutStr outFile outString
if (inString!!7 == '\0') then putStr "End of file detected\n"
else buildEncryptedOutputFile inFile outFile keySchedule
30
Creation of key schedule (in C)
void createSubkeySchedule(unsigned long key128Bits[], unsigned long
subKeys[])
{
// 128-bit key separated into four 32-bit words
unsigned long x0x1x2x3 = key128Bits[0];
unsigned long x4x5x6x7 = key128Bits[1];
unsigned long x8x9xAxB = key128Bits[2];
unsigned long xCxDxExF = key128Bits[3];
unsigned long z0z1z2z3, z4z5z6z7, z8z9zAzB, zCzDzEzF; // Temp 128-bit key
unsigned long x0,x1,x2,x3,x4,x5,x6,x7,x8,x9,xA,xB,xC,xD,xE,xF;
unsigned long z0,z1,z2,z3,z4,z5,z6,z7,z8,z9,zA,zB,zC,zD,zE,zF;
(Shows the function signature and the variable declarations)
31
Creation of key schedule (in C)
z0z1z2z3 = x0x1x2x3 ^ S5[xD] ^ S6[xF] ^ S7[xC] ^ S8[xE] ^ S7[x8];
extractBytes(z0z1z2z3, &z0,&z1,&z2,&z3);
z4z5z6z7 = x8x9xAxB ^ S5[z0] ^ S6[z2] ^ S7[z1] ^ S8[z3] ^ S8[xA];
extractBytes(z4z5z6z7, &z4,&z5,&z6,&z7);
z8z9zAzB = xCxDxExF ^ S5[z7] ^ S6[z6] ^ S7[z5] ^ S8[z4] ^ S5[x9];
extractBytes(z8z9zAzB, &z8,&z9,&zA,&zB);
zCzDzEzF = x4x5x6x7 ^ S5[zA] ^ S6[z9] ^ S7[zB] ^ S8[z8] ^ S6[xB];
extractBytes(zCzDzEzF, &zC,&zD,&zE,&zF);
subKeys[1]
= S5[z8] ^ S6[z9] ^ S7[z7] ^ S8[z6] ^ S5[z2];
subKeys[2]
= S5[zA] ^ S6[zB] ^ S7[z5] ^ S8[z4] ^ S6[z6];
subKeys[3]
= S5[zC] ^ S6[zD] ^ S7[z3] ^ S8[z2] ^ S7[z9];
subKeys[4]
= S5[zE] ^ S6[zF] ^ S7[z1] ^ S8[z0] ^ S8[zC];
(Shows a portion of the code to create four keys)
32
Creation of key schedule (in Haskell)
createSubKeySchedule mainKey = array (1,32) ( k1k2k3k4 ++ k5k6k7k8 ++
k9k10k11k12 ++ k13k14k15k16 ++ k17k18k19k20 ++
k21k22k23k24 ++ k25k26k27k28 ++ k29k30k31k32 )
where (xzA, k1k2k3k4)
= createK1K2K3K4 (mainKey, [0x0,0x0,0x0,0x0])
(xzB, k5k6k7k8)
= createK5K6K7K8 xzA
(xzC, k9k10k11k12)
= createK9K10K11K12 xzB
(xzD, k13k14k15k16) = createK13K14K15K16 xzC
(xzE, k17k18k19k20) = createK17K18K19K20 xzD
(xzF, k21k22k23k24) = createK21K22K23K24 xzE
(xzG, k25k26k27k28) = createK25K26K27K28 xzF
(xzH, k29k30k31k32) = createK29K30K31K32 xzG
(Shows how the complete key schedule is brought together)
33
Creation of key schedule (in Haskell)
createK1K2K3K4 :: XZKeysPairType -> (XZKeysPairType, [(Word32,Word32)])
createK1K2K3K4
((xAlpha:xBeta:xGamma:xOmega:[]),(zAlpha:zBeta:zGamma:zOmega:[])) =
( ((xAlpha:xBeta:xGamma:xOmega:[]),(nzAlpha:nzBeta:nzGamma:nzOmega:[])),
(1,k1):(2,k2):(3,k3):(4,k4):[])
where
nzAlpha = xAlpha `xor` (sBox5!(xOmega#2)) `xor` (sBox6!(xOmega#4)) `xor`
(sBox7!(xOmega#1)) `xor` (sBox8!(xOmega#3)) `xor` (sBox7!(xGamma#1))
nzBeta
= xGamma `xor` (sBox5!(nzAlpha#1)) `xor` (sBox6!(nzAlpha#3)) `xor`
(sBox7!(nzAlpha#2)) `xor` (sBox8!(nzAlpha#4)) `xor`
(sBox8!(xGamma#3))
k1
= (sBox5!(nzGamma#1)) `xor` (sBox6!(nzGamma#2)) `xor`
(sBox7!(nzBeta#4))
`xor` (sBox8!(nzBeta#3))
`xor` (sBox5!(nzAlpha#3))
(Shows how each subkey is built)
34
Read up to 8 characters (in C)
int readBlockOfCharacters(FILE *inFilePtr, unsigned char buffer[])
{
int i = 0, j, symbol, EOF_Detected = FALSE;
while (i < MAX_BYTES)
{
symbol = fgetc(inFilePtr);
if (symbol == EOF)
{ EOF_Detected = TRUE; break; }
buffer[i] = symbol;
i++;
} // End while
for (j = i; j < MAX_BYTES; j++) buffer[j] = 0;
} // End readBlockOfCharacters
(Some code was removed to save space)
35
Read up to 8 characters (in Haskell)
readUpTo8Characters:: Handle -> IO ([Char], Bool)
readUpTo8Characters inputFile =
do (c1,b1) <- getCharOrNull inputFile; (c2,b2) <- getCharOrNull inputFile
(c3,b3) <- getCharOrNull inputFile; (c4,b4) <- getCharOrNull inputFile
(c5,b5) <- getCharOrNull inputFile; (c6,b6) <- getCharOrNull inputFile
(c7,b7) <- getCharOrNull inputFile; (c8,b8) <- getCharOrNull inputFile
return ( (c1:c2:c3:c4:c5:c6:c7:c8:[]), b8)
where getCharOrNull:: Handle -> IO (Char,Bool)
getCharOrNull inputFile =
do catch (do symbol <- hGetChar inputFile
return (symbol, False) )
(\error -> do return ('\0', True) )
(Show exception handling for end-of-file in Haskell)
36
8 chars to a 64-bit word (in C)
typedef struct
{
unsigned long left;
unsigned long right;
} wordPairType;
typedef unsigned char byteBlockType[MAX_BYTES];
typedef union
{
wordPairType
pair;
byteBlockType array;
} blockType;
37
Conversion is
done implicitly
in both
directions in C
by means of a
union data
structure
8 chars to 64-bit word (in Haskell)
charsToBlock :: [Char] -> IO [Word32]
charsToBlock (b1:b2:b3:b4:b5:b6:b7:b8:[]) = return [wordLeft, wordRight]
where wordLeft
= ((intToWord32 (fromEnum b1)) `shiftL` 24) `xor`
((intToWord32 (fromEnum b2)) `shiftL` 16) `xor`
((intToWord32 (fromEnum b3)) `shiftL`
(intToWord32
8) `xor`
(fromEnum b4))
wordRight = ((intToWord32 (fromEnum b5)) `shiftL` 24) `xor`
((intToWord32 (fromEnum b6)) `shiftL` 16) `xor`
((intToWord32 (fromEnum b7)) `shiftL`
(intToWord32
(fromEnum b8))
38
8) `xor`
64-bit word to 8 chars (in Haskell)
blockToChars :: [Word32] -> IO [Char]
blockToChars [wordLeft, wordRight] = return [c1,c2,c3,c4,c5,c6,c7,c8]
where c1 = toEnum (word32ToInt (wordLeft#1))
c2 = toEnum (word32ToInt (wordLeft#2))
c3 = toEnum (word32ToInt (wordLeft#3))
c4 = toEnum (word32ToInt (wordLeft#4))
c5 = toEnum (word32ToInt (wordRight#1))
c6 = toEnum (word32ToInt (wordRight#2))
c7 = toEnum (word32ToInt (wordRight#3))
c8 = toEnum (word32ToInt (wordRight#4))
39
Encryption Algorithm (in C)
newLeft = plainBlock[0];
newRight = plainBlock[1];
for (roundCount = 1; roundCount <= MAX_ROUNDS; roundCount++)
{
oldLeft = newLeft;
oldRight = newRight;
newLeft = oldRight;
if ( (roundCount % 3) == 0)
newRight = oldLeft ^ type3Function(oldRight, subKeys[roundCount],
subKeys[roundCount + 16]);
else if ( (roundCount % 3) == 1)
newRight = oldLeft ^ type1Function(oldRight, subKeys[roundCount],
subKeys[roundCount + 16]);
else if ( (roundCount % 3) == 2)
newRight = oldLeft ^ type2Function(oldRight, subKeys[roundCount],
subKeys[roundCount + 16]);
} // End for
cipherBlock[0] = newRightSide;
cipherBlock[1] = newLeftSide;
40
Encryption Algorithm (in Haskell)
encryptBlock :: KeyScheduleType -> [Word32] -> [Word32]
encryptBlock keySchedule plainList = auxEncryptBlock keySchedule plainList 1
auxEncryptBlock :: KeyScheduleType -> [Word32] -> Word32 -> [Word32]
-- swap left and right
auxEncryptBlock keySchedule (leftHalf : rightHalf:[]) 17 = (rightHalf : leftHalf : [])
auxEncryptBlock keySchedule (leftHalf : rightHalf:[]) counter =
auxEncryptBlock keySchedule (rightHalf : newRightHalf : []) (counter + 1)
where newRightHalf = leftHalf `xor` (fChoice rightHalf (keySchedule!counter)
(keySchedule!(counter + 16)) counter)
fChoice :: Word32 -> Word32 -> Word32 -> Word32 -> Word32
fChoice halfBlock maskingKey rotatingKey roundNbr
| (roundNbr `mod` 3) == 1 = type1Function halfBlock maskingKey rotatingKey
| (roundNbr `mod` 3) == 2 = type2Function halfBlock maskingKey rotatingKey
| otherwise
= type3Function halfBlock maskingKey rotatingKey
41
Permute Function (in C and Haskell)
unsigned long type1Function(unsigned long halfBlock, unsigned long maskingKey,
unsigned long rotatingKey)
{
unsigned long Iword, Ia, Ib, Ic, Id, ls5bits, result;
ls5bits = (rotatingKey << 27) >> 27;
Iword = rotateLeft( (maskingKey + halfBlock), ls5bits);
extractBytes(Iword, &Ia,&Ib,&Ic,&Id);
result = ((S1[Ia] ^ S2[Ib]) - S3[Ic]) + S4[Id];
return result;
}
type1Function :: Word32 -> Word32 -> Word32 -> Word32
type1Function halfBlock maskingKey rotatingKey =
((sBox1!(word#1) `xor` sBox2!(word#2)) - sBox3!(word#3)) + sBox4!(word#4)
where word
= ( (maskingKey + halfBlock) `rotateL` (word32ToInt ls5bits))
ls5bits = ((rotatingKey `shiftL` 27) `shiftR` 27)
42
Extract bytes (in C and Haskell)
void extractBytes( unsigned long word, unsigned long *byte1, unsigned long *byte2,
unsigned long *byte3, unsigned long *byte4)
{
*byte1 = word >> 24;
*byte2 = (word << 8)
>> 24;
*byte3 = (word << 16) >> 24;
*byte4 = (word << 24) >> 24;
}
(#) :: Word32 -> Int -> Word32
(#) word position
| position == 1 = word `shiftR` 24
| position == 2 = (word `shiftL` 8) `shiftR` 24
| position == 3 = (word `shiftL` 16) `shiftR` 24
| position == 4 = (word `shiftL` 24) `shiftR` 24
| otherwise
= error "Error with extraction operator (#): position invalid"
43
Rotate bits to the left (in C)
unsigned long rotateLeft(unsigned long word, unsigned long nbrBitPositions)
{
unsigned long result, i;
result = word;
for (i = 1; i <= nbrBitPositions; i++)
{
// Check if the most significant bit is a one
if (result & MSB_SET_ONLY_NUMBER) // Bitwise AND the result with 2**31
result = (result << 1) + 1;
else
result = (result << 1);
}
return result;
} // End rotateLeft
(Note: rotateL is a library-supplied function in Haskell)
44
End of Backup Slides
Download