The Application of Probability in the FluhrerMantin-Shamir Method of RC4 Key Recovery Doug Madory Term Project ENGS 103 Winter Term 2005 Background RC4 encryption is the encryption used in most software applications today (to include HTTPS and SSL), which makes it arguably the most commonly used encryption method in the world [5]. No wonder then, when the architects of wireless network security sat down to design their protocol in 1997, they incorporated RC4 into their design; in the form of WEP (Wired Equivalent Privacy). Wireless network traffic will always have an inherent liability of interception. It is very difficult to bound where wireless communication radiates. An office that has a wireless infrastructure most likely “spills” its network traffic out into the street, the parking lot, and neighboring buildings. This liability necessitates some form of securing wireless traffic while it is traveling in the air. WEP is the first major attempt to do this. The design concept (and also the origin of its name) of WEP was to provide the equivalent security of having a wire connecting each wireless node. It doesn’t promise to “secure” the network from all forms of attack, it just attempts to secure the traffic from node to node and mitigate the interception liability. Unfortunately, the attempt failed. Despite adopting the most widely used and trusted encryption package on the market, the implementation of RC4 in WEP is tragically flawed. “Ron’s Code #4,” known as RC4, was invented in 1987 by Ron Rivest, the founder of RSA Security, and details on its implementation remained a trade secret until 1994 when an unauthorized detailed-design of the RC4 KeyScheduling Algorithm (KSA) and Pseudo-Random Generation Algorithm (PRGA) was anonymously posted on the internet [1]. In August 2001, Scott Fluhrer of Cisco Systems and Itsik Mantin and Adi Shamir of the Weizmann Institute in Israel wrote the seminal paper, “Weaknesses in the Key Scheduling Algorithm of RC4” that spelled out how a secret key could be recovered from information leaked by the protocol, thus summarily defeating the encryption [2]. This paper discusses some of the probabilistic tools used in the Fluhrer-Mantin-Shamir (FMS) attack. How Does RC4 Work in WEP Every node participating in WEP must be preprogrammed with a “secret key”, which is either a 40-bit code for “64-bit encryption” or a 104-bit code for “128-bit encryption.” Quotes are necessary because the encryption that is marketed as 64 or 128-bit encryption each include a 24bit initialization vector (described below) that is sent unencrypted. So in order to be accurate, 24 must be subtracted from the marketing claims. Figure 1 illustrates the process of encrypting a wireless packet in WEP [1]. When a packet of data is to be sent on a wireless network secured with WEP, Figure 1. an “integrity check value” (2), more commonly known as a CRC checksum, is calculated off the unencrypted plaintext (1) and then appended (3) to the plaintext message. Meanwhile an initialization vector (4) is randomly generated and appended to the “secret key” required for decryption. This resulting stream of data is processed through the RC4 Pseudo Random Generation Algorithm (PRGA) (5) to form a “keystream” (6) of equal length to the plaintext/CRC combination. 2 The plaintext/CRC combination is XOR’ed (7) with the encoding keystream to result in an encrypted message (8) and before transmitting this ciphertext, the initialization vector (IV) is prepended (9) in the clear onto the ciphertext. When a node receives the encrypted packet, it extracts the unencrypted IV and appends it with the preprogrammed secret key and decrypts the message by XOR’ing this keystream with the encrypted portion of the packet (Figure 1 in reverse). One of the central flaws in the implementation of RC4 in the 802.11 standard for WEP is the lack of specification for how to generate IVs. By designing it with 24 bits, there are 224 (16,777,216) possible permutations. Since a new IV is generated for each frame, a busy network may quickly exhaust every possible permutation and the encryption would be forced to repeat IVs. The point of the IV is to provide a unique seed to RC4’s PRGA that generates the keystream that ultimately encrypts the message. Repeating IVs goes against a core tenent of RC4 encryption: never repeat the keys! Problems with WEP Among other problems, WEP suffers from the “Birthday Paradox” [5] which says that in a group as small as 23 people, there is a 50% chance that two people will have the same birthday. The general form says that out of n members, if elements are selected without replacement one at a time, then the probability of a duplicate after two draws is p2= 1/n. For k draws, the probability of at least 1 duplicate is pk = pk-1 + (k-1) * (1/n) * (1-pk-1) Probability of Repeat IV 1.2 1 0.8 Probability of Repeat IV 0.6 0.4 0.2 Frames transmitted 13080 11891 10702 9513 8324 7135 5946 4757 3568 2379 1 1190 0 Figure 2. For the case of WEP, as previously stated there are 224 possible IVs (n) or about 16 million birthdays. When plugged into the equation above, a 50% chance of a repeated IV occurs after k = 4823 or 212 frames and reaches 99% after 12,430 frames. Repetition of IVs is a problem because each unique IV should correspond to one key stream (see Figure 1). Since the IV is known and the PRGA algorithm is known due to the public posting on the internet in 1994, the attacker could work backwards to statistically recover the secret key if all or part of the plaintext message was known. Accomplishing this feat is central to the FMS attack. Furthermore, because of the format used in 802.11, the first plaintext byte of every encrypted portion of a frame is always known. In fact, it is always “AA” denoting 802.2 LLC DSAP for encapsulation. FMS makes use of certain IVs that “leak” information about the secret key and use this known first byte to reveal the value of the first byte of the secret key. FMS then describes additional algorithms to leverage the one known secret key byte against the other 5 or 13 (depending on “64-bit” or “128-bit” encryption) bytes of the secret key. 3 The FMS Attack To understand the FMS attack, the Key-Scheduling Algorithm (KSA) and the PseudoRandom Generation Algorithm (PRGA) must be explained in detail. Below is the c code paraphrasing the operation of these algorithms [4]. KSA(K) Initialization: For i = 0 ... N - 1 S[i] = i j = 0 Scrambling: For i = 0 ... N - 1 j = j + S[i] + K[i mod l] Swap(S[i], S[j]) PRGA(K) Initialization: i = 0 j = 0 Generation Loop: i = i + 1 j = j + S[i] Swap(S[i], S[j]) Output z = S[S[i] + S[j]] The ultimate product of the KSA and PRGA is the keystream that is XOR’ed with the plaintext data and CRC combination to produce the encrypted cyphertext ready for transmission. This process begins by creating an array of values (S[]) of a length (N) equal to that of the length of the plaintext/CRC combination. This array is initialized by setting the values of each element in the array to be equal to the corresponding index of the element (Ex: S[0]=0, S[1]=1, …, S[N]=N). In its scrambling phase, for each value i from zero to (N-1), the KSA calculates a value for j by adding (modulo N) the previous value of j, the ith element of S and the ith element of K (the IV prepended to the secret key) modulo l, the length of K. Finally, the values of S[i] and S[j] are swapped. This is repeated for every element in the S array. The final product is an array, S, that is the same length as the plaintext/CRC combination and is scrambled using the secret key as an index offset for each swap. The scrambled S array is then fed into the PRGA and another succession of N swaps occur, however each time these swaps occur, an output value is calculated. These output values (z) are the ultimate keystream bytes that will be used to encrypt the plaintext data. What FMS discovered were several forms of the three-byte IV that when run through these algorithms yield a predicted result 5% of the time and almost perfectly random all other times. When enough repeated IVs of these special forms are collected, the most common case (the 5% case) would reveal the first unknown byte of the secret key [2]. The process would then be repeated for each additional unknown byte of the secret key until the entire secret key is revealed. Once the secret key is revealed, not only will the attacker be able to decrypt any encrypted frames, but will also be able to participate in all communication as a trusted host using any available network resources. A step-by-step illustration of the mechanics of KSA and PRGA can be seen at the following website (Figure 3) designed by the Figure 3. author for illustration purposes: http://www.dartmouth.edu/~madory/RC4/RC4_KSA.html 4 Where does the 5% come from? For the FMS attack to work, the first two bytes of the IV and the target byte of the secret key must survive the KSA swapping algorithm unchanged after the predicted swaps occur [2]. If we model the remaining swaps as random, then the chance that the three bytes in question are unchanged is 5%. This number comes from aggregating the probability that a byte is unchanged over each step over the three bytes [3]. P(1 byte is unchanged after one random swap) = (1 – 1/N) N is the length of the resulting keystream. P(1 byte is unchanged after N random swaps) = (1 – 1/N)N P(3 bytes are unchanged after N random swaps) = ((1 – 1/N)N)3 P(3 bytes are unchanged after N random swaps) = (e-1)3 = e-3 The expression, ((1 – 1/N)N)3, can be modeled as (e-3) because as N grows to be of any applicable length, the value of the expression asymptotically heads for 0.05. In the end, the value of N is irrelevant as the value is always just below 5%. Value of ((1-1/N)^N)^3 0.06 0.05 0.04 0.03 0.02 0.01 N 241 226 211 196 181 166 151 136 121 91 106 76 61 46 31 16 1 0 Figure 4. Why not use the FMS Attack against SSL? All of this analysis of RC4 weaknesses begs the question: why not use this attack against RC4 in Secure Sockets Layer? Think of the boon to hackers if all secure online transactions could be intercepted and decrypted as easily as WEP. The answer is that the FMS attack will not work against SSL because the way it implements RC4 is significantly different (and superior) to the way it is implemented in WEP. Specifically, SSL uses a unique 128-bit key for each session and not every packet like WEP. This greatly extends the wait for repeat keys to use for cryptanalysis. Additionally, when SSL replaces a key, it replaces all 128 bits, not just only the upper 24 (IV). Therefore in theory, an attacker would need to intercept 264 SSL cyphertext streams instead of 212 WEP frames to begin cryptanalysis. By crudely comparing these two figures, it would appear that RC4 in SSL is 264/212 = 252 times “more secure” than RC4 in WEP [5]. 5 Present Time and Future WEP In response to the advent of the FMS attack, the wireless security industry adopted several techniques to improve the security posture of wireless transmissions including increasing the size of the encryption (from 64-bit in WEP to 128-bit WEP2) and avoiding weak keys (WEPplus). The computational requirements of an FMS attack scales linearly to the size of encryption key, therefore doubling the encryption size just prolongs the inevitable secret key compromise. In October 2001, the most popular manufacturer of wireless networking devices, Orinoco announced the design of a FMS-proof WEP called WEPplus that avoids weak initialization vectors, a technique marketed as “Weak Key Avoidance” [6]. Despite momentarily breathing life back into WEP, Weak Key Avoidance versions of WEP have been slow to be fielded due to industrial inertia as well as ignorance and disagreement regarding the nature of WEP’s vulnerability. WEPplus was rendered moot in August of 2004, when a hacker that goes only by “KoreK” designed a WEP attack that improves upon FMS by not requiring weak initialization vectors and by operating significantly faster than FMS [7]. KoreK’s attack cracks WEP after intercepting ~200,000 frames instead of ~2,000,000 required for FMS. Following the popular adoption of the KoreK-style attack, efforts to revive WEP have all but stopped. The FMS attack remains a relevant topic of study because it is the root of most WEP encryption cracks including the latest KoreK-styke attack. FMS is also relevant because of the prevalence of RC4 encryption in computer security. References: [1] L. Barken, How Secure Is Your Wireless Network? : Safeguarding your Wi-Fi LAN, Prentice Hall, Upper Saddle River, NJ, 2004. [2] S. Fluhrer, I. Mantin, and A. Shamir, “Weaknesses in the key scheduling algorithm of RC4,” Eighth Annual Workshop on Selected Areas in Cryptography, August 2001. [3] S. Fluhrer, I. Mantin, and A. Shamir, “Attacks on RC4 and WEP,” Cryptobytes, 2002. [4] D. Hulton, Practical Exploitation of RC4 Weaknesses in WEP Environments, Dachb0den Labs, February 2002. [5] R. Jenkins, “ISAAC and RC4: Comparing Two Stream Ciphers,” 3rd Fast Software Encryption Workshop, 1996. [6] Orinoco WEPPLus White Paper, October 2001. [7] M. Ossmann, “WEP: Dead Again, Part 1,” www.securityfocus.com, December 14, 2004. 6