Institute of Software Chinese Academy of Sciences A Byte-Based Guess and Determine Attack on SOSEMANUK Xiutao Feng Outline 1 Introduction 2 Description of SOSEMANUK 3 Basic properties of SOSEMANUK 4 Our attack 5 Further discussion on our attack 6 Conclusion 2 1 Introduction 1.1 On SOSEMANUK SOSEMANUK is a software-oriented stream cipher proposed by C. Berbain et al for the eSTREAM project and has been selected into the final portfolio with other six algorithms together. Its design adopted the ideas of both the stream cipher SNOW 2.0 and the block cipher SERPENT, and aimed at improving SNOW 2.0 from two aspects of both security and efficiency. 3 1.2 Known cryptanalytic results on SOSEMANUK The designers of SOSEMANUK presented a guess and determine attack, whose time complexity is 2256 operations; In 2006 H. Ahmadi et al revised the above attack and reduced the time complexity to 2226 operations; In 2006 Y. Tsunoo et al improved Ahmadi et al's result and further reduced it to 2224 operations; In 2008 Jung-Keun Lee et al proposed a correlation attack, which needs about 2147.88 time, 2145.50 key bits, and 2147.10 bit memories; 4 In 2009 Lin and Jie gave a new guess and determine attack, and claimed that their attack only needs 2192 operations. 5 1.3 Our work 6 2 Description of SOSEMANUK LFSR Serpent1 FSM Figure 1 The structure of SOSEMANUK 7 2.1 The LFSR 8 2.2 The FSM 9 2.3 The Serpent1 31 30 2 1 0 ft 3 ft 2 f t 1 ft s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 s2 yt 3 yt 2 yt 1 yt Figure 2 The round function Serpent1 in the bit-slice mode 10 2.4 Generation of Keystream 11 3 Basic properties on SOSEMANUK 12 Let x be a 32-bit word. Denote by x(i) the i-th byte of x, where i=0,1,2,3. For example, s1(3), s4(0), s4(1) and s10(0) are known, then we can calculate s11(0). s 1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 s16 3 2 1 0 Figure 3 The feedback of the LFSR in the byte form 13 4 Our attack 4.1 Basic idea of the guess and determine attack The guess and determine attack is a common cryptographic attack method. Its basic idea is that Guess: first guess the values of a portion of the internal state of the target algorithm; Deduce: then deduce the values of all the rest of the internal state of the algorithm by making use of the values of the guessed portion of the internal state and a few known keystream; Test: finally generate a phase of keystream by using the above recovered values, and test their correctness by comparing the generated keystream with the known keystream. If NOT, then return Step 1. 14 4.2 The execution of our attack Our attack is based on the following assumption: The guessing and deducing procedure of the attack can be subdivided into five phases: 1. Guess the values of s1, s2, s3, R21(0), R21(1), R21(2) and the rest 31-bit values of R11, and deduce the value of s10(0), R12(0), R22, s11(0), s4(1), s10(1), R12(1), s11(1), s4(2), s10(2), R12(2), S11(2) and s4(3). 15 s1 s2 s3 s4 s5 s6 s7 R12 R13 R14 R15 R16 R17 f2 f3 f4 f5 f6 f7 s8 s9 s10 s11 s12 s13 R 21 R 22 R 23 R 24 R 25 s14 s15 s16 3 2 1 0 R11 R 26 R 27 3 2 1 0 f1 f8 3 2 The guessed byte 1 The deduced byte 0 Figure 4 The illustration of the deduction in Phase 1 16 2. By the assumption lsb(R11)=1, which implies R12=R21⊞(s3⊕s10), we get the equation on the variable s10(3): where a, b, c, and d are known. Since s10(3) occurs three times in the above equation, it is easy to check equation (12) has exactly one solution on s10(3). So we can solve it and get s10(3). Further we deduce s11(3), R21(3) and R22(3). Up to now we have obtained s1, s2, s3, s4, s10, s11, R11, R21, R12 and R22. 3. Further deduce R13, R23, R14, R24, R15, R25, R26, s5, s6, s12 and s13. 17 s1 s2 s3 s4 s5 s6 s7 R12 R13 R14 R15 R16 R17 f2 f3 f4 f5 f6 f7 s8 s9 s10 s11 s12 s13 R 21 R 22 R 23 R 24 R 25 s14 s15 s16 3 2 1 0 R11 R 26 R 27 3 2 1 0 f1 3 f8 The known byte 2 The deduced byte in phase 2 1 The deduced byte in phase 3 0 Figure 5 The illustration of the deduction in Phase 2 and 3 18 4. Further guess s7(0) and s8(0), and deduce the rest bytes of s7 and s8. 5. Final deduce s9. 19 s1 s2 s3 s4 s5 s6 s7 R12 R13 R14 R15 R16 R17 f2 f3 f4 f5 f6 f7 s8 s9 s10 s11 s12 s13 R 21 R 22 R 23 R 24 R 25 s14 s15 s16 3 2 1 0 R11 R 26 R 27 3 2 1 0 f1 3 2 f8 The known byte The guessed byte 1 The deduced byte 0 Figure 6 The illustration of the deduction in Phase 4 and 5 20 4.3 Time and data complexity Time complexity: 2176 operations In Phase 1 and Phase 4, we guess a total of 175 bits of the internal state, including s1, s2, s3, R21(0), R21(1), R21(2), s7(0), s8(0) and the rest 31-bit values of R11. Consider the assumption which holds true with probability 2-1. Data complexity: about 20 words used In the guessing phase: 8 words used; In the testing phase: about 8 words used (When 16 words are given, which has totally 512 bits and is larger than the 384 bits of the internal state, the internal state is determined by them. So we can use them to test the correctness of the recovered internal state.); Consider the assumption: another 4 words used (By shifting the keystream by 4 words we can test two cases). 21 5 Further discussion on our attack Here it should be pointed out that the assumption lsb(R11)=1 is NOT necessary for our attack to work. In fact when lsb(R11)=0, which implies that R12=R21⊞s3, similarly we get the equation on s10(3): The above equation has no solution or 2k solutions for some integer k. However when a’, b’, c’ and d’ go through all possible values, the sum of the number of all solutions is just equal to 232. We directly guess total 160-bit values of the internal state in phase 1, and after phase 2 we get total 2160 possible values. For each of them, we go on phases 3, 4 and 5. So the time complexity is still 2176 operations, but the data complexity reduces to about 16 key words. 22 6 Conclusion In this work we presented a byte-based guess and determine attack on SOSEMANUK, which only needs a few words of known keystream to recover the whole internal state of SOSEMANUK with time complexity 2176 operations. Since SOSEMANUK has a key with the length varying from 128 and 256 bits, it shows that when the length of a chosen encryption key is larger than 176 bits, our attack is more efficient than an exhaustive key search. 23 Thank you! 24