Session 6: Introduction to cryptanalysis part 1 Contents Problem definition Symmetric systems cryptanalysis Asymmetric systems cryptanalysis • Particularities of block ciphers cryptanalysis Problem definition KEY KEY Plaintext Ciphertext encipher decipher A Plaintext B Cryptanalysis decrypt Problem definition The problem of cryptanalysis: • Given some information related to the cryptosystem (at least the ciphertext), determine plaintext and/or the key. The goal of the designer is to make this problem as difficult as possible for the cryptanalyst. Problem definition General assumption – all the details of the cryptosystem are known to the cryptanalyst. The only unknown is the key. Problem definition Types of attack: The ciphertext-only attack is the most difficult one for the cryptanalyst (in general). The more information known to the cryptanalyst, the easier the attack. • Ciphertext-only attack • Known plaintext attack • Chosen plaintext attack • Chosen ciphertext attack. Problem definition The “brute force attack” • Elementary attack – no knowledge about • cryptanalysis is necessary. Assumptions: • The cryptosystem is known. • The ciphertext is known. • The goal: • Determine the key/plaintext. • The means: • Trying all the possible keys. Problem definition Complexity of the brute force attack: • Extremely high, if there are many possible keys – impractical. Key space – the total number of keys possible in a cryptosystem. Problem definition Examples of key space size: Key space – 40 bits 11012 Key space – 56 bits (DES) 71016 Key space – 128 bits 31038 Key space – 256 bits 11077 Number of 256-bit primes 11072 Age of the Sun in seconds 11016 Number of clock pulses of a 3GHz computer clock through the Sun’s age 5.41026 Problem definition A cryptosystem’s security is ultimately determined by the size of its key space. However, this is the upper limit of this security measure. There may be a problem in the system design that may cause a significant reduction of the effective key space. The task of the cryptanalyst – to find this pitfall and to use it to attack the system. Symmetric systems Basic attack methods against stream and block ciphers: • Algebraic • Statistical Algebraic attack: • The key symbols (e.g. bits) are the unknowns in the system of equations assigned to the PRNG. Symmetric systems Algebraic attack (cont.): • Given all the details of the PRNG to be • cryptanalyzed (except the key bits), determine the system of equations that relates the bits of the output sequence with the bits of the key. The designer’s goal: • To make this system as non-linear as possible. • The reason: non-linear systems are difficult to solve – there is no general method other than trying all the possible values of the variables: 2n possibilities for a system with n variables. Symmetric systems The problem of solving a non-linear system in GF(2) – the satisfiability problem (SAT). Cook’s theorem (1971): • SAT is NP-complete However, some instances of the SAT problem may be easier to solve. The designer should check the system assigned to the PRNG. Symmetric systems Example: consider the PRNG below: Symmetric systems The system of equations: • (1) y1=(x1+x4)(x5+x7)= • • =x1x5+x1x7+x4x5+x4x7 (2) y2=(x1+x4+x3)(x5+x7+x6)= =x1x5+x1x7+x1x6+x4x5+x4x7+x4x6+ +x3x5+x3x7+x3x6 … (we need 7 independent equations) Symmetric systems Methods of solving the system: • The brute force method: try all the possible • 27-1 solutions (all zeros are not permitted). The linearization method: • Replace all the products by new variables • Solve the obtained linear system (e.g. by Gaussian algorithm) • Try to guess the variables that were included in the products, given the values of the new variables, in such a way that the overall system is consistent. Symmetric systems Example (cont.) y1=z1+z2+z3+z4 y2=z1+z2+z5+z3+z4+z6+z7+z8+z9 … Symmetric systems There are many other methods of solving systems assigned to PRNGs: • Linear consistency test (LCT) • Methods of computational commutative • algebra (Groebner bases etc.) etc. Cryptanalysis of a seriously designed system always includes search. Symmetric systems Statistical methods • In the previous example, the majority of the • • output symbols will be zero, due to the AND combining function. The non-linearity of the assigned system of equations is the highest possible. However, it is possible to make use of bad statistical properties of the output sequence to determine the plaintext sequence. Symmetric systems Example: • With the AND output combiner, the • • probability of zero in the output sequence will be ¾. This means that, upon enciphering with this sequence as the keystream, the probability that the plaintext bit is equal to the ciphertext bit is ¾. Consequence – easy reconstruction of the plaintext. Symmetric systems Correlation – The output sequence coincides too much with one or more internal sequences – this enables correlation attacks – a kind of statistical attack. Correlation attacks: • It is possible to divide the task of the cryptanalyst into several less difficult tasks – “Divide and conquer”. Symmetric systems Typical example – the Geffe’s generator F x1 , x2 , x3 x1 x2 1 x 2 x3 F balanced – good x3 x1 x2 x2 x3 statistical properties Symmetric systems Problem: Correlation! Pr sn s1n s2 n 1 1 3 1 Pr sn s1n 4 Pr sn s1n s2 n 0 2 3 Pr sn s2 n 4 Symmetric systems Since the output sequence is correlated with both input sequences, we can independently guess the input sequences’ bits with high probability if the output sequence is known. Symmetric systems Two most important attacks against block ciphers: • Linear cryptanalysis • Differential cryptanalysis Modern block ciphers are designed in such a way that these attacks have no chance of success (Rijndael, Kasumi, etc.) Symmetric systems Linear cryptanalysis • Known plaintext attack • the cryptanalyst has a set of plaintexts and the corresponding ciphertexts • The cryptanalyst has no way of guessing which plaintext and the corresponding ciphertext were used. Symmetric systems Linear cryptanalysis tries to take advantage of high probability occurrences of linear expressions involving plaintext bits, ciphertext bits (or round output bits) and subkey bits. The basic idea is to approximate the operation of a portion of the cipher with a linear expression. The approach is to determine such expressions with high or low probability of occurrence. Symmetric systems Example: xi1 xi2 xiu y j1 y j2 y jv 0 Here, i and j are the numbers of the rounds from which the bits of the input vector X and the output vector Y are taken, respectively. u bits from the vector X and v bits from the vector Y are taken. Symmetric systems If a block cipher displays a tendency for such linear equations to hold with a probability much higher (or much lower) than ½, this is evidence of the cipher’s poor randomization abilities. The deviation (bias) from the probability of ½ for such an expression to hold is exploited in linear cryptanalysis. This deviation is denominated linear probability bias. Symmetric systems Denominate the probability that the equation holds with pL. The higher the magnitude of the probability bias pL-1/2, the better the applicability of linear cryptanalysis with fewer known plaintexts required in the attack. pL=1 catastrophic weakness – there is always a linear relation in the cipher. pL=0 catastrophic weakness – there is an affine relationship in the cipher (a complement of a linear relationship). Symmetric systems Consider two random variables, X1 and X2. • • X1X2=0 a linear expression – equivalent to X1=X2. X1X2=1 an affine expression – equivalent to X1X2. Assume the following probability distributions: i0 p1 , Pr X 1 i 1 p1 , i 1 i0 p2 , Pr X 2 i 1 p2 , i 1 Symmetric systems If X1 and X2 are independent, then i 0, j 0 p1 p2 , p 1 p , i 0, j 1 1 2 Pr X 1 i, X 2 j i 1, j 0 1 p1 p2 , 1 p1 1 p2 , i 1, j 1 Symmetric systems It can be shown that Pr X 1 X 2 0 Pr X 1 X 2 Pr X 1 0, X 2 0 Pr X 1 1, X 2 1 p1 p2 1 p1 1 p2 . Symmetric systems With probability bias introduced • • • p1=1/2+1 p2=1/2+2 -1/2 1, 2 1/2 we have 1 1 Pr X 1 X 2 0 2 1 2 1, 2 2 2 Symmetric systems Extension to n random binary variables – the piling-up lemma – Matsui, 1993 • • For n independent random binary variables, X1, X2, …, Xn n 1 Pr X 1 X n 0 2 n 1 i 2 i 1 or equivalently 1, 2,,n 2 n 1 n . i i 1 Symmetric systems Pr X 1 X n 0 0or 1. If pi=0 or 1 for all i, then If only one pi=1/2, then In developing the linear approximation of a cipher, the Xi values actually represent linear approximations of the S-boxes. 1 Pr X 1 X n 0 2 Symmetric systems Example: • • • Four random binary variables, X1, X2, X3 and X4. 1 1 Let Pr X 1 X 2 0 1, 2 and Pr X 2 X 3 0 2,3 2 2 Let us derive the expression for the sum of X1 and X3 by adding Pr X 1 X 3 0 PrX 1 X 2 X 2 X 3 0. Symmetric ciphers Since we may consider X1X2 and X2X3 to be independent, we can use the piling-up lemma to determine 1 Pr X 1 X 3 0 2 1, 2 2,3 2 and consequently 1,3 21, 2 2,3 Symmetric systems The expressions X1X2=0 and X2X3=0 are analogous to linear approximations of S-boxes The expression X1X3=0 is analogous to a cipher approximation where the intermediate bit X2 is eliminated. A real analysis is much more complex, involving many S-box approximations.