Lecture 4

advertisement
EE5552 Network Security and
Encryption
block 4
Dr. T.J. Owens CEng MIET
Dr T. Itagaki MIET, MIEEE, MAES
Block 4
Basic concepts of cryptography
Objectives (1)
After studying this material you should
• Appreciate that the central issue in data encryption is the
design of data transformations that are easy, given a specific
piece of secret knowledge, but extremely difficult otherwise.
• Recognise that a modern cryptosystem achieves secrecy
through an algorithm which computes a code from a key.
• Understand that cryptographic techniques can protect against
eavesdropping and tampering.
Objectives (2)
After studying this material you should
• Be able to calculate the Unicity Distance of a cipher system
and comprehend its significance.
• Understand how the one time pad achieves perfect secrecy.
• Appreciate that linear feedback shift registers provide a
method for approximating the one time pad.
Boundaries
Encoding
Source
Modulation
Encryption
Coding
Channel
Coding
Channel
Sample
Decoding
Demodulation
Source
Decoding
Decryption
Channel
Decoding
Convert
Block diagram of a communications system.
Coding steps in a communications system.
Cryptography - terminology
Cryptosystem or cipher system is a method of hiding the content
of messages.
Cryptography is the art (and science) of creating and using
cryptosystems.
Cryptanalysis is the art (and science) of breaking cryptosystems.
Cryptology is the study of both cryptography and cryptanalysis.
Key Phrase Cipher
A
B
C
D
E
A
T
H
I
S
M
B
Y
K
E
W
O
C
R
D
A
B
C
D
F
G
l
N
P
E
Q
U
V
X
Z
c.f. Vigenère cipher
http://en.wikipedia.org/wiki/Vigen%C3%A8re_cipher
Features of the Example Cipher
•
•
•
•
•
•
•
Easy encoding and decoding
Easy to remember key.
The use of different alphabets for the plain text and cipher
text.
Each input symbol mapped to two output symbols.
Removal of redundancy in the plain text (“i” and “j” treated
as the same letter and spaces omitted.
Independent encoding of plaintext characters
Some letters from the key phrase are discarded.
Data Security and Information Theory
Cryptosystems (1)
Aim to transform original data (plaintext) into an unintelligible
form (ciphertext) before transmitting it over a communication
system.
This involves computing an invertible transformation of a
message that is hard to invert without some secret knowledge
known as the key.
Encoding process often called encryption and the decoding
process decryption.
Data Security and Information Theory
Cryptosystems (2)
An unauthorised person attempting to unauthorised access to a
communications system is a cryptanalyst or adversary.
The key must be transmitted from Alice to Bob by a “secure”
channel.
Cryptosystems may be used to assure Secrecy/Privacy,
Authenticity/Integrity and Anonymity/Invisibility.
Attacks on Cipher systems (1)
• Passive wiretapping (eavesdropping)
• Active wiretapping (tampering)
Attacks on Cipher systems (2)
• Eve (the cryptanalyst) knows
–
–
–
–
The encryption algorithm.
The plaintext statistics or structure.
Probability distribution of keys.
The ciphertext only attack: Eve knows the encryption algorithm and
has some ciphertext and some knowledge of the statistical structure
of the plaintext.
– The known plaintext attack. Eve knows the encryption algorithm and
has some plaintext together with its corresponding ciphertext.
– The chosen plaintext attack. Eve knows the encryption algorithm and
is able to choose some plaintext and arrange that it is encrypted.
Discrete Random Variables
X denotes the number of mouse clicks
x: CLICK CLICK CLICK CLICK CLICK CLICK
Y denotes the number of keystrokes
y: KEY KEY KEY KEY KEY
we can write:
(This denotes the probability that X and Y are equal to x)
we cannot write
(This would implies that random variable X is the same as
random variable Y)
Probability Distribution
The probability distribution of X is the set of pairs
Discrete Information Sources
discrete information source emits an endless stream of symbols
drawn from an alphabet
discrete memoryless source (DMS) is a source that emits a
stream of statistically independent symbols from its alphabet.
binary memoryless source has an alphabet of two symbols
Rolling a die = DMS
tossing a coin = binary DMS
Uncertainty and Information
Information conveyed by a message or symbol with probability p
is
Entropy is the expected information or
Ciphertext only Cryptanalysis (1)
Consider the above source and cipher system.
The cryptanalyst knows
the plaintext symbol
probabilities P(A), P(B),
P(C), and P(D) and the
probability distribution
of the keys (P(k1) and
P(k2) are equally likely).
Ciphertext only Cryptanalysis (2)
The cryptanalyst
needs to identify
the key.
The cryptanalyst can
calculate the
probabilities that
any ciphertext
character resulted
from a particular
plaintext character.
Ciphertext only Cryptanalysis (3)
For example, if ciphertext A is observed this results from
plaintext character B and k1 or plaintext character A and k2.
So the probabilities of each of these may be calculated as
This process may be continued to build up a table of conditional
probabilities
Ciphertext only Cryptanalysis (4)
Plaintext, Key
Ciphertext A, k1 A, k2 B, k1 B, k2 C, k1 C, k2 D, k1 D, k2
A
0
0.333 0.667 0
0
0
0
0
B
0.25 0
0
0
0
0.75 0
0
C
0
0
0
0.333 0
D
0
0
0
0
0
0.429 0
0.667 0
0
0.571
Suppose the following plaintext has been enciphered using k2 then
Plaintext:
DCDBCDADCB
Ciphertext:
DBDCBDADBC
This process may be continued to build up a table of conditional probabilities
Ciphertext only Cryptanalysis (5)
Suppose the following plaintext has been enciphered using k2
then
Plaintext:
DCDBCDADCB
Ciphertext:
DBDCBDADBC
On seeing the ciphertext the cryptanalyst calculates the
probability of the two possible corresponding plaintexts (s1
and s2) using the table as follows:
The ciphertext contains one A, three Bs, two Cs and four Ds.
Ciphertext only Cryptanalysis (6)
Calculating the product of the relevant conditional probabilities
for each key gives
П1 = 0.667 x 0.253 x 0.6672 x 0.4294 = 1.57 x 10-4
П2 = 0.33 x 0.753 x 0.332 x 0.5714 = 1.66 x 10-3
Then
Plaintext s2 = DCDBCDADCB and the key was k2.
Shannon proposed two measures of
the security of a cipher system:
Cover Time: This is the time estimated to break the system with
unlimited access to plaintext and ciphertext, but using current
computing technology.
Unicity Distance: This is the amount of ciphertext required for
the key to be identified uniquely.
Unicity Distance (1)
For a source X with an alphabet of size and probability
distribution
the entropy is the expected
information:
Now let ML denote a random plaintext of length L giving
ciphertext CL of length L by application of key kx from key set
K.
Unicity Distance (2)
For any ciphertext the minimum number, n, of cipher text
symbols needed before only one key could have generated
that ciphertext is:
The unicity distance is given by the equality of this expression.
For k equiprobable keys this is
Infinite Unicity Distance
If the unicity distance is infinite then we would have a perfectly secure
system.
We have two choices:
1. Make the denominator zero,
This is only true if the message is randomly generated or is perfectly compressed,
neither of which is possible.
2. Make the numerator infinite,
key of infinite size.
This would seem to require a
However, for a message of n symbols we only need n randomly generated
symbols of the key
Then the unicity distance is greater than n and we need more ciphertext
characters than the n available to break the cipher.
This is the basis of a provably unbreakable cipher.
Perfect Secrecy
This gives perfect secrecy if:
i.e. The number of keys equals the number of messages.
A HUGE amount of key data required.
The One Time Pad (1)
Proposed by Gilbert Vernam during World War 1
The only cipher that provides perfect secrecy.
Each key is used only once
The One Time Pad (2)
The one time pad is so called because the sender at one time
had a pad of paper upon each page of which there is a truly
random sequence of symbols.
A page is destroyed after use so that each key is used only once.
The mixing function can be as simple as addition modulo 2.
Note: If M1 + K = C1 and M2 + K = C2 then an attacker can
compute C1 – C2 = M1 – M2 and if the messages have enough
redundancy they can be recovered.
Approximating the One Time Pad
OTP is impractical because we cannot mathematically generate
truly random sequences.
Pseudorandom sequences, or pseudonoise, used.
Implementation Using Shift Registers (1)
We can approximate a one-time pad by generating an extremely
long psuedorandom sequence (of length or more) and then
combining the elements of this sequence with plaintext
symbols in a very simple way.
The psuedorandom sequence generator in a stream cipher
consists of memory, which holds its current state, and a next
state function, which computes a new state at each step.
The output of the sequence generator is some function of its
state.
Implementation Using Shift Registers (2)
In the following illustrations the arrows go both ways between
the State box and the Next State Function box because the
next state is a function of the current state.
Implementation Using Shift Registers (3)
A closely related cipher system is the cipher feedback (CFB)
configuration where the ciphertext is fed back into the
keystream sequence generator.
Thus the ciphertext in a message depends on all the preceding
ciphertext in the message.
This can provide message authentication – preventing an
adversary tampering with a message undetected.
Wireless technologies use stream ciphers because they
approximate the one-time pad and because they only require
an encryption card not an encryption and a decryption card.
Binary Linear Feedback Shift Registers (1)
Binary LFSRs are used to generate very long sequences of
pseudorandom numbers.
Binary Linear Feedback Shift Registers (2)
The shift register is a sequence of bits (if it is n-bits long, it is
called an n-bit shift register).
Each time a new bit is needed all bits in the shift register are
shifted 1 to the right.
The new left-most bit is computed as a function of the other bits
in the register. The output of the shift register is 1 bit, often
the least significant bit.
The Security of LFSRs
LFSRs are not secure because of their linearity.
Only 2n consecutive bits from the register are required to attack
an LFSR with n stages requires.
To obtain the state and feedback coefficients of the register
requires only one matrix inversion since we are solving 2n
linear equations.
Nonlinear Methods
Combine the output of two or more registers non-linearly.
Many nonlinear combinations of LFSRs have been proposed but
all have some weaknesses making them insecure.
The idea of a nonlinear FSR has more merit, however, and the
OFB mode of the DES block cipher to be seen in block 4 is
essentially a nonlinear FSR.
Bluetooth deploys a stream cipher built using a nonlinear
combination of LFSRs.
home work
• http://en.wikipedia.org/wiki/Enigma_machine
• http://lifehacker.com/5978602/use-accented-characters-tomake-your-ios-password-even-stronger
• http://www.labnol.org/software/strong-passcode/27623/
Download