Lecture 02- Introduction to Cryptography

advertisement
COMMUNICATION SECURITY
LECTURE 2:
INTRODUCTION TO CRYPTOGRAPHY
Dr. Shahriar Bijani
Shahed University
Spring 2016
SLIDES REFERENCES
Matt Bishop, Computer Security: Art and
Science, the author homepage, 2002-2004.
 Addam Schroll, Cryptography, Purdue university.
 Nikita Borisov, Cryptography, Illinois university,
CS461, 2007.

2
DEFINITIONS
 Cryptography
= the science of encryption
 Cryptanalysis
= the science of breaking
encryption
 Cryptology
= cryptography + cryptanalysis
3
DEFINITIONS
Plaintext: A message in its natural format (readable by
an attacker)
 Ciphertext: Message changed to be unreadable by
anyone except the intended recipients
 Key: Sequence that controls the operation and behavior
of the cryptographic algorithm
 Keyspace: Total number of possible values of keys in a
crypto algorithm

4
CRYPTOSYSTEM
 Quintuple
(E, D, M, K, C)

M set of plaintexts

K set of keys

C set of ciphertexts

E set of encryption functions e: M  K  C

D set of decryption functions d: C  K  M
CRYPTOSYSTEM SERVICES

Confidentiality

Integrity

Authenticity

Nonrepudiation

Access Control
6
ENCRYPTION SYSTEMS

Substitution Cipher

Replacing one letter with another

Monoalphabetic Cipher: substitutes one letter in the
ciphertext for one in the plaintext. It uses fixed substitution
over the entire message (e.g. Caesar)

Polyalphabetic Cipher: multiple substitution alphabets.
A string as a key (e.g. Vigenère)

Transposition Cipher

Reordering the letters within a message
7
TYPES OF CRYPTOGRAPHY


Stream Ciphers

Encrypts 1 bit (or byte) of plaintext at a time

Mixes plaintext with key stream

Good for real-time services
Block Ciphers

Encrypts a fixed size of a block (n-bits of data) at one time

Substitution and transposition
8
CRYPTOGRAPHIC METHODS


Symmetric

Same key for encryption and decryption

Key distribution problem
Asymmetric

Mathematically related key pairs for
encryption and decryption

Public and private keys
9
CRYPTOGRAPHIC METHODS

Hybrid

Combines strengths of both methods

Asymmetric distributes symmetric key

Also known as a session key

Symmetric provides bulk encryption

Example:

SSL negotiates a hybrid method
10
SYMMETRIC ALGORITHMS (BLOCK CIPHERS)

DES / 3DES

AES

IDEA

Blowfish

RC4/ RC5

CAST

SAFER

Twofish

KASUMI

A5 (stream cipher)
11
ASYMMETRIC ALGORITHMS

RSA

Diffie-Hellman

El Gamal

Elliptic Curve Cryptography (ECC)
12
HASHING
 hash
function maps any input length to a
fixed-size output
 hash function h(x) must provide
Compression: output length is small
 Efficiency: h(x) easy to compute for any x
 One-way : given a value y it is infeasible to
find an x such that h(x) = y
 Weak collision resistance: given x and h(x),
infeasible to find y  x such that h(y) = h(x)
 Strong collision resistance: infeasible to
find any x and y, with x  y such that h(x) =
h(y)

13
HASHING ALGORITHMS


MD5

Computes 128-bit hash value

Widely used for file integrity checking
SHA-1

Computes 160-bit hash value

NIST approved message digest algorithm
14
CRYPTOGRAPHIC ATTACKS
 Assume
key

adversary knows algorithm used, but not
Three types of attacks:
 Ciphertext only: adversary has only ciphertext; goal
is to find plaintext, possibly key
 Known plaintext: adversary has ciphertext, Learn
(or guess) part of the corresponding plaintext,
decrypt the rest plaintext; goal is to find key
 Chosen plaintext: adversary may supply plaintexts
and obtain corresponding ciphertext; goal is to find
key (or other messages)
BASIS FOR ATTACKS

Mathematical attacks


Based on analysis of underlying mathematics
Statistical attacks

Make assumptions about the distribution of letters,
pairs of letters (digrams), triplets of letters
(trigrams), etc.


Called models of the language
Examine ciphertext, correlate properties with the
assumptions.
TRANSPOSITION CIPHER

Rearrange letters in plaintext to produce ciphertext

Example: Rail-Fence Cipher

Plaintext is HELLO WORLD

Rearrange as
HLOOL
ELWRD

Ciphertext is HLOOL ELWRD
ATTACKING THE CIPHER

Anagramming

If 1-gram frequencies match English frequencies, but
other n-gram frequencies do not, probably
transposition

Rearrange letters to form n-grams with highest
frequencies
EXAMPLE
 Ciphertext:
HLOOLELWRD
 Frequencies of 2-grams beginning with H



HE 0.0305
HO 0.0043
HL, HW, HR, HD < 0.0010
 Frequencies


of 2-grams ending in H
WH 0.0026
EH, LH, OH, RH, DH ≤ 0.0002
 Implies
E follows H
EXAMPLE

Arrange so the H and E are adjacent
HE
LL
OW
OR
LD

Read off across, then down, to get original
plaintext!
SUBSTITUTION CIPHERS
Change characters in plaintext to produce ciphertext
 Example: Caesar cipher


Plaintext is HELLO WORLD

Change each letter to the third letter following it (X goes
to A, Y to B, Z to C)



Key is 3, usually written as letter ‘D’
Ciphertext is KHOOR ZRUOG
Each letter gets mapped to another letter

E.g. A -> E, B -> R, C -> Q, ...
CAESAR CIPHER

Historical Ciphers
K=3
Outer: plaintext
Inner: ciphertext
CAESAR CIPHER

Formally


Encrypt(Letter, Key) = (Letter + Key) (mod 26)
Decrypt(Letter, Key) = (Letter - Key) (mod 26)

Encrypt(“NIKITA”, 3) = “QLNLWD”
Decrypt(“QLNLWD”, 3) = “NIKITA”

More Formally






M = { sequences of letters }
K = { i | i is an integer and 0 ≤ i ≤ 25 }
E = { Ek | k  K and for all letters m,
Ek(m) = (m + k) mod 26 }
D = { Dk | k  K and for all letters c,
Dk(c) = (26 + c – k) mod 26 }
C=M
ATTACKS

Ciphertext only attack:


Recover plaintext knowing only the ciphertext
Ciphertext:

HSPAA SLRUV DSLKN LPZHK HUNLY VBZAO
PUN
FREQUENCY ANALYSIS
HSPAA SLRUV DSLKN LPZHK HUNLY VBZAO
PUN
 Find most frequent letters
4 times: L
 3 times: A, H, N, P, S, U


Guess: Decrypt(L) = E
Key = L-E = 7
 Decrypt(HSPAA SLRUV DSLKN LPZHK HUNLY
VBZAO PUN, 7) = ALITT LEKNO WLEDG EISAD
ANGER OUSTH ING

BRUTE FORCE
 Ciphertext






= IGKYGXOYOTYKIAXK
Decrypt(IGKYGXOYOTYKIAXK,
HFJXFWNXNSXJHZWJ
Decrypt(IGKYGXOYOTYKIAXK,
GEIWEVMWMRWIGYVI
Decrypt(IGKYGXOYOTYKIAXK,
FDHVDULVLQVHFXUH
Decrypt(IGKYGXOYOTYKIAXK,
ECGUCTKUKPUGEWTG
Decrypt(IGKYGXOYOTYKIAXK,
DBFTBSJTJOTFDVSF
Decrypt(IGKYGXOYOTYKIAXK,
CAESARISINSECURE
1) =
2) =
3) =
4) =
5) =
6) =
ATTACKING THE CIPHER

Exhaustive search



If the key space is small enough, try all possible keys
until you find the right one
Caesar cipher has 26 possible keys
Statistical analysis

Compare to 1-gram model of English
STATISTICAL ATTACK

Compute frequency of each letter in ciphertext:
G 0.1
R 0.2

H 0.1
U 0.1
K 0.1
Z 0.1
O 0.3
Apply 1-gram model of English

Frequency of characters (1-grams) in English is on
next slide
CHARACTER FREQUENCIES (1-GRAMS) IN
ENGLISH
a
0.080
h
0.060
n
0.070
t
0.090
b
0.015
i
0.065
o
0.080
u
0.030
c
0.030
j
0.005
p
0.020
v
0.010
d
0.040
k
0.005
q
0.002
w 0.015
e
0.130
l
0.035
r
0.065
x
0.005
f
0.020
m
0.030
s
0.060
y
0.020
g
0.015
z
0.002
Slide #9-30
STATISTICAL ANALYSIS

f(c) frequency of character c in ciphertext

(i) correlation of frequency of letters in ciphertext
with corresponding letters in English, assuming
key is i

(i) = 0 ≤ c ≤ 25 f(c)p(c – i) so here,
(i) = 0.1p(6 – i) + 0.1p(7 – i) + 0.1p(10 – i) + 0.3p(14 – i)
+ 0.2p(17 – i) + 0.1p(20 – i) + 0.1p(25 – i)

p(x) is frequency of character x in English
CORRELATION: (I) FOR 0 ≤ I ≤ 25
i
0
1
2
3
4
5
6
(i)
0.0482
0.0364
0.0410
0.0575
0.0252
0.0190
0.0660
i
7
8
9
10
11
12
(i)
0.0442
0.0202
0.0267
0.0635
0.0262
0.0325
i
13
14
15
16
17
18
(i)
0.0520
0.0535
0.0226
0.0322
0.0392
0.0299
i
19
20
21
22
23
24
25
(i)
0.0315
0.0302
0.0517
0.0380
0.0370
0.0316
0.0430
THE RESULT

Most probable keys, based on :

i = 6, (i) = 0.0660


i = 10, (i) = 0.0635


plaintext HELLO WORLD
i = 14, (i) = 0.0535


plaintext AXEEH PHKEW
i = 3, (i) = 0.0575


plaintext EBIIL TLOLA
plaintext WTAAD LDGAS
Only English phrase is for i = 3

That’s the key (3 or ‘D’)
CAESAR’S PROBLEM

Key is too short

Can be found by exhaustive search

Statistical frequencies not concealed well


They look too much like regular English letters
So make it longer

Multiple letters in key

Idea is to smooth the statistical frequencies to make
cryptanalysis harder
VIGÈNERE CIPHER

Like Caesar cipher, but use a phrase

Example

Message THE BOY HAS THE BALL

Key VIG

Encipher using Caesar cipher for each letter:
key
VIGVIGVIGVIGVIGV
plain
THEBOYHASTHEBALL
cipher OPKWWECIYOPKWIRG
VIGENERE CIPHER

A different caesar cipher per letter
MORESECURETHANCAESAR (Ciphertext)
+ SECRETSECRETSECRETSE (Key)
= FTUWXYVZUWYBTSFSJMTW

M (13) + A (19) = F (6) mod 26

O (15) + E (5) = T (20) mod 26

...
VIGENERE ANALYSIS

Key space?


Frequency analysis?


26Length(Key)
Doesn’t work because of different keys
For many years, the Vigenère cipher was
considered unbreakable!
USEFUL TERMS

period: length of key


tableau: table used to encipher and decipher


In earlier example, period is 3
Vigènere cipher has key letters on top, plaintext
letters on the left
polyalphabetic: the key has several different
letters

Caesar cipher is monoalphabetic
Slide #9-38
VIGENERE ANALYSIS
Guess period of the cipher= p
 Construct p frequency tables

 Cryptanalyze each one
http://math.ucsd.edu/~crypto/java/EARLYCIPHERS/Vigenere.html

Better yet, recover period

Look for repeated n-grams
VIGENERE ANALYSIS

The index of coincidence
measures the differences in the frequencies of the letters in
the ciphertext.
 the probability that two randomly chosen letters from the
ciphertext will be the same.


Fc = frequency of cipher character c, N = length of the ciphertext


Indices of coincidences for different periods:
VIGENÈRE TABLEAU

The key letters on top, plaintext letters on the left
41
RELEVANT PARTS OF TABLEAU
A
B
E
H
L
O
S
T
Y
G
G
H
L
N
R
U
Y
Z
E
I
I
J
M
P
T
W
A
B
H
V
V
W
Z
C
G
J
N
O
T
Tableau shown has
relevant rows,
columns only
 Example
encipherments:



key V, letter T: follow V
column down to T row
(giving “O”)
Key I, letter H: follow I
column down to H row
(giving “P”)
VIGENÈRE ANALYSIS

Ciphertext:
ADQYS MIUSB
OXKKT MIBHK IZOOO
EQOOG IFBAG
KAUMF
VVTAA
MOCIO EQOOG BMBFV ZGGWP CIEKQ
HSNEW
CIDTW
VECNE DLAAV RWKXS VNSVP
HCEUT QOIOF
MEGJS WTPCH
AJMOC HIUIX
Could this be a Caesar cipher?
 We find that the index of coincidence is 0.043,
which indicates a key of length 5 or more.
 So we assume that the key is of length greater
than 1, and apply the Kasiski method

43
VIGENÈRE ANALYSIS

Repetitions of 2 letters or more
The only factors that occur more in the gaps are 2 (in eight gaps) and
3 (in seven gaps). As a first guess, let us try 6.
Factors of
Letters
Start
End
Gap length
gap length

MI
5
15
10
2, 5
OO
22
27
5
5
OEQOOG
24
54
30
2, 3, 5
FV
39
63
24
2, 2, 2, 3
AA
43
87
44
2, 2, 11
MOC
50
122
72
2, 2, 2, 3, 3
QO
56
105
49
7, 7
PC
69
117
48
2, 2, 2, 2, 3
NE
77
83
6
2, 3
SV
94
97
3
3
CH
118
124
6
2, 3
44
VIGENÈRE ANALYSIS


To verify this guess, we compute the
index of coincidence for each alphabet.
We first arrange the message into 6
columns.
Each column represents one alphabet.
The indices of coincidence are:
A
D
Q
Y
S
M
I
U
S
B
O
X
K
K
T
M
I
B
H
K
I
Z
O
O
O
E
Q
O
O
G
I
F
B
A
G
K
A
U
M
F
V
V
T
A
A
C
I
D
T
W
M
O
C
I
O
E
Q
O
O
G
B
M
B
F
V
Z
Alphabet #1:
IC = 0.069
Alphabet #4:
IC = 0.056
G
G
W
P
C
I
Alphabet #2:
IC = 0.078
Alphabet #5:
IC = 0.124
E
K
Q
H
S
N
Alphabet #3:
IC = 0.078
Alphabet #6:
IC = 0.043
E
W
V
E
C
N
E
D
L
A
A
V
R
All ICs indicate a single alphabet
N
except for the ICs of alphabets #4
E
(period between 1 and 2) and #6 (period O
between 5 and 10).
S
W
K
X
S
V
S
V
P
H
C
U
T
Q
O
I
F
M
E
G
J
W
T
P
C
H
A
J
M
O
C
H
I
U
I
X

45
VIGENÈRE ANALYSIS

Counting characters in each column (alphabet) :
Column A B C D E F G H I J K L MN O P Q R S T U V WX Y Z
#1
3 1 0 0 4 0 1 1 3 0 1 0 0 1 3 0 0 1 1 2 0 0 0 0 0 0
#2
1 0 0 2 2 2 1 0 0 1 3 0 1 0 0 0 0 0 1 0 4 0 4 0 0 0
#3
1 2 0 0 0 0 0 0 2 0 1 1 4 0 0 0 4 0 1 3 0 2 1 0 0 0
#4
2 1 1 0 2 2 0 1 0 0 0 0 1 0 4 3 1 0 0 0 0 0 0 2 1 1
#5
1 0 5 0 0 0 2 1 2 0 0 0 0 0 5 0 0 0 3 0 0 2 0 0 0 0
#6
0 1 1 1 0 0 2 2 3 1 1 0 1 2 1 0 0 0 0 0 0 3 0 1 0 1
unshifted H M M M H M M H H M M M M H H M L H H H M L L L L L



An unshifted alphabet has the characteristics in the last row (L=low
frequency, M = moderate frequency, H =high frequency)
now compare the frequency counts in the six alphabets with the
frequency count of the unshifted alphabet.
The first alphabet matches the characteristics of the unshifted alphabet
(note the values for A, E, and I in particular).
46
VIGENÈRE ANALYSIS
the 3rd alphabet seems to be shifted with I
mapping to A.
 in the 6th alphabet : V maps to A.

47
VIGENÈRE ANALYSIS
48
VIGENÈRE ANALYSIS
With proper spacing and punctuation, we have
A LIMERICK PACKS LAUGHS ANATOMICAL
INTO SPACE THAT IS QUITE ECONOMICAL
BUT THE GOOD ONES I'VE SEEN SO SELDOM
ARE CLEAN, AND THE CLEAN ONES SO
SELDOM ARE COMICAL.

The key is ASIMOV.
49
VIGENERE ANALYSIS

Here is a ciphertext message
50
Download