Code Breaking - da Vinci Decathlon

advertisement
Da Vinci Challenge 2014 - Code Breaking
Due to a teacher vocational exchange in
2010/11, nine teams can compete in 2014.
Code breaking (the interpretation of SECRET WRITING) is
one of the Da Vinci Challenges
This presentation aims to describe simple ways to write
secretly (encrypt), and offer routes to interpret secret
writings (de-encrypt). …………..
You have already read over hidden words in the subtitle of this slide
Code Breaking
Codes & Ciphers
Sender
Key & plaintext
Algorithm
Agent
CIPHERTEXT
Algorithm
Key & plaintext
To encipher a secret message the Sender uses a formula (Algorithm) to
convert Plaintext into CIPHERTEXT, the Agent reverses the process to
convert the cipher back to plaintext. Security can be increased by locking
the cipher with a Key which is known only to the Sender and his Agent.
To break a code, we need to recognise its algorithm, and deduce the key
Code Breaking
Coding Algorithms
STEGANOGRAPHY
(Hidden writing)
SECRET
WRITING
TRANSPOSITION
(Shuffled)
CRYPTOGRAPHY
(Scrambled writing)
ENCODE
(Replace Words)
SUBSTITUTION
(Replaced)
ENCIPHER
(Replace, Transpose,
Substitute Letters)
We will look at standard methods of constructing and deconstructing each of
these algorithms, and at frequency analysis which is a useful tool to make the
initial break into a cipher.
But beware, encoders like to include twists and false directions in their ciphers.
They use keys to obscure the algorithm, and occasionally hide the whole text.
STEGANOGRAPHY
Code Breaking
(Hidden writing)
Messages can be hidden within pictures or within text
Remember a hidden message in the sub-title of slide 1?
Due to A teacher Vocational exchange In
2010/11, Nine teams Can compete In 2014.
This demonstrates the weaknesses of Steganography:
 The agent has to know where to look
 The agent has to know how to look (the algorithm for DA VINCI was the first
letter of every odd numbered word)
 The delivery is complicated: embedding messages require a huge amount of
text, and the ensuing cipher text is often awkward.
Code Breaking
ENCODE
(Replace Words)
In true codes whole words are replaced by symbols or entirely
different words they are only viable if supported by code books
(dictionaries) possessed by both the sender and the agent.
assassinate = D capture = J
general = S
king = q
Immediately = 08 today = 75
blackmail = P
protect = Z
minister = W
prince = j
tonight = 28 tomorrow = 4
capture the prince tonight encodes as J j 28
Cockney rhyming slang uses coded words for nouns e.g.
APPLES
PLATES
MINCE
DOG
TROUBLE
WEASEL
Code Breaking
ENCIPHER
by TRANSPOSING letters
A simple encoding method involves transposing (scrambling) existing
letters, using an algorithm known to the agent.
Message IHTSIS CINNAE AEDJYS was encoded by breaking the message
into groups of 3 letters, reversing each group, and putting it back together.
Reversing the process we get:
this/is nice/an d/easy/j - Note j is a null character
A more common transposition algorithm is to anagram the message,
e.g. ATHEIST IS NOSY – is an anagram of this is not easy
Transposition ciphers are very difficult to spot so any in the Da Vinci
challenge are likely to be identified as such – unless they are the self evident
Railfence or Scytale ciphers, which are described overleaf.
Code Breaking
ENCIPHER
by TRANSPOSING letters
Transposition continued
Railfence (simple) - imagine a spiked fence with letters arranged as shown
b s c a l e c t a p s t o
a i r
i f n e r n o i
i n
and written BSCALECTAPSTO.AIRIFNERNOIIN
To de-encode, split the code into 2 equal halves and take alternate letters from each half
Railfence (multiple groups)
To encode just read down the columns to get
y o u r g r i d
YCSIE OAIKM UNZEE RBEFS GEYOS RAORA INUTG DYLHE
c a n b e a n y
To de-encode, take alternate letters from each block.
s i z e y o u l
Of course this would be difficult to break if the size &
number of blocks did not match the grid unless the
agent knows the size already, or the number of letters
is a perfect square: e.g. CEEOBADRK
C E E
i k e f o r t h
e me s s a g e
Skytale - demonstration
O
B
A
D
R
K
Code Breaking
ENCIPHER
Substitution Ciphers – non alphabetic
by SUBSTITUTING symbols
Substitution ciphers where letters are replaced by symbols are the
easiest to crack – if the agent knows the symbols, examples are:
Pigpen [Masonic] Cipher
a
pigpen
cipher
example
Baconian [Binary] Cipher
A
B
C
D
E
AAAAA
AAAAB
AAABA
AAABB
AABAA
£££££
££££$
£££$£
£££$$
££$££
N
O
P
Q
R
ABBAB
ABBBA
ABBBB
BAAAA
BAAAB
£$$£$
£$$$£
£$$$$
$££££
$£££$
AAAAB AAAAA AAABA ABBBA ABBAB ABAAA AAAAA ABBAB
AAABA ABAAA ABBBB AABBB AABAA BAAAB
baconian cipher
To de-encode symbol substitution ciphers like these (and others such as
Morse, ASCII, Wingdings & hieroglyphs) simply do a back substitution.
Code Breaking
ENCIPHER
Substitution Ciphers – numeric
by SUBSTITUTING numbers
1
2
3
4
5
a
b
c
d
e
6
7
8
9
10
f
g
h
i
j
11
12
13
14
15
k
l
m
n
o
16
17
18
19
20
p
q
r
s
t
21
22
23
24
25
26
u
v
w
x
y
z
Substitution ciphers where letters are replaced
by numbers are the easiest to crack – if the
agent knows the algorithm.
Decipher 4 1 12 2 5 1 20 20 9 5 8 9 7 8 19 3 8 15 15 12
dalbeattie
high
school
That was easy, but beware a common trick is to use descending numbers a=26 to z=1.
Now see how using a key number (e.g. 2468) complicates the cipher
plain text
d
a
l
b
e
a
i
e
h
i
g
h
code numbers
4
1 12 2
5
1 20 20 9
5
8
9
7
8
19 3
8 15 15 12
key number
2
4
2
4
4
6
8
2
4
6
2
CODE TEXT
6
5 18 10 7
6
8
t
6
t
8
2
5 26 28 11 9
14 17 9 12
s
c
8
h
o
4
o
6
l
8
25 11 10 19 21 20
This message would virtually impossible to decode without the agent having the key number.
The consecutive code texts are advanced by differing amounts, even double letters have different
codes. However reversing the process is simple.
Code Breaking
ENCIPHER
Substitution Ciphers – numeric
by SUBSTITUTING numbers
1
2
3
4
5
a
b
c
d
e
6
7
8
9
10
f
g
h
i
j
11
12
13
14
15
k
l
m
n
o
16
17
18
19
20
p
q
r
s
t
21
22
23
24
25
26
u
v
w
x
y
z
Substitution ciphers where letters are replaced
by numbers are the easy to crack – if the agent
knows the algorithm.
Decipher 4 1 22 9 14 3 9 13 9 12 1 14
da
vinci
milan
That was easy. Now see how using a single key letter (e.g. q) also complicates the cipher
plain text
code numbers
add code pairs
CODE TEXT
q
17
d
4
V
21
a
1
V
5
v
22
V
23
i
9
V
31
n
14
V
23
c
3
V
17
i
9
V
12
m
13
V
22
i
9
V
22
l
12
V
21
a
1
V
13
n
14
V
15
Reversing the process is also simple, so long as the agent knows the key letter.
CODE TEXT
21
Remove leading
17 4
code
plain text
d
5
23
31
23
17
12
22
22
21
13
15
1
a
22
v
9
i
14
n
3
c
9
i
13
m
9
i
12
l
1
a
14
n
Code Breaking
ENCIPHER
Substitution Ciphers – monoalphabetic
by SUBSTITUTING letters
Monoalphabetic substitution ciphers replace each letter with another letter
in the same alphabet, the simplest of these ciphers is the:
Atbash Cipher
In this cipher the first letter of the alphabet is replaced by the last [A = Z],
the second letter by the penultimate [B = Y] and so on, until we get:
A
B
C
D
E
F
G
H
I
J
K
L
M
Z
Y
X
W
V
U
T
S
R
Q
P
O
N
This cipher was first used in hebrew and its name comes from the hebrew equivalent of
A=Z, B=Y which is aleph=tav, beth=shin
An interesting quirk of the Atbash cipher is that in english some words
encipher into other words, e.g. HOLD IRK ZOO TILT
- which deciphers as slow rip all grog
Code Breaking
ENCIPHER
by SUBSTITUTING letters
Substitution Ciphers – monoalphabetic
Substitution ciphers where letters are replaced by other letters can be the
most difficult to crack – unless the agent has the key and algorithm e.g.
Caesar Cipher
Julius Caesar frequently wrote coded state messages, a frequent Caesar
code replaced each letter with one 3 places further down the alphabet thus
PHQ IUHHOB EHOLHYH WKDW ZKLFK WKHB GHVLUH is the cipher for
men freely
believe
that which they
desire
.
Code Breaking
ENCIPHER
Substitution Ciphers – monoalphabetic with key
by SUBSTITUTING letters
Caesar Cipher with key
We saw that using the basic Caesar cipher men freely believe that which they
desire we get PHQ IUHHOB EHOLHYH WKDW ZKLFK WKHB GHVLUH
But the cipher is made more complex if a key word (e.g. Julius Caesar) is used
to displace the letters:
Plain
a
CIPHER J
b
c
U L
d
I
e
f
g
h
i
S C A E R T
j
k
l
m n
o
V W X Y Z
p
q
r
B D F
s
t
u
v
w
x
y
z
G H K M N O P Q
In this case we get XSY CFSSWP USWRSMS HEJH NERLE HESP ISGRFS
De-encription requires the cipher alphabet to be rebuilt using the algorithm
• Start with the key word(s) without repeated letters
• Fill in the remaining letters starting with the next sequential unused letter
Code Breaking
ENCIPHER
by SUBSTITUTING letters
Plain
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Vigenère Cipher (Polyalphabetic)
a b c d e f g h i j k l m n o p q r s t u v w x y z
B C D E F G H I J K L M N O P Q R S T U V WX Y Z A
C D E F G H I J K L M N O P Q R S T U V WX Y Z A B
D E F G H I J K L M N O P Q R S T U V WX Y Z A B C
E F G H I J K L M N O P Q R S T U V WX Y Z A B C D
F G H I J K L M N O P Q R S T U V WX Y Z A B C D E
G H I J K L M N O P Q R S T U V WX Y Z A B C D E F
H I J K L M N O P Q R S T U V WX Y Z A B C D E F G
I J K L M N O P Q R S T U V WX Y Z A B C D E F G H
J K L M N O P Q R S T U V WX Y Z A B C D E F G H I
K L M N O P Q R S T U V WX Y Z A B C D E F G H I J
L M N O P Q R S T U V WX Y Z A B C D E F G H I J K
M N O P Q R S T U V WX Y Z A B C D E F G H I J K L
N O P Q R S T U V WX Y Z A B C D E F G H I J K L M
O P Q R S T U V WX Y Z A B C D E F G H I J K L M N
P Q R S T U V WX Y Z A B C D E F G H I J K L M N O
Q R S T U V WX Y Z A B C D E F G H I J K L M N O P
R S T U V WX Y Z A B C D E F G H I J K L M N O P Q
S T U V WX Y Z A B C D E F G H I J K L M N O P Q R
T U V WX Y Z A B C D E F G H I J K L M N O P Q R S
U V WX Y Z A B C D E F G H I J K L M N O P Q R S T
V WX Y Z A B C D E F G H I J K L M N O P Q R S T U
WX Y Z A B C D E F G H I J K L M N O P Q R S T U V
X Y Z A B C D E F G H I J K L MN O P Q R S T U V W
Y Z A B C D E F G H I J K L M N O P Q R S T U V WX
Z A B C D E F G H I J K L M N O P Q R S T U V WX Y
A B C D E F G H I J K L M N O P Q R S T U V WX Y Z
Blaise de Vigenère devised a way of
using a series of cipher alphabets and
key word(s). Thus using key words Da
Vinci for his quotation “ art is never
finished only abandoned” we get
plain text
a r t i s n e v e r f i n i s
key word
D A V I N C I D A V I N C I D
CODE TEXT D R O Q F P M Y E M N V O Q V
plain text
h e d o n l y a b a n d o n e d
key word
A V I N C I D A V I N C I D A V
CODE TEXT H Z L B O T B A W I A F W P E Z
DRO QF PMYEM NVOQVHZL
BOTB AWIAFWPEZ
The strength of the Vigenère cipher is that
repeat letters only have the same code
infrequently, here d is coded L, F & Z
Note: the Caesar cipher is row 3!
Code Breaking
Eagles, Tits and Ospreys
ENCIPHER
by SUBSTITUTING letters
If all else fails a text cipher can be deciphered using letter and
word frequency, so long as it is relatively long (100+ letters)
e
t
a
o
i
n
s
h
r
127
91
82
75
70
67
63
61
60
d
l
u
c
m
w
f
y
g
43
40
28
28
24
24
22
20
20
p
b
v
k
x
j
q
z
19
15
10
8
2
2
1
1
a
i
of
to
in
it
is
be
as
at
so
we
he
by
or
on
do
if
me
my
up
an
go
no
us
am
ss
ee
tt
ff
ll
mm
oo
Letter frequency/1000
One and 2 letters
J Q K Z X all <1%
Eagles, Tits And Ospreys
Inhabit North Scotland –
gives top 60% in order.
only 2 one letter words
7 common double repeats in order
The = commonest 3 letter word
Other clues
Very few words are without a vowel
(e.g. fly wry)
Q is always followed by U
H frequently goes before E (e.g. the,
then, they) but rarely after E
Code Breaking
ARE YOU AGENT
MATERIAL?
The preceding slides ran very quickly through the ciphers which I
believe you are most likely to meet in the Da Vinci challenge ,
although I have yet to see a question involving a key locked code,
or a polyalphabetic question
To prepare for the challenge it will be necessary to run through the slides a few at a time
to be able to recognise the different ways to encipher and de-encipher a message. As
well as interest in Code Breaking an agent needs to be:
Literate
Numerate
Accurate
and comfortable with
Etymology
Heuristics
Quotations
So Mr Bond - here’s your first message from M:
UZFQDQEFQP?OAZFMOFKAGDFQMYOAMOT
Code Breaking
Decoding M’s message
The previous slide contained the code:
UZFQDQEFQP?OAZFMOFKAGDFQMYOAMOT
………….. how can we decipher it?
 It does not contain numbers, pigpen shapes, hieroglyphs etc so it is probably an alphabetic code
 ? is strange but it could be a null character or just punctuation, ignore for now
 Does the ? make it a railfence code? ….. No it is not in the middle and taking alternate
letters from each group (UOZA etc) does not make sense
 Look at the letter distribution ….. There are 4 Qs and 5 Fs (so they are likely to be e, t, a or o)
 So is it a transposition code or a substitution code? ….. a transposition code only uses letters in
the original message – there are too many Qs for this to be a transposition message (remember Q
is one of the <1% letters)
 Does it use multiple alphabets (i.e. Vigenere) …. Not likely, no key word has been given
 That makes it monoalphabetic (i.e. Caesar), but what is the advancement? … you can find that by
using the Vigenere square and finding a row where Q & F are decoded as 2 of e,t,a,or o. Or look at
the question ……`a message from M’ hints that the advance is 12 (i.e. M=a, Q=e and F=t).
 We now have UZteDeEteP?OAZtaOtKAGDteaYOAaOT ….. You can now decode the rest of the
message using row 12
Code Breaking
Further
Information
Examples of codes and codebreaking online at
http://www.counton.org/explorer/codebreaking/
Books
Or have your school’s Da Vinci facilitator contact
Geoff Allison via Sue Bain, Piers Butler or Lesley Sloan
Download