Da Vinci Challenge 2014 - Code Breaking Due to a teacher vocational exchange in 2010/11, nine teams can compete in 2014. Code breaking (the interpretation of SECRET WRITING) is one of the Da Vinci Challenges This presentation aims to describe simple ways to write secretly (encrypt), and offer routes to interpret secret writings (de-encrypt). ………….. You have already read over hidden words in the subtitle of this slide Code Breaking Codes & Ciphers Sender Key & plaintext Algorithm Agent CIPHERTEXT Algorithm Key & plaintext To encipher a secret message the Sender uses a formula (Algorithm) to convert Plaintext into CIPHERTEXT, the Agent reverses the process to convert the cipher back to plaintext. Security can be increased by locking the cipher with a Key which is known only to the Sender and his Agent. To break a code, we need to recognise its algorithm, and deduce the key Code Breaking Coding Algorithms STEGANOGRAPHY (Hidden writing) SECRET WRITING TRANSPOSITION (Shuffled) CRYPTOGRAPHY (Scrambled writing) ENCODE (Replace Words) SUBSTITUTION (Replaced) ENCIPHER (Replace, Transpose, Substitute Letters) We will look at standard methods of constructing and deconstructing each of these algorithms, and at frequency analysis which is a useful tool to make the initial break into a cipher. But beware, encoders like to include twists and false directions in their ciphers. They use keys to obscure the algorithm, and occasionally hide the whole text. STEGANOGRAPHY Code Breaking (Hidden writing) Messages can be hidden within pictures or within text Remember a hidden message in the sub-title of slide 1? Due to A teacher Vocational exchange In 2010/11, Nine teams Can compete In 2014. This demonstrates the weaknesses of Steganography: The agent has to know where to look The agent has to know how to look (the algorithm for DA VINCI was the first letter of every odd numbered word) The delivery is complicated: embedding messages require a huge amount of text, and the ensuing cipher text is often awkward. Code Breaking ENCODE (Replace Words) In true codes whole words are replaced by symbols or entirely different words they are only viable if supported by code books (dictionaries) possessed by both the sender and the agent. assassinate = D capture = J general = S king = q Immediately = 08 today = 75 blackmail = P protect = Z minister = W prince = j tonight = 28 tomorrow = 4 capture the prince tonight encodes as J j 28 Cockney rhyming slang uses coded words for nouns e.g. APPLES PLATES MINCE DOG TROUBLE WEASEL Code Breaking ENCIPHER by TRANSPOSING letters A simple encoding method involves transposing (scrambling) existing letters, using an algorithm known to the agent. Message IHTSIS CINNAE AEDJYS was encoded by breaking the message into groups of 3 letters, reversing each group, and putting it back together. Reversing the process we get: this/is nice/an d/easy/j - Note j is a null character A more common transposition algorithm is to anagram the message, e.g. ATHEIST IS NOSY – is an anagram of this is not easy Transposition ciphers are very difficult to spot so any in the Da Vinci challenge are likely to be identified as such – unless they are the self evident Railfence or Scytale ciphers, which are described overleaf. Code Breaking ENCIPHER by TRANSPOSING letters Transposition continued Railfence (simple) - imagine a spiked fence with letters arranged as shown b s c a l e c t a p s t o a i r i f n e r n o i i n and written BSCALECTAPSTO.AIRIFNERNOIIN To de-encode, split the code into 2 equal halves and take alternate letters from each half Railfence (multiple groups) To encode just read down the columns to get y o u r g r i d YCSIE OAIKM UNZEE RBEFS GEYOS RAORA INUTG DYLHE c a n b e a n y To de-encode, take alternate letters from each block. s i z e y o u l Of course this would be difficult to break if the size & number of blocks did not match the grid unless the agent knows the size already, or the number of letters is a perfect square: e.g. CEEOBADRK C E E i k e f o r t h e me s s a g e Skytale - demonstration O B A D R K Code Breaking ENCIPHER Substitution Ciphers – non alphabetic by SUBSTITUTING symbols Substitution ciphers where letters are replaced by symbols are the easiest to crack – if the agent knows the symbols, examples are: Pigpen [Masonic] Cipher a pigpen cipher example Baconian [Binary] Cipher A B C D E AAAAA AAAAB AAABA AAABB AABAA £££££ ££££$ £££$£ £££$$ ££$££ N O P Q R ABBAB ABBBA ABBBB BAAAA BAAAB £$$£$ £$$$£ £$$$$ $££££ $£££$ AAAAB AAAAA AAABA ABBBA ABBAB ABAAA AAAAA ABBAB AAABA ABAAA ABBBB AABBB AABAA BAAAB baconian cipher To de-encode symbol substitution ciphers like these (and others such as Morse, ASCII, Wingdings & hieroglyphs) simply do a back substitution. Code Breaking ENCIPHER Substitution Ciphers – numeric by SUBSTITUTING numbers 1 2 3 4 5 a b c d e 6 7 8 9 10 f g h i j 11 12 13 14 15 k l m n o 16 17 18 19 20 p q r s t 21 22 23 24 25 26 u v w x y z Substitution ciphers where letters are replaced by numbers are the easiest to crack – if the agent knows the algorithm. Decipher 4 1 12 2 5 1 20 20 9 5 8 9 7 8 19 3 8 15 15 12 dalbeattie high school That was easy, but beware a common trick is to use descending numbers a=26 to z=1. Now see how using a key number (e.g. 2468) complicates the cipher plain text d a l b e a i e h i g h code numbers 4 1 12 2 5 1 20 20 9 5 8 9 7 8 19 3 8 15 15 12 key number 2 4 2 4 4 6 8 2 4 6 2 CODE TEXT 6 5 18 10 7 6 8 t 6 t 8 2 5 26 28 11 9 14 17 9 12 s c 8 h o 4 o 6 l 8 25 11 10 19 21 20 This message would virtually impossible to decode without the agent having the key number. The consecutive code texts are advanced by differing amounts, even double letters have different codes. However reversing the process is simple. Code Breaking ENCIPHER Substitution Ciphers – numeric by SUBSTITUTING numbers 1 2 3 4 5 a b c d e 6 7 8 9 10 f g h i j 11 12 13 14 15 k l m n o 16 17 18 19 20 p q r s t 21 22 23 24 25 26 u v w x y z Substitution ciphers where letters are replaced by numbers are the easy to crack – if the agent knows the algorithm. Decipher 4 1 22 9 14 3 9 13 9 12 1 14 da vinci milan That was easy. Now see how using a single key letter (e.g. q) also complicates the cipher plain text code numbers add code pairs CODE TEXT q 17 d 4 V 21 a 1 V 5 v 22 V 23 i 9 V 31 n 14 V 23 c 3 V 17 i 9 V 12 m 13 V 22 i 9 V 22 l 12 V 21 a 1 V 13 n 14 V 15 Reversing the process is also simple, so long as the agent knows the key letter. CODE TEXT 21 Remove leading 17 4 code plain text d 5 23 31 23 17 12 22 22 21 13 15 1 a 22 v 9 i 14 n 3 c 9 i 13 m 9 i 12 l 1 a 14 n Code Breaking ENCIPHER Substitution Ciphers – monoalphabetic by SUBSTITUTING letters Monoalphabetic substitution ciphers replace each letter with another letter in the same alphabet, the simplest of these ciphers is the: Atbash Cipher In this cipher the first letter of the alphabet is replaced by the last [A = Z], the second letter by the penultimate [B = Y] and so on, until we get: A B C D E F G H I J K L M Z Y X W V U T S R Q P O N This cipher was first used in hebrew and its name comes from the hebrew equivalent of A=Z, B=Y which is aleph=tav, beth=shin An interesting quirk of the Atbash cipher is that in english some words encipher into other words, e.g. HOLD IRK ZOO TILT - which deciphers as slow rip all grog Code Breaking ENCIPHER by SUBSTITUTING letters Substitution Ciphers – monoalphabetic Substitution ciphers where letters are replaced by other letters can be the most difficult to crack – unless the agent has the key and algorithm e.g. Caesar Cipher Julius Caesar frequently wrote coded state messages, a frequent Caesar code replaced each letter with one 3 places further down the alphabet thus PHQ IUHHOB EHOLHYH WKDW ZKLFK WKHB GHVLUH is the cipher for men freely believe that which they desire . Code Breaking ENCIPHER Substitution Ciphers – monoalphabetic with key by SUBSTITUTING letters Caesar Cipher with key We saw that using the basic Caesar cipher men freely believe that which they desire we get PHQ IUHHOB EHOLHYH WKDW ZKLFK WKHB GHVLUH But the cipher is made more complex if a key word (e.g. Julius Caesar) is used to displace the letters: Plain a CIPHER J b c U L d I e f g h i S C A E R T j k l m n o V W X Y Z p q r B D F s t u v w x y z G H K M N O P Q In this case we get XSY CFSSWP USWRSMS HEJH NERLE HESP ISGRFS De-encription requires the cipher alphabet to be rebuilt using the algorithm • Start with the key word(s) without repeated letters • Fill in the remaining letters starting with the next sequential unused letter Code Breaking ENCIPHER by SUBSTITUTING letters Plain 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Vigenère Cipher (Polyalphabetic) a b c d e f g h i j k l m n o p q r s t u v w x y z B C D E F G H I J K L M N O P Q R S T U V WX Y Z A C D E F G H I J K L M N O P Q R S T U V WX Y Z A B D E F G H I J K L M N O P Q R S T U V WX Y Z A B C E F G H I J K L M N O P Q R S T U V WX Y Z A B C D F G H I J K L M N O P Q R S T U V WX Y Z A B C D E G H I J K L M N O P Q R S T U V WX Y Z A B C D E F H I J K L M N O P Q R S T U V WX Y Z A B C D E F G I J K L M N O P Q R S T U V WX Y Z A B C D E F G H J K L M N O P Q R S T U V WX Y Z A B C D E F G H I K L M N O P Q R S T U V WX Y Z A B C D E F G H I J L M N O P Q R S T U V WX Y Z A B C D E F G H I J K M N O P Q R S T U V WX Y Z A B C D E F G H I J K L N O P Q R S T U V WX Y Z A B C D E F G H I J K L M O P Q R S T U V WX Y Z A B C D E F G H I J K L M N P Q R S T U V WX Y Z A B C D E F G H I J K L M N O Q R S T U V WX Y Z A B C D E F G H I J K L M N O P R S T U V WX Y Z A B C D E F G H I J K L M N O P Q S T U V WX Y Z A B C D E F G H I J K L M N O P Q R T U V WX Y Z A B C D E F G H I J K L M N O P Q R S U V WX Y Z A B C D E F G H I J K L M N O P Q R S T V WX Y Z A B C D E F G H I J K L M N O P Q R S T U WX Y Z A B C D E F G H I J K L M N O P Q R S T U V X Y Z A B C D E F G H I J K L MN O P Q R S T U V W Y Z A B C D E F G H I J K L M N O P Q R S T U V WX Z A B C D E F G H I J K L M N O P Q R S T U V WX Y A B C D E F G H I J K L M N O P Q R S T U V WX Y Z Blaise de Vigenère devised a way of using a series of cipher alphabets and key word(s). Thus using key words Da Vinci for his quotation “ art is never finished only abandoned” we get plain text a r t i s n e v e r f i n i s key word D A V I N C I D A V I N C I D CODE TEXT D R O Q F P M Y E M N V O Q V plain text h e d o n l y a b a n d o n e d key word A V I N C I D A V I N C I D A V CODE TEXT H Z L B O T B A W I A F W P E Z DRO QF PMYEM NVOQVHZL BOTB AWIAFWPEZ The strength of the Vigenère cipher is that repeat letters only have the same code infrequently, here d is coded L, F & Z Note: the Caesar cipher is row 3! Code Breaking Eagles, Tits and Ospreys ENCIPHER by SUBSTITUTING letters If all else fails a text cipher can be deciphered using letter and word frequency, so long as it is relatively long (100+ letters) e t a o i n s h r 127 91 82 75 70 67 63 61 60 d l u c m w f y g 43 40 28 28 24 24 22 20 20 p b v k x j q z 19 15 10 8 2 2 1 1 a i of to in it is be as at so we he by or on do if me my up an go no us am ss ee tt ff ll mm oo Letter frequency/1000 One and 2 letters J Q K Z X all <1% Eagles, Tits And Ospreys Inhabit North Scotland – gives top 60% in order. only 2 one letter words 7 common double repeats in order The = commonest 3 letter word Other clues Very few words are without a vowel (e.g. fly wry) Q is always followed by U H frequently goes before E (e.g. the, then, they) but rarely after E Code Breaking ARE YOU AGENT MATERIAL? The preceding slides ran very quickly through the ciphers which I believe you are most likely to meet in the Da Vinci challenge , although I have yet to see a question involving a key locked code, or a polyalphabetic question To prepare for the challenge it will be necessary to run through the slides a few at a time to be able to recognise the different ways to encipher and de-encipher a message. As well as interest in Code Breaking an agent needs to be: Literate Numerate Accurate and comfortable with Etymology Heuristics Quotations So Mr Bond - here’s your first message from M: UZFQDQEFQP?OAZFMOFKAGDFQMYOAMOT Code Breaking Decoding M’s message The previous slide contained the code: UZFQDQEFQP?OAZFMOFKAGDFQMYOAMOT ………….. how can we decipher it? It does not contain numbers, pigpen shapes, hieroglyphs etc so it is probably an alphabetic code ? is strange but it could be a null character or just punctuation, ignore for now Does the ? make it a railfence code? ….. No it is not in the middle and taking alternate letters from each group (UOZA etc) does not make sense Look at the letter distribution ….. There are 4 Qs and 5 Fs (so they are likely to be e, t, a or o) So is it a transposition code or a substitution code? ….. a transposition code only uses letters in the original message – there are too many Qs for this to be a transposition message (remember Q is one of the <1% letters) Does it use multiple alphabets (i.e. Vigenere) …. Not likely, no key word has been given That makes it monoalphabetic (i.e. Caesar), but what is the advancement? … you can find that by using the Vigenere square and finding a row where Q & F are decoded as 2 of e,t,a,or o. Or look at the question ……`a message from M’ hints that the advance is 12 (i.e. M=a, Q=e and F=t). We now have UZteDeEteP?OAZtaOtKAGDteaYOAaOT ….. You can now decode the rest of the message using row 12 Code Breaking Further Information Examples of codes and codebreaking online at http://www.counton.org/explorer/codebreaking/ Books Or have your school’s Da Vinci facilitator contact Geoff Allison via Sue Bain, Piers Butler or Lesley Sloan