CRYPTOGRAPHY – HACKERS AND VIRUSES OUTLINE Cryptography Public key systems RSA algorithm Why RSA algorithm is effective RSA algorithm example Hackers and Viruses Virus detection Exploiting the difference between the worst case and the average case INTRODUCTION Cryptography is the science of writing in secret code and is an ancient art. Modern techniques for encoding sensitive financial information have enabled the explosion of electronic commerce. Before computers, any useful cryptographic scheme was necessarily computationally trivial. It had to be because both senders and receivers implemented their algorithms by hand. With the advent of computers, things changed. INTRODUCTION In all the simplest cryptographic systems, the algorithms that will be used for both encryption and decryption are fixed and known to everyone. But those algorithms take two inputs, the text to be encoded or decoded and a key. In a symmetric key system, sender and receiver use the same key. No message can be sent unless there has been some prior agreement on a key. Even if there has been such an agreement, if the same key is used over an extended period of time, an eavesdropper may be able to infer the key and break the code. PUBLIC KEY SYSTEMS In order to change keys, there must be some way to transmit new keys between senders and receivers. Public key systems was first introduced in 1970s. The most widely used public key system is the RSA algorithm. RSA ALGORITHM RSA algorithm[ Rivest , Shamir and Adleman] in 1978. We assume that Bob and Alice wish to exchange secure messages and that Eve is attempting to eavesdrop. We’ll call the original(unencrypted text) as plaintext and the encrypted text as ciphertext. The most general way to describe RSA is as follows RSA ALGORITHM Assume that Alice wants to send a message to Bob. Then 1. 2. 3. 4. Bob chooses a key, private, known only to him. Bob exploits a function f to compute his public key, public = f(private). Bob publishes public Alice exploits Bob’s public key to compute ciphertext= encrypt(plaintext, public) and she sends ciphertext to Bob. Bob exploits his private key to compute plaintext= decrypt(ciphertext, private). In order for this last step to work, encrypt and decrypt must be designed so that one is the inverse of the other. RSA ALGORITHM If there exist efficient algorithm for performing all four of the steps, then Bob and Alice will be able to exchange messages. We assume that Eve knows the algorithm encrypt and decrypt . So she could easily eavesdrop if she could infer Bob’s private key from his public one or if she could compute decrypt without knowing Bob’s private key. RSA ALGORITHM Alice uses the RSA algorithm to send a message to Bob as follows. 1. Bob constructs his public and private keys. 1. 2. 3. 2. Bob chooses two large prime numbers p and q. From them, he computes n=p.q Bob, finds a value e such that 1 < e < p.q and gcd(e.(p-1)(q-1))=1. (in other words, he finds an e such that e and (p-1).(q-1) are relatively prime) Bob computes a value d such that d.e(mod(p-1).(q1))= 1. In RSA terminology, this value d, rather than the original numbers p and q, is referred to as Bob’s private key. Bob publishes (n,e) as his public key RSA ALGORITHM 3. 4. Alice breaks her message plaintext into segments such that no segment corresponds to a binary number that is larger than n. Then, for each plaintext segment, Alice computes ciphertext = plaintexte(mod n). Then she sends ciphertext to Bob. Bob recreates Alice’s original message by computing plaintext = ciphertextd(mod n). RSA ALGORITHM RSA ALGORITHM EXAMPLE We can illustrate the RSA algorithm with a simple message from Alice to Bob. 1. Bob is expecting to receive messages. So he constructs his keys as follows: 1. 2. 3. 2. He chooses two prime numbers, p=19 and q=31. He computes n=p*q=589 He finds an e that has no common divisors with 18*30=540. The e he selects is 49 He finds a value d =1069. Notice that 1069*49 =52381. Bob needs to assure that the remainder, when 52381 is divided by 540, is 1. And it is :52381=540*97+1. Bob’s private key is now 1069. Bob publishes (589,49) as his public key RSA ALGORITHM EXAMPLE Alice wishes to send the simple message “A”. The ASCII code for A is 65. So Alice computes 6549(mod589). She does without actually computing 6549. Instead, she exploits the following two facts: ni+j=ni*nj (n*m)(mod k) =(n(mod k)*m(mod k))(mod k) Combining these, we have: ni+j(mod k) = (ni(mod k)*nj(mod k))(mod k) 3. RSA ALGORITHM EXAMPLE So, to compute 6549,first observe that 49 can be expressed in binary as 110001. So 49 = 1+16+32 Thus 6549 = 651+16+32. The following table lists the required powers 65: 651(mod589)=65 652(mod589)=4225(mod589)=102 654(mod589)=1022(mod589)=10404(mod589)=391 658(mod589)=3912(mod589)=152881(mod589)=330 6516(mod589)=3302(mod589)=108900(mod589)=524 6532(mod589)=5242(mod589)=274576(mod589)=102 RSA ALGORITHM EXAMPLE So we have:6549(mod589) = 651+16+32(mod589). =(651*6516*6532)(mod589) =((651(mod589))*(6516(mod589))*(6532(mod589)))(mod58 9) =(65*524*102)(mod589) =((34060(mod589))*102)(mod589) =(487*102)(mod589) =198 Alice sends Bob the message 198. 4. Bob uses his private key(1069) to recreate Alice’s message by computing 1981069(mod589). Using the same process Alice used, he does this efficiently and retrieves the message 65. RSA IS EFFECTIVE The function encrypt and decrypt are inverse of each other and it is proved using euler’s generalization of Fermat’s Little theorem. Bob can choose primes efficiently using the following algorithm. Randomly choose two large numbers as candidates Check the candidates to see if they are prime. This can be done efficiently using a randomized algorithm. Repeat steps 1 and 2 until two primes have been chosen. So, for example suppose Bob wants to choose a 1000 bit number. The probability of a randomly chosen number near 21000 being prime is about 1/693. RSA IS EFFECTIVE Bob can check gcd efficiently using euler’s theorem, so he can compute e. Bob can compute d efficiently, using an extension of euclid’s algorithm that exploits the quotients that it produces at each step. Alice can implement encrypt efficiently. It is not necessary to compute plaintext and then take its remainder mod n. Similarly Bob can implement decrypt efficiently. Eve can’t recreate plaintext because She can’t simply invert encrypt and she can’t try every candidate plaintext and see is she gets one that produces ciphertext. HACKERS AND VIRUSES In this, we discuss the other network security issues ie., virus detection . We’ll see that the undecidability proves that the definite virus detector cannot exist. The second involves the difference between the average case and the worst case time complexity of some important algorithms. This difference may allow hackers to launch denial of service attacks and to observe “secret” behavior of remote hosts. VIRUS DETECTION Given a known computer virus V, consider the problem of detecting an infection by V. The most straightforward approach to solving this problem is just to scan incoming messages by <V>. But virus can easily evade this technique by altering their text in ways that have no effect on computation that V performs. For example, source code could be modified to add blanks in meaningless places or to add leading 0’s to numbers. VIRUS DETECTION Executable code could be modified by adding jump instructions to the next instruction. So the practical virus detection problem can be stated as “Given a known virus V and an input message M”, does M contain the text of a program that computes the same thing V computes? We know the equivalence question is undecidable for turing machines, using that the equivalence question for arbitrary programs is also undecidable. VIRUS DETECTION So, we can’t solve the virus problem by making a list of known viruses and comparing new code to them. Suppose that, instead of making a list of forbidden operations, we allowed users to define a “white list” of the operations that are to be allowed to be run on their machines. Then the job of a virus filter is to compare incoming code to the operations on the white list. Any code that is equivalent to some allowed operation can be declared safe. But now we have EXACTLY THE SAME PROBLEM. No test for equivalence exists. EXPLOITING THE DIFFERENCE BETWEEN THE WORST CASE AND THE AVERAGE CASE Some widely used algorithms have the property that their worst case time complexity is significantly different than their average case time complexity. For example: Looking up an entry in a hash table may take, on average, constant time. But if all the entries collide and hash to the same table location, the time required becomes O(n) where n is the number of entries in the table. EXPLOITING THE DIFFERENCE BETWEEN THE WORST CASE AND THE AVERAGE CASE Looking up an entry in a binary search tree may take, on average O(logn) time. But the tree may become unbalanced. In the worst case, it becomes a list and lookup time again becomes O(n). EXPLOITING THE DIFFERENCE BETWEEN THE WORST CASE AND THE AVERAGE CASE HACKERS Hackers can exploit these facts One way to launch a denial of service attack against a target site S is tend to S a series of messages/requests that has been crafted so that S will exhibit its worst case performance. If S was designed so that it could adequately respond to its traffic in the average case, it will no longer be able to do so. One way to get a peek inside a site S and observe properties that were not intended to be observable is to time it. For example, it is sometimes possible to observe the time required by S to perform decryption or password checking and so to infer its private key or a stored password REFERENCES Automata, Computability, and Complexity| Theory and Applications [book] by Elaine Rich. http://www.cs.utexas.edu/~ear/cs341/automatabo ok/appcsecurity_link.html?http://www.cs.rice.edu/ ~scrosby/hash