UNIT – 1 Computer security is information security as applied to computers and networks. The field covers all the processes and mechanisms by which computer-based equipment, information and services are protected from unintended or unauthorized access, change or destruction. This included not only protection from unauthorized activities or untrustworthy individuals, but also from unplanned events and natural disasters. A Taxonomy of Computer Security Computer security is frequently associated with three core areas, which can be conveniently summarized by the acronym "CIA": Confidentiality -- Ensuring that information is not accessed by unauthorized persons Integrity -- Ensuring that information is not altered by unauthorized persons in a way that is not detectable by authorized users Authentication -- Ensuring that users are the persons they claim to be 1. Confidentiality You may find the notion of confidentiality to be straightforward: Only authorized people or systems can access protected data. However, as we see in later chapters, ensuring confidentiality can be difficult. For example, who determines which people or systems are authorized to access the current system? By "accessing" data, do we mean that an authorized party can access a single bit? the whole collection? pieces of data out of context? Can someone who is authorized disclose those data to other parties? Confidentiality is the security property we understand best because its meaning is narrower than the other two. We also understand confidentiality well because we can relate computing examples to those of preserving confidentiality in the real world. 2. Integrity Integrity is much harder to pin down. As Welke and Mayfield [WEL90, MAY91, NCS91b] point out, integrity means different things in different contexts. When we survey the way some people use the term, we find several different meanings. For example, if we say that we have preserved the integrity of an item, we may mean that the item is precise accurate unmodified modified only in acceptable ways modified only by authorized people modified only by authorized processes consistent internally consistent meaningful and usable 3. Availability Availability applies both to data and to services (that is, to information and to information processing), and it is similarly complex. As with the notion of confidentiality, different people expect availability to mean different things. For example, an object or service is thought to be available if It is present in a usable form. It has capacity enough to meet the service's needs. It is making clear progress, and, if in wait mode, it has a bounded waiting time. The service is completed in an acceptable period of time. We can construct an overall description of availability by combining these goals. We say a data item, service, or system is available if There is a timely response to our request. Resources are allocated fairly so that some requesters are not favored over others. The service or system involved follows a philosophy of fault tolerance, whereby hardware or software faults lead to graceful cessation of service or to work-arounds rather than to crashes and abrupt loss of information. The service or system can be used easily and in the way it was intended to be used. Concurrency is controlled; that is, simultaneous access, deadlock management, and exclusive access are supported as required. Computer security is not restricted to these three broad concepts. Additional ideas that are often considered part of the taxonomy of computer security include: Access control -- Ensuring that users access only those resources and services that they are entitled to access and that qualified users are not denied access to services that they legitimately expect to receive Nonrepudiation -- Ensuring that the originators of messages cannot deny that they in fact sent the messages2 Availability -- Ensuring that a system is operational and functional at a given moment, usually provided through redundancy; loss of availability is often referred to as "denialof-service" Privacy -- Ensuring that individuals maintain the right to control what information is collected about them, how it is used, who has used it, who maintains it, and what purpose it is used for Functional View Computer security can also be analyzed by function. It can be broken into five distinct functional areas:3 Risk avoidance -- A security fundamental that starts with questions like: Does my organization or business engage in activities that are too risky? Do we really need an unrestricted Internet connection? Do we really need to computerize that secure business process? Should we really standardize on a desktop operating system with no access control intrinsics? Deterrence -- Reduces the threat to information assets through fear. Can consist of communication strategies designed to impress potential attackers of the likelihood of getting caught. See Rule 5: The Fear of Getting Caught is the Beginning of Wisdom. Prevention -- The traditional core of computer security. Consists of implementing safeguards like the tools covered in this book. Absolute prevention is theoretical, since there's a vanishing point where additional preventative measures are no longer costeffective. Detection -- Works best in conjunction with preventative measures. When prevention fails, detection should kick in, preferably while there's still time to prevent damage. Includes log-keeping and auditing activities Recovery -- When all else fails, be prepared to pull out backup media and restore from scratch, or cut to backup servers and net connections, or fall back on a disaster recovery facility. Arguably, this function should be attended to before the others Computer Criminals It involve the use of information technology to gain an illegal or an unauthorized access to a computer system with intent of damaging, deleting or altering computer data. Computer crimes also include the activities such as electronic frauds, misuse of devices, identity theft and data as well as system interference. Marcus Rogers has identified eight types of cyber-criminals, distinguished by their skill levels and motivations. Rogers is an associate professor at Purdue University in West Lafayette, Ind., where he heads cyberforensics research in the university's department of computer technology. NOVICE • Limited computer and programming skills. • Rely on toolkits to conduct their attacks. • Can cause extensive damage to systems since they don't understand how the attack works. • Looking for media attention. CYBER PUNKS • Capable of writing their own software. • Have an understanding of the systems they are attacking. • Many are engaged in credit card number theft and telecommunications fraud. • Have a tendency to brag about their exploits. INTERNALS a) Disgruntled employees or ex-employees • May be involved in technology-related jobs. • Aided by privileges they have or had been assigned as part of their job function. • Pose largest security problem. b) Petty thieves • Include employees, contractors, consultants . • Computer literate. • Opportunistic: take advantage of poor internal security. • Motivated by greed or necessity to pay off other habits, such as drugs or gambling. CODERS • Act as mentors to the newbies. Write the scripts and automated tools that others use. • Motivated by a sense of power and prestige. • Dangerous — have hidden agendas, use Trojan horses. OLD GUARD HACKERS • Appear to have no criminal intent. • Alarming disrespect for personal property. • Appear to be interested in the intellectual endeavor. PROFESSIONAL CRIMINALS • Specialize in corporate espionage. • Guns for hire. • Highly motivated, highly trained, have access to state-of-the-art equipment. INFORMATION WARRIORS / CYBER-TERRORISTS • Increase in activity since the fall of many Eastern Bloc intelligence agencies. • Well funded. • Mix political rhetoric with criminal activity.Political activist • Possible emerging category. • Engage in hacktivism. Computer crimes involve activities of software theft, wherein the privacy of the users is hampered. These criminal activities involve the breach of human and information privacy, as also the theft and illegal alteration of system critical information. The different types of computer crimes have necessitated the introduction and use of newer and more effective security measures. Types of computer crime Hacking Currently defined as to gain illegal or unautorized access to a file, computer or network. Identity Theft Various crimes in which a criminal or large group uses the identity of an unknowing, innocent person. Phishing: E-mail fishing for personal and financial information disguised as legitimate business email. Pharming: False websites that fish for personal and financial information by planting false URLs in Domain Name Users. Credit Card Fraud A wide-ranging term for theft and fraud committed using a credit card or any similar payment mechanism as a fraudulent source of funds in a transaction. Forgery The process of making, adapting, or imitating objects, statistics, or documents, with the intent to deceive. Digital Forgery: New technologies are used to create fake checks, passports, visas, birth certificates with little skill or investments. Scams A confidence game or other fraudulent scheme,especially for making a quick profit,to cheat or swindle. Auctions: Some sellers do not sell items or send inferior products. Stock Fraud: Common method is to buy a stock low, send out email urging others to buy and then, sell when the price goes up. Click Fraud: Repeated clicking on an ad to either increase a site's revenue or to use up a competitors advertising budget. Hacking Originally, the word, Hacking, was not used as defined above. Hacking was clever piece of code constructed by hackers who were smart and creative programmers. In 1970s to 1990s, the definition of hacking changed as many people started using computers and abused computer terminologies. By 1980s, hacking behavior included spreading viruses, pranks, thefts, and phone phreaking. The difference between hackers and other criminals is the purpose of crime. Hackers commonly try to benefit not only themselves but also other computer users. Therefore, they have some ethics for their action. They believe sharing computer programs is important in development of new softwares.Openness will help people to access anything they need and use it for their personal demand. Decentralization will prevent authority from abusing information and controlling people. Hands On Imperative/Free access to computersItalic text is indispensable to break barriers between people and new technology. Those ethics will slowly change as the demand for computer changes. Identity Theft Today, threats of identify theft come in many forms. It is important that you learn how to recognize fraudulent activity to protect yourself from identity theft. Identity theft occurs when someone uses your personally identifying information, like your name, Social Security number, or credit card number, without your permission, to commit fraud or other crimes. For identity thieves, this information is as good as gold. Skilled identity thieves may use a variety of methods such as Old-Fashion Stealing, Skimming, Phishing, or Dumpster Diving to get hold of your information. Once they have your personal information, identity thieves use it in a variety of ways, for instance Credit Card Fraud, Bank/Account Fraud, or Government Document Fraud. To prevent any such identity theft is to monitor you personal accounts, bank statements and check your credit report on a regular basis. If you check your credit report regularly, you may be able to limit the damage caused by identity theft. Filing a police report, checking your credit reports, notifying creditors, and disputing any unauthorized transactions are some of the steps you must take immediately to restore your good name. Scams As of today there are so many ways to get lured to online scams. People who scam others are often referred as scammers, cheaters and swindlers. Online scams are everywhere they can be fake auctions and promotions. When people read great deals online they believe what they see but in reality it is a game of deceit. The frauds can also happen with health care, credit card, vacation and lottery. The internet changed the way we operate from research, leisure and work. Every day people are cheated by these frauds. People get scammed by fake photos, damaged goods, misleading information, and false advertising. People are tricked into providing their private information like social security numbers, address, phone numbers and credit card information. Forgery Digital forgery has become a big problem with the boom of the internet. Many businesses need proof of identity to perform a service, and with identity fraud being a larger goal for criminals this proof is difficult to accept as truthful. Criminals have access to much more advanced technology and are willing to go to further lengths to steal people's information. A social security number, credit card number, or bank account number are not strong enough proof to show who someone is anymore. Many companies ask for copies of a social security card, birth certificate, or a monthly bill with your name and address on it for further verification. Even going to these lengths is not enough. Digital forgery is taken one step further with software to recreate and manipulate these private documents and proceed with the scam intended. Unfortunately these scams are being made even more accessible to even the least educated of internet criminals. It is to the point where a thief can obtain your credit card information and recreate your birth certificate for less than it costs to fill up his gas tank. This is frightening because you will never even know if it is happening to you. Everyone must be aware that there are always cyber-criminals on the loose, and no information is sacred on the internet. Methods of Defense Security Components 1.Confidentiality: The assets are accessible only by authorized parties. – Keeping data and resources hidden 2.Integrity: The assets are modified only by authorized parties, and only in authorized ways. – Data integrity (integrity) – Origin integrity (authentication) 3.Availability: Assets are accessible to authorized parties. – Enabling access to data and resources Methods to safe computer crime(Methods of Defense) 1.Encryption 2. Software controls 3. Hardware controls 4.Policies 5.Physical controls 1. Encryption at the heart of all security methods Confidentiality of data Some protocols rely on encryption to ensure availability of resources. Encryption does not solve all computer security problems. 2. Software controls Internal program controls OS controls Development controls Software controls are usually the 1st aspects of computer security that come to mind. 3. Policies and Mechanisms Policy says what is, and is not, allowed – This defines “security” for the site/system/etc. Mechanisms enforce policies Mechanisms can be simple but effective – Example: frequent changes of passwords Composition of policies – If policies conflict, discrepancies may create security vulnerabilities Legal and ethical controls – Gradually evolving and maturing 4. Overlapping Controls Several different controls may apply to one potential exposure. H/w control + S/w control + Data control Goals of Security 1.Prevention – Prevent attackers from violating security policy 2.Detection – Detect attackers’ violation of security policy 3.Recovery – Stop attack, assess and repair damage – Continue to function correctly even if attack succeeds Cryptography is the science of information security. The word is derived from the Greek kryptos, meaning hidden. Cryptography is closely related to the disciplines of cryptology and cryptanalysis. Cryptography includes techniques such as microdots, merging words with images, and other ways to hide information in storage or transit. However, in today's computercentric world, cryptography is most often associated with scrambling plaintext (ordinary text, sometimes referred to as cleartext) into ciphertext (a process called encryption), then back again (known as decryption). Individuals who practice this field are known as cryptographers. Modern cryptography concerns itself with the following four objectives: 1) Confidentiality (the information cannot be understood by anyone for whom it was unintended) 2) Integrity (the information cannot be altered in storage or transit between sender and intended receiver without the alteration being detected) 3) Non-repudiation (the creator/sender of the information cannot deny at a later stage his or her intentions in the creation or transmission of the information) 4) Authentication (the sender and receiver can confirm each other?s identity and the origin/destination of the information) Types of Cryptography There are two main types of cryptography: Secret key cryptography Public key cryptography Hash function Trust Models 1. Secret key cryptography(Symmetric Key) Symmetric-key algorithms[1] are a class of algorithms for cryptography that use the same cryptographic keys for both encryption of plaintext and decryption of ciphertext. The keys may be identical or there may be a simple transformation to go between the two keys. The keys, in practice, represent a shared secret between two or more parties that can be used to maintain a private information link.[2] This requirement that both parties have access to the secret key is one of the main drawbacks of symmetric key encryption, in comparison to public-key encryption. Types of symmetric-key algorithms Symmetric-key encryption can use either stream ciphers or block ciphers.[citation needed] Stream ciphers encrypt the digits (typically bits) of a message one at a time. Block ciphers take a number of bits and encrypt them as a single unit, padding the plaintext so that it is a multiple of the block size. Blocks of 64 bits have been commonly used. The Advanced Encryption Standard (AES) algorithm approved by NIST in December 2001 uses 128-bit blocks. 2. Asymmetric Key (public-key cryptography) Asymmetric cryptography or public-key cryptography is cryptography in which a pair of keys is used to encrypt and decrypt a message so that it arrives securely. Initially, a network user receives a public and private key pair from a certificate authority. Any other user who wants to send an encrypted message can get the intended recipient's public key from a public directory. They use this key to encrypt the message, and they send it to the recipient. When the recipient gets the message, they decrypt it with their private key, which no one else should have access to. Witfield Diffie & Martin Hellman, researchers at Stanford University, first publicly proposed asymmetric encryption in their 1977 paper, New Directions In Cryptography. (The concept had been independently and covertly proposed by James Ellis several years before when he was working for the British Government Communications Headquarters.) An asymmetric algorithm, as outlined in the Diffie-Hellman paper, is a trap door or one-way function. Such a function is easy to perform in one direction, but difficult or impossible to reverse. For example, it is easy to compute the product of two given numbers, but it is computationally much harder to find the two factors given only their product. Given both the product and one of the factors, it is easy to compute the second factor, which demonstrates the fact that the hard direction of the computation can be made easy when access to some secret key is given. The function used, the algorithm, is known universally. This knowledge does not enable the decryption of the message. The only added information that is necessary and sufficient for decryption is the recipient's secret key. Asymmetric key encryption uses different keys for encryption and decryption. These two keys are mathematically related and they form a key pair. One of these two keys should be kept private, called private-key, and the other can be made public (it can even be sent in mail), called public-key. Hence this is also called Public Key Encryption. A private key is typically used for encrypting the message-digest; in such an application private-key algorithm is called message-digest encryption algorithm. A public key is typically used for encrypting the secret-key; in such a application private-key algorithm is called key encryption algorithm. Popular private-key algorithms are RSA (invented by Rivest, Shamir and Adlemen) and DSA (Digital Signature Algorithm). While for an ordinary use of RSA, a key size of 768 can be used, but for corporate use a key size of 1024 and for extremely valuable information a key size of 2048 should be used. Asymmetric key encryption is much slower than symmetric key encryption and hence they are only used for key exchanges and digital signatures. 3. Hash Function A cryptographic hash function is a hash function; that is, an algorithm that takes an arbitrary block of data and returns a fixed-size bit string, the (cryptographic) hash value, such that an (accidental or intentional) change to the data will (with very high probability) change the hash value. The data to be encoded are often called the "message," and the hash value is sometimes called the message digest or simply digest. The ideal cryptographic hash function has four main properties: it is easy to compute the hash value for any given message it is infeasible to generate a message that has a given hash it is infeasible to modify a message without changing the hash it is infeasible to find two different messages with the same hash Cryptographic hash functions have many information security applications, notably in digital signatures, message authentication codes (MACs), and other forms of authentication. They can also be used as ordinary hash functions, to index data in hash tables, for fingerprinting, to detect duplicate data or uniquely identify files, and as checksums to detect accidental data corruption. Indeed, in information security contexts, cryptographic hash values are sometimes called (digital) fingerprints, checksums, or just hash values, even though all these terms stand for functions with rather different properties and purposes. There are several well-known hash functions in use today: Hashed Message Authentication Code (HMAC): Combines authentication via a shared secret with hashing. Message Digest 2 (MD2): Byte-oriented, produces a 128-bit hash value from an arbitrary-length message, designed for smart cards. MD4: Similar to MD2, designed specifically for fast processing in software. MD5: Similar to MD4 but slower because the data is manipulated more. Developed after potential weaknesses were reported in MD4. Secure Hash Algorithm (SHA): Modeled after MD4 and proposed by NIST for the Secure Hash Standard (SHS), produces a 160-bit hash value. Substitution cipher Substitution cipher In cryptography, a substitution cipher is a method of encryption by which units of plaintext are replaced with ciphertext, according to a regular system; the "units" may be single letters (the most common), pairs of letters, triplets of letters, mixtures of the above, and so forth. The receiver deciphers the text by performing an inverse substitution. Substitution ciphers can be compared with transposition ciphers. In a transposition cipher, the units of the plaintext are rearranged in a different and usually quite complex order, but the units themselves are left unchanged. By contrast, in a substitution cipher, the units of the plaintext are retained in the same sequence in the ciphertext, but the units themselves are altered. There are a number of different types of substitution cipher. If the cipher operates on single letters, it is termed a simple substitution cipher; a cipher that operates on larger groups of letters is termed polygraphic. A monoalphabetic cipher uses fixed substitution over the entire message, whereas a polyalphabetic cipher uses a number of substitutions at different positions in the message, where a unit from the plaintext is mapped to one of several possibilities in the ciphertext and vice versa. For example: Plain Alphabet: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Cipher Alphabet: Z Y X W V U T S R Q P O N M L K J I H G F E D C B A With the above, the Plain text "This is a sample" would encrypt to "Gsrh rh z hznkov." With Substitution Ciphers, the secret is in the mapping between the plain and cipher alphabets. However, there are several analytical techniques to help break these ciphers with only the ciphertext. See Frequency analysis Types of substitution 1. Homophonic substitution 2. Polyalphabetic substitution 3. Polygraphic substitution 4. Mechanical substitution ciphers Transposition cipher It is a simple data encryption scheme in which plaintext characters are shifted in some regular pattern to form ciphertext. In manual systems transpositions are generally carried out with the aid of an easily remembered mnemonic. For example, a popular schoolboy cipher is the “rail fence,” in which letters of the plaintext are written alternating between rows and the rows are then read sequentially to give the cipher. In a depth-two rail fence (two rows) the message WE ARE DISCOVERED SAVE YOURSELF would be written Simple frequency counts on the ciphertext would reveal to the cryptanalyst that letters occur with precisely the same frequency in the cipher as in an average plaintext and, hence, that a simple rearrangement of the letters is probable. The rail fence is the simplest example of a class of transposition ciphers, known as route ciphers, that enjoyed considerable popularity in the early history of cryptology. In general, the elements of the plaintext (usually single letters) are written in a prearranged order (route) into a geometric array (matrix)—typically a rectangle—agreed upon in advance by the transmitter and receiver and then read off by following another prescribed route through the matrix to produce the cipher. The key in a route cipher consists of keeping secret the geometric array, the starting point, and the routes. Clearly both the matrix and the routes can be much more complex than in this example; but even so, they provide little security. One form of transposition (permutation) that was widely used depends on an easily remembered key word for identifying the route in which the columns of a rectangular matrix are to be read. For example, using the key word AUTHOR and ordering the columns by the lexicographic order of the letters in the key word In decrypting a route cipher, the receiver enters the ciphertext symbols into the agreed-upon matrix according to the encryption route and then reads the plaintext according to the original order of entry. A significant improvement in cryptosecurity can be achieved by reencrypting the cipher obtained from one transposition with another transposition. Because the result (product) of two transpositions is also a transposition, the effect of multiple transpositions is to define a complex route in the matrix, which in itself would be difficult to describe by any simple mnemonic. In the same class also fall systems that make use of perforated cardboard matrices called grilles; descriptions of such systems can be found in most older books on cryptography. In contemporary cryptography, transpositions serve principally as one of several encryption steps in forming a compound or product cipher. Making "Good" Encryption Algorithms So far, the encryption algorithms we have seen have been trivial, intended primarily to demonstrate the concepts of substitution and permutation. At the same time, we have examined several approaches cryptanalysts use to attack encryption algorithms. Now we examine algorithms that are widely used in the commercial world. Unlike the previous sections, this section does not delve deeply into the details either of the inner workings of an algorithm or its cryptanalysis. (We save that investigation for Chapter 10.) What Makes a "Secure" Encryption Algorithm? There are many kinds of encryption, including many techniques beyond those we discuss in this book. Suppose you have text to encrypt. How do you choose an encryption algorithm for a particular application? To answer this question, reconsider what we have learned so far about encryption. We looked at two broad classes of algorithms: substitutions and transpositions. Substitutions "hide" the letters of the plaintext, and multiple substitutions dissipate high letter frequencies to make it harder to determine how the substitution is done. By contrast, transpositions scramble text so that adjacent-character analysis fails. Sidebar 2-4 Soviet Encryption During World War II Kahn [KAH96] describes a system that the Soviet Union thought unbreakable during World War II. It combined substitution with a one-time pad. The basic idea was to diffuse high-frequency letters by mapping them to single digits. This approach kept the length of cryptograms small and thus reduced the on-air time as the message was transmitted. To see how the encryption worked, consider the eight most common letters of the English language: ASINTOER, arranged as in "a sin to er(r)" to make them easy to remember. These letters were assigned to single digits, 0 to 7. To encode a message, an analyst would begin by selecting a keyword that became the first row of a matrix. Then, the remaining letters of the alphabet were listed in rows underneath, as shown below. Moving vertically through the matrix, the digits 0 to 7 were assigned to the eight common letters, and then the two-digit groups from 80 to 99 were mapped to the remaining letters of the alphabet plus any symbols. In our example, the keyword is SUNDAY: S U N 0 83 2 90 6 97 B C F H E D A Y G 80 84 3 91 94 98 I J L 1 85 88 92 95 7 P Q K M O R T V 81 86 4 5 96 99 X / Z . W 82 87 89 93 Then the message "whereis/456/ airborne " would be encoded as w h e 99 98 3 r b r e 4 o r n e 4 80 7 4 2 3 i s 3 1 / 4 5 0 6 / a i 93 44 55 66 93 6 1 or 99983431 09344556 69361480 7423. (Digits of plaintext numbers were repeated.) Finally, the numerical message was encrypted with a one-time pad from a common reference book with numerical tables ”one that would not arouse suspicion, such as a navigator's book of tables. For each type of encryption we considered , we described the advantages and disadvantages. But there is a broader question: What does it mean for a cipher to be "good"? The meaning of good depends on the intended use of the cipher. A cipher to be used by military personnel in the field has different requirements from one to be used in a secure installation with substantial computer support. In this section, we look more closely at the different characteristics of ciphers. Shannon's Characteristics of "Good" Ciphers In 1949, Claude Shannon [SHA49] proposed several characteristics that identify a good cipher. 1. The amount of secrecy needed should determine the amount of labor appropriate for the encryption and decryption. Principle 1 is a reiteration of the principle of timeliness from Chapter 1 and of the earlier observation that even a simple cipher may be strong enough to deter the casual interceptor or to hold off any interceptor for a short time. 2. The set of keys and the enciphering algorithm should be free from complexity. This principle implies that we should restrict neither the choice of keys nor the types of plaintext on which the algorithm can work. For instance, an algorithm that works only on plaintext having an equal number of As and Es is useless. Similarly, it would be difficult to select keys such that the sum of the values of the letters of the key is a prime number. Restrictions such as these make the use of the encipherment prohibitively complex. If the process is too complex, it will not be used. Furthermore, the key must be transmitted, stored, and remembered , so it must be short. 3. The implementation of the process should be as simple as possible. Principle 3 was formulated with hand implementation in mind: A complicated algorithm is prone to error or likely to be forgotten. With the development and popularity of digital computers, algorithms far too complex for hand implementation became feasible . Still, the issue of complexity is important. People will avoid an encryption algorithm whose implementation process severely hinders message transmission, thereby undermining security. And a complex algorithm is more likely to be programmed incorrectly. 4. Errors in ciphering should not propagate and cause corruption of further information in the message. Principle 4 acknowledges that humans make errors in their use of enciphering algorithms. One error early in the process should not throw off the entire remaining ciphertext . For example, dropping one letter in a columnar transposition throws off the entire remaining encipherment. Unless the receiver can guess where the letter was dropped, the remainder of the message will be unintelligible. By contrast, reading the wrong row or column for a polyalphabetic substitution affects only one character ”remaining characters are unaffected. 5. The size of the enciphered text should be no larger than the text of the original message. The idea behind principle 5 is that a ciphertext that expands dramatically in size cannot possibly carry more information than the plaintext, yet it gives the cryptanalyst more data from which to infer a pattern. Furthermore, a longer ciphertext implies more space for storage and more time to communicate. These principles were developed before the ready availability of digital computers, even though Shannon was aware of computers and the computational power they represented. Thus, some of the concerns he expressed about hand implementation are not really limitations on computerbased implementation. For example, a cipher's implementation on a computer need not be simple, as long as the time complexity of the implementation is tolerable. Properties of "Trustworthy" Encryption Systems Commercial users have several requirements that must be satisfied when they select an encryption algorithm. Thus, when we say that encryption is "commercial grade," we mean that it meets these constraints: It is based on sound mathematics . Good cryptographic algorithms are not just invented; they are derived from solid principles. It has been analyzed by competent experts and found to be sound . Even the best cryptographic experts can think of only so many possible attacks, and the developers may become too convinced of the strength of their own algorithm. Thus, a review by critical outside experts is essential. It has stood the "test of time." As a new algorithm gains popularity, people continue to review both its mathematical foundations and the way it builds upon those foundations. Although a long period of successful use and analysis is not a guarantee of a good algorithm, the flaws in many algorithms are discovered relatively soon after their release. Three algorithms are popular in the commercial world: DES (data encryption standard), RSA (Rivest “Shamir “Adelman, named after the inventors), and AES (advanced encryption standard). The DES and RSA algorithms (as well as others) meet our criteria for commercialgrade encryption; AES, which is quite new, meets the first two and is starting to achieve widespread adoption. Symmetric and Asymmetric Encryption Systems Recall that the two basic kinds of encryptions are symmetric (also called "secret key") and asymmetric (also called "public key"). Symmetric algorithms use one key, which works for both encryption and decryption. Usually, the decryption algorithm is closely related to the encryption one. (For example, the Caesar cipher with a shift of 3 uses the encryption algorithm "substitute the character three letters later in the alphabet" with the decryption "substitute the character three letters earlier in the alphabet.") The symmetric systems provide a two-way channel to their users: A and B share a secret key, and they can both encrypt information to send to the other as well as decrypt information from the other. As long as the key remains secret, the system also provides authentication, proof that a message received was not fabricated by someone other than the declared sender. Authenticity is ensured because only the legitimate sender can produce a message that will decrypt properly with the shared key. The symmetry of this situation is a major advantage of this type of encryption, but it also leads to a problem: key distribution. How do A and B obtain their shared secret key? And only A and B can use that key for their encrypted communications. If A wants to share encrypted communication with another user C, A and C need a different shared key. Key distribution is the major difficulty in using symmetric encryption. In general, n users who want to communicate in pairs need n * ( n “ 1)/2 keys. In other words, the number of keys needed increases at a rate proportional to the square of the number of users! So a property of symmetric encryption systems is that they require a means of key distribution . Public key systems, on the other hand, excel at key management. By the nature of the public key approach, you can send a public key in an e-mail message or post it in a public directory. Only the corresponding private key, which presumably is kept private, can decrypt what has been encrypted with the public key. But for both kinds of encryption, a key must be kept well secured. Once the symmetric or private key is known by an outsider, all messages written previously or in the future can be decrypted (and hence read or modified) by the outsider. So, for all encryption algorithms, key management is a major issue. It involves storing, safeguarding, and activating keys. Stream and Block Ciphers Most of the ciphers studied in this chapter are stream ciphers ; that is, they convert one symbol of plaintext immediately into a symbol of ciphertext. (The exception is the columnar transposition cipher.) The transformation depends only on the symbol, the key, and the control information of the encipherment algorithm. A model of stream enciphering is shown in Figure 2-6. Figure 2-6. Stream Encryption. Some kinds of errors, such as skipping a character in the key during encryption, affect the encryption of all future characters. However, such errors can sometimes be recognized during decryption because the plaintext will be properly recovered up to a point, and then all following characters will be wrong. If that is the case, the receiver may be able to recover from the error by dropping a character of the key on the receiving end. Once the receiver has successfully recalibrated the key with the ciphertext, there will be no further effects from this error. To address this problem and make it harder for a cryptanalyst to break the code, we can use block ciphers. A block cipher encrypts a group of plaintext symbols as one block. The columnar transposition and other transpositions are examples of block ciphers. In the columnar transposition, the entire message is translated as one block. The block size need not have any particular relationship to the size of a character. Block ciphers work on blocks of plaintext and produce blocks of ciphertext, as shown Figure 2-7. In the figure, the central box represents an encryption machine: The previous plaintext pair is converted to po , the current one being converted is IH , and the machine is soon to convert ES . Figure 2-7. Block Cipher Systems. Table 2-3 compares the advantages and disadvantages of stream and block encryption algorithms. Table 2-3. Comparing Stream and Block Algorithms. Stream Encryption Algorithms Advantages Disadvantages Block Encryption Algorithms Speed of transformation . Because each symbol is encrypted without regard for any other plaintext symbols, each symbol can be encrypted as soon as it is read. Thus, the time to encrypt a symbol depends only on the encryption algorithm itself, not on the time it takes to receive more plaintext. Low error propagation . Because each symbol is separately encoded, an error in the encryption process affects only that character. Low diffusion . Each symbol is separately enciphered. Therefore, High diffusion . Information from the plain-text is diffused into several ciphertext symbols. One ciphertext block may depend on several plaintext letters. Immunity to insertion of symbols . Because blocks of symbols are enciphered, it is impossible to insert a single symbol into one block. The length of the block would then be incorrect, and the decipherment would quickly reveal the insertion. Slowness of encr y ption . The person or machine using Stream Encryption Algorithms all the information of that symbol is contained in one symbol of the ciphertext. Susceptibility to malicious insertions and modifications . Because each symbol is separately enciphered, an active interceptor who has broken the code can splice together pieces of previous messages and transmit a spurious new message that may look authentic . Block Encryption Algorithms a block cipher must wait until an entire block of plaintext symbols has been received before starting the encryption process. Error propagation . An error will affect the transformation of all other characters in the same block. Data Encryption Standard (DES) It is a is a widely-used method of data encryption using a private (secret) key that was judged so difficult to break by the U.S. government that it was restricted for exportation to other countries. There are 72,000,000,000,000,000 (72 quadrillion) or more possible encryption keys that can be used. For each given message, the key is chosen at random from among this enormous number of keys. Like other private key cryptographic methods, both the sender and the receiver must know and use the same private key. DES applies a 56-bit key to each 64-bit block of data. The process can run in several modes and involves 16 rounds or operations. Although this is considered "strong" encryption, many companies use "triple DES", which applies three keys in succession. This is not to say that a DES-encrypted message cannot be "broken." Early in 1997, Rivest-Shamir-Adleman, owners of another encryption approach, offered a $10,000 reward for breaking a DES message. A cooperative effort on the Internet of over 14,000 computer users trying out various keys finally deciphered the message, discovering the key after running through only 18 quadrillion of the 72 quadrillion possible keys! Few messages sent today with DES encryption are likely to be subject to this kind of code-breaking effort. DES originated at IBM in 1977 and was adopted by the U.S. Department of Defense. It is specified in the ANSI X3.92 and X3.106 standards and in the Federal FIPS 46 and 81 standards. Concerned that the encryption algorithm could be used by unfriendly governments, the U.S. government has prevented export of the encryption software. However, free versions of the software are widely available on bulletin board services and Web sites. Since there is some concern that the encryption algorithm will remain relatively unbreakable, NIST has indicated DES will not be recertified as a standard and submissions for its replacement are being accepted. The next standard will be known as the Advanced Encryption Standard (AES). As a practical matter, anyone today who wants high security uses a more powerful version of DES called Triple-DES. To start encrypting with Triple-DES, two 56-bit keys are selected. Data is encrypted via DES three times, the first time by the first key, the second time by the second key and the third time by the first key once more. This process creates an encrypted datastream that is unbreakable with today's code-breaking techniques and available computing power, while being compatible with DES. The Feistel function (F function) of DES General Designers IBM First published 1977 (standardized in January 1979) Derived from Lucifer Successors Triple DES, G-DES, DES-X, LOKI89, ICE Cipher detail Key sizes 56 bits Block sizes 64 bits Structure Balanced Feistel network Rounds 16 Best public cryptanalysis DES is now considered insecure because a brute force attack is possible (see EFF DES cracker). As of 2008, the best analytical attack is linear cryptanalysis, which requires 243 known plaintexts and has a time complexity of 239–43 (Junod, 2001). Modes of DES FIPS 81 describes four approved modes of DES: Electronic Codebook (ECB) mode, Cipher Block Chaining (CBC) mode, Cipher Feedback (CFB) mode, and Output Feedback (OFB) mode. 5 The National Institute of Standards and Technology (NIST) Special Publication 800-38A describes a 5th method, Counter (CTR).6 These modes can be used with both DES and Triple DES. Key differences in each mode are error propagation and block vs. stream ciphers. 1.Error propagation means an error in a step of encryption or decryption (such as a bit flipped from 0 to 1) propagates to subsequent steps, which causes further errors. 2.A block cipher encrypts a set block size of data (64 bits for ECB and CBC modes). 3.A stream cipher encrypts bits or groups of bits (1–64 bits in CFB, OFB, and CTR modes). Although DES is a block cipher, it emulates stream ciphers in these modes. 1.Electronic Codebook (ECB) Mode Electronic Codebook is the DES native mode, “a direct application of the DES algorithm to encrypt and decrypt data.”7 In this mode, each block of plaintext is independently encrypted into a respective block of ciphertext. This is done via a Feistel (named after Horst Feistel, one of the creators of Lucifer) cipher, which creates 16 subkeys based on the symmetric key and encrypts the plaintext via 16 rounds of transformation. The same process is used (with the symmetric key) to convert ciphertext back into plaintext; the difference is the 16 subkeys are supplied in reverse order. Repeated blocks of identical plaintext result in repeated blocks of ciphertext, which can aid cryptanalysis of the ciphertext. This effect is best illustrated. The first image is the SANS logo (bitmap format); the second image is the SANS logo bitmap encrypted via DES ECB mode. Although the bitmap data is encrypted, the original pattern is clearly visible. The pattern is visible because repeated blocks of plaintext pixels in the bitmap are encrypted into repeated blocks of respective ciphertext pixels. In this mode, errors do not propagate, as each block is encrypted independently. The term Codebook refers to cryptographic code books, which contain dictionaries of words or phrases (such as “Attack has begun”) with a coded equivalent (“The eagle has flown”). 2. Cipher Block Chaining (CBC) Mode Cipher Block Chaining Mode is a block cipher which XORs (exclusive OR) each new block of plaintext with the previous block of ciphertext (they are “chained” together). This means repeated blocks of plaintext do not result in repeated blocks of ciphertext. CBC also uses an initialization vector, which is a random initial block used to ensure that two identical plaintexts result in different ciphertexts (due to different initialization vectors). Here is the same SANS logo bitmap data, encrypted with DES CBC mode: No pattern is visible. This is true for all DES modes other than ECB. In this mode, errors propagate, as each previous step’s encrypted output is XORed (“chained”) with the new block of plaintext. 3.Cipher Feedback (CFB) Mode Cipher Feedback mode is a stream cipher that encrypts plaintext by breaking it into units of X (from 1 to 64) bits. This allows bit or byte-level encryption. CFB mode uses a random initialization vector, and previous units of ciphertext are XORed with subsequent units of plaintext (the cipher is “fed back” to the plaintext). As with CBC, errors propagate. 4.Output Feedback (OFB) Mode Like CFB mode, Output Feedback mode uses a random initialization vector and encrypts plaintext by breaking it down into a stream by encrypting units of X (from 1 to 64) bits of plaintext. OFB mode differs from CFB mode by creating a pseudo-random stream of bits (called “output”), which is XORed with the plaintext during each step (the “output” is “fed back” to the plaintext). Because the output (and not ciphertext) is XORed to the plaintext, errors do not propagate. 5.Counter (CTR) Mode Counter mode is a stream cipher like OFB mode; the key difference is the addition of counter blocks. The counter can be added or concatenated to a nonce (a random value that is used once), and then incremented for each unit of plaintext that is encrypted. The first counter block acts as an initialization vector. In each round, the counter blocks are XORed with plaintext. Advanced Encryption Standard AES stands for Advanced Encryption Standard. AES is a symmetric key encryption technique which will replace the commonly used Data Encryption Standard (DES). It was the result of a worldwide call for submissions of encryption algorithms issued by the US Government's National Institute of Standards and Technology (NIST) in 1997 and completed in 2000. The winning algorithm, Rijndael, was developed by two Belgian cryptologists, Vincent Rijmen and Joan Daemen. AES provides strong encryption and has been selected by NIST as a Federal Information Processing Standard in November 2001 (FIPS-197), and in June 2003 the U.S. Government (NSA) announced that AES is secure enough to protect classified information up to the TOP SECRET level, which is the highest security level and defined as information which would cause "exceptionally grave damage" to national security if disclosed to the public. The AES algorithm uses one of three cipher key strengths: a 128-, 192-, or 256-bit encryption key (password). Each encryption key size causes the algorithm to behave slightly differently, so the increasing key sizes not only offer a larger number of bits with which you can scramble the data, but also increase the complexity of the cipher algorithm. BitZipper supports 128- and 256-bit encryption keys, which is the two key strengths supported by WinZip 9. Both key strengths provide significantly better security than standard ZIP 2.0 encryption. It is slightly faster to encrypt and decrypt data protected with 128-bit AES, but with today's fast PCs the time difference is barely notable. Steps in AES 1. SubBytes 2. ShiftRows 3. AddRoundKey 1.The SubBytes step In the SubBytes step, each byte in the state is replaced with its entry in a fixed 8-bit lookup table, S; bij = S(aij). In the SubBytes step, each byte in the state matrix is replaced with a SubByte using an 8-bit substitution box, the Rijndael S-box. This operation provides the non-linearity in the cipher. The S-box used is derived from the multiplicative inverse over GF(28), known to have good nonlinearity properties. To avoid attacks based on simple algebraic properties, the S-box is constructed by combining the inverse function with an invertible affine transformation. The Sbox is also chosen to avoid any fixed points (and so is a derangement), and also any opposite fixed points. 2.The ShiftRows step In the ShiftRows step, bytes in each row of the state are shifted cyclically to the left. The number of places each byte is shifted differs for each row. The ShiftRows step operates on the rows of the state; it cyclically shifts the bytes in each row by a certain offset. For AES, the first row is left unchanged. Each byte of the second row is shifted one to the left. Similarly, the third and fourth rows are shifted by offsets of two and three respectively. For blocks of sizes 128 bits and 192 bits, the shifting pattern is the same. Row n is shifted left circular by n-1 bytes. In this way, each column of the output state of the ShiftRows step is composed of bytes from each column of the input state. (Rijndael variants with a larger block size have slightly different offsets). For a 256-bit block, the first row is unchanged and the shifting for the second, third and fourth row is 1 byte, 3 bytes and 4 bytes respectively—this change only applies for the Rijndael cipher when used with a 256-bit block, as AES does not use 256-bit blocks. 3.The MixColumns step In the MixColumns step, each column of the state is multiplied with a fixed polynomial c(x). In the MixColumns step, the four bytes of each column of the state are combined using an invertible linear transformation. The MixColumns function takes four bytes as input and outputs four bytes, where each input byte affects all four output bytes. Together with ShiftRows, MixColumns provides diffusion in the cipher. During this operation, each column is multiplied by the known matrix that for the 128-bit key is The multiplication operation is defined as: multiplication by 1 means no change, multiplication by 2 means shifting to the left, and multiplication by 3 means shifting to the left and then performing xor with the initial unshifted value. After shifting, a conditional xor with 0x11B should be performed if the shifted value is larger than 0xFF. In more general sense, each column is treated as a polynomial over GF(28) and is then multiplied modulo x4+1 with a fixed polynomial c(x) = 0x03 · x3 + x2 + x + 0x02. The coefficients are displayed in their hexadecimal equivalent of the binary representation of bit polynomials from GF(2)[x]. The MixColumns step can also be viewed as a multiplication by a particular MDS matrix in a finite field. This process is described further in the article Rijndael mix columns. Difference between DES and AES DES is really old while AES is relatively new DES is breakable while AES is still unbreakable DES uses a much smaller key size compared to AES DES uses a smaller block size compared to AES DES uses a balanced Feistel structure while AES uses substitution-permutation DES was originally designed to run in specialized hardware and is considered “computationally expensive” on general-purpose processors. AES was designed to run efficiently on a variety of processors, including general-purpose ones. AES is more secure when compared to DES as it uses large blocks for encryption and the algorithm is relatively complicated than AES. Possible number of AES 128-bit keys are 1021 times greater than DES 56bit keys, hence it is found that a machine that could recover a DES key in a second (I.e. try 255keys per second) , it takes 149 trillion years to crack a 128-bit AES key. RSA RSA is an algorithm for public-key cryptography that is based on the presumed difficulty of factoring large integers, the factoring problem. RSA stands for Ron Rivest, Adi Shamir and Leonard Adleman, who first publicly described it in 1977. Clifford Cocks, an English mathematician, had developed an equivalent system in 1973, but it was classified until 1997. A user of RSA creates and then publishes the product of two large prime numbers, along with an auxiliary value, as their public key. The prime factors must be kept secret. Anyone can use the public key to encrypt a message, but with currently published methods, if the public key is large enough, only someone with knowledge of the prime factors can feasibly decode the message.[1] Whether breaking RSA encryption is as hard as factoring is an open question known as the RSA problem. The RSA algorithm was publicly described in 1977 by Ron Rivest, Adi Shamir, and Leonard Adleman at MIT; the letters RSA are the initials of their surnames, listed in the same order as on the paper. RSA Operation The RSA algorithm involves three steps: key generation, encryption and decryption. 1.Key generation RSA involves a public key and a private key. The public key can be known to everyone and is used for encrypting messages. Messages encrypted with the public key can only be decrypted using the private key. The keys for the RSA algorithm are generated the following way: 1. Choose two distinct prime numbers p and q. o For security purposes, the integers p and q should be chosen at random, and should be of similar bit-length. Prime integers can be efficiently found using a primality test. 2. Compute n = pq. o n is used as the modulus for both the public and private keys 3. Compute φ(n) = (p – 1)(q – 1), where φ is Euler's totient function. 4. Choose an integer e such that 1 < e < φ(n) and greatest common divisor of (e, φ(n)) = 1; i.e., e and φ(n) are coprime. o e is released as the public key exponent. o e having a short bit-length and small Hamming weight results in more efficient encryption - most commonly 0x10001 = 65,537. However, small values of e (such as 3) have been shown to be less secure in some settings.[4] 5. Determine d as: i.e., d is the multiplicative inverse of e mod φ(n). This is more clearly stated as solve for d given (de) = 1 mod φ(n) This is often computed using the extended Euclidean algorithm. d is kept as the private key exponent. By construction, d*e= 1 mod φ(n). The public key consists of the modulus n and the public (or encryption) exponent e. The private key consists of the modulus n and the private (or decryption) exponent d which must be kept secret. (p, q, and φ(n) must also be kept secret because they can be used to calculate d.) An alternative, used by PKCS#1, is to choose d matching de ≡ 1 mod λ with λ = lcm(p − 1, q − 1), where lcm is the least common multiple. Using λ instead of φ(n) allows more choices for d. λ can also be defined using the Carmichael function, λ(n). The ANSI X9.31 standard prescribes, IEEE 1363 describes, and PKCS#1 allows, that p and q match additional requirements: be strong primes, and be different enough that Fermat factorization fails. 2.Encryption Alice transmits her public key send message M to Alice. to Bob and keeps the private key secret. Bob then wishes to He first turns M into an integer m, such that by using an agreed-upon reversible protocol known as a padding scheme. He then computes the ciphertext corresponding to . This can be done quickly using the method of exponentiation by squaring. Bob then transmits to Alice. Note that at least nine values of m could yield a ciphertext c equal to m,[5] but this is very unlikely to occur in practice. 3.Decryption Alice can recover from by using her private key exponent via computing . Given , she can recover the original message M by reversing the padding scheme. (In practice, there are more efficient methods of calculating below.) using the pre computed values DSA he Digital Signature Algorithm (DSA) is a United States Federal Government standard or FIPS for digital signatures. It was proposed by the National Institute of Standards and Technology (NIST) in August 1991 for use in their Digital Signature Standard (DSS), specified in FIPS 186,[1] adopted in 1993. A minor revision was issued in 1996 as FIPS 186-1.[2] The standard was expanded further in 2000 as FIPS 186-2 and again in 2009 as FIPS 186-3.[3] DSA is covered by U.S. Patent 5,231,668, filed July 26, 1991, and attributed to David W. Kravitz,[4] a former NSA employee. This patent was given to "The United States of America as represented by the Secretary of Commerce, Washington, D.C." and the NIST has made this patent available worldwide royalty-free.[5] Dr. Claus P. Schnorr claims that his U.S. Patent 4,995,082 (expired) covered DSA; this claim is disputed.[6] DSA is a variant of the ElGamal Signature Scheme. DSA is a modification to a signature scheme devised by Taher ElGamal in 1984. In a single paper, ElGamal presented both a public-key encryption scheme, and a distinct signature scheme. RSA is used for signing, by 'encrypting' the message (normally, the hash of the message) using the secret key, and the signature is verified by 'decrypting' using the public key. This reverses the roles of the keys, when RSA is used for confidentiality. RSA uses the same operations for encryption and decryption, and the public and secret keys have a reciprocal relationship to one another. However, the ElGamal cryptosystem uses quite different operations for encryption and decryption, so the trick used to convert RSA into a signature scheme doesn't work. This explains why ElGamal presented a separate (but mathematically related) scheme for signing. The ElGamal signature scheme uses a randomly chosen number in the signing procedure, so that a given message can have a vast number of different signatures. It is not designed to allow computation of the signed message (or signed hash) -- only verification. So ElGamal gave the world a public-key cryptosystem that can't be used for signing, and a public-key signature scheme that can't be used for encryption. To my understanding, he didn't do this to prove that these things can be done; rather, these properties fell out from the arithmetic he used. The ElGamal signature scheme in its original form was never adopted as a widely-used standard, mainly because of size: the signatures are large, and the computations time-consuming. The security of the scheme is believed to be based on the difficulty of the discrete logarithm problem (DLP). For DLP to be secure in multiplication groups (the most commonly used setting), the groups must be very large: orginally, 1024 bits was recommended; now, probably 2048 or 4096 bits. (As a side note, in practice, signatures must be "harder" than cryptosystems, because secrets usually lose value in 1-30 years, whereas the secure verification of a signature for legal purposes may be necessary after 50+ years). The ElGamal signing scheme thus leads to signatures of at least 1024 bits, which are rather cumbersome. Today, most signatures are between 128 and 256 bits in length, with 160 being the most common. The wizards at NSA made a very clever improvement to ElGamal, to create DSA. DSA is set in large group (1024 bits max, in the original standard; extensible to much larger sizes in revisions to the standard), but the computations for signature and verification are performed in a much smaller subgroup, chosen to match the size of the hash function. So, for example, you could choose a 2048 bit group, but use a 160-bit subgroup compatible with SHA-1. In this subgroup, the signature is the same size as the hash, and the computations are quicker -- but the security is believed to be based on the size of the main group. APPENDED CORRECTION: All of the ElGamal algorithms: ElGamal encryption, ElGamal signature, and its derivative the DSA, result in two numbers, each the size of the modulus (or in the case of DSA, the subgroup). So with a 1024-bit modulus, and ElGamal signature occupies 2048 bits, making it even more cumbersome than I described above. For DSA with a 160-bit subgroup (chosen for SHA-1 compatibility), the signature is 320 bits in length. This compares favorably with RSA signatures, which must be modulus-length, so today RSA signatures must be at least 1024 bits long, and for long-term security much larger than that. What is Data Encryption? Encryption uses a mathematical algorithm to scramble readable text that cannot be read unless the reader has the key to "unlock," or convert, the information back to its readable form. This means that your sensitive data cannot be accessed without you providing a password. Types of Data Encryption There are many different types of data encryption, but not all are reliable. In the beginning, 64bit encryption was thought to be strong, but was proven wrong with the introduction of 128-bit solutions. AES (Advanced Encryption Standard) is the new standard and permits a maximum of 256-bits. In general, the stronger the computer, the better chance it has at breaking a data encryption scheme. Data encryption schemes generally fall in two categories: 1.symmetric and AES, DES 2. Asymmetric. DSA , RSA