Lecture Eleven - Data Security & Recovery (Reference: C. J. Date, An introduction to Database Systems) We can protect our data from unauthorised access by some security control such as system password or so, however, some infiltrators may physically removing part of the database, or by tapping into a communication line to get the data they want. It’s hardly to prevent infiltrators to get the data, how can we protect our data? One of the most effective measure is data encryption: the infiltrators may get the data, but they cannot interpret the data. We store and transmit data in an encrypted form. 1/17 Terminology in Data Encryption: Plaintext Encryption Encryption Key Ciphertext The original data. An algorithm which converts the plaintext into another form. An input value for use in the encryption algorithm. The encrypted form of plaintext. An example for data encryption. Plain text : AS KINGFISHERS CATCH FIRE Encryption key: ELIOT 2/17 Steps: 1. Divide the plaintext into blocks of length equal to the encryption key: AS_KI NGFIS HERS_ CATCH _FIRE (Note: ‘_’ means blank space) 2. Replace each character of the plaintext by an integer in the range 00-26, using blank space as 00, A as 01, ..Z=26 3. Repeat Step 2 for the encryption key: ELIOT = 0512091520 3/17 4. For each block of the plaintext, replace each character by the sum modulo 27 of its integer encoding and the integer encoding and the integer encoding of the corresponding character of the encryption key: 0119001109 0512091520 0604092602 1407060919 0512091520 1919152412 0805181900 0301200308 0006091805 0512091520 0512091520 0512091520 1317000720 0813021801 0518180625 5. Replace each integer encoding in the result of Step 4 by its character equivalent: FDIZB SSOXL MQ_GT HMBRA ERRFY Q: Is it really secure? Ans.: Well, obviously it is not difficult to break this security. 4/17 Public-Key Encryption In this method, both the encryption algorithm and the encryption key are made freely available, therefore everyone is able to make use of this method to convert plaintext into Ciphertext. Whereas the corresponding decryption key is kept secret. Note that in this method, there are two keys, one is the encryption key and the other is decryption key. Decryption key cannot be deduced from the encryption key, thus even the person performing the original encryption cannot perform the corresponding decryption if not authorised to do so. The original idea of public-key encryption is from Diffie and Hellman, in which a specific approach is from Rivest, Shamir and Adleman(RSA scheme). 5/17 The RSA scheme is based on the facts that: 1. There is a known fast algorithm for determining whether a given number is prime or not 2. There is no known algorithm for finding the prime factors of a given composite number. Algorithm for RSA scheme: 1. Randomly choose two distinct large prime numbers p and q, then compute the product r = p * q. 2. Randomly choose a large integer e that is relatively prime to the product (p 1) * (q - 1). The integer e is the encryption key. 6/17 3. Take the decryption key d to be unique “multiplicative inverse” of e modulo (p - 1) * (q - 1); d = e modulo (p-1) * (q-1) 4. Publish the integers r and e but not d. 5. To encrypt a piece of plaintext P, replace it by the Ciphertext C, computed as follows: C= Pe modulo r 6. To decrypt a piece of Ciphertext C, replace it by the plaintext P, compute as follows: P = Cd modulo r 7/17 Example: Let p = 3, q = 5, r = 15. (p -1) * (q - 1) = 8 Let e = 11 (a prime greater than both p and q). d = 11 modulo 8 Now let the plaintext P consist of the integer 13, then the Ciphertext C is: C = Pe modulo r = 1311 modulo 15 = 1,792,160,394,037 modulo 15 =7 8/17 Now the original plaintext P is given by P = Cd modulo r = 73 modulo 15 = 13 Q: If A sends a Ciphertext to B, how does B know that the Ciphertext did indeed come from A? Suppose ECA and ECB are for encrypting messages to be sent to A and B, DCA and DCB are for decrypting messages message for A and B. Now, A wishes to send a plaintext P to B, A can first DCA the message and then ECB the result and transmits that as Ciphertext C: C = ECB ( DCA (P)) 9/17 When B receives the C, the DCB will be applied to it followed by the ECA, thus producing the final result P: ECA ( DCB (C)) = ECA (DCB ( ECB ( DCA (P)))) = ECA (DCA (P)) ----- DCB and ECB cancel =P ----- ECA and DCA cancel Now, because ECA will produce P only if algorithm DCA was used in the encryption process, and that algorithm is known only to A, so no one can forge A’s message. 10/17 Recovery In database, recovery means restoring the database to a state that is know to be correct after some failure has caused the current state to be incorrect. We must make sure that the database is recoverable that any piece of information it contains can be reconstructed from some other information stored, somewhere in the system. Transaction Recovery We can define any database update into a logical unit of work, i.e. a transaction. A transaction begins with the successful execution of BEGIN TRANSACTION and a COMMIT or ROLLBACK command. Begin Transaction Select * from table; Update table set ........ Commit; 11/17 Begin Transaction Update table set ..... Rollback; Note carefully that COMMIT and ROLLBACK terminate a transaction, not the program. Generally speaking, a single program consists of a sequence of transactions. Money Transfer: ($1000 from John’s account to May’s account) John’s account = John’s account– 1000 May’s account = May’s account + 1000 12/17 There are four important properties for each transaction: Atomicity: Consistency: Transactions are atomic (all or nothing) Transformations preserve database consistency. That is, a transaction transforms a consistent state into another consistent state. Isolation: No matter how many transactions are running concurrently, any given transaction’s updates are concealed from all others until that transaction ends. Durability: Q: What’s the implication of that? Once a transaction committed, its updates will be valid, even any subsequent operations fail, including system crash. 13/17 System Recovery System Such as power failure, OS crash which affect all transactions Failures: currently in progress but do not physically damage the database. A system failure is also called a soft crash. Media Failures: Such as disk crash or cartridge failure which cause damage to the database. It affects at least those transactions currently using the damaged data. A media failure is also called a hard crash. Q: What mechanism can be used to recover a crashed system? Checkpoint At certain prescribed intervals, the system automatically takes a checkpoint which involves physically writing the contents of the database buffers out to the physical database and writing a special checkpoint record to the physical log. 14/17 The checkpoint record gives a list of all transactions that were in progress at the time the checkpoint was taken. Checkpoint example Time f Time c Time T1 T2 T3 T4 T5 Checkpoint System failure 15/17 When the system restarts after Time f, transaction T3 and T5 will be undone, T2 and T4 will be redone, T1 has finishing physically writing the database and thus will not be included in the restart process. Media Recovery A media failure is a failure such as a disk head crash, or a disk controller failure. Recovery form such cases basically involves reloading the database from a backup copy and then using the log to redo all transactions that completed since that backup copy was taken. Q: For non-stop applications, such as banking systems, which does not tolerate system down time, what configuration can be used to tackle the problem? 16/17 DATABASE-W Disk A Disk Controller-1 CPU-X DATABASE-W Disk B Disk Controller-2 CPU-Y Media-Fault-Tolerant System Configuration 17/17