PART II CS 245: Database System Principles • • • • Notes 08: Failure Recovery Crash recovery (2 lectures) Concurrency control (3 lectures) Transaction processing (2 lects) Information integration (1 lect) Ch.17[17] Ch.18[18] Ch.19[19] Ch.20[21,22] Hector Garcia-Molina CS 245 Notes 08 1 - x is key of relation R - x y holds in R - Domain(x) = {Red, Blue, Green} is valid index for attribute x of R - no employee should make more than twice the average salary Age White 52 Green 3421 Gray 1 CS 245 Notes 08 2 • Predicates data must satisfy • Examples: • Would like data to be “accurate” or “correct” at all times Name Notes 08 Integrity or consistency constraints Integrity or correctness of data EMP CS 245 3 CS 245 Notes 08 4 Constraints (as we use here) may not capture “full correctness” Definition: • Consistent state: satisfies all constraints • Consistent DB: DB in consistent state Example 1 Transaction constraints • When salary is updated, new salary > old salary • When account record is deleted, balance = 0 CS 245 CS 245 Notes 08 5 Notes 08 6 1 Constraints (as we use here) may not capture “full correctness” Note: could be “emulated” by simple constraints, e.g., account Acct # …. Example 2 Database should reflect real world balance deleted? Reality DB CS 245 Notes 08 7 in any case, continue with constraints... Observation: DB cannot be consistent always! Example: a1 + a2 +…. an = TOT (constraint) Deposit $100 in a2: a2 a2 + 100 TOT TOT + 100 CS 245 Notes 08 9 Transaction: collection of actions that preserve consistency Consistent DB CS 245 T Notes 08 CS 245 Notes 08 8 Example: a1 + a2 +…. an = TOT (constraint) Deposit $100 in a2: a2 a2 + 100 TOT TOT + 100 a2 . . 50 . . 150 . . 150 TOT . . 1000 . . 1000 . . 1100 CS 245 Notes 08 10 Big assumption: If T starts with consistent state + T executes in isolation T leaves consistent state Consistent DB’ 11 CS 245 Notes 08 12 2 Correctness How can constraints be violated? (informally) • Transaction bug • DBMS bug • Hardware failure • If we stop running transactions, DB left consistent • Each transaction sees a consistent DB e.g., disk crash alters balance of account • Data sharing e.g.: T1: give 10% raise to programmers T2: change programmers CS 245 Notes 08 13 CS 245 systems analysts Notes 08 14 Will not consider: How can we prevent/fix violations? • Chapter 8[17]: due to failures only • Chapter 9[18]: due to data sharing only • Chapter 10[19]: due to failures and sharing • How to write correct transactions • How to write correct DBMS • Constraint checking & repair That is, solutions studied here do not need to know constraints CS 245 Notes 08 15 Chapter 8[17]: Recovery CS 245 Events • First order of business: Failure Model CS 245 Notes 08 17 CS 245 Notes 08 Desired Undesired Notes 08 16 Expected Unexpected 18 3 Desired events: see product manuals…. Our failure model memory CS 245 Undesired expected events: System crash - memory lost - cpu halts, resets processor CPU M D disk Notes 08 19 Desired events: see product manuals…. CS 245 Notes 08 Undesired Unexpected: 20 Everything else! Examples: • Disk data is lost • Memory lost without CPU halt • CPU implodes wiping out universe…. Undesired expected events: System crash - memory lost - cpu halts, resets that’s it!! Undesired Unexpected: CS 245 Everything else! Notes 08 21 Is this model reasonable? CS 245 Notes 08 Second order of business: Approach: Add low level checks + redundancy to increase probability model holds Storage hierarchy x E.g., Replicate disk storage (stable store) Memory parity CPU checks CS 245 Notes 08 22 23 Memory CS 245 Notes 08 x Disk 24 4 Operations: Operations: • Input (x): block containing x memory • Output (x): block containing x disk • Input (x): block containing x memory • Output (x): block containing x disk • Read (x,t): do input(x) if necessary t value of x in block • Write (x,t): do input(x) if necessary value of x in block t CS 245 Notes 08 25 Key problem Unfinished transaction Example Constraint: A=B T1: A A 2 B B2 CS 245 Notes 08 26 T1: Read (A,t); t t2 Write (A,t); Read (B,t); t t2 Write (B,t); Output (A); Output (B); A: 8 B: 8 A: 8 B: 8 memory CS 245 Notes 08 27 T1: Read (A,t); t t2 Write (A,t); Read (B,t); t t2 Write (B,t); Output (A); Output (B); A: 8 16 B: 8 16 A: 8 16 B: 8 16 28 A: 8 16 B: 8 memory disk Notes 08 disk Notes 08 T1: Read (A,t); t t2 Write (A,t); Read (B,t); t t2 Write (B,t); Output (A); failure! Output (B); A: 8 B: 8 memory CS 245 CS 245 29 CS 245 disk Notes 08 30 5 • Need atomicity: execute all actions of a transaction or none at all CS 245 Notes 08 31 One solution: undo logging due to: Hansel and Gretel, 1812 AD CS 245 Notes 08 Undo logging One solution: undo logging due to: Hansel and Gretel, 1812 AD • Improved in 1813 AD to durable undo logging (Immediate modification) A:8 B:8 Notes 08 Undo logging 33 T1: Read (A,t); t t2 Write (A,t); Read (B,t); t t2 Write (B,t); Output (A); Output (B); A:8 16 B:8 16 disk A:8 16 B:8 16 A:8 B:8 disk Notes 08 log 34 (Immediate modification) A:8 16 B:8 disk memory 35 CS 245 log Notes 08 T1: Read (A,t); t t2 Write (A,t); Read (B,t); t t2 Write (B,t); Output (A); Output (B); A=B <T1, start> <T1, A, 8> memory CS 245 CS 245 Undo logging (Immediate modification) A=B A:8 B:8 memory CS 245 32 T1: Read (A,t); t t2 Write (A,t); Read (B,t); t t2 Write (B,t); Output (A); Output (B); (immediate modification) (immediate modification) Notes 08 A=B <T1, start> <T1, A, 8> <T1, B, 8> log 36 6 Undo logging T1: Read (A,t); t t2 Write (A,t); Read (B,t); t t2 Write (B,t); Output (A); Output (B); A:8 16 B:8 16 <T1, start> <T1, A, 8> <T1, B, 8> disk A:8 16 B:8 16 log Notes 08 (Immediate modification) T1: Read (A,t); t t2 Write (A,t); Read (B,t); t t2 Write (B,t); Output (A); Output (B); A=B A:8 16 B:8 16 memory CS 245 Undo logging (Immediate modification) CS 245 <T1, start> <T1, A, 8> <T1, B, 8> <T1, commit> A:8 16 B:8 16 disk memory 37 A=B log Notes 08 38 One “complication” One “complication” • Log is first written in memory • Not written to disk on every action • Log is first written in memory • Not written to disk on every action memory A: 8 B: 8 A: 8 16 B: 8 16 Log: <T1,start> <T1, A, 8> <T1, B, 8> CS 245 memory DB A: 8 16 B: 8 A: 8 16 B: 8 16 Log: <T1,start> <T1, A, 8> <T1, B, 8> Log Notes 08 39 CS 245 DB Log BAD STATE #1 Notes 08 One “complication” Undo logging rules • Log is first written in memory • Not written to disk on every action (1) For every action generate undo log record (containing old value) (2) Before x is modified on disk, log records pertaining to x must be on disk (write ahead logging: WAL) (3) Before commit is flushed to log, all writes of transaction must be reflected on disk A: 8 16 B: 8 16 Log: <T1,start> <T1, A, 8> <T1, B, 8> <T1, commit> CS 245 A: 8 16 B: 8 DB Log BAD STATE #2 ... memory <T1, B, 8> <T1, commit> Notes 08 41 CS 245 Notes 08 40 42 7 Recovery rules: Undo logging Recovery rules: • For every Ti with <Ti, start> in log: - If <Ti,commit> or <Ti,abort> in log, do nothing - Else For all <Ti, X, v> in log: write (X, v) output (X ) Write <Ti, abort> to log Undo logging • For every Ti with <Ti, start> in log: - If <Ti,commit> or <Ti,abort> in log, do nothing - Else For all <Ti, X, v> in log: write (X, v) output (X ) Write <Ti, abort> to log IS THIS CORRECT?? CS 245 Notes 08 Recovery rules: 43 Undo logging CS 245 • Can writes of <Ti, abort> records be done in any order (in Step 3)? – Example: T1 and T2 both write A – T1 executed before T2 – T1 and T2 both rolled-back – <T1, abort> written but NOT <T2, abort>? – <T2, abort> written but NOT <T1, abort>? in reverse order (latest earliest) do: - write (X, v) - output (X) (3) For each Ti S do - write <Ti, abort> to log CS 245 Notes 08 T1 write A 45 Notes 08 CS 245 T2 write A Notes 08 time/log 46 To discuss: What if failure during recovery? No problem! Undo idempotent CS 245 44 Question (1) Let S = set of transactions with <Ti, start> in log, but no <Ti, commit> (or <Ti, abort>) record in log (2) For each <Ti, X, v> in log, - if Ti S then Notes 08 • • • • • 47 Redo logging Undo/redo logging, why both? Real world actions Checkpoints Media failures CS 245 Notes 08 48 8 Redo Logging Redo logging (deferred modification) First send Gretel up with no rope, then Hansel goes up safely with rope! T1: Read(A,t); t t2; write (A,t); Read(B,t); t t2; write (B,t); Output(A); Output(B) A: 8 B: 8 A: 8 B: 8 DB memory LOG CS 245 Notes 08 49 CS 245 Notes 08 50 Redo logging (deferred modification) Redo logging (deferred modification) T1: Read(A,t); t t2; write (A,t); Read(B,t); t t2; write (B,t); Output(A); Output(B) T1: Read(A,t); t t2; write (A,t); Read(B,t); t t2; write (B,t); Output(A); Output(B) A: 8 16 B: 8 16 A: 8 B: 8 <T1, start> <T1, A, 16> <T1, B, 16> <T1, commit> A: 8 16 B: 8 16 DB memory output A: 8 16 B: 8 16 DB memory LOG CS 245 Notes 08 LOG 51 Redo logging (deferred modification) T1: Read(A,t); t t2; write (A,t); Read(B,t); t t2; write (B,t); Output(A); Output(B) A: 8 16 B: 8 16 memory output A: 8 16 B: 8 16 DB <T1, start> <T1, A, 16> <T1, B, 16> <T1, commit> <T1, end> LOG CS 245 Notes 08 <T1, start> <T1, A, 16> <T1, B, 16> <T1, commit> 53 CS 245 Notes 08 52 Redo logging rules (1) For every action, generate redo log record (containing new value) (2) Before X is modified on disk (DB), all log records for transaction that modified X (including commit) must be on disk (3) Flush log at commit (4) Write END record after DB updates flushed to disk CS 245 Notes 08 54 9 Recovery rules: Redo logging Recovery rules: • For every Ti with <Ti, commit> in log: Redo logging • For every Ti with <Ti, commit> in log: – For all <Ti, X, v> in log: Write(X, v) Output(X) – For all <Ti, X, v> in log: Write(X, v) Output(X) IS THIS CORRECT?? CS 245 Notes 08 Recovery rules: 55 Redo logging Notes 08 57 • Want to delay DB flushes for hot objects CS 245 Actions: write X output X write X output X write X output X write X output X combined <end> (checkpoint) Notes 08 56 • Want to delay DB flushes for hot objects Actions: write X output X write X output X write X output X write X output X Say X is branch balance: T1: ... update X... T2: ... update X... T3: ... update X... T4: ... update X... CS 245 Notes 08 Solution: Checkpoint Combining <Ti, end> Records Say X is branch balance: T1: ... update X... T2: ... update X... T3: ... update X... T4: ... update X... Notes 08 Combining <Ti, end> Records (1) Let S = set of transactions with <Ti, commit> (and no <Ti, end>) in log (2) For each <Ti, X, v> in log, in forward order (earliest latest) do: - if Ti S then Write(X, v) Output(X) (3) For each Ti S, write <Ti, end> CS 245 CS 245 59 58 • no <ti, end> actions> •simple checkpoint Periodically: (1) Do not accept new transactions (2) Wait until all transactions finish (3) Flush all log records to disk (log) (4) Flush all buffers to disk (DB) (do not discard buffers) (5) Write “checkpoint” record on disk (log) (6) Resume transaction processing CS 245 Notes 08 60 10 ... ... CS 245 ... ... <T3,C,21> ... <T2,commit> ... <T2,B,17> • Undo logging: cannot bring backup DB copies up to date • Redo logging: need to keep all modified blocks in memory until commit Checkpoint Redo log (disk): <T1,commit> Key drawbacks: <T1,A,16> Example: what to do at recovery? Crash Notes 08 61 Update <Ti, Xid, New X val, Old X val> page X Notes 08 63 Example: Undo/Redo logging what to do at recovery? Notes 08 • Page X can be flushed before or after Ti commit • Log record flushed before corresponding updated page (WAL) • Flush at commit (log only) CS 245 Notes 08 64 L O G ... Start-ckpt active TR: Ti,T2,... ... end ckpt ... ... ... <T2, D, 40, 41> ... <T2, C, 30, 38> ... <T1, commit> ... <T1, B, 20, 23> <T1, A, 10, 15> <checkpoint> CS 245 ... 62 Non-quiesce checkpoint log (disk): ... Notes 08 Rules Solution: undo/redo logging! CS 245 CS 245 Crash for undo 65 CS 245 dirty buffer pool pages flushed Notes 08 66 11 Non-quiesce checkpoint Examples what to do at recovery time? memory no T1 commit checkpoint process: for i := 1 to M do output(buffer i) L O G [transactions run concurrently] CS 245 Notes 08 67 Examples what to do at recovery time? no T1 commit L O G ... T1,a Undo T1 CS 245 ... Ckpt T1 ... Ckpt end ... T1b Notes 08 ... Ckpt T1 CS 245 ... Ckpt end ... T1b Notes 08 68 Example L O G 69 ... ckpt-s T1 T1 ckptT1 T1 ... T1 ... ... ... ... ... a b c cmt end CS 245 Notes 08 70 Recover From Valid Checkpoint: ckpt-s T1 T1 ckptT1 T1 ... T1 ... ... ... ... ... a b c cmt end L O ... G Notes 08 ckpt start ... ckpt end ... T1 T1 ckpt... start ... ... b c start of latest valid checkpoint Redo T1: (redo b,c) CS 245 ... (undo a,b) Example L O G T1,a ... 71 CS 245 Notes 08 72 12 Recovery process: • Backwards pass (end of log latest valid checkpoint start) – construct set S of committed transactions – undo actions of transactions not in S Real world actions E.g., dispense cash at ATM Ti = a1 a2 …... aj …... an • Undo pending transactions – follow undo chains for transactions in (checkpoint active list) - S • Forward pass $ (latest checkpoint start end of log) – redo actions of S transactions start checkpoint CS 245 backward pass forward pass Notes 08 73 CS 245 Notes 08 Solution 74 ATM (1) execute real-world actions after commit (2) try to make idempotent Give$$ (amt, Tid, time) lastTid: time: give(amt) $ CS 245 Notes 08 75 Media failure (loss of non-volatile storage) CS 245 Notes 08 76 Media failure (loss of non-volatile storage) A: 16 A: 16 Solution: Make copies of data! CS 245 Notes 08 77 CS 245 Notes 08 78 13 Example #2 Example 1 Triple modular redundancy • Keep 3 copies on separate disks • Output(X) --> three outputs • Input(X) --> three inputs + vote X1 CS 245 • Keep N copies on separate disks • Output(X) --> N outputs • Input(X) --> Input one copy - if ok, done - else try another one Assumes bad data can be detected X3 X2 Notes 08 Redundant writes, Single reads 79 CS 245 Notes 08 80 Backup Database Example #3: DB Dump + Log • Just like checkpoint, except that we write full database database backup database log create backup database: for i := 1 to DB_Size do [read DB block i; write to backup] active database [transactions run concurrently] • If active database is lost, – restore active database from backup – bring up-to-date using redo entries in log CS 245 Notes 08 81 Backup Database CS 245 database last needed undo log create backup database: for i := 1 to DB_Size do [read DB block i; write to backup] last needed undo db dump checkpoint time not needed for media recovery [transactions run concurrently] not needed for media recovery redo not needed for undo after system failure • Restore from backup DB and log: Similar to recovery from checkpoint and log Notes 08 82 When can log be discarded? • Just like checkpoint, except that we write full database CS 245 Notes 08 not needed for redo after system failure 83 CS 245 Notes 08 84 14 Summary • Consistency of data • One source of problems: failures - Logging - Redundancy • Another source of problems: Data Sharing..... next CS 245 Notes 08 85 15