Final Exam Review Last Lecture R&G - All Chapters Covered The end crowns all, And that old common arbitrator, Time, Will one day end it. William Shakespeare. Troilus and Cressida. Topics Covered • • • • Relation algebra/calculus Transactions Concurrency Control Crash Recovery Relational Algebra: 5 Basic Operations • Selection ( s ) (horizontal). Selects a subset of rows from relation • Projection ( p ) Retains only wanted columns from relation (vertical). • • • • Cross-product ( ) Allows us to combine two relations. Set-difference ( — ) Tuples in r1, but not in r2. Union ( ) Tuples in r1 and/or in r2. Renaming (r) E.g. r (C(1 -> sid1, 5 -> sid2), S1 x R1) Compound Operations •Intersect ( ) •R S = R (R S) •Join R1 < S1.sid R1.sid •Equijoin: Special case where c contains only equalities •Natural join •Condition Join S1 - Compute R S - Select rows where attributes that appear in both relations have equal values - Project all unique attributes and one copy of each of the common ones. Division example Find the names of sailors who’ve reserved ALL red boats p sname ((p sid, bid Reserves) / (p bid (scolor=red Boats)) Sailors) {S1 | S (SSailors S.sname = S1.sname B Boats ( B.color = ‘red’ R(RReserves S.sid = R.sid B.bid = R.bid)))} File Organization: Indexes • Indexes, clustered vs. unclustered, dense vs. sparse • Understand search keys (vs. key constraints!) • 3 alternatives for data entries Data record with key value k <k, rid of data record with search key value k> <k, list of rids of data records with search key k> – Choice of alternative orthogonal to indexing technique! • Clustered/unclustered • Dense/sparse ACID properties of Transaction Executions • A tomicity: All actions in the Xact happen, or none happen. • C onsistency: If each Xact is consistent, and the DB starts consistent, it ends up consistent. • I solation: Execution of one Xact is isolated from that of other Xacts. • D urability: If a Xact commits, its effects persist. Concurrency Control • Transaction: basic unit of operation – made up of reads and writes • Goal: ACID Transactions • A & D are provided by Crash Recovery • C & I are provided by Concurrency Control • Bottom line: reads and writes for various transactions MUST be ordered such that the final state of the database is the same as some serial ordering of the transactions Scheduling Transactions • Serial schedule: Schedule that does not interleave the actions of different transactions. • Equivalent schedules: For any database state, the effect (on the set of objects in the database) of executing the first schedule is identical to the effect of executing the second schedule. • Serializable schedule: A schedule that is equivalent to some serial execution of the transactions. (Note: If each transaction preserves consistency, every serializable schedule preserves consistency. ) Conflict Serializable Schedules • Two schedules are conflict equivalent if: – Involve the same actions of the same transactions – Every pair of conflicting actions is ordered the same way • Schedule S is conflict serializable if S is conflict equivalent to some serial schedule Example • A schedule that is not conflict serializable: T1: T2: R(A), W(A), R(A), W(A), R(B), W(B) R(B), W(B) A T1 T2 Dependency graph B • The cycle in the graph reveals the problem. The output of T1 depends on T2, and viceversa. Dependency Graph • Dependency graph: One node per Xact; edge from Ti to Tj if an operation of Ti conflicts with an operation of Tj and Ti’s operation appears earlier in the schedule than the conflicting operation of Tj. • Theorem: Schedule is conflict serializable if and only if its dependency graph is acyclic An Aside: View Serializability • Schedules S1 and S2 are view equivalent if: – If Ti reads initial value of A in S1, then Ti also reads initial value of A in S2 – If Ti reads value of A written by Tj in S1, then Ti also reads value of A written by Tj in S2 – If Ti writes final value of A in S1, then Ti also writes final value of A in S2 T1: R(A) W(A) T2: W(A) T3: W(A) • T1: R(A),W(A) T2: W(A) T3: W(A) View serializability is “weaker” than conflict serializability! – Every conflict serializable schedule is view serializable, but not vice versa! – I.e. admits more legal schedules Approaches to Concurrency Control • 2PL - all objects have Shared and eXclusive locks – once one lock is released, no more locks may be acquired – Strict 2PL: don’t release locks until commit time – Conservative 2PL: acquire all locks at start, release all at end • Locking issues – must either prevent or detect deadlock – may want multiple granularity locks (table, page, record) using IS, IX, SIX, S, X locks (check compatibility matrix!) – locking in B-trees usually not 2PL – phantom problem: locking all records of a given criteria (e.g., age > 20) Other Approaches to CC • • • • Multiple granularity locks Optimistic CC Timestamp CC Multiversion CC NOT COVERED in CLASS – NOT ON FINAL Question from old final - True/False • “Our lock manager uses S and X locks, which guarantees that all schedules are conflictserializable” • “Because we use 2PL, deadlocks can never occur” • “Because we use Strict 2PL, there will be no cascading aborts” Question from old final - True/False • “Our lock manager uses S and X locks, which guarantees that all schedules are conflictserializable” • “Because we use 2PL, deadlocks can never occur” • “Because we use Strict 2PL, there will be no cascading aborts” More questions .. • Read Uncommitted (UR): – Always read data without setting any locks – Always set exclusive locks before writing and hold them until end-of-transaction. • Or: No S-Locks, Strict 2-Phase X-Locks. Which of the following conflicts can occur between a UR Xact, and a Strict 2PL Xact: – Write-Write – Read-Write – Write-Read More questions .. • Read Committed (RC): – Obtain locks in the same way as Strict 2PL, and release exclusive locks at the end-of-transaction just like Strict 2PL. • Or: RC transactions use Short-Term S-Locks, and Strict 2-Phase X-Locks. Which of the following conflicts can occur between a RC Xact, and a Strict 2PL Xact: – Write-Write – Read-Write – Write-Read More questions .. • All SQL lock modes – even the relaxed modes – use Strict 2-Phase Write Locks. Consider what would happen if a transaction released a shared lock immediately after a read, and released an exclusive lock immediately after a write (Short-Term R-Locks and Short-Term X-Locks). Which of the following could happen in those cases? – Write-Write conflicts. – Inconsistent data in the database – Deadlocks Crash Recovery • • • • ACID - need way to ensure A & D We studied approach of Aries system Buffer management Steal, no Force Every Write to a page is first logged in WAS – log record is in stable storage before data page on disk – log record has Xact#, before value, after value • Checkpoints record which pages dirty, which XActs running Buffer Mgmt Plays a Key Role • Force policy – make sure that every update is on disk before commit. – Provides durability without REDO logging. – But, can cause poor performance. • No Steal policy – don’t allow buffer-pool frames with uncommited updates to overwrite committed data on disk. – Useful for ensuring atomicity without UNDO logging. – But can cause poor performance. Of course, there are some nasty details for getting Force/NoSteal to work… Preferred Policy: Steal/No-Force • This combination is most complicated but allows for highest performance. • NO FORCE (complicates enforcing Durability) – What if system crashes before a modified page written by a committed transaction makes it to disk? – Write as little as possible, in a convenient place, at commit time, to support REDOing modifications. • STEAL (complicates enforcing Atomicity) – What if the Xact that performed udpates aborts? – What if system crashes before Xact is finished? – Must remember the old value of P (to support UNDOing the write to page P). Buffer Management summary No Steal No Force Force Steal Fastest Slowest Performance Implications No Steal No Force No UNDO REDO Steal UNDO REDO No UNDO UNDO Force No REDO No REDO Logging/Recovery Implications Basic Idea: Logging • Record REDO and UNDO information, for every update, in a log. – Sequential writes to log (put it on a separate disk). – Minimal info (diff) written to log, so multiple updates fit in a single log page. • Log: An ordered list of REDO/UNDO actions – Log record contains: <XID, pageID, offset, length, old data, new data> – and additional control info (which we’ll see soon). Write-Ahead Logging (WAL) • The Write-Ahead Logging Protocol: Must force the log record for an update before the corresponding data page gets to disk. Must force all log records for a Xact before commit. (alt. transaction is not committed until all of its log records including its “commit” record are on the stable log.) • #1 (with UNDO info) helps guarantee Atomicity. • #2 (with REDO info) helps guarantee Durability. • This allows us to implement Steal/No-Force • Exactly how is logging (and recovery!) done? – We’ll look at the ARIES algorithms from IBM. Transaction Commit • • • • write Commit record to log flush log tail to stable storage remove Xact from Xact table write End record to log Transaction Abort • write Abort record to log • go back through log, undoing each write (and add CLR to log) • when done, write End record to log Crash Recovery - 3 phases • Analysis: starting from checkpoint, go forward in the log to see: – what pages were dirty – what transactions were active at time of crash • Redo: start from oldest transaction that wrote to a dirty page, and redo all writes to dirty pages. • Undo: start at the end of the log (time of crash), work backward undoing all writes made by transactions that were active at time of crash – What happens when you encounter a CLR ?