Quick Review of Apr 24 material

Quick Review of Apr 24 material • Sorting (Sections 13.4) • Sort-merge Algorithm for external sorting • Join Operation implementations (sect. 13.5) – – – – – – Size estimation Nested-loop method (and cost) Sort-Merge Join (and cost) Hash Join (and cost) Indexed Join (and cost) 3-way Join Remaining Material • Read over sections the remainder of chapter 13 on your own (you are responsible for the material) • We’re going to look at section 14.3 today (following slides) • We’ve covered a significant part of the material in chapter 14 already, mixed in with the chapter 13 material in the class notes. Read through chapter 14 yourself; you are responsible for the material. • We have five classes remaining (counting today) – – – – – the bulk of it will be spent on chapter 15 (15.1-5, 15.9) additional material will be sections 16.1 (lock-based protocols) and section 17.4 (log-based recovery) if possible, we will attempt to finish up in time to use May 13 as a review May 15? Optional Study Class? Transformation of Relational Expressions • (Section 14.3) • Two relational algebra expressions are equivalent if they generate the same set of tuples on every legal database instance. – Important for optimization – Allows one relational expression to be replaced by another without affecting the results – We can choose among equivalent expressions according to which are lower cost (size or speed depending upon our needs) – An equivalence rule says that expression A is equivalent to B; we may replace A with B and vice versa in the optimizer. Equivalence Rules (1) In the discussion that follows: Ex represents a relational algebra expression y represents a predicate Lz represents a particular list of attributes  and  are selection and projection as usual • cascade of selections: a conjunction of selections can be deconstructed into a series of them 12(E) = 1(2(E)) • selections are commutative 1(2(E)) = 2( 1(E)) Equivalence Rules (2) • cascade of projections: only the final ops in a series of projections is necessary L1(L2(…(L3(E))…)) = L1(E) • selections can be combined with Cartesian products and theta joins (E1 X E2) = E1 |X| E2 1(E1 | X|2 E2) = E1 |X|12 E2 • joins and theta-joins are commutative (if you ignore the order of attributes, which can be corrected by an appropriate projection) E1 |X| E2 = E2 |X| E1 Equivalence Rules (3) • joins and cross product are associative (E1 |X|1 3 E2 ) |X|2 E3 = E1 |X|1 (E2 |X|23 E3) (where 2 involves attributes from only E2 and E3.) (E1 |X| E2 ) |X| E3 = E1 |X| (E2 |X| E3) (E1 X E2 ) X E3 = E1 X (E2 X E3) (the above two equivalences are useful special cases of the first one) Equivalence Rules (4) • selection distributes over join in two cases: – if all the attributes in 1 appear only in one expression (say E1) 1 (E1 |X| E2) = (1 (E1 )) |X| E2 – if all the attributes in 1 appear only in E1 and all the attributes in 2 appear only in E2 1 2 (E1 |X| E2) = (1 (E1 )) |X| (2 ( E2)) (note that the first case is a special case of the second) Equivalence Rules (5) • projection distributes over join. Given L1 and L2 being attributes of E1 and E2 respectively: – if  involves attributes entirely from L1 and L2 then L1L2 (E1 |X| E2) = (L1 (E1)) |X| (L2 ( E2)) – if  involves attribute lists L3 and L4 (both not in L1  L2) from E1 and E2 respectively L1L2 (E1 |X| E2) = L1L2 (L1L3 (E1)) |X| (L2L4 ( E2)) Equivalence Rules (6) • union and intersection are commutative (difference is not) E1  E2 = E2  E1 E1  E2 = E2  E1 • union and intersection are associative (difference is not) (E1  E2)  E3 = E1  (E2  E3) (E1  E2)  E3 = E1  (E2  E3) • selection distributes over union, intersection, set difference  (E1  E2) =  (E1)   (E2)  (E1  E2) =  (E1)   (E2)  (E1  E2) =  (E1)  E2  (E1 - E2) =  (E1) -  (E2)  (E1 - E2) =  (E1) - E2 Chapter 15: Transactions • A transaction is a single logical unit of database work -for example, a transfer of funds from checking to savings account. It is a set of operations delimited by statements of the form begin transaction and end transaction • To ensure database integrity, we require that transactions have the ACID properties – Atomic: all or nothing gets done. – Consistent: preserves the consistency of the database – Isolated: unaware of any other concurrent transactions (as if there were none) – Durable: after completion the results are permanent even through system crashes ACID Transactions • Transactions access data using two operations: – read (X) – write (X) • ACID property violations: – Consistency is the responsibility of the programmer – System crash half-way through: atomicity issues – Another transaction modifies the data half-way through: isolation violation – example: transfer $50 from account A to account B T: read(A); A:=A-50; write (A); read (B); B:=B+50; write (B) Transaction States • Five possible states for a transaction: – active (executing) – partially committed (after last statement’s execution) – failed (can no longer proceed) – committed (successful completion) – aborted (after transaction has been rolled back Transaction States • Committed or Aborted transactions are called terminated • Aborted transactions may be – restarted as a new transaction – killed if it is clear that it will fail again • Rollbacks – can be requested by the transaction itself (go back to a given execution state) – some actions can’t be rolled back (e.g., a printed message, or an ATM cash withdrawal) Concurrent Execution • Transaction processing systems allow multiple transactions to run concurrently – improves throughput and resource utilization (I/O on transaction A can be done in parallel with CPU processing on transaction B) – reduced waiting time (short transactions need not wait for long ones operating on other parts of the database) – however, this can cause problems with the “I” (Isolation) property of ACID Serializability • Scheduling transactions so that they are serializable (equivalent to some serial schedule of the same transctions) ensures database consistency (and the “I” property of ACID) • serial schedule is equivalent to having the transactions execute one at a time • non-serial or interleaved schedule permits concurrent execution Serial Schedule • To the right T1 and T2 are on a serial schedule: T2 begins after T1 finishes. No problems. Interleaved Schedule • This is an example of an interleaved concurrent schedule that raises a number of database consistency concerns.

Quick Review of Apr 24 material

Related documents

Products

Support

Quick Review of Apr 24 material

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib