University of Manitoba Asper School of Business 3500 DBMS Bob Travica Chapter 7 Database Transactions, Integrity & Accuracy Based on G. Post, DBMS: Designing & Building Business Applications Updated 2018 D B S Y S T E M S Outline Concepts of Database Transaction, Integrity & Accuracy Code in DB system Triggers Transactions Concurrent Access Locks & Deadlocks ACID Transactions Keys creation DB cursor 2 of 20 D B Transaction & Integrity Concepts Database transaction is a set of processing steps that succeed or fail altogether. A database transaction behaves like a whole. S Y S T E M S Differentiate from business transaction – one instance of a business operation (a purchase, a sale, an inventory input…); refer to TPS. Workings of a DBS is usually in the form of database transactions. Database Integrity is about data consistency. Inside a DBS, integrated data are “correct” data since a particular piece of data is the same across the system. But it might be wrong (inaccurate), that is not corresponding to reality. 3 of 20 D B S Y S T E M S Transaction & Integrity Concepts Database integrity is sometimes mixed up with the concept of database accuracy (data corresponding to/reflecting reality); keep this in mind when reading the textbook. Better: think of data integrity AND data accuracy as properties of a quality database. Example: A $100 million figure is stored and referred to consistently in a DB system (integrity), but the real income is $90 million (accuracy, the lack of it). Properly designed and executed transactions ensure data integrity and accuracy. 4 of 20 D B S Y S T E M S Code as Part of DB System Create code As triggers = procedures that support biz processes; In forms and reports (localized effects) Functions combined with embedded SQL User-defined functions Function combined with SQL statements Tables DB System Trigger If inventory < Minimum Then CREATE POrder End If Form C++ code: if (. . .) { // embedded SQL SELECT … } If (Click) Then MsgBox . . . End If 5 of 20 D B S Y S T E M S User-Defined Functions* • A program for increasing a salary for a certain amount (pseudo code). CREATE FUNCTION IncreaseSalary (EmpID INTEGER, Amt CURRENCY) BEGIN INPUT EmpID, Amt IF (Amt > 50000) THEN RETURN -1 -- error flag, data validation ELSE UPDATE Employee SET Salary = Salary + Amt WHERE EmployeeID = EmpID; PRINT Salary -- new values END * A function is the code (piece of application software) that returns a result of processing. 6 of 20 D B S Y S T E M S Events and Triggers (The topic from pre-midterm) • An event starts (initiates) a trigger (programming code): • Common events: • SQL-based on rows: INSERT, DELETE, UPDATE • SQL-based on tables: ALTER, CREATE, DROP • Based on user’s action: LOGOFF, LOGON • Based on server action: SERVERERROR, SHUTDOWN, STARTUP • Example on slide 4: BEFORE UPDATE of Employee table rows, run a trigger that checks if a salary ceiling is respected. 7 of 20 D B Table • SQL events support triggers at two points in time - BEFORE and AFTER data change Row S Y S T E M S Table and Row Triggers Before Update on a table (all rows) Before Update of a row Update point After Update on the table After Update of the row time 8 of 20 D B S Y S T E M S An After Row Trigger Example • The trigger logs employee ID, date, user’s ID, old salary, and new salary into table SalaryChanges, after each update on the salary column in the Employee table. Useful for auditing. CREATE TRIGGER LogSalaryChanges AFTER UPDATE OF Salary ON Employee REFERENCING OLD ROW As oldrow NEW ROW As newrow FOR EACH ROW INSERT INTO tblSalaryChanges (EmpID, ChangeDate, UserID, OldValue, NewValue) VALUES (newrow.EmployeeID, CURRENT_TIMESTAMP, CURRENT_USER, oldrow.Salary, newrow.Salary); 9 of 20 D B Cascading Triggers • Increase scope of automation. Take care of dependencies. Sale(SaleID, SaleDate, …) S Y S T E M S SaleItem(SaleID, ItemID, Quantity, …) AFTER INSERT UPDATE Inventory SET Inventory.QOH=Inventory.QOH – newrow.Quantity; Inventory(ItemID, QOH, …) AFTER UPDATE WHEN Inventory.QOH < minQuantity INSERT {new Order} INSERT {new OrderItem}; Order(OrderID, OrderDate, …) OrderItem(OrderID, ItemID, Quantity, …) 10 of 20 D B S Y S T E M S Transactions Definition: Transaction is a sequence of processing tasks (a process) that succeed or fail altogether. Functions, procedures and triggers (preceding slides) can be defined as transactiosi. Transaction 1. Subtract $1000 from Savings. (Then, system crashes) 2. Add $1000 to Checking. (Money not added) Reason: to protect database accuracy 1. SavingsAccount Checking against system failures. Joe Doe Account Example on right: 1. customer starts transferring money from savings account to checking account. 2. System crashes after subtracting amount from Savings and the amount is not added to Checking. Bal.: 5340.92 Joe Doe Subtract: 1000.00 Bal.: 1424.27 New Bal.: 4340.92 Add: 1000 2. Transfer Fails $1000 subtracted from Savings but Checking Balance not increased! inaccurate data!___ Checking Account Joe Doe Bal.: 1424.27 11 of 20 D B S Y S T E M S Designing Transactions Transactions are programmed Mark a transaction start - START TRANSACTION Determine a point of temporary saving of data changes Mark a transaction end - COMMIT end-result of processing to database, permanent save. 12 of 20 D B Designing Transactions: Error Correction Full ROLLBACK Start S Y S T E M S SAVEPOINT StartOptional Run simple steps COMMIT (Save) Riskier steps Partial ROLLBACK processing sequence START TRANSACTION; SELECT * FROM tbl_Customer WHERE CustID=… UPDATE tbl_Customer.Address SAVEPOINT StartOptional UPDATE tbl_Order.newrecord UPDATE tbl_Inventory.QOH IF error THEN ROLLBACK TO SAVEPOINT StartOptional END IF COMMIT; 13 of 20 D B S Y S T E M S Concurrent Access Concurrent Access Multiple transactions competing for the same data at the same time. If Order process reads Balance before Payment process ends, the end balance will be incorrect. Table Customer Account read and update Payment transaction 1) Read Balance 800 2) Subtract pmt -200 3) Read old balance=800 4) Save new bal. 600 5) Add order 150 6) Write balance 950 Order transaction Integrity violation Accuracy violation $950 is the end balance instead of $750 (600+150). Customer at loss. 14 of 20 D B S Y S T E M S Pessimistic Locks (Serialization) One answer to problems of concurrent access is to force transactions to run in a sequence one after another. A transaction places a SERIALIZABLE lock on data so that no other transaction can access it before the first transaction is completed. SET TRANSACTION SERIALIZABLE READ, WRITE Payment transaction locks the table 1) Read balance 2) Subtract pmt 3) Save new bal. 800 -200 600 Order transaction, trying to interfere with Payment at step 3 is locked out. 3) System’s message : “Table is locked, try later”. The integrity problem avoided and the transaction will work with accurate data. 15 of 2 D B Deadlock! Transaction 1 (T1) Deadlock = problem with serialized locks S Y S T E M S when different transactions use multiple tables concurrently. T1 locks tbl A and requests access to tbl B, while tbl B is locked by T2 that, in turn, requests access to T1. Neither transaction can be completed. 1) Lock tbl A 2) Read tbl B 1 3 Tbl A Tbl B • Transactions lock out each other, creating a “deadly embrace” (deadlock). 2 • Solution 1 to deadlock: Each transaction waits random time, tries again, releases locks if unsuccessful, and runs itself all over again. • Solution 2: automated lock manager (next). 4 1) Lock tbl B 2) Update tbl A Transaction 2 (T2) 16 of 20 D B Deadlock Loop Sometimes a lock manager must intervene: A closed lock-wait loop that blocks many transactions. Table A Transaction 1 S Y S T E M S Table B Table C Lock Table D Table E Wait (needs tbl_B, tbl_D) Transaction 2* (needs tbl_D, tbl_A) Lock Wait Transaction 3 Transaction 4 (needs tbl_A, tbl_E) Lock Wait Transaction 6 (needs tbl_E, tbl_C) Wait Lock Transaction 7 (needs tbl_C, tbl_D) Lock Wait Lock Manager (part of DBMS) monitors all transactions and disables some temporarily to clear the deadlock. 17 of 20 D B S Y S T E M S Optimistic Locks Opposite to serialization lock (pessimistic lock); a solution against deadlocks and wait times with pessimistic locks. Logic: Assuming that collisions are rare Read transactions are frequent Record state of DB at Read time of any transaction (no locks) If a transaction tries to write a new value, system reads data again (current data), and compares it with the initial read. If there is a difference, system sends error message and calls the transaction to execute itself again, so it works with the current data. 18 of 20 D B S Y S T E M S ACID Transactions The ACID standard ensures data integrity and accuracy (in part) in a relational database. Differentiates DBS on quality and the relational DB tech. from non-relational tech. Atomicity: all changes succeed or fail together (as “an atom”), i.e., DBS supports transactions definition and execution. Consistency: all data remain internally consistent (when transactions are executed) and can be validated, i.e., referential integrity supported). Isolation: each transaction is processed separately, i.e., DBS has locks management. Durability: When a transaction is committed, data changes are permanently saved (using log files and other techniques), i.e., DBS has a recovery capability. 19 of 20 D B Methods to Generate Keys 1. S Y S T E M S The DBMS can generate key values automatically whenever a row is inserted into a table (surrogate key). Drawback: concurrent access to DB system can mix up keys that are generated at the same time (e.g., CustomerID In Customer table, which needs to be reused in the Order table). 2. A separate key generator is called by a programmer to create a new key for a specified table. Prevents mix-ups but requires that programmers write code to generate a key for every table and each row insertion. 20 of 20 D B Key Generator • Appropriate key creation also ensures database integrity and accuracy. S Y S T E M S Customer Table Create an order for a new customer: (1) Create new key for CustomerID (2) INSERT row into Customer (3) Create key for new OrderID (4) INSERT row into Order CustomerID, Name, … Order Table OrderID, CustomerID, … 21 of 20 D B S Y S T E M S Database Cursor Cursor = A type of variable (memory space) that holds entire records. A relic from old DB systems (Declare Cursor… Close Cursor) – no relationship with screen cursor (user interface). Purposes: Complex calculations on rows Comparisons between rows Year 1998 1999 2000 2001 Sales Diff. 104,321 145,998 276,004 362,736 A cursor would read values of sales for the current and previous year, and calculate the difference. 22 of 20