Information Resources Management April 10, 2001

advertisement
Information Resources
Management
April 10, 2001
Agenda
Administrivia
 Database Design
 Denormalization
 Database Administration
 Security
 Backup & Recovery
 Concurrency Controls

Administrivia
Schema Tuning Staying Normal
Split Tables - Vertical Partitioning
 Highly used vs. infrequently used
columns


Don’t partition if result will be more joins

Keys are duplicated
Schema Tuning Staying Normal
Variable length fields (VARCHAR,
others)
 Indeterminant record lengths
 Row locations vary


Vertically partition row into two tables,
one with fixed and one with variable
columns
Schema Tuning Leaving Normal

Normalization
 Eliminates duplication
 Reduces anomalies
 Does

not result in efficiency
Denormalize for performance
Denormalization Warnings

Increases chance of errors or inconsistencies
May result in reprogramming if business rules
change
Optimizes based on current transaction mix
Increases duplication and space required
Increases programming complexity

Always normalize first then denormalize




Denormalization
Partition Rows
 Combine Tables
 Combine and Partition
 Replicate Data

Combining Opportunities
One-to-one (optional)
 allow nulls
 Many-to-many (assoc. entity)
 2 tables instead of 3
 Reference data (one-to-many)
 “one” not use elsewhere
 few of “many”

Combining Examples
Employee-Spouse (name and SSN
only)
 Owner-PctOwned - Property
 few owners with multiple properties
 Property-Type (description)
 one type per property

Partitioning
Horizontal
 By row type
 Separate processing by type
 Supertype/subtype decision
 Vertical (already seen)
 Both

Replication

Intentionally repeating data

Example: Owner-PctOwned-Property
 Owner includes PctOwned &
PropertyID
 Property includes majority OwnerSSN
and PctOwned
Performance Tuning
Not a one-time event
 Monitoring probably more important
 Things change
 applications, database (table) sizes,
data characteristics
 hardware, operating system, DBMS

Database Administration
Security
 Backup & Recovery
 Concurrency Controls

Security - Authorization
Row Operations
 Read
 Insert
 Update
 Delete
Table Operations
 Index
 Creation/Removal
 Resource
 New Tables
 Alteration
 Drop
Authorization Granularity
Table-level only
 View is the same as a table

Views
Select statement that is given a table
name
 Views can select from other views

CREATE VIEW OfficeEmps AS
(SELECT O.OfficeNbr, E1.EmpID, E1.Name, M.EmpID,
E2.Name AS MgrName) FROM Office AS O,
Manager AS M, Employee AS E1, Employee as E2
WHERE O.OfficeNbr = M.OfficeNbr AND M.EmpID =
E2.EmpID AND O.OfficeNbr = E1.OfficeNbr and
E1.EmpID <> E2.EmpID)
Enhancing Granularity
Through Views
Specific Columns - SELECT xxxx
 Specific Rows - WHERE xxxx=yyyy
 Both

SQL
GRANT priviledge ON table TO user
 (WITH GRANT OPTION)
 REVOKE priviledge ON table FROM
user
 (RESTRICT or CASCADE)
 GRANTS by that user on that table

Types of Failures
Transaction
 Logical
 System
 System
 Operating System
 Hardware
 Network
 Disk

Recovery Approaches
Switch - mirror DB needed (RAID-1)
 Restore/Rerun
 Previous backup
 Rerun all transactions (needed)
 Log-Based
 Rollback - undo incomplete
 Rollforward - previous backup

Requirements

Permanently write changes without
changing the database

Transaction States
 Partially Committed - transaction is
done
 Fully Committed - changes have been
made
Log-Based Recovery
Log - record of all database activity
 Log Records
 Transaction start
 Transaction write (update)
 new and old values
 Transaction abort
 Transaction commit

Log-Based Recovery
Deferred
 Immediate

Deferred Log
Trans
Log
DB
Database modification occurs after transaction commits
Deferred Log
Only new values kept in update log
record
 Only committed changes need to be
reapplied at recovery
 Uncommitted changes can be removed
from the log

Deferred Log Example
Transaction
READ(EMP=75)
GRADE=11
WRITE(EMP=75)
COMMIT
**committed**
READ(EMP=75)
READ(GRADE=11)
SALARY = 26500
WRITE(EMP=75)
COMMIT
**committed**
Log
<T1 START>
Database
<T1 EMP=75, GRADE=11>
<T1 COMMIT>
<T2 START>
EMP=75, GRADE=11
<T2 EMP=75, SALARY=26500>
<T2 COMMIT>
EMP=75, SALARY=26500
Deferred Log Example
Transaction
READ(EMP=75)
GRADE=11
WRITE(EMP=75)
COMMIT
**committed**
READ(EMP=75)
READ(GRADE=11)
SALARY = 26500
WRITE(EMP=75)
COMMIT
**committed**
Log
<T1 START>
Database
<T1 EMP=75, GRADE=11>
<T1 COMMIT>
<T2 START>
EMP=75, GRADE=11
<T2 EMP=75, SALARY=26500>
<T2 COMMIT>
Recovery only deletes from log
EMP=75, SALARY=26500
Deferred Log Example
Transaction
READ(EMP=75)
GRADE=11
WRITE(EMP=75)
COMMIT
**committed**
READ(EMP=75)
READ(GRADE=11)
SALARY = 26500
WRITE(EMP=75)
COMMIT
**committed**
Log
<T1 START>
Database
Database
<T1 EMP=75, GRADE=11>
<T1 COMMIT>
<T2 START>
EMP=75,
EMP=75, GRADE=11
GRADE=11
<T2 EMP=75, SALARY=26500>
SALARY=26500>
<T2 COMMIT>
EMP=75,
EMP=75, SALARY=26500
SALARY=26500
REDO(T1) - commit vs. actual database update
Deferred Log Example
Transaction
READ(EMP=75)
GRADE=11
WRITE(EMP=75)
COMMIT
**committed**
READ(EMP=75)
READ(GRADE=11)
SALARY = 26500
WRITE(EMP=75)
COMMIT
**committed**
Log
<T1 START>
Database
Database
<T1 EMP=75, GRADE=11>
<T1 COMMIT>
<T2 START>
EMP=75,
EMP=75, GRADE=11
GRADE=11
<T2 EMP=75, SALARY=26500>
SALARY=26500>
<T2 COMMIT>
REDO(T1); Delete T2 from Log
EMP=75,
EMP=75, SALARY=26500
SALARY=26500
Deferred Log Example
Transaction
READ(EMP=75)
GRADE=11
WRITE(EMP=75)
COMMIT
**committed**
READ(EMP=75)
READ(GRADE=11)
SALARY = 26500
WRITE(EMP=75)
COMMIT
**committed**
Log
<T1 START>
Database
Database
<T1 EMP=75, GRADE=11>
<T1 COMMIT>
<T2 START>
EMP=75,
EMP=75, GRADE=11
GRADE=11
<T2 EMP=75, SALARY=26500>
SALARY=26500>
<T2 COMMIT>
REDO(T1); REDO(T2)
EMP=75,
EMP=75, SALARY=26500
SALARY=26500
Failure During Recovery
Recovery from recovery must be
possible
 Redo must be executable multiple times
without any differences from a single
execution

Immediate Modification
Trans
Log
DB
Database modified as transaction proceeds
Immediate Modification
Update log records require old and new
values
 Recovery requires either a REDO or an
UNDO based on whether or not each
transaction was committed

Immediate Example
Transaction
READ(EMP=75)
GRADE=11
WRITE(EMP=75)
COMMIT
**committed**
READ(EMP=75)
READ(GRADE=11)
SALARY = 26500
WRITE(EMP=75)
COMMIT
**committed**
Log
<T1 START>
<T1 EMP=75, GRADE=10,11>
Database
EMP=75, GRADE=11
<T1 COMMIT>
<T2 START>
<T2 EMP=75,
SALARY=25000,26500>
<T2 COMMIT>
EMP=75, SALARY=26500
Immediate Example
Transaction
READ(EMP=75)
GRADE=11
WRITE(EMP=75)
COMMIT
**committed**
Log
<T1 START>
<T1 EMP=75, GRADE=10,11>
Database
EMP=75, GRADE=11
<T1 COMMIT>
<T2 START>
READ(EMP=75)
READ(GRADE=11)
SALARY = 26500
WRITE(EMP=75)
<T2 EMP=75,
SALARY=25000,26500>
COMMIT
**committed**
<T2 COMMIT>
UNDO(T1)
EMP=75, SALARY=26500
Immediate Example
Transaction
READ(EMP=75)
GRADE=11
WRITE(EMP=75)
COMMIT
**committed**
READ(EMP=75)
READ(GRADE=11)
SALARY = 26500
WRITE(EMP=75)
COMMIT
**committed**
Log
<T1 START>
<T1 EMP=75, GRADE=10,11>
GRADE=10,11>
Database
Database
EMP=75,
GRADE=11
EMP=75, GRADE=11
<T1 COMMIT>
<T2 START>
<T2 EMP=75,
SALARY=25000,26500>
SALARY=25000,26500>
<T2 COMMIT>
REDO(T1)
EMP=75,
SALARY=26500
EMP=75, SALARY=26500
Immediate Example
Transaction
READ(EMP=75)
GRADE=11
WRITE(EMP=75)
COMMIT
**committed**
READ(EMP=75)
READ(GRADE=11)
SALARY = 26500
WRITE(EMP=75)
COMMIT
**committed**
Log
<T1 START>
<T1 EMP=75, GRADE=10,11>
GRADE=10,11>
Database
Database
EMP=75,
GRADE=11
EMP=75, GRADE=11
<T1 COMMIT>
<T2 START>
<T2 EMP=75,
SALARY=25000,26500>
SALARY=25000,26500>
EMP=75,
SALARY=26500
EMP=75, SALARY=26500
<T2 COMMIT>
UNDO(T2); REDO(T1) -- order can be important
Immediate Example
Transaction
READ(EMP=75)
GRADE=11
WRITE(EMP=75)
COMMIT
**committed**
READ(EMP=75)
READ(GRADE=11)
SALARY = 26500
WRITE(EMP=75)
COMMIT
**committed**
Log
<T1 START>
<T1 EMP=75, GRADE=10,11>
GRADE=10,11>
Database
Database
EMP=75,
GRADE=11
EMP=75, GRADE=11
<T1 COMMIT>
<T2 START>
<T2 EMP=75,
SALARY=25000,26500>
SALARY=25000,26500>
EMP=75,
SALARY=26500
EMP=75, SALARY=26500
<T2 COMMIT>
REDO(T1); REDO(T2)
Logging Requirements






Log must always be in “stable storage”
All log writes must be successful
Log kept separate from database
Backup copy of database that coincides with
start of a new log
Recovery needed dependent on type of
failure
Database restart must recovery completely
before allowing new transactions
Checkpoints
Recovery has to search entire log
 Many REDOs are unnecessary
 Recovery can be a lengthy process


Checkpoints are used to limit the
recovery action that is needed
Checkpoints
1. Flush all log records to permanent storage
2. Flush all data buffers to permanent storage
3. Write a <checkpoint> to the permanent
storage copy of the log
No updates are allowed while
checkpointing
Checkpoint Recovery
1. Search from end of log to most recent
<checkpoint>
2. Continue searching backward until the
first transaction <START> before the
<checkpoint>
3. From that <START> onward, UNDO
and REDO all transactions
(Serial execution only)
Advantages of Logging
Less Overhead at Commit
 No Data Fragmentation
 No Need for Garbage Collection
 Faster recovery
 Support for Concurrency

Transactions
Concept
 State
 Serializability
 Maintaining Serializability

Transaction
Single Unit of Work - User’s Perspective
 Multiple Operations


Required Properties (ACID)
Atomicity - all or none
 Consistency - database consistency
maintained
 Isolation - appearance of being alone
 Durability - changes persist

Transaction State





Active
Partially Committed
Failed
Aborted
Partially
Committed
Committed
Committed
Active
Failed
Aborted
Implementing Transactions in
SQL

COMMIT WORK

ROLLBACK WORK
Atomicity & Durability

Easiest
 Completely new copy of database
 Update new copy
 Don’t update pointer until commit
 Recoverable from failure at any point
provided the acknowledgement of the
commit and the update of the pointer
occur simultaneously.
Concurrency
Multiple Transactions
 Serial (one at a time) is best but
 combination of slow & fast in single
transaction
 short and long transactions


Concurrency must be handled carefully
Example
Employee (EmpID, Grade, Salary)
Grade (Grade, Midpoint)
Employee:
Grade:
75, 10, 25000
10, 20000
11, 30000
Example

T1 - Change employee #75 to grade 11
READ (Employee)
Grade = 11
WRITE (Employee)

T2 - Update salaries by 5% of midpoint
READ (Employee)
READ (Grade)
Salary = Salary + (0.05 * Midpoint)
WRITE (Employee)
Example - Serial Execution

T1 then T2
 Result: Salary = 26500 (25000 +
.05*30000)

T2 then T1
 Result: Salary = 26000 (25000 +
.05*20000)
Concurrent Execution
T1
READ (Employee)
T2
READ (Employee)
Grade = 11
WRITE (Employee)
READ (Grade)
Salary =
WRITE (Employee)
Result?
Concurrent Execution
T1
T2
READ (Employee)
READ (Grade)
READ (Employee)
Grade = 11
WRITE (Employee)
Salary =
WRITE (Employee)
Result?
Recoverable Schedules

If T2 reads an item updated by T1, T1
must commit before T2
Cascadeless Schedule
 If T2 reads an item updated by T1, T1
must commit before T2 reads
Not Recoverable
T1
READ (Employee)
WRITE (Employee)
T2
READ (Employee)
READ (Grade)
WRITE (Employee)
COMMIT
ROLLBACK
Result?
Recoverable
T1
READ (Employee)
WRITE (Employee)
T2
READ (Employee)
READ (Grade)
WRITE (Employee)
COMMIT
COMMIT
Result?
Recoverable
T1
READ (Employee)
WRITE (Employee)
T2
READ (Employee)
READ (Grade)
WRITE (Employee)
ROLLBACK
?????
Result?
Cascadeless
T1
READ (Employee)
WRITE (Employee)
COMMIT
T2
READ (Employee)
READ (Grade)
WRITE (Employee)
COMMIT
Result?
Ensuring Serializability

Concurrency Control Schemes

Can’t analyze transactions
 some in progress
 analysis longer than transaction
 already running continue to run
Concurrency Control - Locks
Shared - Read only
 Exclusive - Read/Write


LOCK-S
LOCK-X
Compatibility of Locks
 multiple transactions can have the
same lock
 shared locks only
Deadlocks
T1: READ(A), READ(B), WRITE(A)
T2: READ(B), READ(A), WRITE(B)
T1
LOCK-X(A)
READ(A)
T2
LOCK-X(B)
READ(B)
LOCK-S(B)
WRITE(A)
LOCK-S(A)
WRITE(B)
UNLOCK(A)
UNLOCK(B)
UNLOCK(A)
UNLOCK(B)
Locking Protocol
Set of Rules
 Reduce Possibility of Deadlocks
 Create “appearance” of serial execution
 to each transaction

Two-Phase Locking Protocol
Growing Phase
 Can obtain but not release locks
 Shrinking Phase
 Can release but not obtain locks


First release of a lock is the transition
between phases (lock point)
Two-Phase Locking

Strict
 Prevent cascading rollbacks
 Exclusive locks (LOCK-X) held until
commit

Rigorous
 All locks held until commit
Lock Conversion

Changing a Lock
 Upgrade - shared to exclusive
 Downgrade - exclusive to shared
 Can
only upgrade in growing phase
 Can only downgrade in shrinking
phase
Most Used Locking Scheme
Read  LOCK-S(A), READ(A)
 Write
 If LOCK-S(A),  UPGRADE(A),
WRITE(A)
 If no lock,  LOCK-X(A), WRITE(A)
 Locks held until COMMIT or
ROLLBACK
 Strict - exclusive only
 Rigorous - all locks

Granularity
Lock only what is needed
 Could be
 Row
 Table
 Set of Tables
 Entire Database


Model as a tree with the database at the root
and the rows as the leaves
Intention Locking
To lock a row
 Traverse the tree from the root to the
row
 Put intention locks on the nodes on
the way down
 Intention locks provide knowledge of
lower level locks when a higher level
lock is desired -- prevents having to
traverse the entire tree to lock the
database

Intention Locking

Locks Acquired
 Top-Down

Locks Released
 Bottom-Up
Deadlocks
Prevention
 Recovery

Deadlock Prevention
1. Acquire all locks simultaneously
2. Rollback instead of waiting for a lock
2a. Lock wait timeouts
Deadlock Recovery
If not prevented, deadlocks must be
detected and recovered
 Detection - periodically search for
problems
 Recovery
 Select a victim - which one?
 Rollback - how far?
 Avoid Starvation
 Always killing the same victim
which never gets to execute

Recovery with Concurrency
Locking Protocol
 Transaction Rollback
 Checkpoints
 Restart

Recovery Locking Protocol

Recovery Dependent on Locking
 Multiple UNDOs may not work
correctly if a second transaction reads
a value updated by a prior transaction
before the prior transaction commits

Use Two-Phase Locking that is at least
Strict
Transaction Rollback
Use log records to complete the rollback
 Must rollback from most recent to earlier
updates
 Release exclusive locks after rollback is
completed

Checkpoints

Multiple transactions can be active at a
checkpoint
 Change <checkpoint> log record to
include list of all currently active
transactions

Still have to halt other processing while
checkpointing
Checkpointing - When?
More often checkpoints -> faster
recovery
 Less often -> longer recovery
 MTBF - all components
 Timing
 Amount of Activity
 # transactions
 # updates
 log file size

Restart Recovery
Redo list - commit found
 Undo list - start found - not on Redo list

Scan log backwards from the end
 Stop at <checkpoint>
 For each transaction on the
checkpoint list not on the Redo list,
add it to the Undo list

Restart Recovery
1. Starting again at the end of the log,
Undo all transactions on the Undo list
2. Return to the most recent checkpoint
3. Move forward and redo all transactions
on the redo list
Homework #8
Database Design
 Database Administration

Download