Transaction Management in HDBMSs ... HDBS Transaction Model III–1

advertisement
Transaction Management in HDBMSs
©2003 Vera Goebel & Denise Ecklund
HDBMS-TM-1
HDBS Transaction Model
global transactions
GTi
GTj
GTM - global
transaction manager
{ GSTi1, GSTl1, GSTi2, GSTj2 }
LTk
LTl
local
transactions
server
server
(proxy for the GTM)
(proxy for the GTM)
GSTi1
DBMS 1
©2003 Vera Goebel & Denise Ecklund
©2002 Vera Goebel & Denise Ecklund
GSTi2
GSTj1
...
GSTj2
local
transactions
DBMS n
LTm
LTn
HDBMS-TM-2
III–1
Transaction Management
• Local transactions: access data at a single site outside of the
global HDBS control.
• Global transactions: are executed under the HDBS control.
Local DBMSs have three types of autonomy:
Autonomy Type
Definition
Resulting Problem
No changes can be made to the local
DBMS software to support the HDBMS
Design
Each local DBMS controls execution of
global subtransactions and local
transactions ( the commit/abort decision)
Local DBMS do not communicate with
Communication each other and they do not exchange
execution control information
Execution
Non-serializable schedule
for global transactions
Non-atomic & non-durable
global transactions
Distributed deadlock
can not be detected
©2003 Vera Goebel & Denise Ecklund
HDBMS-TM-3
Global Serializability Problem
Global
Serializability
Atomicity &
Durability
Distrbuted
Deadlock
• GTM is responsible for
– A serializable schedule for the set of global transactions
– Coordination of submission and execution of global subtransactions
among the local DBMSs
• Serializing the global schedule?
GT2
GT1
GST11 GST12
GST21 GST22 GST23
Local DBMS-3
Local DBMS-1
Local DBMS-2
If GST11 ⟨ GST22 at site DBMS-1,
Then it must be the case that GST12 ⟨ GST23 at site DBMS-2
If GST23 ⟨ GST12 at site DBMS-2
©2003 Vera Goebel & Denise Ecklund
©2002 Vera Goebel & Denise Ecklund
GT2 ⟨ GT1
GT1 ⟨ GT2
A non-serializable schedule!
HDBMS-TM-4
III–2
Local Transactions and the Global Serializable Schedule
•
•
•
•
Local transactions execute outside the control of the GTM
Local transactions create indirect conflicts with global transactions
GTM is not aware of local transactions and these indirect conflicts
In general, the GTM cannot ensure global serializability
GT1: r1(a) r1(c)
LT3: w3(a) w3(b)
GTM belives
GT1 ⟨ GT2
at both sites
GT2: r2(b) r2(d)
LDBMS-1
a b
LDBMS-1: r1(a) c1 w3(a) w3(b) c3 r2(b) c2
=> LDBMS-1: GT1 ⟨ LT3 ⟨ GT2
LDBMS-2
LT4: w4(c) w4(d)
c d
LDBMS-2: w4(c) r1(c) c1 r2(d) c2 w4(d) c4
=> LDBMS-2: GT2 ⟨ LT4 ⟨ GT1
©2003 Vera Goebel & Denise Ecklund
HDBMS-TM-5
Controlling the Execution Order of Global Subtransactions
• Four Strategies:
1) Execute global transactions serially
• No concurrent execution for global transactions!
• Does not solve indirect conflicts with local transactions
• Costs: Heavy CC processing at the GTM
Low query processing throughput
Global
Serializability
Atomicity &
Durability
Distrbuted
Deadlock
2) Define a specific order over the global transactions
and use the concurrency control mechanism of each
local DBMS to enforce that order
• Every local DB stores one ”ticket” object
• Extend every global subtransaction to access the ticket
GT1: r1(a) w1(a)
GT2: r2(b) w2(b)
newGT1: r1(ticketS1) r1(a) w1(a) w1(ticketS1) c1
newGT2: r2(ticketS1) r2(b) w2(b) w2(ticketS2) c2
• Means GT1 and GT2 will be correctly serialized with respect to all global
transactions and all local transaction executed by the local DBMS at S1
©2003 Vera Goebel & Denise Ecklund
©2002 Vera Goebel & Denise Ecklund
HDBMS-TM-6
III–3
Controlling the Execution Order of Global Subtransactions
3) Use local DBs deploying rigorous CC Algorithms
• If all LDBMSs use rigorous 2-phase locking
and support a “prepare-to-commit” interface then
Global
Serializability
Atomicity &
Durability
Distrbuted
Deadlock
– Global transactions are serializable without a CC Alg at GTM
– Local transactions can not cause indirect conflicts
Ex: (w4(c) r1(c) c1 r2(d) c2 w4(d) c4)
Not a rigorous
local schedule
In R2PL, T4 holds
all locks until commit, so ...
T1 can not read object c
until after T4 commits
4) Relax the serializability requirement
• Use “strong correctness” instead
• Most indirect conflicts have no effect
on correctness
©2003 Vera Goebel & Denise Ecklund
HDBMS-TM-7
Alternative Consistency Models
• Global schedule is not serializable; it is strongly correct
– Global transactions preserve all data consistency constraints
Constraint-based strategies
Global
Serializability
Atomicity &
Durability
Distrbuted
Deadlock
• Local serializability: Some HDBS applications have no global
constraints because each DBS is (and should be) independent from
each other => no global concurrency control mechanism needed
So, local serializability ensures strong correctness of global executions.
Ex application: travel reservation service for planes, trains, ferries, hotels, etc.
• Limited global constraints: Some applications need global constraints.
Define 2 types of data: global data and local data. Global constraints may
only span global data, and local transactions may not write to global data.
Use two-level serializability (2LSR): local-SR and global-SR.
Artificial solution: local site has no autonomy over or direct-access to global data;
local site must submit transactions to GTM to update global
data stored at the local site => master-slave relationship.
©2003 Vera Goebel & Denise Ecklund
©2002 Vera Goebel & Denise Ecklund
HDBMS-TM-8
III–4
Alternative Consistency Models
Global
Serializability
Global &
Atomicity
Serializability
Durability
Distrbuted
Deadlock
Non-constraint-based strategies
• Diverge from strong correctness and serializability
1) Epsilon Serializability
• Allows a specified number of nonserializable conflicts
2) Sets of Compatible Transactions
• Assume a set of known transactions
• Pre-analyze the transactions for conflicts
• Group non-conficting transactions into compatible sets
• Not CC control required among transactions in a compatible set
©2003 Vera Goebel & Denise Ecklund
HDBMS-TM-9
Global Atomicity and Recovery Problem
Global
Serializability
Atomicity &
Durability
Distrbuted
Deadlock
• The GTM must guarantee that a global transaction
commits at all sites or aborts at all sites
• Local DBMSs wish to preserve their execution autonomy
– May not implement or export a “prepare-to-commit” interface
GT1
GST11 GST12
GTM
2PC
GTM Proxy
Abort GST11 No 2PC
LDBMS
2PC
GTM Proxy
No 2PC Commit GST12
LDBMS
• A local DBMS can unilaterally abort a subtransaction anytime
– Results in non-atomic global transactions and incorrect global schedules
– Local transactions and global subtransactions see committed partial results
Note: The first heterogeneous systems did not support update transactions!
©2003 Vera Goebel & Denise Ecklund
©2002 Vera Goebel & Denise Ecklund
HDBMS-TM-10
III–5
Approaches to Achieve Atomicity and Durability
Global
Serializability
Atomicity &
Durability
Distrbuted
Deadlock
• If all LDBMSs export a “prepare-to-commit” interface,
then use 2PC between the proxy and the LDBMS
• If some LDBMSs do not export “prepare-to-commit”,
then four approaches:
1) Modify each global subtransaction to
“callback to the proxy” just before local commit
• Blocks the global subtransaction until GTM
completes 2PC with proxies
GTM
2PC
GTM Proxy
No 2PC
• Possibly only if the LDBMS supports a client
callback service
LDBMS
• Fails if the LDBMS uses optimistic concurrency control
©2003 Vera Goebel & Denise Ecklund
HDBMS-TM-11
Approaches to Achieve Atomicity and Durability
• If any global subtransaction aborts
2) REDO failed write operations from global subtransactions
Global
Serializability
Atomicity &
Durability
Distrbuted
Deadlock
- Performed by the proxy, who must maintain a local redo log
3) RETRY failed global subtransactions (read & write operations)
- Performed by the proxy
- Inappropriate semantics for many applications or transactions
- No guarantee that the retry can ever be committed
Ex: Banking application – withdrawing money can fail ”forever”
4) UNDO committed global subtransactions by
executing compensating transactions
- Performed by the GTM
- Can provide semantic atomicity (called a saga)
©2003 Vera Goebel & Denise Ecklund
©2002 Vera Goebel & Denise Ecklund
Inconsistent data is
temporarily visible
to other transactions!
HDBMS-TM-12
III–6
Global Deadlock Problem
• Same problem as in distributed homogeneous DBMSs
Site X
waits for T1 x
to release Lx
T1 x
holds lock Lx
T1 x needs a
waits for T1 y
to complete
T1 y
Site Y
holds lock Lb
waits for T2 y
to release Ly
holds lock La
T2 x
Global
Serializability
Atomicity &
Durability
Distrbuted
Deadlock
T2 y needs b
waits for T2 x
to complete
T2 y
holds lock Ly
• We solved the problem by exchanging lock information to
construct the global “waits-for” graph
– This violates design autonomy and communication autonomy
• Therefore the GTM will be unaware of a global deadlock.
• There are no complete solutions to the global deadlock
problem for autonomous multi-database systems.
©2003 Vera Goebel & Denise Ecklund
HDBMS-TM-13
Status: Transaction Management for HDBS
• Transaction management for HDBSs is a very active research area.
• Distributed transactions over the Internet define new semantics for
transaction consistency, allowing development of new solutions.
Open issues:
• What can be done if some of the local subsystems (e.g., file
systems) do not support transaction management?
• Performance implications of transaction management strategy?
• Handling of different degrees of consistency?
©2003 Vera Goebel & Denise Ecklund
©2002 Vera Goebel & Denise Ecklund
HDBMS-TM-14
III–7
Conclusions
HDBS allows
a uniform view on the combination of data
maintained by different autonomous database systems.
• available: prototypes & commercial products with a set of fixed /
specific drivers (so-called gateways) for existing, widely used data
management systems (conventional DBS and file systems)
• missing: systematic support for individual integration of arbitrary
data management systems
– Examples: geographical DBs, multimedia DBs, Internet storefronts, etc.
©2003 Vera Goebel & Denise Ecklund
©2002 Vera Goebel & Denise Ecklund
HDBMS-TM-15
III–8
Download