Database Management Systems Chapter 10 Distributed Databases and the Internet

Database Management Systems
Chapter 10
Distributed Databases
and the Internet
Jerry Post
McGraw-Hill/Irwin
Copyright © 2005 by The McGraw-Hill Companies, Inc. All rights reserved.
D
A
T
A
B
A
S
E
Distributed Databases





SELECT Sales
FROM Britain.Sales
UNION
SELECT Sales
FROM France.Sales
UNION
SELECT Sales
FROM Italy.Sales
Definition
Advantages / Uses
Problems / Complications
Client-Server / SQL Server
Microsoft Access
Germany
Britain
France
Italy
2
D
A
T
A
B
A
S
E
Distributed Database Definition
 Multiple independent databases
 Each DBMS is a complete
DBMS (engine, queries,
locking, transactions, etc.)
 Usually on different machines.
 Usually in different locations.
 Connected by a network.
 Might be different environments
 Hardware
 Operating System
 DBMS Software
Database
Apollo
Database
Zeus
England
France
Database
Athena
United States
3
D
A
T
A
B
A
S
E
Distributed Database Rules

 C.J. Date

 Rule 0: Transparency: the
user should not know or care
that the database is distributed. 





Local autonomy.
No reliance on a central site.
Continuous operation.
Location independence.
Fragmentation independence
(physical storage).
 Replication independence.
Distributed query processing.
Distributed transaction
management.
Hardware independence.
 Operating system independence.
 Network independence.
 DBMS independence.
4
D
A
T
A
B
A
S
E
Distributed Features
 Each database can continue to run even if portion fails.
 Data and hardware can be moved without affecting
operations or users.
 Expanding operations.
 Performance issues.
 System expansion and upgrades.
 Add new section without affecting others.
 Upgrade hardware, network and DBMS.
5
D
A
T
A
B
A
S
E
Advantages and Applications
 Business operations are
often distributed
local
transactions
 Work and data are
segmented by department.
 Work and data are
segmented by geographical
location.
 Improved performance
 Most updates and queries
are performed locally.
 Maintain local control and
responsibility over data.
future
expansion
 Can still combine data
across the system.
 Scalability and expansion
 Add on, not replacement.
6
D
A
T
A
B
A
S
E
Creating a Distributed Database
 Design administration plan.
 Choose hardware and DBMS vendor,
and network.
 Set up network and DBMS
connections.
 Choose locations for data.
 Choose replication strategy.
 Create backup plan and strategy.
 Create local views and synonyms.
 Perform stress test: loads and failures.
7
D
A
T
A
B
A
S
E
Distributed Query Processing
 Networks are slow





Drives: 20 - 60 MB per sec.
LANs: 1-10 MB per sec (10-100 mbps).
WANs: 0.01 - 5 MB per sec.
Faster is possible but expensive!
SANs: 10-100 MB per sec.
 Goal is to minimize transmissions.
WAN
0.1 - 5 MB
 Each system must be capable of
evaluating queries--preferably SQL.
 Results depend heavily on how the
system joins tables.
10 - 20 MB
Disk drive
10-100 MB
LAN
8
D
A
T
A
B
A
S
E
 Example




Distributed Query Processing
NY
NY: Customers: 1 M rows
Customers(C#, …)
LA: Production: 10 M rows
1,000,000
C# list from
Chicago: Sales: 20 M rows
desired P#
Query: List customers who
Chicago
Matching
bought blue products on March 1
Sales(S#, C#, Sdate) Customer
 Bad idea #1
data
20,000,000
SaleItem(S#, P#,…)
 Transfer all rows to Chicago
50,000,000
 Then JOIN and select.
 Better idea #2 (probably)
P# sold on
 Transfer blue products from LA March 1
to Chicago
Blue P#
sold on
 Better idea #3
LA
March 1
 Get sale items on March 1
Products(P#, Color…)
 Get blue products from LA
10,000,000
 Send C# to NY
9
D
A
T
A
B
A
S
E
Data Replication
 Goals
 Minimize transmissions
 Improve performance
 Support heavy multiuser
access.
 Problems
 Updating copies
Britain
Britain: Customers
& Sales
Market research &
data corrections.
France: Customers
& Sales
Spain: Customers
& Sales
Periodic
updates
 Bulk transmissions
 Site unavailable
 Concurrency
 Easier for two people to
change the same data at
the same time.
 Decision support systems.
 Data warehouse.
Spain
Britain: Customers
& Sales
France: Customers
& Sales
Spain: Customers
& Sales
Update data.
10
D
A
T
A
B
A
S
E
Concurrency and Locks
 Each DBMS must maintain
lock facility.
 To update, each DBMS must
utilize and recognize other
lock mechanisms and return
codes.
 Each DBMS must have a
deadlock resolution protocol
that recognizes the
distributed databases.
 Random wait.
 Optimistic updates.
 Two-phase commit.
DBMS #1
Accounts
Jones
8898
Transaction A
Locked
Waiting
DBMS #2
Accounts
Jones
3561
Transaction B
Waiting
Locked
11
D
A
T
A
B
A
S
E
Transactions & Two-Phase Commit
 Two (or more) separate lock
managers.
 DBMS initiating update
serves as the coordinator.
 Two phases
Database 1
Initiate Transaction
1. Prepare to commit.
All agree?
 Coordinator sends message
2. Commit
and data to all machines to
“get ready.”
 Local machines save data in
logs, verify update status
and return message.
 If all locals report OK, then
Database 2
Lock tables.
coordinator writes log and
Database 3
Save log.
instructs others to proceed.
Update all tables.
If any fail, it sends Rollback
message.
12
D
A
T
A
B
A
S
E
Distributed Transaction Managers
Transaction Manager
Resource
Manager
DBMS
Transaction Manager
Resource
Manager
DBMS
Transaction Manager
Resource
Manager
Transaction
Processing
DBMS
Monitor
The distributed transaction
coordinator/transaction processing monitor
handles the transaction decisions and
coordinates across the participating
systems.
13
D
A
T
A
B
A
S
E
Distributed Design Questions
Question
What level of data consistency is needed?
How expensive is storage?
What are the shared access requirements?
How often are the tables updated?
Required speed of updates (transactions)?
How important are predictable transaction times?
DBMS support for concurrency and locking?
Can shared access be avoided?
Concurrent
High
Medium – High
Global
Often
Fast
High
Good – Excellent
No
Replication
Low – Medium
Low
Local
Seldom
Slow
Low
Poor
Yes
14
D
A
T
A
B
A
S
E
Distributed Databases In Oracle
 Database Links
 Full database names.
 CONNECT command.
 Linking through synonyms.
 CREATE SYNONYM …
 Central control over permissions.
Schema.Table@Location
Scott.Emp@hq.acme.com
Server
database
Synonym:
Employee
Procedure:
DELETE FROM
Employee
WHERE ...
 Linking through Views/queries.
 CREATE VIEW AS …
 Can assign local permissions.
 Linking through stored procedures.
 DELETE …
 Strong control over actions.
View
user
permissions
User can only
run procedure.
No other access.
15
D
A
T
A
B
A
S
E
Client-Server
Server
Server
Shared
Database
Front-end
User Interface
Clients
Clients
16
D
A
T
A
B
A
S
E
LAN File Server
File Server
 Not a distributed database.
 Data file stored on server.
 Server is passive, appears
as giant disk drive to PC.
 PC processes all data.
 Retrieves all needed data
across the network.
 Performance improvements.
 Indexes are crucial.
 Store some data on each
PC (replication).
 Store applications on PC
(graphics & forms).
 Convert to SQL-Server
DBMS data file
Application
Shared
Data
All data from all tables are
read by PC, which performs
JOIN and WHERE test. If
available, reads index first.
SELECT Name, SaleDate
FROM Customer INNER JOIN Sales
ON Customer.C# = Sales.C#
WHERE SaleDate BETWEEN #1-Mar-97#
AND #9-Mar-97#;
17
D
A
T
A
B
A
S
E
LAN File Server: Slow
File Server
MyFile.mdb
CustID Name …
115
Jenkins …
Forms 125
Juarez ...
Order ...
DBMS
software
transferred.
Application
and query
transferred.
SELECT *
FROM Customer
WHERE City = “Sandy”
One row at a time
transferred, until
all rows are examined.
18
D
A
T
A
B
A
S
E
Client-Server Databases
File Server
 One machine machine is
dominant (server) and
handles data for many
clients.
 Client machines handle
front-end tasks and small
data tables that are not
shared.
DBMS
SQL Server
Send SQL
statement.
Shared
Data
Return
matching
data.
application
19
D
A
T
A
B
A
S
E
ADO and Direct Connections
Server Computer
The Database vendor
provides its own data
transport (e.g,. Oracle or
SQL Server) installed on
the server and the client.
Database
Server
DBMS transport
ADO provides a driver that
connects your application to
the transport services.
ODBC can serve as the
data transport if nothing
else is available
DBMS transport
ADO
Visual Basic
application
Client Computer
20
D
A
T
A
B
A
S
E
Three-Tier Client-Server
Databases.
 Server Databases
Transactions.
 Client front-end
Legacy applications.
 Middle
 Locate databases
 Business rules
 Program code
Database links.
Business rules.
Program code.
Application.
Front-end.
User Interface.
Database
Servers
Middleware
Client
21
D
A
T
A
B
A
S
E
Database Independence on the Client
Original DBMS
ADO
New DBMS
ADO
Application
22
D
A
T
A
B
A
S
E
Database Independence with Queries
Independent Application Query: works with any DBMS
SELECT SaleID, SaleDate, CustomerID, CustomerName
FROM SaleCustomer
Saved Oracle Query
SELECT SaleID, SaleDate, CustomerID,
LastName || ‘, ‘ || FirstName AS CustomerName
FROM Sale, Customer
WHERE Sale.CustomerID=Customer.CustomerID
Saved SQL Server Query
SELECT SaleID, SaleDate, CustomerID,
LastName + ‘, ‘ + FirstName AS CustomerName
FROM Sale INNER JOIN Customer
ON Sale.CustomerID = Customer.CustomerID
23
D
A
T
A
B
A
S
E
The Internet as Client-Server
information
Internet
Router
Router
Server
Client
Browser
request
http://server.location/page
Web Server
HTML pages
Forms
Graphics
24
D
A
T
A
B
A
S
E
HTML Limited Clients
<HTML>
<HEAD>
<TITLE>My main page</TITLE></HEAD>
<BODY BACKGROUND=“graphics/back0.jpg”>
<P>My text goes in paragraphs.</P>
<P>Additional tags set <B>boldface</B> and <I>Italic</I>.
<P>Tables are more complicated and use a set of tags for rows and
columns.</P>
<TABLE BORDER=1>
<TR><TD>First cell</TD><TD>Second cell</TD></TR>
<TR><TD>Next row</TD><TD>Second column</TD></TR>
</TABLE>
<P>There are form tags to create input forms for collecting data.
But you need CGI program code to convert and use the input data.</P>
</BODY>
</HTML>
25
D
A
T
A
B
A
S
E
HTML Output
My text goes in paragraphs.
Additional tags set boldface and Italic.
Tables are more complicated and use a set of tags
for rows and columns.
First cell
Second cell
Next row
Second column
There are form tags to create input forms for
collecting data. But you need CGI program code
to convert and use the input data.
26
D
A
T
A
B
A
S
E
Web Server Database Fundamentals
0 Request Server/Form.html
3
Client/Browser
Database
1
2 Data 3
2
DBMS
HTML
Form
Result
1
Query
Web Server
Result Page
1
HTML
form
Form.html
2
Query
Template
+ Code
Program code
27
D
A
T
A
B
A
S
E
Database Example: Client Side
0 Request Server/Form.html
Server
1 Initial form
3 Results
2
28
D
A
T
A
B
A
S
E
Client-Server Data Transfer
Order Form
Order ID
1015
Customer
Jones, Martha
Order Date
12-Aug
What if there are 10,000 customers?
How much time to load the combo box?
How do you refresh/reload the combo box?
Alternatives?
29
D
A
T
A
B
A
S
E
Latency
Server
Generate form
Receive form data
Transmission
delay
Transmission
delay
time
Form received
Client
User delay
30
D
A
T
A
B
A
S
E
XML: Transferring Data
Order: OrderID, OrderDate, ShippingCost, Comment
Item: ItemID, Description, Quantity, Cost
Item: ItemID, Description, Quantity, Cost
Item: ItemID, Description, Quantity, Cost
Many XML files contain hierarchical data.
31
D
A
T
A
B
A
S
E
XML: Schema Definition xsd
<?xml version="1.0" encoding="utf-8"?>
<xs:schema id="OrderList" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
<xs:element name="OrderList" msdata:IsDataSet="true">
<xs:complexType>
Partial file,
<xs:choice maxOccurs="unbounded">
generated by
<xs:element name="Order">
<xs:complexType>
.NET xsd.exe
<xs:sequence>
<xs:element name="OrderID" type="xs:string" minOccurs="0" />
<xs:element name="OrderDate" type="xs:date" minOccurs="0" />
<xs:element name="ShippingCost" type="xs:string" minOccurs="0" />
<xs:element name="Comment" type="xs:string" minOccurs="0" />
<xs:element name="Items" minOccurs="0" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="ItemID" nillable="true" minOccurs="0" maxOccurs="unbounded">
<xs:complexType>
<xs:simpleContent msdata:ColumnName="ItemID_Text" msdata:Ordinal="0">
<xs:extension base="xs:string">
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
<xs:element name="Description" nillable="true" minOccurs="0" maxOccurs="unbounded">
<xs:complexType>
<xs:simpleContent msdata:ColumnName="Description_Text" msdata:Ordinal="0">
<xs:extension base="xs:string">
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
32
D
A
T
A
B
A
S
E
XML Data Example
<?xml version="1.0"?>
<!DOCTYPE OrderList SYSTEM "orderlist.dtd">
<OrderList>
<Order>
<OrderID>1</OrderID>
<OrderDate>3/6/2004</OrderDate>
<ShippingCost>$33.54</ShippingCost>
<Comment>Need immediately.</Comment>
<Items>
<ItemID>30</ItemID>
<Description>Flea Collar-DogMedium</Description>
<Quantity>208</Quantity>
<Cost>$4.42</Cost>
<ItemID>27</ItemID>
<Description>Aquarium Filter &
Pump</Description>
<Quantity>8</Quantity>
<Cost>$24.65</Cost>
</Items>
</Order>
</OrderList>
XML: extensible markup
language
33
D
A
T
A
B
A
S
E
XML Example in Explorer
34
D
A
T
A
B
A
S
E
Java and JDBC
Connection con = DriverManager.getConnection(
"jdbc.myDriver:myDBName",
“myLogin”,
“myPassword”);
Statement smt = con.CreateStatement();
ResultSet rst = smt.executeQuery(
“SELECT AnimalID, Name, Category, Breed FROM Animal”);
while (rst.next()) {
int iAnimal = rst.getInt(“AnimalID”);
String sName = rst.getString(“Name”);
String sCategory = rst.getString(“Category”);
String sBreed = rst.getString(“Breed”);
\\ Now do something with these four variables
}
35
Database Management Systems
End of
Chapter 10
McGraw-Hill/Irwin
Copyright © 2005 by The McGraw-Hill Companies, Inc. All rights reserved.