Distributed Computing & Database Systems Introduction: Distributed Computing & Database Systems

advertisement
Distributed
Computing &
Database Systems
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
1
Distributed
DistributedDatabase
DatabaseSystems:
Systems:What
Whatwe
wewill
willcover
cover??
) Introduction to Distributed Systems
) Architecture of a distributed database system
) Date’s Rules for DDBS
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
2
DDBS:
DDBS:Recommended
RecommendedReferences
References
Date C. J. (1995)
An Introduction to Database Systems. Volume 1, 6th Edition, Addison Wesley
Elmasri R. & Navathe S. B. (1994)
Fundamentals of Database Systems, 2nd Ed, Benjamin Cummings
Korth HF & Silberschatz A (1991)
Database System Concepts Second Edition, McGraw-Hill
Date,C.J.(1990)
Relational Database: Writings 1985 - 1989, Chapter 10, Addison-Wesley
Oszu,M.,Valduriez,P.(1991)
Principles of Distributed Database Systems, Prentice Hall
Coulouris, Dollimore, Kindberg
“Distributed Systems”, 2nd Ed, Addison-Wesley
A.Goschinski,
Distributed operating systems, Addison Wesley, 1992
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
3
What
What is
is aa True
True Distributed
DistributedSystem
System
“A system that runs on a collection of machines that do
not have shared memory, yet looks to its users like a
single computer”
•
•
•
•
•
Eg Amoeba, Sprite, Chorus, Clouds
Global IPC
Single set of System calls on each node
All machines run the same kernel
Each kernel controls its own resources (?)
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
4
DS:
DS:Design
Design Issues
Issues
• Must be transparent
• Provide flexibility
• Be reliable
• Good performance
• Scalable
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
5
DS:
DS:Reliability
Reliability
• Availability is a related factor
• Design should not require the simultaneous
functioning of a substantial number of critical
components
• More redundancy greater availability and greater
inconsistency
• Fault tolerance, the ability to mask failures from the
user
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
6
DS:
DS:Performance
PerformanceIssues
Issues
• The rest are useless without this
• Hard to measure, benchmarks are meaningless
• Balance number of messages and grain size of
distributed computations
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
7
DS:
DS:Scalability
Scalability
• A maxim for developing distributed systems
• Avoid centralised components, tables and
algorithms
• Only decentralised algorithms should be used
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
8
DS:
DS:Characteristics
Characteristicsof
ofdecentralised
decentralisedalgorithms
algorithms
• No machine has complete information about the state
of the system
• Machines make decisions based only on locally
available information
• Failure of one machine does no ruin the algorithm
• There is no implicit assumption of the existence of a
global clock
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
9
Types
Typesof
ofDistributed
DistributedSystems
Systems
• Workstation Server
• Processor Pool
• Hybrid and
• Integrated
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
10
DS:
DS:Workstation
WorkstationServer
Server
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
11
DS:
DS: Processor
Processor Pool
Pool
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
12
DS:
DS:Hybrid
Hybridstyle
style
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
13
DS:
DS:Integrated
Integrated systems
systems
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
14
Distributed
DistributedSystem
SystemCommunications
Communications
• A distributed system relies on communication,
generally involving:
♦ Message Passing
⇒ When a client want to communicate with a server it sends a
message. The server replies with a response. A message passing
mechanism may be:
• Reliable or Unreliable
• Blocking or Nonblocking
♦ Remote Procedure Calls
⇒ A reliable send followed by a get is very like a procedure call. A
remote procedure call can be provided by including a stub with the
server and the client. The stub is responsible for dealing with the
communication between systems. The use of the stubs makes the
client call just a local procedure call and the server routine is called
locally.
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
15
Layered
LayeredView
Viewon
on aa Distributed
DistributedSystem
System
Applications
DBMS, TPS, ...
Distributed OS
Hardware
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
16
Main
Maincomponents
componentsof
ofDistributed
DistributedOS
OS
User & Performance Oriented Issues
♦ Communication
♦
♦
♦
♦
♦
♦
model
Paradigms for
process
interaction
Transparency
Heterogeneity
Autonomy and/or
interdependence
Reliable
computing
Replication of
information and
data consistency
Components Oriented Issues
File Management
Resource Management
Memory Management
Process Management
COT5200: DISTRIBUTED DATABASE SYSTEMS
Kernel
♦ Interprocess
♦
♦
♦
♦
♦
♦
♦
communication
Synchronisation
Addressing &
naming
Process
management
Resource allocation
Deadlock detection
& resolution
Resource protection
Communication
security &
authentication
Introduction: Distributed Computing & Database Systems
17
Broad
Broad Characteristics
Characteristics of
of Distributed
Distributed OS
OS
• The task of a Distributed OS is to enable a distributed system to be
conveniently programmed, so that it can be used to implement the
widest possible range of applications.
• lt does this by presenting applications with general, problem-oriented
abstractions of the resources in a distributed system. Examples of
such abstractions are communication channels and processes instead of networks and processors.
• In an open distributed system, the distributed operating system is
implemented by a collection of kernels and servers (server
processes).
• This lecture focuses on the part of a distributed operating system that
acts as an infrastructure for general, network-transparent resource
management.
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
18
Distributed
DistributedOS:
OS:Facilities
Facilitiesfor
forEncapsulating
EncapsulatingResources
Resources
A distributed operating system must provide facilities for
encapsulating resources in a modular and protected fashion,
while providing clients with network-wide access to them.
Kernels and servers are both resource managers. They contain
resources, and as such they have to provide:
♦Encapsulation: They should provide a useful service interface to their
resources that is, a set of operations that meet their clients' needs. The
details of management of memory and devices used to implement
resources should be hidden from clients, even when they are local.
♦Concurrent processing: Clients may share resources and access them
concurrently. Resource managers are responsible for achieving
concurrency transparency.
♦Protection: Resources require protection from illegitimate accesses - for
example, files are protected from being read by users without read
permissions, and device registers are protected from application
processes.
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
19
Distributed
DistributedOS:
OS:Access
Access to
toResources
Resources
• Clients access resources by identifying them in arguments to operations -
for example, remote procedure calls to a server, or system calls to a
kernel.
• We call an access to an encapsulated resource an invocation, regardless
of how it is implemented. A combination of client libraries, kernels and
servers may be called upon to perform the following invocation-related
tasks:
♦ Name resolution: The server (or kernel) that manages a resource has to be
located, from the resource' s identifier.
♦ Communication: Operation parameters and results have to be passed to and
from resource managers, over a network or within a computer.
♦ Scheduling: This is related to concurrency: when an operation is invoked, its
processing must be scheduled within the kernel or server.
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
20
Relationship
Relationshipof
ofDBMS
DBMSComponents
Components
Runtime Support
Query Optimiser
Runtime
Development Support
Catalog Manager
Transaction Manager
File Manager
Recovery Manager
Cache Manager
Log Manager
COT5200: DISTRIBUTED DATABASE SYSTEMS
Security Manager
CC Manager
Introduction: Distributed Computing & Database Systems
21
Additional
AdditionalComponents
Componentsfor
forDDBMS
DDBMS
Runtime Support
Runtime
Development Support
Catalog Manager
Name Services
Distributed Optimiser
Transaction Manager
Security Manager - Kerberos
File Manager
Distributed File System
Recovery Manager
CC Manager
Cache Manager
Log Manager
Replication Manager
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
22
Distributed
DistributedSystems
SystemsArchitecture
Architecture
STANDARD COMMUNICATIONS FRAMEWORK
OPEN SERVICES
SHARED DATA
TRANSACTIONAL SERVICES
SECURITY
ENHANCED SERVICES
Performance and Availability
COT5200: DISTRIBUTED DATABASE SYSTEMS
REPLICATION
Introduction: Distributed Computing & Database Systems
23
Characteristics
Characteristicsof
ofDS
DS
• ISO Reference Model for Open Distributed Computing
has identified the following types of transparency
♦ Access
♦ Location
♦ Concurrency
♦ Replication
♦ Failure
♦ Migration
♦ Performance
♦ Scaling
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
24
Distributed
DistributedComputing:
Computing:Paradigms
Paradigms for
for Process
Process Interaction
Interaction
•
•
•
•
The client/server model
The integrated model
The pipe model
A remote procedure call
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
25
Distributed
DistributedComputing:
Computing:Client/Server
Client/ServerModel
Model
• Three major problems with the client/server model:
♦ Control of individual resources is centralised in a single server
♦ Each single server is a potential bottleneck
♦ To improve the performance, multiple implementations of similar functions must
be used
Client process 1
Client process 2
..
.
Server
Process
Resource
Service request
Client process N
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
26
The
The Client/Server
Client/Server Model
Modelin
inaa Distributed
DistributedSystem
System
Computer 1
Computer 2
Computer 3
Client
File
Server
Print
Server
Kernel
Kernel
Kernel
...
...
Computer N
Mail
Server
Kernel
Network
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
27
Distributed
DistributedComputing:
Computing:The
The Integrated
Integrated Model
Model
• The deficiencies of the client-server model led to the development of
the integrated model. According to this model, each computer's
software is designed as a complete facility with a general file system
and name interpretation mechanisms. This implies that each computer
in a distributed system would run the same software.
• Note that a distributed system that has been developed based on the
integrated model can be easily made to look like a client/server based
system if suitable configuration flexibility has been provided.
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
28
Distributed
DistributedComputing:
Computing:Pipe
PipeModel
Model
• The pipe model is based on the concept of a process. A pipe is a
communication facility which allows transfer of data between processes
based on a first-in-first-out (FIFO) strategy. Pipes also allow
synchronisation of process execution.
• Traditionally, pipes are implemented using the file system for data storage.
• The most distinct feature of a pipe is that it allows a process to send bulk
data to a remote node.
Cannot share a pipe
Sharing
access to an
unnamed pipe
Process A
Process A1
Process A2
Process A3
Calls a pipe
Process A21
Process A22
Share a pipe
Process A211
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
29
Distributed
DistributedComputing:
Computing:RPC
RPC Model
Model
• The remote procedure call model was discussed in detail in the previous
lecture.
• Communication models based on remote procedure calls allow a process
to call a procedure at a remote computer. This operation is performed in
the same manner in which a local procedure is called.
• A remote procedure call blocks the caller until the call is complete and a
reply has been received.
• When a call is made, a request message is sent to a remote computer
where a desired procedure resides, a process is created to execute this
procedure, and after this process completes, a reply message is sent to the
calling process.
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
30
Reasons
Reasonsfor
forData
DataDistribution
Distribution
• Centralised DBMS vs. Distributed Database System
• A distributed database is a collection of data that belongs logically to the
same system but is physically spread over the sites of a computer
network
• Several factors have led to the development of DDBS:
♦ Distributed nature of some database applications
♦ Increased reliability and availability
♦ Allowing data sharing while maintaining some measure of local control
♦ Improved performance
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
31
Additional
AdditionalFunctionality
Functionality of
ofDDBS
DDBS
• Distribution leads to increased complexity in the system design and
•
implementation
DDBMS must be able to provide additional functions to those of a
centralised DBMS. Some of these are:
♦ To access remote sites and transmit queries and data among the various
♦
♦
♦
♦
♦
♦
sites via a communication network.
To keep track of the data distribution and replication in the DDBMS catalog.
To devise execution strategies for queries and transactions that access data
from more than one site.
To decide on which copy of a replicated data item to access.
To maintain the consistency of copies of a replicated data item.
To maintain the global conceptual schema of the distributed database
To recover from individual site crashes and from new types of failures such as
failure of a communication link.
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
32
DBMS
DBMSImplementation
ImplementationAlternatives
Alternatives
Distribution
Logically integrated
and homogeneous
multiple DBMSs
Distributed
homogeneous DBMS
Distributed
homogeneous
federated DBMS
Distributed
homogeneous
multidatabase
system
Distributed
heterogeneous
DBMS
Autonomy
Distributed
heterogeneous
federated DBMS
Multidatabase
system
Single site
Heterogeneous
homogeneous
integrated DBMS Single site federated DBMS Heterogeneous
multidatabase
heterogeneous
Heterogeneity
system
federated DBMS
COT5200: DISTRIBUTED DATABASE SYSTEMS
Distributed
heterogeneous
multidatabase
system
Introduction: Distributed Computing & Database Systems
33
Physical
Physical Architecture
Architecture of
of DDBS
DDBS
Front-end
machine
DP
Back-end
machine
DP
AP
AP
Site 1
Site 2
....
DP
AP
Site 3
Site N
Communication
network
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
34
Possible
PossibleNetwork
NetworkTopologies
Topologies
Partially connected network
Fully connected network
A
A
B
F
B
C
E
D
E
Ring network
C
F
D
A
A
Star network
B
B
E
B
E
C
D
A
C
F
D
F
E
C
D
F
Tree-structured network
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
35
Components
Components of
of Distributed
Distributed DBMS
DBMS
USER
User
Requests
Database
System
Responses
User Interface
Handler
Semantic Data
Controller
External
Schema
Global
Conceptual
Schema
Local
Internal
Schema
Runtime Support
Processor
Local Recovery
Manager
System
Log
Global Query
Optimiser
DP
GD/D
Global Execution
Monitor
AP
COT5200: DISTRIBUTED DATABASE SYSTEMS
Local
Conceptual
Schema
Local Query
Processor
Introduction: Distributed Computing & Database Systems
36
Date’s
Date’s 12
12 Rules
Rules for
for Distributed
Distributed Systems
Systems
Rule 0. TO THE USER, A DISTRIBUTED SYSTEM SHOULD LOOK EXACTLY
LIKE A NONDISTRIBUTED SYSTEM
1. Local autonomy
2. No reliance on a central site
3. Continuous operation
4. Location independence
5. Fragmentation independence
6. Replication independence
7. Distributed query processing
8. Distributed transaction management
9. Hardware independence
10. Operating system independence
11. Network independence
12. DBMS independence
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
37
Rule
Rule1:
1:Local
LocalAutonomy
Autonomy
Autonomy objective: Sites should be autonomous to the maximum extent possible
• Local data is locally owned and managed, with local accountability
♦ security considerations
♦ integrity considerations
• Local operations remain purely local
• All operations at a given site are controlled by that site; no site X
should depend on some other site Y for its successful functioning
• In some situations some slight loss of autonomy is inevitable
♦ fragmentation problem - Rule 5
♦ replication problem - Rule 6
♦ update of replicated relation - Rule 6
♦ multiple-site integrity constraint problem - Rule 7
♦ a problem of participation in a two-phase commit process - Rule 8
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
38
Rule
Rule 2:
2: No
No Reliance
Reliance on
on aa Central
Central Site
Site
There must not be any reliance on a central "master" site for some central
service, such as centralized query processing or centralized transaction
management, such that the entire system is dependent on that central site
• Reliance on a central site would be undesirable for at least the following
two reasons:
♦ that central site might be a bottleneck
♦ the system would be vulnerable
• In a distributed system, therefore, the following functions (among others)
must all be distributed:
♦ Dictionary management
♦ Query processing
♦ Concurrency control
♦ Recovery control
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
39
Rule
Rule3:
3: Continuous
ContinuousOperation
Operation
There should ideally never be any need for a planned entire system shutdown
• Incorporating a new site X into an existing distributed system D should
•
•
•
•
not bring the entire system to a halt
Incorporating a new site X into an existing distributed system D should
not require any changes to existing user programs or terminal activities
Removing an existing site X from the distributed system should not
cause any unnecessary interruptions in service
Within the distributed system, it should be possible to create and
destroy fragments and replicas of fragments dynamically
It should be possible to upgrade the DBMS at any given component site
to a newer release without taking the entire system down
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
40
Rule
Rule4:
4: Location
LocationIndependence(Transparency)
Independence(Transparency)
Users should not have to know where data is physically stored, but rather
should be able to behave - at least from a logical standpoint - as if the data
was all stored at their own local site
• Simplifies user programs and terminal activities
• Allows data to migrate from site to site
• It is easier to provide location independence for simple retrieval
•
•
operations than it is for update operations
Distributed data naming scheme and corresponding support from
the dictionary subsystem
User naming scheme
♦ User U has to have a valid logon ID at each of multiple sites to operate
♦ User profile for each valid logon ID in the dictionary
♦ Granting of access privileges at each component site
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
41
Rule
Rule5:
5: Fragmentation
FragmentationIndependence
Independence(Transparency)
(Transparency)
• A distributed system supports data fragmentation if a given relation can
be divided up into pieces or "fragments" for physical storage purposes
A system that supports data fragmentation should also support
fragmentation independence (also known as fragmentation transparency)
•
Users should be able to behave (at least from a logical
standpoint) as if the data were in fact not fragmented at all
•
•
•
•
•
•
Fragmentation is desirable for performance reasons
Horizontal fragmentation
SELECT
Vertical fragmentation
PROJECT
Fragmentation must be defined within the context of a distributed
database
Fragmentation independence (like location independence) is desirable
because it simplifies user programs and terminal activities
Fragmentation independence implies that users should normally be
presented with a view of the data in which the fragments are logically
combined together by means of suitable joins and unions
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
42
Rule
Rule 5:
5: An
An Example
Example of
of Fragmentation
Fragmentation
User Perception
Employee
EMP # DEPT #
E1
E2
E3
E4
E5
DX
DY
DZ
DY
DZ
SALARY
45K
40K
50K
63K
40K
New York fragment
EMP # DEPT #
E1
E3
E5
DX
DZ
DZ
London fragment
SALARY
45K
50K
40K
physical storage
New York
COT5200: DISTRIBUTED DATABASE SYSTEMS
EMP # DEPT #
E2
E4
DY
DY
SALARY
40K
63K
physical storage
London
Introduction: Distributed Computing & Database Systems
43
Rule
Rule6:
6: Replication
ReplicationIndependence
Independence(Transparency)
(Transparency)
User should be able to behave as if the data were in fact
not replicated at all
• A distributed system supports data replication if a given relation (more
•
•
generally, a given fragment of a relation) can be represented at the
physical level by many distinct stored copies or replicas, at many distinct
sites.
Replication, like fragmentation, should be “transparent to the user”
Replication is desirable for at least two reasons:
♦ Performance
♦ Availability
• Update propagation problem
• Replication independence (like location and fragmentation independence)
is desirable because it simplifies user programs and terminal activities
• Snapshots
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
44
Rule
Rule 6:
6: Example
Example of
of Data
Data Replication
Replication
Employee
EMP # DEPT #
E1
E2
E3
E4
E5
New York
fragment
Replica of
London
fragment
EMP # DEPT #
E1
E3
E5
DX
DY
DZ
DY
DZ
SALARY
DX
DZ
DZ
45K
50K
40K
45K
40K
50K
63K
40K
EMP # DEPT #
E2
E4
DY
DY
EMP # DEPT #
EMP # DEPT #
E2
E4
SALARY
DY
DY
physical storage
New York
COT5200: DISTRIBUTED DATABASE SYSTEMS
40K
63K
User Perception
SALARY
E1
E3
E5
DX
DZ
DZ
SALARY
40K
63K
SALARY
45K
50K
40K
London
fragment
Replica of
New York
fragment
physical storage
London
Introduction: Distributed Computing & Database Systems
45
Rule
Rule 7:
7:Distributed
DistributedQuery
Query Processing
Processing
It is crucially important for distributed database systems to choose a
good strategy for distributed query processing
•
•
Query processing in a distributed system involve
♦ local CPU and I/O activity at several distinct sites
♦ some amount of data communication among those sites
Amount of data communication is a major performance factor
Ry
300 records
Query Q
•
•
•
Query compilation ahead of time
Views that span multiple sites
Integrity constraints within a DDBS that span multiple
sites
COT5200: DISTRIBUTED DATABASE SYSTEMS
Rz
2M
records
Introduction: Distributed Computing & Database Systems
46
Rule
Rule 8:
8:Distributed
DistributedTransaction
Transaction Management
Management
Two major aspects of transaction management, recovery control
and concurrency control, require extended treatment in the
distributed environment
•
•
In a distributed system, a single transaction can involve the execution of code at
multiple sites and can thus involve updates at multiple sites
Each transaction is therefore said to consist of multiple "agents," where an agent
is the process performed on behalf of a given transaction at a given site
holds lock Lx
Global deadlock: neither site
can detect it using only
information that is internal to
that site
T1x
Site X
wait for T1x
to release Lx
T2x
wait for
T1y to
complete
T1y
wait for
T2x to
complete
wait for T2y
to release Ly
T2y
Site Y
holds lock Ly
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
47
Rule
Rule9:
9: Hardware
HardwareIndependence(Transparency)
Independence(Transparency)
User should be presented with the “single-system image” regardless
any particular hardware platform
•
It is desirable to be able to run the same DBMS on different hardware systems
•
It is desirable to have those different hardware systems all participate as equal
partners (where appropriate) in a distributed system
•
The strict homogeneity assumption is not relaxed; it is still assumed that the same
DBMS is running on all those different hardware systems
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
48
Rule
Rule 10:
10: Operating
Operating System
System Independence
Independence
It is obviously desirable, not only to be able to run the same DBMS
on different hardware systems, but also to be able to run it on
different operating systems - even different operating systems on
the same hardware
• From a commercial point of view, the most important operating system
environments, and hence the ones that (at a minimum) the DBMS
should support, are probably MVS/XA, MVS/ESA, VM/CMS, VAX/VMS,
UNIX (various flavors), OS/2, MS/DOS, Windows
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
49
Rule
Rule11:
11: Network
NetworkIndependence
Independence
It is obviously desirable to be able to support a variety of
disparate communication networks
• From the point of view of the distributed DBMS, the network is merely the
•
•
•
•
•
provider of a reliable message transmission service
By "reliable" here is meant that, if the network accepts a message from
site X for delivery to site Y, then it will eventually deliver that message to
site Y;
Messages will not be garbled, will not be delivered more than once, and
will be delivered in the order sent.
The network should also be responsible for site authentication
Ideally the system should support both local area networks and widearea networks
Distributed system should support a variety of different network
architectures
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
50
Rule
Rule12:
12: DBMS
DBMSIndependence
Independence
Ideal distributed system should provide DVBMS
independence (or transparency)
INGRES
user
INGRES
(SQL)
INGRES/
STAR
INGRES
database
COT5200: DISTRIBUTED DATABASE SYSTEMS
GATE
WAY
distributed INGRES database
ORACLE
(SQL)
ORACLE
database
Introduction: Distributed Computing & Database Systems
51
Distributed
DistributedDatabase
DatabaseSystems:
Systems: Conclusions
Conclusions
&
Distributed computing systems: concepts and terminology
&
Distributed OS and architectural models of DS
&
Architecture of a distributed database system
&
Date’s Rules for DDBS
&
Tradeoffs in distributing the database
&
&
Advantages
&
Disadvantages
Problems of distributed database systems
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
52
Distributed
Distributed Database
Database Systems:
Systems: Keywords
Keywords
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
data distribution
data replication
global failure
DBMS catalog
global conceptual schema
site crash
external schema
local conceptual schema
local internal schema
Rule 0
local autonomy
no reliance on central site
continuous operation
location transparency
fragmentation independence
COT5200: DISTRIBUTED DATABASE SYSTEMS
•
•
•
•
•
•
•
•
•
replication independence
distributed transaction
manager
hardware independence
OS independence
network independence
DBMS independence
distributed locking
distributed commitment
recovery of nodes
Introduction: Distributed Computing & Database Systems
53
What’s
What’s next
next ?? --Client/Server
Client/Server Database
Database Systems
Systems
¾ Client/Server Distributed Computing
¾ Client/Server advantages/disadvantages
¾ Oracle server concepts
COT5200: DISTRIBUTED DATABASE SYSTEMS
Introduction: Distributed Computing & Database Systems
54
Download