Database Concepts

advertisement
DISTRIBUTED DATABASES
AND DDBMS
Learning Objectives




Understand the concept of “Distributed Data”
Describe various Distributed Data and DDBMS
implementations
Explain how database design affects the DDBMS
environment
Apply DDBMS principles to solve problems
Definitions

Distributed Database: A single logical
database that is spread physically across
computers in multiple locations that are connected
by a data communications link

Decentralized Database: A collection of
independent databases on non-networked
computers
They are not the same thing!
What are we talking about here?
Key Questions:
 Are components of the application in more than one
place?
 Are the data in more than one place?
 Does the app use more than one DBMS or “system”
for data management?
 Which facets, if any, are transparent to users?
Why distribute your app or data?





It’s hard.
It’s complex.
So why do it?
Scalability.
Redundancy.
Application Complexity
Monolithic


Everything works / is
contained within one
computer.
Ex. Ms Word
Distributed


Various working pieces
are in different
physical places,
working over a
computer network.
Ex. Google Docs
Data Distribution
Single Site Data (Simple)


All data stored in /
retrieved from one
place on a network.
Ex. Wordpress
Multi-Site Data (Complex)


Various parts of the
data come from
various sites on a
network.
Ex. My Slice, DNS
Data Complexity
Homogeneous (Easier)


All data associated
with the application is
stored in the same
DBMS
Ex. Wordpress
Heterogeneous (More Difficult)


Various data
components of the
application are stored
in different DBMSes
Ex. SU Blackboard,
Facebook
Multisite Data DBMS Options

Horizontal Partitioning –
 Distributing

Vertical Partitioning –
 Distributing

data by row
data by table or column.
Replication –
 Copying
data either on a schedule or in real-time
Summary: The taxonomy
App
Multi Site
Monolithic
Distributed
Replicated
Single Site
Hetero.
Multi Site
Homo.
Horiz.
Partitioned
Vert.
Partitoned
Homogeneous == Same DBMS
User’s View of Db
CRM Db
•Customers
•Sales Staff
•Orders
Actual Implementation
N. America
Europe
•Customers
•Sales Staff
•Orders
Oracle
Same
Oracle
Heterogeneous == Multiple DBMS
CRM Db
User’s View of Db
•Customers
•Sales Staff
•Orders
Europe
•Orders Invoices
N. America
Actual Implementation
File System
Europe
•Customers
•Sales Staff
•Orders
Oracle
MySQL
Example of Replication
CRM Db
User’s View of Db
•Customers
•Sales Staff
•Orders
Actual Implementation
N. America
Europe
•All Customers
•All Sales Staff
•All Orders
•All Customers
•All Sales Staff
•All Orders
Master
Replica
Example of Horizontal Partitioning
CRM Db
User’s View of Db
•Customers
•Sales Staff
•Orders
Actual Implementation
N. America
•NA Customers
•NA Sales Staff
•NA Orders
Europe
•E Customers
•E Sales Staff
•E Orders
Example of Vertical Partitioning
ERP System
User’s View of Db
•Financials
•Customer Service
•Prod. Support
•Human Resources
Actual Implementation
N. America
•Financials
•Human
Resources
Europe
•Customer
Service
•Prod
Support
5 Typical Distributed Databases





Centralized with Single Site Data
Replicated with Snapshots (in real time)
Replicated with Synchronization (on demand, or a
schedule)
Integrated Partitions ( Paritioning in data center)
Independent Partitions (Geographically distributed
partitioning)
5 Typical Distributed Databases
Transparency

Location Transparency


Replication Transparency


User/application does not need to know about duplication of
data
Failure Transparency


User/application does not need to know where data resides
Either all or none of the actions of a transaction are committed
Transparency is difficult but important. The greater the
distribution of data the more there will be a need for
transparency to offset the complexity.
Applying The Concepts Via Example:



Monolithic or Distributed?
Single Site or Multi Site data?
If multi-site:
H
/ V Partitioned or Replicated?
 Homogeneous or Heterogeneous?



Location Transparency?
Replication Transparency?
Failure Transparency?
DISTRIBUTED DATABASE AND DDBMS
Questions?
Download