CMSC424: Database Design Instructor: Amol Deshpande CMSC424, Spring 2005

advertisement
CMSC424: Database Design
Instructor: Amol Deshpande
amol@cs.umd.edu
CMSC424, Spring 2005
Data Modeling
• Goals:
• Conceptual representation of the data
• “Reality” meets “bits and bytes”
• Must make sense, and be usable by other
people
• Today:
• Entity-relationship Model
• Relational Model
CMSC424, Spring 2005
Motivation
• You’ve just been hired by Bank of America as
their DBA for their online banking web site.
• You are asked to create a database that
monitors:
•
•
•
•
•
customers
accounts
loans
branches
transactions, …
• Now what??!!!
CMSC424, Spring 2005
Database Design Steps
Entity-relationship Model
Typically used for conceptual
database design
info
Conceptual DB design
Three Levels of
Modeling
Conceptual Data Model
Logical DB design
Logical Data Model
Relational Model
Typically used for logical
database design
Physical DB design
Physical Data Model
CMSC424, Spring 2005
4
Entity-Relationship Model
• Two key concepts
• Entities:
• An object that exists and is distinguishable from other
objects
• Examples: Bob Smith, BofA, CMSC424
• Have attributes (people have names and addresses)
• Form entity sets with other entities of the same type that
share the same properties
• Set of all people, set of all classes
• Entity sets may overlap
• Customers and Employees
CMSC424, Spring 2005
Entity-Relationship Model
• Two key concepts
• Relationships:
• Relate 2 or more entities
• E.g. Bob Smith has account at College Park Branch
• Form relationship sets with other relationships of the
same type that share the same properties
• Customers have accounts at Branches
• Can have attributes:
• has account at may have an attribute start-date
• Can involve more than 2 entities
• Employee works at Branch at Job
CMSC424, Spring 2005
ER Diagram: Starting Example
access-date
cust-name
number
cust-id
customer
has
account
cust-street
cust-city
balance
• Rectangles: entity sets
• Diamonds: relationship sets
• Ellipses: attributes
CMSC424, Spring 2005
7
Rest of the class
• Details of the ER Model
• How to represent various types of
constraints/semantic information etc.
• Design issues
• A detailed example
CMSC424, Spring 2005
Next: Relationship Cardinalities
• We may know:
One customer can only open one account
OR
One customer can open multiple accounts
• Representing this is important
• Why ?
• Better manipulation of data
• Can enforce such a constraint
• Remember: If not represented in conceptual
model, the domain knowledge may be lost
CMSC424, Spring 2005
Mapping Cardinalities
• Express the number of entities to which
another entity can be associated via a
relationship set
• Most useful in describing binary
relationship sets
CMSC424, Spring 2005
Mapping Cardinalities
• One-to-One
customer
has
account
• One-to-Many
customer
has
account
• Many-to-One
customer
has
account
• Many-to-Many
customer
has
account
CMSC424, Spring 2005
Mapping Cardinalities
• Express the number of entities to which
another entity can be associated via a
relationship set
• Most useful in describing binary
relationship sets
• N-ary relationships ?
CMSC424, Spring 2005
Next: Types of Attributes
• Simple vs Composite
• Single value per attribute ?
• Single-valued vs Multi-valued
• E.g. Phone numbers are multi-valued
• Derived
• If date-of-birth is present, age can be derived
• Can help in avoiding redundancy, enforcing
constraints etc…
CMSC424, Spring 2005
Types of Attributes
access-date
cust-name
number
cust-id
customer
has
account
cust-street
cust-city
balance
CMSC424, Spring 2005
Types of Attributes
age
• multi-valued (double ellipse)
access-date
• derived
(dashed ellipse)
cust-name
number
cust-id
date-of-birth
customer
has
account
cust-city
phone no.
balance
cust-street
CMSC424, Spring 2005
Types of Attributes
access-date
age
cust-name
number
cust-id
date-of-birth
customer
has
account
phone no.
balance
cust-street
month
day
cust-city
year
Composite Attribute
CMSC424, Spring 2005
Next: Keys
• Key = set of attributes identifying
individual entities or relationships
CMSC424, Spring 2005
Entity Keys
Possible Keys:
date-of-birth
{cust-id}
cust-name
{cust-name, cust-city, cust-street}
{cust-id, age}
cust-id
cust-name ?? Probably not.
age
customer
Domain knowledge dependent !!
cust-street
cust-city
phone no.
CMSC424, Spring 2005
Entity Keys
• Superkey
• any attribute set that can distinguish entities
• Candidate key
• a minimal superkey
• Can’t remove any attribute and preserve key-ness
• {cust-id, age} not a superkey
• {cust-name, cust-city, cust-street} is
• assuming cust-name is not unique
• Primary key
• Candidate key chosen as the key by DBA
• Underlined in the ER Diagram
CMSC424, Spring 2005
Entity Keys
• {cust-id} is a natural primary key
date-of-birth
cust-name
• Try to use a candidate key that
rarely changes
cust-id
age
• Typically, SSN forms a good
primary key
customer
• e.g. something involving address
not a great idea
cust-street
cust-city
phone no.
CMSC424, Spring 2005
Relationship Set Keys
• What attributes are needed to represent a relationship
completely and uniquely ?
• Union of primary keys of the entities involved, and relationship
attributes
access-date
number
cust-id
customer
has
account
• {cust-id, access-date, account number} describes a relationship
completely
CMSC424, Spring 2005
Relationship Set Keys
• Is {cust-id, access-date, account number} a candidate key ?
• No. Attribute access-date can be removed from this set without
losing key-ness
• In fact, union of primary keys of associated entities is always a
superkey
access-date
number
cust-id
customer
has
CMSC424, Spring 2005
account
Relationship Set Keys
• Is {cust-id, account-number} a candidate key ?
• Depends
access-date
number
cust-id
customer
has
CMSC424, Spring 2005
account
Relationship Set Keys
• Is {cust-id, account-number} a candidate key ?
• Depends
access-date
number
cust-id
customer
has
account
• If one-to-one relationship, either {cust-id} or {account-number}
sufficient
• Since a given customer can only have one account, she can only
participate in one relationship
• Ditto account
CMSC424, Spring 2005
Relationship Set Keys
• Is {cust-id, account-number} a candidate key ?
• Depends
access-date
number
cust-id
customer
has
account
• If one-to-many relationship (as shown), {account-number} is a
candidate key
• A given customer can have many accounts, but at most one
account holder per account allowed
CMSC424, Spring 2005
Relationship Set Keys
• General rule for binary relationships
• one-to-one: primary key of either entity set
• one-to-many: primary key of the entity set
on the many side
• many-to-many: union of primary keys of
the associate entity sets
• n-ary relationships
• More complicated rules
CMSC424, Spring 2005
Next: Data Constraints
• Representing semantic data constraints
• We already saw constraints on relationship
cardinalities
CMSC424, Spring 2005
Participation Constraint
• Given an entity set E, and a relationship
R it participates in:
• If every entity in E participates in at least
one relationship in R, it is total participation
• partial otherwise
CMSC424, Spring 2005
Participation Constraint
access-date
cust-name
number
cust-id
customer
has
account
cust-street
cust-city
balance
Total participation
CMSC424, Spring 2005
29
Cardinality Constraints
How many relationships can an entity participate in ?
access-date
number
cust-id
customer
account
has
0..*
Minimum - 0
Maximum – no limit
1..1
Minimum - 1
Maximum - 1
CMSC424, Spring 2005
Next: Recursive Relationships
• Sometimes a relationship associates an
entity set to itself
CMSC424, Spring 2005
Recursive Relationships
emp-name
emp-id
manager
works-for
employee
worker
emp-street
emp-city
Must be declared with roles
CMSC424, Spring 2005
Next: Weak Entity Sets
• An entity set without enough attributes
to have a primary key
• E.g. Transaction Entity
• Attributes:
• transaction-number, transaction-date,
transaction-amount, transaction-type
• transaction-number: may not be unique across
accounts
CMSC424, Spring 2005
Weak Entity Sets
• A weak entity set must be associated
with an identifying or owner entity set
• Account is the owner entity set for
Transaction
CMSC424, Spring 2005
Weak Entity Sets
Still need to be able to distinguish between different
weak entities associated with the same strong entity
number
trans-date
trans-number
account
has
Transaction
trans-type
balance
trans-amt
CMSC424, Spring 2005
Weak Entity Sets
Discriminator: A set of attributes that can be used for that
number
trans-date
trans-number
account
has
Transaction
trans-type
balance
trans-amt
CMSC424, Spring 2005
Weak Entity Sets
• Primary key:
• Primary key of the associated strong entity
+ discriminator attribute set
• For Transaction:
• {account-number, transaction-number}
CMSC424, Spring 2005
Next: Specialization
• Consider entity person:
• Attributes: name, street, city
• Further classification:
• customer
• Additional attributes: customer-id, credit-rating
• employee
• Additional attributes: employee-id, salary
• Note similarities to object-oriented
programming
CMSC424, Spring 2005
Specialization: Example
CMSC424, Spring 2005
Finally: Aggregation
• No relationships between relationships
• E.g.: Associate account officers with has account
relationship set
customer
has
?
account officer
employee
CMSC424, Spring 2005
account
Finally: Aggregation
• Associate an account officer with each account ?
• What if different customers for the same account can have
different account officers ?
customer
has
?
account officer
employee
CMSC424, Spring 2005
account
Finally: Aggregation
• Solution: Aggregation
customer
has
account officer
employee
CMSC424, Spring 2005
account
More…
• Read Chapter 2 for:
• Specialization/Aggregation details
• Different types of specialization’s etc
•
•
•
•
Generalization: opposite of specialization
Lower- and higher-level entities
Attribute inheritance
…
CMSC424, Spring 2005
E/R Data Model
Design Issue #1: Entity Sets vs. Attributes
An Example: Employees can have multiple phones
(b)
(a)
Employee
phone_no
vs
Employee
Phone
Uses
phone_loc
loc
no
To resolve, determine how phones are used
1. Can many employees share a phone?
(If yes, then (b))
2. Can employees have multiple phones?
(if yes, then (b), or (a) with multivalued attributes)
3. Else
Employee
phone
(a), perhaps with composite attributes
CMSC424, Spring 2005
no
loc
E/R Data Model
Design Issue #2: Entity Sets vs. Relationship Sets
An Example: How to model bank loans
Customer
ssn
Loan
Borrows
name
(a)
vs
Customer
amt
lno
ssn
Branch
Loans
bname
name
amt
lno
(b)
To resolve, determine how loans are issued
1. Can there be more than one customer per loan?
• If yes, then (a). Otherwise, loan info must be replicated for each
customer (wasteful, potential update anomalies)
2. Is loan a noun or a verb?
• Both, but more of a noun to a bank. (hence (a) probably more
appropriate)
CMSC424, Spring 2005
bcity
E/R Data Model
Design Issue #3: N-ary vs Binary Relationship Sets
An Example: Works_At
Ternary:
Employee
Works_at
Dept
Branch
(Joe, Moody, Acct)  Works_At
vs
Binary:
Employee
WAE
WA
WAB
Branch
WAD
Dept
(Joe, w3)  WAE
(Moody, w3)  WAB
(Acct, w3)  WAD
CMSC424, Spring 2005
Choose n-ary
when possible!
(Avoids redundancy,
update anomalies)
Example Design
• We will model a university database
• Main entities:
•
•
•
•
•
Professor
Projects
Departments
Graduate students
etc…
CMSC424, Spring 2005
SSN
proj-number
name
sponsor
professor
project
area
start
rank
budget
dept-no
SSN
name
name
grad
dept
office
age
homepage
degree
CMSC424, Spring 2005
SSN
proj-number
name
sponsor
professor
project
area
start
rank
budget
dept-no
SSN
name
name
grad
dept
office
age
homepage
degree
CMSC424, Spring 2005
proj-number
SSN
PI
name
sponsor
professor
project
area
start
rank
budget
Co-PI
Appt
Chair
Supervises
RA
Time (%)
dept-no
SSN
name
name
Major
homepage
CMSC424, Spring 2005
age
advisor
office
grad
advisee
dept
Mentor
degree
proj-number
SSN
PI
name
sponsor
professor
project
area
start
rank
budget
Co-PI
Appt
Chair
Supervises
RA
Time (%)
dept-no
SSN
name
name
Major
homepage
CMSC424, Spring 2005
age
advisor
office
grad
advisee
dept
Mentor
degree
proj-number
SSN
PI
name
sponsor
professor
project
area
start
rank
budget
Co-PI
Appt
Chair
Supervises
RA
Time (%)
dept-no
SSN
name
name
Major
advisee
office
grad
homepage
And so on…
CMSC424, Spring 2005
age
advisor
dept
Mentor
degree
Summary
• Entity-relationship Model
• Intuitive diagram-based representation of domain
knowledge, data properties etc…
• Two key concepts:
• Entities
• Relationships
• We also looked at:
•
•
•
•
Relationship cardinalities
Keys
Participation Constraints
…
CMSC424, Spring 2005
Summary
• Details unimportant
• Key idea: We can represent many data
properties and constraints conceptually
using this
• Read Chapter 2
• Assignment will require you to do this
anyway !
CMSC424, Spring 2005
Download