Database design 3

advertisement
Entity Relationship Approach
 Top-down
approach to data modeling
 Uses diagrams
 Normalization - confirms technical
soundness
 Entity Relationship - reflects business
requirements
1
Using Diagrams
 Define
a procedure for representing existing
models
 Develop diagrams directly from business
requirements
 Business-oriented terminology based on...
– Entities (things of interest to business)
– Relationships between entities
2
Hospital Example
 Graphic
2-13
 Format does not encourage a quick
appreciation of the main concepts and rules
– e.g. Each operation can be performed by only
one surgeon but this is not immediately
apparent
 This
is a very simple model, as we progress
to more complex models the problem of
presentation becomes more serious
3
Data Model Diagram
 Sometimes
called data structure diagram,
they are based on two symbols
– A "box" represents a table (a rounded rectangle
is usually used)
– A line drawn between two boxes represents a
foreign key pointing back to the table that has
the primary key
4
Existing Tables
 Simply
draw a box for every table in the
model with the name of it inside
 Graphic
3-2
5
Existing Relationships
 Draw
a line between the two boxes and
indicate the direction of the link by putting
a "crows foot" at the foreign key end
– Think of the crows foot as an arrow pointing
back to the table with the primary key
 Graphic
3-3
6
Interpreting the Diagram
 The
model specifies a Surgeon table (data
about surgeons)
 The model specifies an Operation table
(data about operations)
 Each Operation can be associated with only
one surgeon
 Each Surgeon can be associated with many
Operations
7
Summarizing Relationships
 This
represents the relationship between
tables (implied by the primary and foreign
keys) without having to list any column
names at all
8
Asking Questions
 We
could now ask the business specialist
– Is it true that each operation is performed by
one surgeon only?
– Can we count on this in the future?
 We
can make changes now (in the model)
while the cost is still low
 Question: If more than one Surgeon is
allowed, how would we handle it?
9
Another Solution
 We
could choose to track only the surgeon
who managed the operation
 Record decisions like this in the diagram
– Avoids the question being revisited
– Specifies precisely what data will be held
 Cannot
answer question - "In how many
operations did surgeon number 12 at
hospital number 18 participate?"
10
Annotated Relationship
 Graphic
3-4
 As
well as annotating the diagram, we
should change the column name from
Surgeon Number to Managing Surgeon
Number
11
Optionality
 What
if there are no surgeons required for
an operation?
– e.g. a small cut with antibiotics given
 Operation
table may have a NULL value for
Surgeon Number
 To show the difference between optional or
mandatory relationships use the convention
shown in Graphic 3-5
12
Exercise
 Draw
the relationships for the rest of the
model
13
First Pass
 Graphic
3-6
 Checking
the diagram will often reveal
unsound assumptions and
misunderstandings
 Conversely, it may increase confidence in
the model for both user and designer
14
Operation and Operation Type
 Are
we sure that each operation can be of
only one type?
 How would we represent a gall bladder
removal and appendectomy operation?
– 1. Allow only "simple" operations - need a new
table for repeating group
– 2. Allow complex operation types such as
"Combined Gall Bladder removal and
Appendectomy"
15
Both will Work
 Option
2 will be easier to implement
 Option 1 will be more elegant
– Can ask "list all operations that involved
appendectomies"
16
Redundant Lines
 Note
the Lines connecting Hospital,
Operation, and Surgeon tables
– line from Hospital to Surgeon
– line from Surgeon to Operation
– line from Operation to Hospital
» Does this line add anything to our knowledge of the
business rules?
17
Rules for Removing Lines
 Graphic
3-7
 If A derives B and B derives C than an A
derives C connecting line is not needed
– If B is optional, than we cannot remove the A
derives C line
– If A derives C provides different information
than A derives B we can not remove A derives
C
18
Exercise
 Remove
the remaining Redundant lines
from Graphic 3-7
19
Solution
 Graphic
3-8
20
Top-Down Approach
 Why
Normalize at all? Why not just use
this Top-down E-R approach?
 In practice this is what is most often done,
with Normalization being a final check
 Using E-R first...
– we can start with "What data do we need to
keep information about
– no need to start with a single overly complex
table
21
Terminology
 The
Relational Models (used with
Normalization) were built on three basic
concepts: tables, columns, and keys
 The E-R Models: Entities, Attributes, and
Relationships
 Easier to say, "The relationship between a
hospital and surgeon", then "the existence
of the primary key of Hospital as a foreign
key in the Surgeon table"
22
Entities
 An
entity is the "real world" class of things
that a table represents
– entities: St. John's Hospital
– entity type: Hospital
 In
practice: Entity means Entity type and
Entity Instance (row) for Entity
23
Entities - continued
 Some
things will need to be represented by
more than one entities
– e.g. Invoices would be represented by two
entities: Invoice Header and Invoice Item
– e.g. Quarterly Profit would be derived from
sales and expense figures from other entities
24
Multiple "Things" as One Entity
 Example...
– Preferred Customer
– Corporate Customer
– Becomes a Customer Type with a Customers
Table
25
Entity Naming
 The
name of an entity must be in the
singular. (Exactly the opposite of what it
says in the SQL book!)
– e.g. Account instead of Accounts
 Three
reasons:
– Consistency - standards
– Communication - an 'entity is something we
need to keep information about
– Compatibility with Relationship Name - (we'll
look at this later)
26
Relationships
 The
lines between the boxes are the
relationships between the entities
 We name relationships in both directions
– "Each company may issue one or more shares"
– "Each share must be issued by one company"
 Graphic
3-9 show notation were using
 Graphic 3-10 shows alternative
diagramming notations
27
Exercise
 Draw
the E-R model that would represent
the relationship between a Manager and
Departments within his company. A
Manager may be in charge of more than one
department
28
Solution and More Examples
 Graphic
3-11
 Graphic 3-12
29
Suggested Diagramming Tips
 Orient
your diagram so that the Crows feet
are nearer to the bottom of the page
 Place Crow's feet on the right
 Eliminate crossing lines where possible (but
clarity is most important
 Duplicate Entities to avoid long relationship
lines
– Can use a dotted-line to connect the same entity
30
Many-to-Many Relationships
 Many-to-Many
Relationships are frequently
modeled
 Graphic 3-13
 How would we implement the relationship
using foreign keys?
31
We Can't
 We
can't hold the key to Qualification in the
Employee table because an Employee could
have several qualifications
 Like wise the Qualification table would
need to support multiple Employee Keys
 A Normalized model cannot represent
many-to-many relationships with foreign
keys, yet such relationships certainly exist
32
Representing Many-to-Many
 Can't
use foreign keys
 Can use a table
33
Normalized Many-to-Many
 Graphic
3-16
 Whenever we encounter a many-to-many
relationship between two entities we
represent it with a new entity linked to the
two original entities.
 The Primary keys of each original table
become together become the Primary Key
of the new entity ("the resolution entity")
34
Choice of Representation
 Conceptual
vs. Physical Model
 Conversion is not totally mechanical
– New Non-key attributes in the resolution
entity?
– Different Name for the resolution entity?
» e.g. instead of Employee-Qualification table we
might use "Certifications" table
35
One-to-One Relationships
 Should
not automatically combine the
entities into a single entity
 When to split
– distinct real-world concepts (e.g. PersonPassport)
– Separating attribute groups (e.g. Detail vs.
default)
– Transferables (e.g. Part type -stored in-Bin)
36
Self-Referencing Relationships
 Graphic
3-18
 Each employee may manage one or more
employees
 Each employee may be managed by one
employee
– carry the foreign key in the same table as the
Primary key
– e.g. Manager Id -> Employee Id
37
Attributes
 Sometimes
show a few attributes to clarify
meaning (e.g. Primary, Foreign keys)
 Don't show all the attributes, because we
want the big picture
 Keep in separate lists for each entity
38
More Normalization
 If
in the process of listing attributes, we
find repeating groups or lookup tables
– Normalize the design
– Update the E-R model
39
Summary
 Data
models can be presented
diagrammatically by using a box to
represent each table and a line for each
foreign key relationship
 This provides a language for "top-down"
data models; prior to developing attributes
 Many-to-Many relationships are resolved
with a "resolution entity"
40
Last Slide - Entity Relationship
 Assignment
#10 due next week
41
Download