T81-490b Systems Analysis and Development Project – Creating the

advertisement
T81-490b
Systems Analysis and
Development Project
Database Design – Creating the
Physical Data Model – Part 1
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
Announcements
Change to the Schedule
Where we're going:
Assignment #6: Process Modeling - due next week
Talk about Physical Database tonight and next week
Exam #2
Assignment #7: Data Dictionary - will be assigned tonight
Assignment #8: Report and Screen Design
5 more class nights
Tonight's Topic:
Going from the Conceptual/Logical Model to the Physical Model
Make Assignment #7 tonight
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
Take Quiz on Reading Assignment #6.
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
Make Assignment #7 now.
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
Making the transition from Conceptual data design to physical data
design.
In assignment #5, you created a model that was partly conceptual and
partly logical.
Here are some specific definitions.
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
Conceptual Data Model:
- high-level, business-oriented view
- non-critical details left out
- emphasize the most important entities, attributes, and relationships
Goal: clarity at a high level
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
Logical Data Model:
- fully normalized
- all attributes defined
- all candidate keys specified
- primary key identified
- foreign key definitions clearly defined or implied
- any remaining many-to-many are translated into associative entities
- cardinality has been specified
- optionality has been specified
Goal: a complete document from which a physical database can be
developed
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
Physical Data Model:
- dependent on your physical DBMS
- specified by DDL statements which will actually be used to create the
database objects
- may not be fully normalized
Goal: creation of a physical database
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
Normalization -- a quick review
Before we begin to create the Physical Model, we must make sure that
our Logical Model is normalized.
Normalization is essentially the process of identifying the one best place
where each fact belongs.
It is important for data integrity, and for ease of loading data into our
database.
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
The Normal Forms 1NF
1. Eliminate repeating groups
2. Eliminate/resolve non-atomic data
2NF
1. All attributes are dependent on the primary key
3NF
1. No relationships between the attributes
This is corny and over-worn, but I'll say it anyway. In 3NF,
"Every attribute depends upon the key, the whole key, and nothing but
the key...
... so help me Codd."
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
To ensure a working knowledge of Normalization,
Do the Soccer exercise.
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
Break
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
The Physical Data Model
The physical data model is created by transforming the logical data
model into a physical implementation based on the DBMS to be used
for deployment.
It is very vendor-specific. You will need a good working knowledge of
the DBMS.
The term "model" is a little misleading. It is not a diagram, like ERD or
DFD. It is basically a set of DDL to create the database objects.
(Please note: this whole discussion assumes you will be using a
Relational Database Management System.)
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
Basic Transformations
1. Transform Entities to Tables
2. Transform Attributes to Columns
The naming rules of the DBMS may not let you keep the same names you
had you had in the logical model. --> Look at handout on abbrev's.
3. Transform Domains to Data Types
Each column must have a data type and size. Maybe decimals too. More
about data types later.
Maybe constraints on the columns.
NOT NULL constraints
Uniqueness constraints
"Check" constraints: specific or range of values
4. Transform Relationships into Referential Contstraints
Primary Keys and Foreign Keys
A good CASE tool will generate DDL from your data model.
Handout and discuss: ERD to DDL
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
Handout and discuss:
ERD to DDL
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
Other physical model structures to be discussed later (next week)
include:
Physical data structures: - tablespaces, datafiles, extents, blocks, rows
Performance structures: - indexes
Security features: -- grants, views
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
Performance -- preliminary introduction
What are the performance issues?
- Essentially: How fast does it run? (Does it run fast enough?)
- Possibly also: How well does it scale? (well enough?)
Scalability
- It worked in test with small amounts of data.
- Results were correct.
- It ran fast.
- Why wouldn't it be the same in production?
- You will soon scale up. Maybe data is gradually added to the system.
- The results (hopefully) will still be correct.
- As volume increases, things start to slow down. Why?
What's the solution to these issues?
Performance Tuning / Design for Performance
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
Denormalization
Why denormalize?
One reason only: performance.
Don't denormalize because you think it might be helpful.
Try your best to tune the performance first.
Do it as a last resort in a long performance tuning effort.
It is disruptive. It takes time (maybe downtime to the end users).
It could be interpreted as you fixing an implementation mistake.
Document what you did, why you did it, and when.
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
The downside: redundant data.
Now, when you load that data, you must load it in more than one place.
How to handle that fact?
If you do it in programs
- you might forget, in a new program.
- it's a lot of work that way
- adhoc user adds a row of data – then suddenly you're off
You might use Triggers
- at the database level
- independent of programs or adhoc users
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
Examples of reasons you might denormalize: (NOT an exhaustive list)
1. Prejoined Tables
If the join must done often, and is prohibitively expensive.
Advantage: you do the join only once.
Must be periodically refreshed or rebuilt (which will do the join again).
In Oracle, this is called a Materialized View.
2. Report Tables
Often a report cannot be generated using SQL only. You can create
a Prejoined Table with just the information needed for the report.
Then write a program to do a simple SELECT from the report table and
then do the remainder of the formatting.
3. Derivable Data
If the cost (in cpu cycles) is prohibitive, you might physically store
such calculated data instead of calculating it on the fly each time.
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
Handout the Soccer "partially denormalized" solution -- discuss.
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
A note about Disk Utilization
In the olden days, disk was very expensive.
Now, it is "cheap".
But don't be wasteful due to 1) incomptence, or 2) apathy
Another Triple constraint triangle (no, this is not on the Exam)
- Storage
-- don't squander disk
- Performance -- don't spend time packing/unpacking,
compress/decompress, decoding
- Maintainability -- don't be so cryptic that nobody can understand your
code
Watch out for the ever expanding data store where no data is ever
deleted.
Design a plan to archive and delete old data that is no longer actively
used
See chart on next slide…
Date
2/21/2005
12/21/2004
10/21/2004
8/21/2004
6/21/2004
4/21/2004
2/21/2004
12/21/2003
10/21/2003
8/21/2003
6/21/2003
4/21/2003
2/21/2003
12/21/2002
10/21/2002
8/21/2002
6/21/2002
4/21/2002
2/21/2002
12/21/2001
10/21/2001
8/21/2001
6/21/2001
4/21/2001
2/21/2001
12/21/2000
10/21/2000
8/21/2000
6/21/2000
4/21/2000
2/21/2000
Size
Millions
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
SMT_DATA Trend
1800
1600
1400
1200
1000
800
Series1
600
400
200
0
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
x
T81-490b -- Database Design – Creating the Physical Data Model – Part 1 -- Class #12
x
Related documents
Download