Bringing It All Together in a Commercial RDBM Package user to:

advertisement
Bringing It All Together in a Commercial
RDBM Package
• Good relational database software packages allows the
user to:
– Record and display the design of every table, including field
names, descriptions, types, ranges, key fields, foreign keys and
indexes in a data dictionary;
– Record and display relationships between tables;
– Support at least the fundamental relational functions SELECT,
PROJECT and JOIN;
Packages such as Microsoft Access, FoxPro and Oracle
support this functionality
Tuesday, September 14,
1999
90-728 MIS Lecture Notes
1
Connectivity Clarification
• One-to-one (1-1) relationship:
– a single occurrence of one entity is associated with a single
occurrence of another entity
For example:
1
1
PASSPORT
PERSON
• One-to-many (1-M) relationship:
– a single occurrence of one entity is associated with one or more
occurrences of another entity
For example:
1
M
CAR
INSPECTION
• Many-to-many (M-M) relationship:
– one or more occurrences of one entity is associated with one or
more occurrences of another entity
For example:
CUSTOMER
Tuesday, September 14,
1999
M
M
PRODUCT
90-728 MIS Lecture Notes
2
Complex Databases and the EntityRelationship Model
Last lecture:
–
–
–
–
Presented table designs for transactions
Introduced codes for data consistency
Defined basic data operations
Showed 1-to-many entity relationship  foreign key links between
tables
– Showed table joins  PK/FK links between tables
Today:
– More abstract view of database design: entity-relationship diagrams
as representation of business rules
– Service delivery life cycles: tracking events as data object passes
through information system
– Implementing a database using the E-R model
Tuesday, September 14,
1999
90-728 MIS Lecture Notes
3
Data Models and Levels of Abstraction
A data model uses E-R diagrams to represent important policies
and procedures of an organization
Data models can be used by senior managers and by
programmer/analysts.
There are three kinds of data models in I/T design:
– Conceptual models, in which entities and relationships are represented
without reference to hardware or software platforms:
– Internal models, in which conceptual E-R diagram is modified for
specific database software platform used;
– External models, in which E-R diagram is divided into functional
modules with explicit business constraints and common entities;
– Physical models, which adapt abstract models to hardware- and
software-specific design considerations
Tuesday, September 14,
1999
90-728 MIS Lecture Notes
4
Data Models for RDBMSs
Relational database models (RDBMs) shield physical and
software-specific details from the end-user. Thus, we will
be concerned with the conceptual model of data storage.
Data modeling caveats:
– Many I/T professionals work at the physical level almost
exclusively
– Many I/T professionals focus on “data flows” rather than “entities”
– Much real-world database design is done without explicit abstract
data models
Tuesday, September 14,
1999
90-728 MIS Lecture Notes
5
Entities and Attributes
An entity is a fundamental data element. It corresponds to a table
in the relational model.
An attribute is a feature or partial description of an entity. It
corresponds to a field in the relational model.
– Composite attribute: can be divided to yield further attributes
– Simple attribute: cannot be divided into other attributes
– Single-valued attributes: attributes which can take on a single value from
their domains.
– Multi-valued attributes: attributes which can take on two or more values
from their domains.
– Derived attributes: attributes whose values are calculated via an
algorithm and which do not have to be stored in the database.
Tuesday, September 14,
1999
90-728 MIS Lecture Notes
6
Rules for Attribute Representation
• Attributes in E-R diagrams:
Include attributes in E-R diagrams only in preliminary stages.
Detailed descriptions of attributes are stored in the data dictionary.
STUDENT
Table
COURSE
COURSE
COURSE
COURSE
COURSE
STUDENT
Attribute
CrsNbr
CName
Credit
MaxEnrl
FID
SID
Last Name
Definition
Course Number
Course Name
Course Credits
Maximum Enrollment
Faculty Number
Student Number
Data Type
Character
Character
Numeric
Numeric
Numeric
Numeric
Key
PK
FK
FK
PK
• Composite versus Simple Attributes:
Use simple attributes whenever necessary to minimize chances of
data key error or data extraction complications
Tuesday, September 14,
1999
90-728 MIS Lecture Notes
7
Rules for Attribute Representation (cont’d)
• Single-Valued versus Multi-Valued Attributes:
Multi-valued attributes cannot be represented directly within the
relational model. Instead, either:
– Define new single-valued attributes, or
– Define a new entity set
1
Fender_Color
Car
Hood_Color
Trim_Color
Car
M
Color
or
Car_ID
Part_Color
Car_Part
What are the pros and cons of these approaches?
Tuesday, September 14,
1999
90-728 MIS Lecture Notes
8
Rules for Attribute Representation (cont’d)
• Derived Attributes:
For transaction processing, use derived attributes as opposed to
stored attributes whenever possible.
– Minimizes number of columns in table
– Attributes may have to be re-calculated for reports or queries
anyway
For data warehouses, derived attributes may be stored as explicit
fields since focus is on data aggregation rather than view
generation.
Tuesday, September 14,
1999
90-728 MIS Lecture Notes
9
Entity Relationships
Relationships describe the type of association between entities:
– Business rule representation: text description defines a business rule;
– Number of associations between related entities: n-ary relationships;
– Connectivity: number of instances of one entity that are uniquely
associated with one or more instances of another entity;
– Cardinality: the number of entity occurences associated with a specific
entity;
– Existence Dependency: an entity may or may not exist if a related entity
does not exist;
– Relationship Participation: an entity can exist independent of another
entity (optional) or must be associated with an entity (mandatory)
– Weak Entities: an entity is existence-dependent and has a primary key
which is derived from that of the associated entity
Tuesday, September 14,
1999
90-728 MIS Lecture Notes
10
Unary (Recursive) Relationships
Entities can be related to themselves in a variety of ways:
– A course can be a prerequisite for another course;
– A part can be assembled from one or more other parts;
– An employee can be supervised by another employee;
Representation of unary relationships depends on the
connectivity associated with the recursion :
If a course has at most
one prerequisite, then
add a prerequisite to
the COURSE table.
If a course can
have many
prerequisites, use a
linking table.
CrsNbr
101
102
103
309
311
Tuesday, September 14,
1999
CrsNbr
101
102
103
309
311
Cname
Algebra 1
Algebra 2
Trig
Pascal
C
Cname
Algebra 1
Algebra 2
Trig
Pascal
C
Prerequisite
1
1
COURSE
101
102
101
309
CrsNbr
103
103
311
311
is a
prereq
Prerequisite
101
102
102
309
90-728 MIS Lecture Notes
M
N
COURSE
CCLink
11
Cardinalities and Business Rules
Cardinality determines how many times a row related in one
table will appear in another table.
For example, a business rule associated with student
preferences for school transfers may specify that:
– a student can list at most nine schools to which he/she may wish to
be considered for acceptance next year, and
– a student must list at least one school.
1
M
PREFERENCES
may rank
STUDENT
(1, 9)
(1, 1)
A business rule requiring that a student list his/her current school
as one of the preferences may be implemented only at the
application software level
Tuesday, September 14,
1999
90-728 MIS Lecture Notes
12
Weak Entity Sets
Weak entity sets are useful when business rules do not permit keys
that are unique to various entities.
– For example, a tips hotline for reporting cars that may be stolen could be
implemented as:
CALLER
1
M
CALL
1
M
CVLINK
M
1
VEHICLE
CALLER (SSN, First Name, Last Name, Phone Number, . . .)
VEHICLE (VIN, Plate Number, State, Make, Model, Year, Color, . . .)
CALL (Call#, Date, Time, Address, SSN@, VIN@, . . .)
CVLINK (Call#@, VIN@)
However, callers may be unwilling to identify themselves, and there may
be only sketchy information on the vehicles.
CALLER and VEHICLE become weak entities, each unable to have
unique keys.
Tuesday, September 14,
1999
90-728 MIS Lecture Notes
13
Weak Entity Sets (cont’d)
CALLER and VEHICLE share the (unique) key CALL. Thus,
if the same car is reported by three different callers, the car
appears in VEHICLE three times. Also, every time a
person makes a call, the caller is included in the database
again:
1
CALLER
1
1
M
CALL
VEHICLE
CALLER (Call#, SSN, First Name, Last Name, Phone Number, . . .)
VEHICLE (Call#, Vehicle#, VIN, Plate Number, State, Make, Model, Year, Color, . . .)
CALL (Call#, Date, Time, Address, SSN@, VIN@, . . .)
This is another example of the identification and application of
business rules
Tuesday, September 14,
1999
90-728 MIS Lecture Notes
14
Generating “Views” of Data
Users often want to see data from multiple tables combined in an
intuitive way. To create views of data, perform multi-table joins:
– (i) Choose an entity set (table) that has not yet been processed. Call the
chosen table the "row driver" of the view.
– (ii) Include all tables into the view that fall along relationship paths
starting from the row driver that have a cardinality of 1 pointing away
from the row driver.
– (iii) Return to step (i) until all desired tables have been processed.
Example: How can we generate views of data associated with sales
events?
PRODUCT
1
1
M
SPLINK
Tuesday, September 14,
1999
(appears in)
M
SALE
M
(conducts)
(requests)
CUSTOMER
1
SALESPERSON
90-728 MIS Lecture Notes
15
Generating Views of Data (cont’d)
Trivial views: generated by a single table
Non-trivial views: generated by multiple tables in one-to-one or
an one-to-many relationships. To do this, have each
component entity added to the view one by one until all
desired tables are added.
For example:
– SALESPERSON, CUSTOMER, and PRODUCT have only the
trivial views of themselves
– SALE has the view: SALE + SALESPERSON + CUSTOMER
– SPLINK has the view consisting of every table:
SPLINK + SALE + CUSTOMER + SALESPERSON
In Access, views are implemented as queries consisting of a
series of joins.
Tuesday, September 14,
1999
90-728 MIS Lecture Notes
16
Multiple Linking Tables
Business rules may require that a database may have more than one
linking table.
– For example, a hospital operations database may have the following rules:
• An operation has a single patient but many doctors, each with a different role
• A patient may have more than one procedure performed in a single operation
• A patient may have several post-operative drugs.
+--------+
+-----------+
¦ ROLE ¦
¦ PATIENT
¦
+--------+
+-----------+
¦1
¦1
¦M
¦M
+--------+M
1+-----------+1
M+---------+
¦ ODLINK +---------¦ OPERATION +--------¦OPDLINK ¦
+--------+
+-----------+
+---------+
¦M
¦1
¦M
¦1
¦M
¦1
+--------+
+-----------+
+---------+
¦ DOCTOR ¦
¦ OPLINK
¦
¦POST-OP ¦
+--------+
+-----------+
¦DRUG
¦
¦M
+---------+
¦1
+------------+
¦ PROCEDURE ¦
+------------+
Tuesday, September 14,
1999
90-728 MIS Lecture Notes
17
Multiple Linking Tables (cont’d)
The E-R diagram could be implemented as:
ROLE (Role Code)
PATIENT (Patient#, . . .)
DOCTOR (Doctor#, . . .)
PROCEDURE (Procedure Code, Procedure Name)
POST-OP DRUG (Drug Code, Drug Name)
OPERATION (Operation#, Patient#@, Date, Start Time, . . .)
ODLINK (Operation#@, Doctor#@, Role Code@)
OPLINK (Operation#@, Procedure Code@)
OPDLINK (Operation#@, Drug Code@)
Data views of interest could include:
– Number of operations of various types performed by each doctor:
ODLINK + DOCTOR + OPERATION
– Drugs used on patients during recent operations: OPDLINK +
POST-OP DRUG + OPERATION + PATIENT
– All procedures performed on patients: OPLINK + PROCEDURE +
OPERATION + PATIENT
Tuesday, September 14,
1999
90-728 MIS Lecture Notes
18
Service Delivery Life Cycles
It may be useful to track the operations or steps associated with a
particular event as it winds its way through a system:
– Patient intake and treatment in a hospital
– Handgun tracing to detect firearms used in crimes
– Charitable pledge tracking
Example: Lost and found department of an agency
–
–
–
Policy is that the same day an item is found, a notice needs to be
posted describing the item and stating that it has been found.
After 30 days, if no one has claimed an item, a second notice is
posted.
Finally, 30 days after the second notice, if no one has claimed an item,
the item is disposed.
Problem: Find a way to track each found item through its life
cycle stages, until it comes to a final disposition
Solution: Use a decision tree with codes for each branch.
Tuesday, September 14,
1999
90-728 MIS Lecture Notes
19
Service Delivery Life Cycles (cont’d)
Owner Found
After First
Notice
Item Found
OwnerFound
After Second
Notice
First Notice
Sent
Second
Notice Sent
For decision support purposes, it may
be useful to associate probabilities
with each potential event (tree
branch)
Item
Disposed
Tuesday, September 14,
1999
90-728 MIS Lecture Notes
20
Service Delivery Life Cycles (cont’d)
• What are the branching
frequencies?
• What are the branching
probabilities?
• What are the average durations
until owners are found?
Tuesday, September 14,
1999
90-728 MIS Lecture Notes
21
Converting an E-R Diagram into a
Database Structure
• Define tables and primary keys
• Define attributes based on cardinality restrictions and primary
key definitions
• Define indexes for certain (combinations of ) attributes
• Define table relationships
• Allow cascade updates/cascade deletes if business rules allow
• Build data dictionary:
–
–
–
–
Attribute name, description and data type
Attribute cardinality
Attribute domain
Example data elements
Computer-aided software engineering (CASE) tools
automate many of these steps
Tuesday, September 14,
1999
90-728 MIS Lecture Notes
22
Download