SA&D

advertisement
CSUN Information Systems
Systems Analysis & Design
http://www.csun.edu/~dn58412/IS431/IS431_SP16.htm
Data
Modeling
IS 431: Lecture 5
1
Data Modeling
 Elements of Entity Relationship Diagram (ERD)
 Relational Data Model
IS 431 : Lecture 5
2
Data Modeling
Data Modeling (database modeling, information
modeling) is a technique for organizing and
documenting a system’s data in a model.
Entity Relationship Diagram (ERD) depicts data in
terms of the entities and relationships described by the
data.
IS 431 : Lecture 5
3
From Data Model
to DB Implementation
 ERD: a conceptual model of data entities (things of
interest) , their attributes (characteristics of interest),
and their relationships (with other things) in an
information system (technical independent).(Analysis)
 Relational Data Model: a blueprint for implementation
of a conceptual data model (ERD) in relational
database environment (software independent)(Design)
 MS Access Relationship Window: a graph showing
how a data model is implemented with Microsoft
Access (a specific DBMS software)(Implementation)
IS 431 : Lecture 5
4
Entity-Relationship
Diagrams
 Database = data + relationship
 ERD is used to model data and their relationship
 ERD is a graphical representation of a conceptual
data model
 ERD is resource independent: it does not commit
to any particular database environment.
IS 431 : Lecture 5
5
Entities
 Entity is a group of attributes corresponding to the
same conceptual object about which we need to
capture and store data
– objects, persons, places, events concepts whose
existence is not dependent on many other entities
 Entity is a set of occurrences (instances) of the
object that it represents
 Entity must have a unique name (a singular noun),
unique identifier, and at least one attribute (the
identifier itself)
IS 431 : Lecture 5
6
Entities: Examples

Persons: agency, contractor, customer,
department, division, employee, instructor,
student, supplier.

Places: sales region, building, room,
branch office, campus.

Objects: book, machine, part, product, raw material, software
license, software package, tool, vehicle model, vehicle.

Events: application, award, cancellation, class, flight, invoice,
order, registration, renewal, requisition, reservation, sale, trip.

Concepts: account, block of time, bond, course, fund, qualification,
stock.
IS 431 : Lecture 5
7
Entity Instance: Example
Entity instance – a single occurrence of an entity.
Attributes
Entity
Student ID Last Name First Name
Instances
2144
Arnold
Betty
3122
Taylor
John
3843
Simmons
Lisa
9844
Macy
Bill
2837
Leath
Heather
2293
Wrench
Tim
IS 431 : Lecture 5
8
Entities: Attributes
An attribute is a descriptive property or characteristic
of interest of an entity. Also called element, property,
and field.
The data type for an attribute defines what type of data
can be stored in that attribute.
The domain of an attribute defines what values an
attribute can legitimately take on.
The default value for an attribute is the value that will
be recorded if not specified by the user.
IS 431 : Lecture 5
9
Data Type
Representative Logical Data Types for Attributes
Logical
Data Type
Logical Business Meaning
NUMBER
Any number, real or integer.
TEXT
A string of characters, inclusive of numbers. When numbers are included
in a TEXT attribute, it means that we do not expect to perform arithmetic
or comparisons with those numbers.
MEMO
Same as TEXT but of an indeterminate size. Some business systems
require the ability to attach potentially lengthy notes to a give database
record.
DATE
Any date in any format.
TIME
Any time in any format.
YES/NO
An attribute that can assume only one of these two values.
VALUE SET A finite set of values. In most cases, a coding scheme would be
established (e.g., FR=Freshman, SO=Sophomore, JR=Junior,
SR=Senior).
IMAGE
Any picture or image.
IS 431 : Lecture 5
10
Data Domains
Representative Logical Domains for Logical Data Types
Data Type
Domain
Examples
NUMBER
For integers, specify the range.
For real numbers, specify the range and
precision.
{10-99}
{1.000-799.999}
TEXT
Maximum size of attribute.
Actual values are usually infinite; however,
users may specify certain narrative
restrictions.
Text(30)
DATE
Variation on the MMDDYYYY format.
MMDDYYYY
MMYYYY
TIME
For AM/PM times: HHMMT
For military (24-hour times): HHMM
HHMMT
HHMM
YES/NO
{YES, NO}
{YES, NO} {ON,
OFF}
VALUE
SET
{value#1, value#2,…value#n}
{table of codes and meanings}
{M=Male
F=Female}
IS 431 : Lecture 5
11
Entities: Identification
 A key is an attribute, or a group of attributes, that assumes a
unique value for each entity instance.
 A group of attributes that uniquely identifies an instance of an
entity is called a composite (or concatenated) key.
 A candidate key is a “candidate to become the primary key”
of instances of an entity. (StudentID, SSN, DriverLicenseNo)
 A primary key (identifier) is that candidate key that will most
commonly be used to uniquely identify a single entity instance.
(StudentID)
 Any candidate key that is not selected to become the primary
key is called an alternate key. (SSN, DriverLicenseNo)
 A secondary key is an attribute whose values divide all entity
instances into useful subgroups/sub-criteria. (Major, Gender,
etc)
IS 431 : Lecture 5
12
Entities ...
ENTITY NAME
CUSTOMER
- entity id
- attribute 1
- attribute 2
- …………..
- attribute n
- Customer_ID
- Cust_Name
- Cust_Address
- Cust_Phone
IS 431 : Lecture 5
13
Relationships
 A Relationship documents an association between
one, two, or more entities
 It must have a name (and may carry data)
 Degree of Relationship (number of entities)
 Cardinalities of Relationship (number of
instances/members)
IS 431 : Lecture 5
14
Relationships: Degree
 Degree of Relationship defines how many entities
are involved in a relationship:
–
–
–
–
Recursive (Unary)
Binary
Ternary
….
IS 431 : Lecture 5
15
Relationships: Cardinalities
 Cardinalities document how many
occurrences/members of one entity can relate to a
single occurrence/member of another entity in a
relationship.
 Max / Min number of occurrences
 Reflect business policies or general business
practices (e.g., how many classes a student can take,
how many students a class can hold).
(15, 35)
Student
(1, 5)
Take
IS 431 : Lecture 5
Class
16
Cardinalities ...
One-to-One (1:1) Relationship
Sales
1
Pay
1
Cash
Collections
M
Cash
Collections
Ex: Cash Sales
One-to-Many (1:M) Relationship
Sales
1
Pay
Ex: Installment Payments
IS 431 : Lecture 5
17
Cardinalities ...
Many-to-One (M:1) Relationship
Sales
M
Pay
1
Cash
Collections
Ex: Pay many credit purchases in full
Many-to-Many (M:N) Relationship
Sales
M
Pay
N
Cash
Collections
Ex:Pay credit purchases with partial
payments over some months
IS 431 : Lecture 5
18
Cardinalities ...
 Maximum / Minimum Cardinality
Employee
(0,E)
Possess
(1,S)
Skill
An employee must have minimum one skill
A particular skill may not be possessed by any employee
(1,E)
Employee
(0,P)
Belong
Project
A project must have minimum one employee
A particular employee may not belong to any project
IS 431 : Lecture 5
19
Relationships ...
 Recursive relationships involve only one entity
(occurrences in the same entity)
N
N
TOWN
ENTITY
- entityID
- attribute 1
- attribute 2 M
Relationship Name
Attrib 1, Attrib 2
TRAVEL
Distance
-TownID
-Town_Name
M
IS 431 : Lecture 5
20
Relationships ...
 Binary relationship
ENTITY 1
-entity1_ID
-attribute 11
-attribute 12
ENTITY 2
M
Relationship Name
Attrib 1, Attrib 2
N
EMPLOYEE
PROJECT
M
- Emp_ID
- Emp_Name
- Emp_Title
-entity2_ID
-attribute 21
-attribute 22
Manage
Date
IS 431 : Lecture 5
N
- Project_ID
- Proj_Name
- Proj_Due
21
Relationships ...
 Ternary relationship
EMPLOYEE
M
- EmpID
- Emp_Name
- Emp_Title
Assign
Date
N
PROJECT
- ProjectID
- Proj_Name
- Proj_Due
P
TASK
- TaskID
- TaskName
IS 431 : Lecture 5
22
Logical Data Modeling
Stages
1. Context Data model
–
To establish project scope
2. Key-base data model
–
–
–
–
Eliminate nonspecific relationships
Add associative entities
Include primary and alternate keys
Precise cardinalities
3. Fully attributed data model
–
–
All remaining attributes
Subsetting criteria
4. Normalized data model
IS 431 : Lecture 5
23
What is a Good
Data Model?
 A good data model is simple.
– Data attributes that describe any given entity should describe only
that entity.
– Each attribute of an entity instance can have only one value.
 A good data model is essentially non redundant.
– Each data attribute, other than foreign keys, describes at most one
entity.
– Look for the same attribute recorded more than once under different
names.
 A good data model should be flexible and adaptable to future
needs.
IS 431 : Lecture 5
24
Building ERD
 Identify entities of interest (Use REAL framework –
Resources/Event/Agents/Locations – more in Lecture
5B) – (Top Down)
 Identify all the attributes with sufficient details (context
specific)
 Assign attributes to entities – (Bottom Up)
 Identify degrees of relationships between entities
(context specific)
 Complete the relationships with cardinalities (context
specific)
 Build the model
IS 431 : Lecture 5
25
Rules in ERD Building
 Each entity must have a name
 Each entity must have an identifier
 An occurrence itself cannot be an entity (a
constant or a table with one unique record?)
 Each relationship must have a name (may
or may not carry data)
 Reasonable cardinalities (context specific)
IS 431 : Lecture 5
26
Alternative ERD Notations
Attribute 1
Attribute 3
Attribute 4
Attribute 1
Attribute 2
relates to
Entity 1
Entity 2
is related to
Attribute 5
IS 431 : Lecture 5
Attribute 2
Attribute 3
Attribute 4
27
Alternative ERD Notations
Entity
1
0
Entity
Entity
Entity
1
1
Entity
Entity
Entity
M
0
Entity
Entity
Entity
M
1
Entity
Entity
(0, 1)
(1, 1)
(0 , *)
(1 , *)
IS 431 : Lecture 5
28
Associative Entity
•An associative entity is an entity that inherits its primary
key from more than one other entity (called parents).
•Each part of that composite (concatenated) key points to
one and only one instance of each of the connecting
entities.
•Represent an M:N relationship carrying data
EMPLOYEE
PROJECT
1
- Emp_ID
- Emp_Name
- Emp_Title
M
ASSIGN
Date
IS 431 : Lecture 5
N
1
- Project_ID
- Proj_Name
- Proj_Due
29
M:N Relationship vs.
Associative Entity
EMPLOYEE
PROJECT
M
- Emp_ID
- Emp_Name
- Emp_Title
EMPLOYEE
- Emp_ID
- Emp_Name
- Emp_Title
1
ASSIGN
Date
ASSIGNMENT
M
N
- Emp_ID
- Proj_ID
- Date
IS 431 : Lecture 5
N
- Project_ID
- Proj_Name
- Proj_Due
PROJECT
1
- Project_ID
- Proj_Name
- Proj_Due
30
Foreign Keys in
Relational Database
A foreign key (FK) in Entity E1(CustID in
ORDER) is a primary key of another Entity E2
(CustID in CUSTOMER), which is used to identify
(link) a 1:M relationship between E1 and E2
(CUSTOMER and ORDER).
Foreign key is made on the many side
(CUSTOMER has many ORDERS, therefore
ORDER carries CustID as FK to show which
Customer places that Order)
IS 431 : Lecture 5
31
Foreign Key
CUSTOMER
CUSTOMER
CustomerID
ORDER
ORDER
OrderID
CustomerID
IS 431 : Lecture 5
1:M Relationship
Primary Key
Foreign Key
32
Foreign Keys in
Relational Database. . .
In M:N relationship, the associative/junction table
with a composite key will be used to capture the
relationship.
– ORDER involved many PRODUCTS, PRODUCT
involved in many ORDERS. Composite key
ProductID-OrderID for LINE ITEM to indicate
which product involves in which sales
 Each part of the composite key serves like a foreign
key.
Sometimes, a “surrogate” key (RecordNo) is used as
primary key to simplify the identification of record.
IS 431 : Lecture 5
33
Composite Key
ORDER
PRODUCT
ORDER
OrderID
PRODUCT
ProductID
M:N Relationship
Primary Key
LINE_ITEM
RecordNo
OrderID
ProductID
Composite Key
JUNCTION TABLE
IS 431 : Lecture 5
34
ERD Case Studies
• ERD 1: Group Consultants Database
• ERD 2: Video Rental Database
• ERD 3: Health Club Database
• ERD 4: Music Store Database
to be discussed in class
IS 431 : Lecture 5
35
Convert ERD
to Relational Data Model
 Legend
–
–
–
–
E1, E2, E3 , … represent entities
a1, a2, a3, …, b1, b2, b3, … represent entity attributes
R represents a relationship
r represents one or many attributes that can be carried
by a relationship
IS 431 : Lecture 5
36
Convert ERD
to Relational Data Model …
 Pattern #1
E1
- a1
- a2
E1( a1, a2 )
TASK
- TaskID
- TaskName
TASK( TaskID , TaskName )
IS 431 : Lecture 5
37
Convert ERD
to Relational Data Model …
 Pattern #2
E1
- a1
- a2
1
E1( a1, a2 )
R
r
R( a1, a1’, r)
1
1
INDIVIDUAL
- SSN
- Name
MARRY
Date
1
INDIVIDUAL ( SSN , Name)
MARRY ( SSN , SSN’, Date)
Husband
IS 431 : Lecture 5
Wife
38
Convert ERD
to Relational Data Model …
 Pattern #3
E1
- a1
- a2
1
E1( a1, a2 )
R
R( a1’, a1 )
M
1
INDIVIDUAL
- SSN
- Name
LEAD
M
INDIVIDUAL ( SSN , Name)
LEAD ( SSN’ , SSN )
Member
IS 431 : Lecture 5
Leader
39
Convert ERD
to Relational Data Model …
 Pattern #4
E1
- a1
- a2
TOWN
- TownID
- Name
M
R
E1( a1, a2 )
r
R( a1, a1’, r)
N
M
Tour
Date
N
TOWN ( TownID , Name)
TOUR ( TownID, TownID’ , Date )
IS 431 : Lecture 5
40
Convert ERD
to Relational Data Model …
 Pattern #5
E1
- a1
- a2
1
EMPLOYEE
1
- EmpID
- EmpName
R
1
MANAGE
E2
- b1
- b2
- b3
E1( a1, a2 )
E2( b1, b2 , b3 , a1)
OR
E1( a1, a2 , b1 )
E2( b1, b2 , b3 )
PROJECT
1 - ProID
- ProName
- ProDue
EMPLOYEE(EmpID, EmpName)
PROJECT(ProID, ProName, ProDue, EmpID)
OR
EMPLOYEE(EmpID, EmpName, ProID)
PROJECT(ProID, ProName, ProDue )
IS 431 : Lecture 5
41
Convert ERD
to Relational Data Model …
 Pattern #6
E1
- a1
- a2
1
CUSTOMER
1
- CustID
- CustName
R
M
INVOLVE
E2
- b1
- b2
- b3
E1( a1, a2 )
E2( b1, b2 , b3 , a1)
PURCHASE
M - PurchID
- Item
- Quantity
CUSTOMER (CustID, CustName)
PURCHASE(PurchID, Item, Quantity, CustID)
IS 431 : Lecture 5
42
Convert ERD
to Relational Data Model …
 Pattern #7
E1
- a1
- a2
M
CUSTOMER
M
- CustID
- CustName
R
r
SALE
Date
N
E2
- b1
- b2
- b3
E1( a1, a2 )
E2( b1, b2 , b3)
R( a1, b1, r)
SALESREP
N - RepID
- RepName
CUSTOMER (CustID, CustName)
SALESREP (RepID, RepName)
SALE(CustID, RepID, Date)
IS 431 : Lecture 5
43
Convert ERD
to Relational Data Model …
 Pattern #8
E1
- a1
- a2
1
R
r
N
E2
- b1
- b2
- b3
P
E3
- c1
- c2
- c3
IS 431 : Lecture 5
E1( a1, a2 )
E2( b1, b2 , b3)
E3( c1, c2 , c3 )
R( b1, c1 , a1, r)
44
Convert ERD
to Relational Data Model …
 Pattern #8 (example)
EXECUTIVE
1
- ExecID
- ExecName
RESPONSIBLE
AppointDate
DEPT
N - DeptID
- DeptName
P
PROJECT
- ProjectID
EXECUTIVE( ExecID, ExecName)
DEPT( DeptID, DeptName)
PROJECT (ProjectID)
RESPONSIBLE (DeptID, ProjectID, ExecID, AppointDate )
IS 431 : Lecture 5
45
Convert ERD
to Relational Data Model …
 Pattern #9
E1
- a1
- a2
E2
M
R
r
N
- b1
- b2
- b3
P
E3
- c1
- c2
- c3
IS 431 : Lecture 5
E1( a1, a2 )
E2( b1, b2 , b3)
E3( c1, c2 , c3 )
R(a1, b1, c1 , r)
46
Convert ERD
to Relational Data Model …
 Pattern #9 (example)
EMPLOYEE
M
- EmpID
- EmpName
ASSIGN
TASK
N - TaskID
Date
P
PROJECT
- ProjectID
EMPLOYEE( EmpID, EmpName )
TASK( TaskID)
PROJECT( ProjectID)
ASSIGN(EmpID, TaskID, ProjectID, Date)
IS 431 : Lecture 5
47
Generalization
• Generalization is a technique wherein the attributes
that are common to several types of an entity are
grouped into their own entity, called a supertype.
An entity supertype is an entity whose instances
store attributes that are common to one or more entity
subtypes.
An entity subtype is an entity whose instances inherit
some common attributes from an entity supertype and
then add other attributes that are unique to an instance
of the subtype.
•
•
IS 431 : Lecture 5
48
Type & Subtypes
Employee
Name
DOB
SSN
DateHire
Executive
Position
Location
DateAppoint
Salary
Foreperson
Factory
Station
DateAppoint
Worker
Rank
Wage Rate
Wage Rate
IS 431 : Lecture 5
49
Generalization Hierarchy
IS 431 : Lecture 5
50
Data Analysis &
Normalization
Data analysis is a process that prepares a data model
for implementation as a simple, non-redundant,
flexible, and adaptable database. The specific
technique is called normalization.
Normalization is a technique to organize data
attributes such that they are grouped to form nonredundant, stable, flexible, and adaptive entities.
IS 431 : Lecture 5
51
Normalization (in Latin)
First normal form (1NF) – an entity whose attributes have no
more than one value for a single instance of that entity
– Any attributes that can have multiple values actually describe a
separate entity, possibly an entity and relationship.
Second normal form (2NF) – an entity whose nonprimary-key
attributes are dependent on the full primary key.
– Any nonkey attributes that are dependent on only part of the primary
key should be moved to any entity where that partial key is actually
the full key. This may require creating a new entity and relationship
on the model.
Third normal form (3NF) – an entity whose nonprimary-key
attributes are not dependent on any other non-primary key
attributes.
– Any nonkey attributes that are dependent on other nonkey attributes
must be moved or deleted. Again, new entities and relationships may
have to be added to the data model.
IS 431 : Lecture 5
52
Normalization
(in Plain English !!!)
 First normal form (1NF) :
– No repeating group of a same attribute (multi-valued attribute)
– If not: create a new record / entity for this group.
 Second normal form (2NF)
– Attributes should depend on the whole (composite) key, not
part of it (partial functional dependency).
– If not: create a new entity for these partial depended attributes
 Third normal form (3NF)
– Attributes should depend on the (primary) key only, not on each
other (transitive dependency)
– If not: create new entity for these partial depended attributes
IS 431 : Lecture 5
53
Un-normalized Relation
Observation: Repeating groups / multi-value attributes !!!
IS 431 : Lecture 5
54
Relation in 1NF
Observation: Attributes still depend on a part of the key !!!
IS 431 : Lecture 5
55
Relations in 2NF
Observation: Attributes still
depend on a non-key attribute !!!
IS 431 : Lecture 5
56
Relations in 3NF
Observation: Now each
table stores data about one
thing only.
IS 431 : Lecture 5
57
Example of
Relational Database
IS 431 : Lecture 5
58
Example of
Relational Database . . .
IS 431 : Lecture 5
59
Data Dictionary
EMPLOYEE
Attributes
Type
Size
Description
Authorization
EmpID
Numeric
6
Identifier
HR Manager
EmpFirstName
Text
10
Employee First Name
HR Manager
EmpLastName
Text
10
Employee Last Name
HR Manager
Address
Text
50
Employee Address
HR Manager
City
Text
10
Employee City
HR Manager
State
Text
2
Employee Last Name
HR Manager
Zip
Text
XXXXX
Employee Last Name
HR Manager
Phone
Text
XXX-XXX-XXXX
Employee Last Name
HR Manager
Date Hired
Date
MM/DD/YY
Date Hired Employee
HR Manager
Position
Text
15
Position of Employee
HR Manager
Attributes
Types
Size
Description
Authorization
EntryNumber
Numeric
6
Identifier
Project Manager
EntryDate
Date
MM/DD/YY
Date of Entry
Project Manager
HoursWorked
Numeric
3
Hours on Task
Project Manager
HotelExpense
Currency
4
Fund Spent on Hotel
Project Manager
TravelExpense
Currency
4
Fund Spent on Travel
Project Manager
FoodExpense
Currency
4
Fund Spent on Food
Project Manager
Approved
Y/N
1
Approved / Not Yet
Project Manager
EXPENSES
IS 431 : Lecture 5
60
Data to Process Matrix
IS 431 : Lecture 4
61
Data-to-Location
CRUD Matrix
IS 431 : Lecture 5
62
Download