Second Normal Form

advertisement
Announcements
Difference between “excused” and “absent”
 Office hours next week

•
•
•
•

May not be here Monday
Tuesday: 1:00 to 2:00
Wednesday: 10:00 to 11:00
Friday: 1:30 to 2:30
Questions regarding Syllabus, Class?
The Coffee Roasting Company
Phase I
Designing the Database
Overview

Apply normalization process to
purchasing cycle documents
• Purchase Order
• Receiving Report
• Payment Voucher

Combine three document sub-schemas
into overall schema for purchasing cycle
What is a Database?

Collection of data organized into:
• Tables (similar to a spreadsheet) – typically
associated with an object/item of interest such as
a customer, product, etc.
– Columns represent fields
• Attributes and characteristics that describe the object/item in
the table
• A customer has a name, address, phone, etc.
– Rows represent an instance of each object/item
• Plow House, Stayton OR, (503) 123-4567
Database Rules

Only one type of record in each table
• If put both customers and suppliers in same
table, how can you tell a customer from a
supplier?
• If use an “external entity” table for both
customers and suppliers, then a field in the
table would be used to identify the type of
external entity.
Database Rules

Each record (row) in a table must be unique
• If two rows were exactly the same:
• Plow House, Stayton OR, (503) 123-4567
• Plow House, Stayton OR, (503) 123-4567
– Is this an error – was the record entered twice by
mistake
– Which record should you use?
Database Rules

Each row within a table should have a
primary key
• Given that each record must be different in
some way, how can you easily find a given
record? That is, what field should you use to
find each different record
•
•
•
•
Plow House, Stayton OR, (503) 123-4567
Plow House, Corvallis OR, (541) 789-1234
Plow House, Bend OR, (503) 987-6543
Hitching Post, Bend OR, (503) 987-1234
Database Rules

Characteristics of a Primary Key
• Must be unique for each row in the table
• Must be stable over the life of the database
– Names are not a good choice – names change
– Normally assign an ID number or code
• Hence extensive use of SSN
– Internally generated numbers and codes will be unique and will not
change
• Should make sense from a business perspective
– Phone number?
• Easy to use
– Starting to use alphanumeric codes instead of numbers
Design of the Database

Purpose: Capture the information off of the
Purchase Cycle forms and set up a database
that:
• Allows easy search and manipulation
• Reduces amount of redundant information
• Allows for future expansion and modification of
data
What is Database Design?
Process of determining the content and
arrangement of data needed to support
various activities.
 Could use a “Common-sense” approach

• No way to assure that design works as intended

We will use the “Normalization” Model
What is Normalization?

Technique to “optimize” the design of a
Relational Database
• Break large complex table into smaller, simpler
tables
• Ensure that the smaller tables fit back together.
– When break a Whole into its Parts, need to be sure you
can recreate the Whole from the Parts
What is Normalization?

Must be familiar with business processes
• Must understand relationships between purchase
order, receiving report and payment voucher
– Can multiple purchases arrive on the same receiving
report?
– Can a single purchase arrive on multiple receiving
reports?
– Can multiple purchases be paid with the same payment
voucher?
– If multiple shipments are received, when do we pay?
Overview of Normalization

1NF: Create two-dimensional tables
• Identify and define relevant data elements
• Remove repeating groups
• Select primary key
2NF: Ensure full functional dependence on
Primary Key
 3NF: Remove transitive dependencies

Normalization Rules

No repeating groups
• Typically occurs when multiple products on
single form
• Each product should occupy a separate row in
the table
First Normal Form
Purpose

Organize data in table
• Rows and Columns
• Allows flexibility for future expansion

Does not reduce redundancy
• Merely an organization step to set up for
normalization
• Does include specification of the primary
keys
First Normal Form
Step 1

Identify and define relevant data elements
• Usually, do not include calculated fields
– Can be easily derived when needed
• Exception is when need to store information for
historical purposes
– We will compute the sales price for a product
– Once computed and conveyed to the customer,
should not change the price
– For example, the underlying base price may change
after the order confirmation is sent to the customer
First Normal Form
Step 1 – cont’d

As identify fields, define:
• Data element name
• Field name within database (normally shorter
without spaces or special characters)
• Type of data to be stored (number, text, date,
etc)
• Size and format of stored data
– Number can be integer, single precision, currency, …
First Normal Form
Step 1 – cont’d
Which items should be included from this form?
First Normal Form
Step 2
Select relevant data elements
 Eliminate Repeating Groups

• Repeating groups occur when multiple items
allowed on form
– Product name, type, pounds ordered, cost-perpound are repeated on form
• Information on each product ordered should be
entered into separate row in table
First Normal Form
Repeating Groups
What information is repeated on this form?

Information on each product ordered should
be entered into separate row in table
• Note that we can now add multiple products per
order by adding rows to the table.
First Normal Form
Step 1
Select relevant data elements
 Eliminate Repeating Groups

• Each product ordered is entered into
separate row in table

Select Primary Key
• Combination of values within certain columns
used to distinguish one row from another
Select Primary Key

On forms with repeating information
(products):
• Need one component of key to identify
information in the non-repeating section
– Top portion of PO form
• Need one component of key to identify
information in the repeating section
– Bottom portion of PO form
First Normal Form
Select Primary Key
Which fields to use? Non-repeating, Repeating
Select Primary Key

Need to develop a product code to use instead
of product name
• Internally assigned, so we know they are unique
and stable
• As new products are added, we will assign new
codes
• Easy to use
– First letters
of product
name
Select Primary Key

Pick combination of:
• Purchase Order Number
– internally assigned, unique
• Product Code
– internally assigned, unique
Purchase Order Form in
First Normal Form
PONum and ProdCode will be primary key
 No repeating columns of information

Second Normal Form

Only required if primary key is “concatenated”
• Combination of data elements used in key

Technical definition is Full Functional
Dependence
• All components of key are needed to identify each
non-key field in the table

Purpose: Reduce redundancy & eliminate
modification anomalies
• What happens if need to change delivery date of
PO 412004?
Second Normal Form

Identify all possible combinations of Primary
Key Components
• PONum + ProdCode
• PONum
• ProdCode

For each non-key data element, determine
which combination of key components is
needed for unique identification
• May need to look through multiple documents to
understand relationships
Second Normal Form
Which non-key fields are functionally dependent on
PONum? ProdCode? Combination?
Business Process Assumptions

Unit Cost is negotiated
• Note cost difference between 412005 and
412006
Second Normal Form

PO Products – Combination of PONum &
ProdCode is primary key.
• All non-key fields have full functional dependence
on PONum & ProdCode?
– Do we always order 400 pounds of GA?
Second Normal Form

Products – ProdCode is primary key.
• All non-key fields have full functional dependence
on ProdCode
Second Normal Form

PO Details – PONum is primary key.
• All non-key fields have full functional dependence
on PONum
Third Normal Form

Remove Transitive Dependencies
• Technical definition: a non-key field has
functional dependence on a non-key field
• Non-Technical: Fields A, B and C where A is key.
– By definition, B and C have dependence on A
– What if you could also determine C based on
knowledge of B
Transitive Dependency
• Example
– Supplier name is not part of key, yet name can be used to
determine sales office, contact, and phone
• Assuming one contact per supplier
Solution - split Supplier Information
into another table

When create supplier table, must select a
primary key
• Name is not a good choice for the key

Create an internally assigned Supplier Code
Transitive Dependency
• Another problem – Ship Office
– Droubi’s has two shipping offices – name/code does not
uniquely identify shipping office – need separate table
Solution - split Supplier Shipping
Information into another table

Create supplier shipping code
Third Normal Form Tables
Still have a transitive dependency
Transitive Dependency

Does Zip Code determines City and State?
Third Normal Form Tables
Third Normal Form Tables

One-to-Many Relationships
• Single-field primary key always has cardinality of
One
• Multiple-field primary key has cardinality of Many
• Non-key field that is part of a relationship has
cardinality of Many
Receiving Report

First Normal Form
•
•
•
•
Select and define data elements of interest
Eliminate repeating groups
Identify primary key
Specify any assumptions you had to make.
Business Process Assumptions
Only one receiving warehouse
 Do we need to assume that the quantities
received are the same as quantities ordered?

Eliminate Repeating Groups

Multiple data elements for:
• Product Name
• Product Type
• Quantity Received

Place each “Product” in separate row
Identify Primary Key
Select data element from top (non-repeating)
and bottom (repeating)
 Best Key: Combination of:

• RRNumber
• Product Code
– Created for each product
Second Normal Form

Is the FNF table in Second Normal Form?
• Why or why not?
If not, what are the functional dependencies?
 What business process assumptions did you
make?

Second Normal Form
Required since primary key is concatenated
 Three Combinations

• RRNum
• ProdCode
• RRNum + ProdCode
Third Normal Form

Are the SNF tables in Third Normal Form?
• Why or why not?
If not, what are the transitive dependencies?
 What business process assumptions did you
make?

Third Normal Form

From business perspective:
• Makes sense to put supplier information in
separate table -- reduces redundancy

From Normalization perspective:
• Transitive dependence between:
– RRNum
– Supplier Name
– Supplier Ship Address, CSZ
• Will also deal with transitive dependence
between zip and city state
Practice Exercise
Payment Voucher

First Normal Form
•
•
•
•
Select and define data elements of interest
Eliminate repeating groups
Identify primary key
Business Process Assumptions
Primary Key?

One from non-repeating section
• PV-Num

One from each repeating section
• ProdCode
• RRNum
Second Normal Form

Is the FNF table in Second Normal Form?
• Why or why not?
If not, what are the functional dependencies?
 What business process assumptions did you
make?

Second Normal Form

Required since concatenated primary key
• Seven combinations
– PVNum
ProdCode
RRNum
– PVNum + ProdCode
– PVNum + RRNum
– RRNum + ProdCode
– PVNum + ProdCode + RRNum
Identify Functional
Dependencies
Third Normal Form

Are the SNF tables in Third Normal Form?
• Why or why not?
If not, what are the transitive dependencies?
 What business process assumptions did you
make?

Third Normal Form

Transitive dependencies?
• Supplier Information
– PVNum identifies Supplier Name
– Supplier Name identifies Supplier Address, CSZ
Common Tables?
ZipCityState
 Products
 Supplier Shipping Location

Database Schema for
Purchasing Cycle

These are the tables and relationships used
to generate the Purchase Order, Receiving
Report, and Payment Voucher
Assignment

Design Third Normal Form Datadase for Order
Fulfillment of CRC
• You must be able to create these reports from data
within the tables.
– Sales Order Confirmation
– Bill of Lading
– Backorder Notice
• Focus on the data, not the process
– Do not worry about what triggers the Backorder Notice,
just be able to create the report from the underlying
tables
Assignment

Once you have designed the tables:
 Use Access to document the relationships
between the tables and as your data dictionary
 Create empty tables in Access
 Enter the abbreviated field names, specify the data
type and size, enter the long name as the caption
 Set the primary key fields
Assignment

Use the “relationships” feature in Access to
define the relationships between tables
 Always set to enforce referential integrity
Assignment

Provide a description of the processes that
you used to design the third normal form
tables. Specifically:
 You must justify your selection of fields for the
primary keys
 You must describe how you know that each table
is in third normal form.

Provide all assumptions you had to make
regarding the nature of the business
operations to support the placement of the
various data elements within the tables.
Questions?
Download