Functional Dependencies and Normalisation

advertisement
Functional
Dependencies and
Normalisation
Why Normalisation

Normalisation with the use of Functional
Dependency is the key way of eliminating
data redundancy during the design of your
database.
Functional Dependencies
A functional dependency is a relationship
of one attribute or field in a record to
another.
 In a database, we often have the case
where one field defines the other.
 For example, we can say that Social
Security Number (SSN) defines a name.

Functional Dependencies
This means that if I have a database with
SSNs and names, and if I know
someone's SSN, then I can find their
name.
 By using the word “defines” we are saying
that for every SSN we will have one and
only one name.

Functional Dependencies
And hence we can come to a conclusion
that we have defined name as being
functionally dependent on SSN.
 The idea of a functional dependency is to
define one field as an anchor (link) from
which one can always find a single value
for another field.

Functional Dependencies

Another example, suppose that a
company assigned each employee a
unique employee number. Each
employee has a number and a name.
Names might be the same for two
different employees, but their employee
numbers would always be different and
unique because the company defined
them that way.
Functional Dependencies
We write a functional dependency (FD)
connection with an arrow:
SSN → Name
Or
EmpNo → Name
 The expression SSN → Name is read
"SSN defines Name" or "SSN implies
Name"

Functional Dependencies

Consider the table below:
Functional Dependencies (FD)
You have two people named Fred! It is
usual that Name will not be unique and it
is commonplace for two people to have
the same name.
 However, no two people have the same
EmpNo and for each EmpNo, there is a
Name.

Functional Dependencies (FD)
The FD X → Y means that for every
occurrence of X you will get the same
value of Y.
 consider another example

Functional Dependencies (FD)



Here, we will define two FDs: SSN → Name
and School → Location.
Further, we will define this FD: SSN →
School.
First, have we violated any FDs with our
data? Because all SSNs are unique, there
cannot be a FD violation of SSN → Name.
Why? Because a FD X → Y says that given
some value for X, you always get the same Y.
Because the X's are unique, you will always
get the same value. The same comment is
true for SSN → School.
Functional Dependencies (FD)
Note:
 If we define a functional dependency X →
Y and we define a functional dependency
Y → Z, then we know by inference that X
→ Z.
 We can define SSN → School. We also
can define School → Location, so we can
infer that SSN → Location.
Functional Dependencies (FD)

The inference we have illustrated is called
the transitivity rule of FD inference.
Transitivity rule:Given X → Y
Given Y → Z
Then X → Z
Functional Dependencies (FD)

A primary key constraint is a special case
of an FD.

The attributes in the key play the role of X,
and the set of all attributes in the relation
plays the role of Y.
Normal Forms and Normalisation

Normalization: The process of decomposing
unsatisfactory "bad" relations by breaking up
their attributes into smaller relations

Normal form: Condition using keys and FDs of
a relation to certify whether a relation schema is
in a particular normal form
Normal Forms and
Normalisation
In a given relation schema, we need to
decide whether it is a good design or
whether it requires a decomposition into
smaller relations.
 Such a decision must be guided by an
understanding of what problems from
current schema.
 Normal forms are used for such
guidance.

Normal Forms

Normal forms based on FDs are:
 First
normal form (1NF)
 Second normal form (2NF)
 Third normal form (3NF)
 Boyce-Codd normal form (BCNF)

The process of using normal forms to
improve database design by breaking
up relations into smaller relation is
called Normalization
Normal Forms

These forms have increasingly resrictive
requirements:
 Every
relation in BCNF is also in 3NF
 Every relation in 3NF is also in 2NF
 Every relation in 2NF is also in 1NF
Unnormalised Data-Set to First
normal form (1NF)
A relation is in first normal forms (1NF ) if
every field contains only atomic values,
that is, not lists or sets.
 Consider the data set below

Unnormalised Data-Set to First
normal form (1NF)
In the dataset above suppose we choose
the data item moduleName as the key of
this data-set.
 There are some cells in the table such as
studentNo, studentName, assGrade and
assType contains multiple values.

 All
repeat with respect to moduleName
First Normal Form
The attribute studentNo, studentName,
assGrade and assType are not
functionally dependent on our chosen
primary key moduleName.
 The attributes staffNo and staffName
are dependent to key.
 Form two tables; one for the functionally
dependent attributes and other for nonfunctionally dependent.

1NF Tables
1NF to 2NF
Remove part key dependencies
 Involves examining those tables that have
a compound key and for each non-key
data item in the table asking the question:
can the data-item be uniquely identified by
part of the compound key?

1NF to 2NF
A relation is in second normal form if and
only if it is in first normal form and every
non-key attribute is fully functionally
dependent on the primary key.
 E.g table Assessments have three-part
compound key moduleName, studentNo
and assType

1NF to 2NF
We observed that ModuleName has no
influence on the studentName.
 StudentNo alone determine
studentName.
 Hence we have to break out the
determinant and dependent data-items
into their own table.
 This lead to decomposition of the tables
as follows:

2NF Tables
2NF to 3NF



To move from second normal form to third
normal form we remove inter-data
dependencies.
To do this examine every table and ask of
each pair of non-key data-items; is the value
of data item A dependent on the value of
data-item B, or vice versa?
If the answer is yes, split off the relevant dataitems into a separate table
2NF to 3NF



A relation is in third normal form if and only if it is
in second normal form and every-key attribute is
non-transitively dependent on the primary key.
The only place where this is relevant is in the
table called Modules.
Here staffNo determines staffName. staffName
hence is transitively dependent on moduleName
2NF to 3NF
StaffNo is therefore asking to be a primary
key.
 Hence, we create a separate table to be
called Lecturers with staffNo as the
primary key as illustrated in the below
tables:

3NF Tables
Another Example

Consider the table below:
Title
Author1
Author
2
ISBN
Subject
Pages
Publisher
Database
System
Concepts
Abraham
Silberschatz
Henry F.
Korth
0072958863
MySQL,
Computers
1168
McGraw-Hill
Operating
System
Concepts
Abraham
Silberschatz
Henry F.
Korth
0471694665
Computers
944
McGraw-Hill
Download