Normalisation presentation

advertisement
Project and Data
Management Software
Data Analysis and Data Modelling
Normalisation
Project and Data Management Software
1
Normalisation
 Normalisation provides an algorithm for
reducing complex data structures into simple
structures
 Formalised by set of rules known as Codd’s
laws
 Tidying up the data so there is no data
redundancy
 Ensuring data is grouped logically
Project and Data Management Software
2
Why Use Normalization?
 Relations formed by the process makes the
data easier to understand and manipulate.
 Provides a stable base for future database
growth.
 Simplifies relations and reduces anomalies.
Project and Data Management Software
3
Stages of Normalization
 There are 3 stages:
 1st Normal Form – 1NF
 2nd Normal Form – 2NF
 3rd Normal Form – 3NF
 BCNF Boyce Codd Normal Form
 4NF also exists
Project and Data Management Software
4
First Normal Form – 1NF
 For a relation to be in 1NF all its attributes
must be atomic


Each attribute must contain a single value
not a repeating group of values.
Every non-primary key attribute must be
functionally dependent on the Primary Key.
Project and Data Management Software
5
Un-normalised data
Course Code
Course Desc
Employee Number
Name
Block
Room No
Date Joined Course
Allocated Hours
Project and Data Management Software
6
Un-normalised data
 A list of fields needed for the system
 E.g. Staff Development Course
 All staff are released for two hours a week for staff
dev.
 Employees work at their own pace in a lab.
 A total of six attributes are recorded about each
employee including their normal office location (block
and room), the date they joined the course and how
many hours it is planned for them to work on it.
Project and Data Management Software
7
First Normal Form (1NF)
 An entity is in 1NF if, and only if, it has an
identifying key and there are no repeating
attributes or groups of attributes
 To get to 1NF we must remove all repeating
groups (data elements)
Project and Data Management Software
8
Our Example
COURSE
Course Code
Course Desc.
EMP_ON_COURSE
Course Code
Employee Number
Name
Block
Room No
Date Joined Course
Allocated Hours
Project and Data Management Software
9
Second Normal Form (2NF)
 An entity is in 2NF if, and only if, it is in 1NF
and has no attributes which require only part
of the key to identify them uniquely
 To get to 2NF we remove part key
dependencies
 All data items must be dependant on the
primary key
Project and Data Management Software
10
Our Example
 Course is already in 2NF
 Emp_On_Course is not because
Attribute
Name
Block
RoomNo
Depends On
Employee No
Employee No
Employee No
Attribute
Date Joined
Hours
Depends On
Employee No + Course Code
Employee No + Course Code
Project and Data Management Software
11
So we..
 Take out details that are linked only to employee into
a separate table
 If in any doubt, ask a question such as ‘Are these
fields affected when they join a course’
Attribute
Name
Block
RoomNo
Depends On
Employee No
Employee No
Employee No
Project and Data Management Software
12
Cont.
COURSE
Course Code
Course Desc
EMP_ON_
COURSE
Course Code
Emp No
Date Joined
Course
Allocated Hours
Project and Data Management Software
EMPLOYEE
Emp No
Name
Block
Room No
13
Problems
 Block and Room Number are related, so if
one is updated the other will be affected.
 If the block names change, then the whole
of the employee records will have to be
altered
Project and Data Management Software
14
Third Normal Form (3NF)
 An entity is in 3NF if, and only if, it is in 2NF
and no non-key attribute depends on another
non-key attribute.
 To get to 3NF we must remove attributes that
depend on other non-key attributes
 It removes any mutual dependence between
non-key attributes
Project and Data Management Software
15
Third Normal Form 3NF
 In other words:

“The attributes is a relation in 3NF must
depend on the key, the whole key and nothing
but the key” !
Project and Data Management Software
16
How to do that: Dependency
 Decide on the direction of the dependency
between the attributes
 If B determines A, then A is dependant on B
 If A depends on B, create a new entity, keyed
by B, with A as an attribute
 Leave B in the original entity and mark it as a
foreign key, but remove A from the original
entity
Project and Data Management Software
17
Our Example: Dependency
 If, given a value for A, there is only one possible
value for B, then B is dependant on A
 Therefore, given a value for room no., there is only
one value for block. The same is not true vice-versa.
 Hence Block is dependent on Room No.
 Leave Room No in the original entity and mark it as a
foreign key, but remove Block from the original entity
Project and Data Management Software
18
Our Example
 Hence the EMPLOYEE (2NF) entity becomes
EMPLOYEE
LOCATION
Employee No
Name
Room No *
Room No
Block
* Room No is a foreign key in the
Employee entity
Project and Data Management Software
19
Entity Relationship Modelling
Course
Location
Emp_On_Course
Project and Data Management Software
Employee
20
Background - Keys
 Primary key
 Unique Identifier
 Can be made up of more than one attribute
and then is called a composite key
 If there is no obvious choice, use a number
 Foreign Key
 Does not belong to the entity
 Used to relate entity to entity
 A primary key in another table
Project and Data Management Software
21
To Normalise

Follow 3 simple steps
1. Remove all repeating data elements
2. Ensures data items are dependant on the
primary key
3. Remove all fields dependant on non-key
fields
Project and Data Management Software
22
Download