Data Design Normalization - Mr. Ahmad Al-Ghoul learning Objectives Explain the concept of normalization 2 A Normalization Example The three steps of data normalization are Remove all repeating groups and identify the primary key Ensure that all nonkey attributes are fully dependent on the primary key Remove any transitive dependencies, attributes which are dependent on other nonkey attributes 3 A Normalization Example A Normalization Example To show the normalization process, consider the familiar situation which might depict several entities in a school advising system: ADVISOR, COURSE, and STUDENT The relationships among the three entities are shown in the ERD below [1] 4 A Normalization Example In this example we will discuss the normalization rules for these three entities, to show how you can design a validate ERD Note that the relationship between student and course is many to many relationship (M:N) which can not create physically and as we said previously we must solve this problem by dividing this relationship to two one to many (1:M) relationship by adding a new entity called Associative entity the STUDENT table contains fields that relate to the ADVISOR and COURSE entities, so you can decide to begin with the initial design for the STUDENT table. 5 A Normalization Example Unnormalized student table STUDENT (STUDENT-NUMBER, STUDENT-NAME, TOTALCREDITS, GPA, ADVISOR-NUMBER, ADVISOR-NAME, (COURSE-NUMBER, COURSE-DESC, NUM-CREDITS, GRADE)) Repeating groups The STUDENT table is unnormalized because it contains a repeating group that represents the courses each student has taken. [1] 6 A Normalization Example 1NF To convert the STUDENT record to 1NF, you must expand the primary key to include the key of the repeating group STUDENT (STUDENT-NUMBER, STUDENTNAME, TOTAL-CREDITS, GPA, ADVISORNUMBER, ADVISOR-NAME, COURSE-NUMBER, COURSE-DESC, NUM-CREDITS, GRADE) 7 A Normalization Example Choose data items Identify keys Split groups Eliminate repeating groups UNF STUDENT-NUMBER, STUDENT-NAME, TOTAL-CREDITS, GPA, ADVISOR-NUMBER, ADVISOR-NAME, COURSE-NUMBER, COURSE-DESC, NUM-CREDITS, GRADE UNF to 1NF level 1 1 1 1 1 1 2 2 2 2 1NF STUDENT-NUMBER, STUDENT-NAME, TOTAL-CREDITS, GPA, ADVISOR-NUMBER, ADVISOR-NAME, COURSE-NUMBER, COURSE-DESC, NUM-CREDITS, GRADE 8 A Normalization Example The STUDENT table in 1NF. Notice that the primary key has been expanded to include STUDENT-NUMBER and COURSE-NUMBER. Also, the repeating group has been eliminated. [1] 9 A Normalization Example 2NF No attribute dependent on a portion of a primary key The student name, total credits, GPA, advisor number, and advisor name all relate only to the student number The course description depends on the course number Grade field depends on the entire primary key 10 A Normalization Example Does this attribute depend on the whole of the primary key? 1NF STUDENT-NUMBER, STUDENT-NAME, TOTAL-CREDITS, GPA, ADVISOR-NUMBER, ADVISOR-NAME, COURSE-NUMBER, COURSE-DESC, NUM-CREDITS, GRADE 1NF to 2NF 2NF STUDENT-NUMBER STUDENT-NAME TOTAL-CREDITS GPA ADVISOR-NUMBER ADVISOR-NAME COURSE-NUMBER COURSE-DESC NUM-CREDITS STUDENT-NUMBER COURSE-NUMBER GRADE 11 A Normalization Example You would create a new table for each field and combination of fields in the primary key STUDENT (STUDENT-NUMBER, STUDENT-NAME, TOTAL-CREDITS, GPA, ADVISOR-NUMBER, ADVISORNAME) COURSE (COURSE-NUMBER, COURSE-DESC, NUMCREDITS) GRADE (STUDENT-NUMBER, COURSE-NUMBER, GRADE) 12 A Normalization Example STUDENT, COURSE, and GRADE tables in 2NF. Notice that all fields are functionally dependent on the entire primary key of their respective tables. [1] 13 A Normalization Example 3NF Remove any transitive dependencies No attribute dependent on a nonkey attribute The COURSE and GREADE are in 3NF STUDENT is not in 3NF, because the ADVISOR-NAME field depends on the ADVISOR-NUMBER field, which is not part of the STUDENT primary key To convert STUDENT to 3NF, you remove the ADVISORNAME field from the STUDENT table and place it into a table with ADVISOR-NUMBER as the primary key 14 A Normalization Example Is this attribute dependent on any other non-key attribute(s)? 2NF STUDENT-NUMBER STUDENT-NAME TOTAL-CREDITS GPA ADVISOR-NUMBER ADVISOR-NAME COURSE-NUMBER COURSE-DESC NUM-CREDITS STUDENT-NUMBER COURSE-NUMBER GRADE 3NF STUDENT-NUMBER STUDENT-NAME TOTAL-CREDITS GPA ADVISOR-NUMBER ADVISOR-NUMBER ADVISOR-NAME COURSE-NUMBER COURSE-DESC NUM-CREDITS STUDENT-NUMBER COURSE-NUMBER GRADE 15 A Normalization Example 3NF STUDENT-NUMBER STUDENT-NAME TOTAL-CREDITS GPA ADVISOR-NUMBER Table Names STUDENT ADVISOR ADVISOR-NUMBER ADVISOR-NAME COURSE-NUMBER COURSE-DESC NUM-CREDITS STUDENT-NUMBER COURSE-NUMBER GRADE COURSE GRADE 16 A Normalization Example 3NF The final 3NF design is STUDENT (STUDENT-NUMBER, STUDENT-NAME, TOTAL-CREDITS, GPA, ADVISOR-NUMBER) ADVISOR (ADVISOR-NUMBER, ADVISOR-NAME) COURSE (COURSE-NUMBER, COURSE-DESC, NUMCREDITS) GRADE (STUDENT-NUMBER, COURSE-NUMBER, GRADE) 17 A Normalization Example ADVISOR ADVISES STUDENT RECEIVES is on COURSE SHOWS GRADE Associative entity The ERD for STUDENT, ADVISOR, and COURSE after normalization 18 A Normalization Example Compare between the ERD before normalization and after After normalization we have four entitles: STUDENT, ADVISOR, COURSE, and GRADE, which is an associative entity You can see that (M:N) relationship between STUDENT and COURSE has been converted into two 1:M relationships One relationship between STUDENT and GRADE and the other relationship between COURSE and GRADE You should know that normal forms beyond 3NF exist, but they rarely are used in business-oriented systems 19 Sequence Summary 1NF 2NF All key attributes defined No repeating groups in a table All attributes dependent on a primary key No attribute dependent on a portion of a primary key Attributes may be functionally dependent on nonkey attributes 3NF Remove any transitive dependencies No attribute dependent on a nonkey attribute 20 Sequence Summary In this Sequence we have Explained the three steps of data normalization through an example 21 Reference [1] System Analysis and Design, Sixth Edition Authors: Gary B. Shelly, Thomas J. Cashman and Harry J. Rosenblatt Publisher: SHELLY CASHMAN SEWIES. 22