Introduction to Databases Normalization Functional Dependency Given a relation R, we say that the attribute y of R is functionally dependent on attribute x of R, if and only if, each value of x has association with precisely one y value in R at one time. Supplier-no. has FD with Sup-name, and City. Also (Sup-no and Part-no) has FD with QTY. Normalization Normalization is a step by step process of replacing the relationships between data by two dimension tables. The normalized relation must have the following properties: 1- Each entry in a table represents one data item, no repeating groups. 2- They have column homogeneous. 3- Each column assigned a distinct name. 4- All rows are distinct. 5- Both the rows and the columns can be viewed in any shape. Steps 1- The initial or the unnormalized relation, and this step must be converted into the First Normal Form (1NF). 2- The (1NF) is converted into Second Normal Form (2NF). 3- The (2NF) is converted into Third Normal Form (3NF). 4- The (3NF) is converted into Optimized Third Normal Form (Optimized 3NF). Unnormalized relation 1NF relations 2NF relations 3NF relations Optimized relations. *A relation R is in 1NF if and only if all underlying domains contain atomic values only (No repeating groups). *A relation R is in 2NF if and only if every non key attribute is fully dependent on the primary key. It means: 1- For each field in the 1NF relation, ask if it relates to the whole key or only part of the key. 2- If any field relates to only part of the key, remove that field from the 1NF relation to a new relation, which also contains the dependent key. *Converting from 2NF to 3NF is similar to the conversion to 2NF, in that we are looking for dependencies between each field in the relation. Example: An Organization needs to manage their training courses using the following data items: Employee Number, EmployeeName, Employee Address, CourseCode, CourseName, CoursePrice, CourseLocation, CoursePrerequisite. Having the following semantic assumptions: 1- Employee Number is not duplicated. 2- CourseCode is not duplicated 3- Courses have fixed prices and locations. 4- Each EmployeeName has a unique address. It is required to accomplish an optimized 3NF database. Unnormalized relation= (e-n,e-na, e-a, cc, cn, cp, cl, cpre) 1NF R1= (e-n, e-na, e-a) & R2= (e-n, cc, cn, cp, cl, cpr) 2NF R1 & R21= (cc, cp) & R22= (cc, cl) & R23= (e-n, cc, cn, cpre) 3NF R11= (e-na, e-a) & R12= (e-n, (e-n, or e-a)) & R21 & R22 & R23 Optimized 3NF R11 & R12 & R23 & R4= (cc, cp, cl) Example 2 An Equipment company needs to convert its documents into optimized Data bases. The body of the company reports the following attributes: Salesperson-number, S-name, S-working-area, Customernumber, C-name, Warehouse-number, W-location, Sales. We have the following semantic assumptions: 1- The salesperson-number is unique. 2- No two S-names have the same S-working-area. 3- No two W-locations have the same Warehouse-number. Example 3: A company needs to computerize their purchase orders, and they have a special document must be filled from the employee, and this document contains the following fields: Order-number, O-date, O-due-date, Supplier-number, Sname, S-address, Part-number, P-name, P-color, P-price, PQuantity. The company has the following semantic assumptions: 1- Order-number are never duplicated 2- No two suppliers have the same address. 3- Same parts have the same price. 4- Same parts have the same color.