IT360 Lab 7: Normalization DUE: March 4, 2014, 2359 (paper copy BEFORE start of lab) Some of the most difficult decisions that you face as a database developer are what tables to create and what columns to place in each table, as well as how to relate the tables that you create. Normalization is the process of applying a series of rules to ensure that your database achieves optimal structure. Normal forms are a progression of these rules. Each successive normal form achieves a better database design than the previous form did. Although we discussed several levels of normal forms, this lab focuses on 1st Normal Form (1NF), and Boyce-Codd Normal Form (BCNF). If you do not understand functional dependencies, then review the discussion on functional dependencies on the slides and your notes. 1. Functional Dependencies: Consider the following relation R with attributes A, B, and C, with 4 rows in it: A B C 1 3 2 1 4 2 1 5 2 2 6 2 For each of the following functional dependency, specify whether the dependency is true or false for the specific instance of the table given above: True/ False A→B _________ A→C _________ B→A _________ B→C _________ C→A _________ C→B _________ (A,B) → C _________ (A,C) → B _________ (B,C) → A _________ 2: 1st Normal Form (1NF) Consider the Students table, with the primary key underlined, and the following data: Students: Alpha 100111 092244 113221 090112 Name John Doe Matt Smith Melinda Black Tom Johnson Email doe@usna.edu smith@usna.edu black@usna.edu Johnson@usna.edu Courses NN204, SI204, IT221 SM223, EE301 SI204 NN204, SI204, IT221 StudentGrades C,B,B A,A B A,C,B a) Is the Students table in 1NF? Why? b) If the Students table is not in 1NF, redesign the table such that the resulting tables are in 1NF. For each of the resulting tables, give the table name, column names, primary keys, and foreign keys. Write a few rows in each of the new tables, based on the rows in the initial table. Hint: remember the Enroll table that we used in class while learning SQL 3. Functional Dependncies and BCNF Consider the following relational schema: StudentCourses(studentID, studentName, courseID, profID, profOffice) Each row in table StudentCourses encodes the fact that the student with the given ID and name took the given course from the professor with the given ID and office. Assume that students have unique IDs but not necessarily unique names, and professors have unique IDs but more professors might share an office. Each student has only one name; each professor has only one office. a) Specify the functional dependencies for table StudentCourses that encodes the assumptions described above and no additional assumptions. b) Specify a possible primary key for the StudentCourses table, based on the stated assumptions. c) Give one example of a deletion anomaly in the StudentCourses table. d) Is StudentCourses in Boyce-Codd Normal Form (BCNF)? Why? e) If StudentCourses is not in BCNF, decompose StudentCourses such that resulting tables are in BCNF. For each of the resulting tables, give the table name, column names, primary keys, and foreign keys. (no need to write the actual data in the resulting tables) 4: SQL Given the following tables: ITEM(ItemID, Description, PurchaseDate, Store, City, Weight, PriceInLocalCurrency, ExchangeRate) SHIPMENT(ShipmentID, ShipperName, ShipperInvoiceNumber, DepartureDate, ArrivalDate, InsuredValue) SHIPPED_ITEM(ShipmentID, ShipmentItemNb, ItemID, Value) Write the ANSI SQL query to find the ItemID and Description for the items with the lowest shipped Value. 5: (Extra credit) Boyce-Codd Normal Form (BCNF) Below is the Rentals table created for the "Movies On-demand" division of CamCast. Rentals: RentalID 1 Title Die Hard III CustomerID 1111 RentalDate 3/3/2012 2 1111 4 The last man standing Wedding Crashers Dodgeball 5 6 3 7 MovieCategory Old Price $4.25 3/3/2012 Director John McTiernan Walter Hill Old $4.25 1111 3/3/2012 David Dobkin New $5.50 1222 3/4/2012 New $5.50 Die Hard III 1222 3/4/2012 Old $4.25 As good as it gets Forest Gump 1332 1/7/2013 Old $4.25 1111 1/7/2013 Rawson Marshall Thurber John McTiernan James Brooks Robert Zemeckis Old $4.25 The primary key of the Rentals table is RentalID. d) Explain (English) the conditions under which the following functional dependency is true: RentalID -->CustomerID e) Based on the sample data on the table, is the functional dependency RentalID --> CustomerID true? f) Explain the conditions under which the following functional dependency is true: Director --> Title g) Based on the sample data on the table, is the functional dependency Director --> Title true? h) Based on your general knowledge of movies and rentals, is the functional dependency Director --> Title true? i) Write a functional dependency that expresses the fact that all movies in a given category have the same price. j) We discussed insertion anomalies, deletion anomalies and update anomalies as examples of problems that can appear in tables that are not normalized. The following is an example of an insertion anomaly in the Rentals table: if we want to create a new category of movies, “Must See”, there is no way to store the price of this type of movie in the database, until someone rents a movie in this category, and the rental information is recorded into the Rentals table. Give one example of a deletion anomaly in the Rentals table. k) State what you believe are reasonable functional dependencies for the Rentals table for a Movies On-demand business (include the functional dependencies from points a) to f) that you believe are/should be true). l) Given your answer above, decompose the Rentals table such that the resulting tables are in BCNF. For each of the resulting tables, give the table name, column names, primary keys, and foreign keys. (no need to write the actual data in the resulting tables) 6 (Extra credit): 4th Normal Form (4NF) Do some research (internet, textbook, slides) to learn about multivalued dependencies and 4NF. Suppose we have the following Courses table with columns CourseID, Instructor, Book that stores the courses, the instructor teaching the course, and the recommended books for the course. The book(s) recommended for a course does not depend on the teacher teaching the course, just on the course. Here is an example of instantiation for this table: Courses: CourseID Instructor Book IT360 J. Jones Kroenke IT360 J. Jones Welling IT360 Burke Kroenke IT360 Burke Welling SI440 J. Jones Kroenke SI440 J. Jones Ramakrishnan SI440 J. Jones Stonebraker a) Give an example of a multivalued dependency in the Courses table. b) Is the Courses table in 4NF? If answer to yes, say why. If not, decompose the table such that the resulting tables are in 4th normal form. For each of the resulting tables, give the table name, column names, primary keys, and foreign keys. Turn in paper copies (due before start of lab on March 4, 2014): Electronic (NOT Required for this lab): 1. Upload the yourlastname_yourfirstname_lab7.doc file containing all answers including the SQL question to Lab 7 assignment on Blackboard. Hard-copies: 1. The completed assignment coversheet. Your comments will help us improve the course. 2. A hard copy of your answers for each question.