Homework 1 3.34. Consider a CONFERENCE_REVIEW database in which researchers submit their research papers for consideration. Reviews by reviewers are recorded for use in the paper selection process. The database system caters primarily to reviewers who record answers to evaluation questions for each paper they review and make recommendations regarding whether to accept or reject the paper. The data requirements are summarized as follows: ▪ Authors of papers are uniquely identified by e-mail id. First and last names are also recorded. ▪ Each paper is assigned a unique identifier by the system and is described by a title, abstract, and the name of the electronic file containing the paper. ▪ A paper may have multiple authors, but one of the authors is designated as the contact author. ▪ Reviewers of papers are uniquely identified by e-mail address. Each reviewer’s first name, last name, phone number, affiliation, and topics of interest are also recorded. ▪ Each paper is assigned between two and four reviewers. A reviewer rates each paper assigned to him or her on a scale of 1 to 10 in four categories: technical merit, readability, originality, and relevance to the conference. Finally, each reviewer provides an overall recommendation regarding each paper. ▪ Each review contains two types of written comments: one to be seen by the review committee only and the other as feedback to the author(s). Design an entity–relationship diagram for the CONFERENCE_REVIEW database and build the design using a data modeling tool such as ERwin or Rational Rose. 4.30. Consider a database system for a baseball organization such as the major leagues. The data requirements are summarized as follows: ▪ The personnel involved in the league include players, coaches, managers, and umpires. Each is identified by a unique personnel id. They are also described by their first and last names along with the date and place of birth. ▪ Players are further described by other attributes such as their batting orientation (left, right, or switch) and have a lifetime batting average (BA). ▪ Within the players group is a subset of players called pitchers. Pitchers have a lifetime ERA (earned run average) associated with them. ▪ Teams are uniquely identified by their names. Teams are also described by the city in which they are located and the division and league in which they play (such as Central division of the American League). ▪ Teams have one manager, a number of coaches, and a number of players. ▪ Games are played between two teams, with one designated as the home team and the other the visiting team on a particular date. The score (runs, hits, and errors) is recorded for each team. The team with the most runs is declared the winner of the game. ▪ With each finished game, a winning pitcher and a losing pitcher are recorded. In case there is a save awarded, the save pitcher is also recorded. ▪ With each finished game, the number of hits (singles, doubles, triples, and home runs) obtained by each player is also recorded. Design an enhanced entity–relationship diagram for the BASEBALL database and enter the design using a data modeling tool such as ERwin or Rational Rose. 5.11. Suppose that each of the following Update operations is applied directly to the database state shown in Figure 5.6. Discuss all integrity constraints violated by each operation, if any, and the different ways of enforcing these constraints. a. Insert <‘Robert’, ‘F’, ‘Scott’, ‘943775543’, ‘1972-06-21’, ‘2365 Newcastle Rd, Bellaire, TX’, M, 58000, ‘888665555’, 1> into EMPLOYEE. b. Insert <‘ProductA’, 4, ‘Bellaire’, 2> into PROJECT. c. Insert <‘Production’, 4, ‘943775543’, ‘2007-10-01’> into DEPARTMENT. d. Insert <‘677678989’, NULL, ‘40.0’> into WORKS_ON. e. Insert <‘453453453’, ‘John’, ‘M’, ‘1990-12-12’, ‘spouse’> into DEPENDENT. f. Delete the WORKS_ON tuples with Essn = ‘333445555’. g. Delete the EMPLOYEE tuple with Ssn = ‘987654321’. h. Delete the PROJECT tuple with Pname = ‘ProductX’. i. Modify the Mgr_ssn and Mgr_start_date of the DEPARTMENT tuple with Dnumber = 5 to ‘123456789’ and ‘2007-10-01’, respectively. j. Modify the Super_ssn attribute of the EMPLOYEE tuple with Ssn = ‘999887777’ to ‘943775543’. k. Modify the Hours attribute of the WORKS_ON tuple with Essn = ‘999887777’ and Pno = 10 to ‘5.0’. 9.4. Figure 9.8 shows an ER schema for a database that can be used to keep track of transport ships and their locations for maritime authorities. Map this schema into a relational schema and specify all primary keys and foreign keys. 9.5. Map the BANK ER schema of Exercise 3.23 (shown in Figure 3.21 4.9) into a relational schema. Specify all primary keys and foreign keys. Repeat for the AIRLINE schema (Figure 3.20) of Exercise 3.19 and for the other schemas for Exercises 3.16 through 3.24. Homework 2 8.16. Specify the following queries on the COMPANY relational database schema shown in Figure 5.5 using the relational operators discussed in this chapter. Also show the result of each query as it would apply to the database state in Figure 5.6. a. Retrieve the names of all employees in department 5 who work more than 10 hours per week on the ProductX project. d. For each project, list the project name and the total hours per week (by all employees) spent on that project. e. Retrieve the names of all employees who work on every project. g. For each department, retrieve the department name and the average salary of all employees working in that department. i. Find the names and addresses of all employees who work on at least one project located in Houston but whose department has no location in Houston. 8.18. Consider the LIBRARY relational database schema shown in Figure 8.14, which is used to keep track of books, borrowers, and book loans. Referential integrity constraints are shown as directed arcs in Figure 8.14, as in the notation of Figure 5.7. Write down relational expressions for the following queries: b. How many copies of the book titled The Lost Tribe are owned by each library branch? d. For each book that is loaned out from the Sharpstown branch and whose Due date is today, retrieve the book title, the borrower’s name, and the borrower’s address. f. Retrieve the names, addresses, and number of books checked out for all borrowers who have more than five books checked out. 8.19. Specify the following queries in relational algebra on the database schema given in Exercise 5.14: CUSTOMER(Cust#, Cname, City) ORDER(Order#, Odate, Cust#, Ord_amt) ORDER_ITEM(Order#, Item#, Qty) ITEM(Item#, Unit_price) SHIPMENT(Order#, Warehouse#, Ship_date) WAREHOUSE(Warehouse#, City) d. List the orders that were not shipped within 30 days of ordering. e. List the Order# for orders that were shipped from all warehouses that the company has in New York. Homework 3 6.10. Specify the following queries in SQL on the COMPANY relational database schema shown in Figure 5.5. Show the result of each query if it is applied to the COMPANY database in Figure 5.6. a. Retrieve the names of all employees in department 5 who work more than 10 hours per week on the ProductX project. b. List the names of all employees who have a dependent with the same first name as themselves. c. Find the names of all employees who are directly supervised by ‘Franklin Wong’. 6.12. Specify the following queries in SQL on the database schema of Figure 1.2. b. Retrieve the names of all courses taught by Professor King in 2007 and 2008. d. Retrieve the name and transcript of each senior student (Class = 4) majoring in CS. A transcript includes course name, course number, credit hours, semester, year, and grade for each course completed by the student. Homework 5 14.24. Consider the universal relation R = {A, B, C, D, E, F, G, H, I, J} and the set of functional dependencies F = {{A, B} → {C}, {A} → {D, E}, {B} → {F}, {F} → {G, H}, {D} → {I, J}}. What is the key for R? Decompose R into 2NF and then 3NF relations. 14.25. Repeat Exercise 14.24 above for the following different set of functional dependencies: G = {{A, B} → {C}, {B, D} → {E, F}, {A, D} → {G, H}, {A} → {I}, {H} → {J}}. 15.26. Apply Algorithm 15.2(a) to the relation in Exercise 14.24 to determine a key for R. Create a minimal set of dependencies G that is equivalent to F, and apply the synthesis algorithm (Algorithm 15.4) to decompose R into 3NF relations. 15.31. Consider the following decompositions for the relation schema R of Exercise 14.24. Determine whether each decomposition has (1) the dependency preservation property, and (2) the lossless join property, with respect to F. Also determine which normal form each relation in the decomposition is in. a. D1 = {R1, R2, R3, R4, R5}; R1 = {A, B, C}, R2 = {A, D, E}, R3 = {B, F}, R4 = {F, G, H}, R5 = {D, I, J} b. D2 = {R1, R2, R3}; R1 = {A, B, C, D, E}, R2 = {B, F, G, H}, R3 = {D, I, J} c. D3 = {R1, R2, R3, R4, R5}; R1 = {A, B, C, D}, R2 = {D, E}, R3 = {B, F}, R4 = {F, G, H}, R5 = {D, I, J} 5.14. Consider the following six relations for an order-processing database application in a company: CUSTOMER(Cust#, Cname, City) ORDER(Order#, Odate, Cust#, Ord_amt) ORDER_ITEM(Order#, Item#, Qty) ITEM(Item#, Unit_price) SHIPMENT(Order#, Warehouse#, Ship_date) WAREHOUSE(Warehouse#, City) Here, Ord_amt refers to total dollar amount of an order; Odate is the date the order was placed; and Ship_date is the date an order (or part of an order) is shipped from the warehouse. Assume that an order can be shipped from several warehouses. Specify the foreign keys for this schema, stating any assumptions you make. What other constraints can you think of for this database? Mentioned by Exercise 9.5 3.16. Which combinations of attributes have to be unique for each individual SECTION entity in the UNIVERSITY database shown in Figure 3.20 to enforce each of the following miniworld constraints: a. During a particular semester and year, only one section can use a particular classroom at a particular DaysTime value. b. During a particular semester and year, an instructor can teach only one section at a particular DaysTime value. c. During a particular semester and year, the section numbers for sections offered for the same course must all be different. Can you think of any other similar constraints? 3.17. Composite and multivalued attributes can be nested to any number of levels. Suppose we want to design an attribute for a STUDENT entity type to keep track of previous college education. Such an attribute will have one entry for each college previously attended, and each such entry will be composed of college name, start and end dates, degree entries (degrees awarded at that college, if any), and transcript entries (courses completed at that college, if any). Each degree entry contains the degree name and the month and year the degree was awarded, and each transcript entry contains a course name, semester, year, and grade. Design an attribute to hold this information. Use the conventions in Figure 3.5. 3.18. Show an alternative design for the attribute described in Exercise 3.17 that uses only entity types (including weak entity types, if needed) and relationship types. 3.19. Consider the ER diagram in Figure 3.21, which shows a simplified schema for an airline reservations system. Extract from the ER diagram the requirements and constraints that produced this schema. Try to be as precise as possible in your requirements and constraints specification. 3.20. In Chapters 1 and 2, we discussed the database environment and database users. We can consider many entity types to describe such an environment, such as DBMS, stored database, DBA, and catalog/data dictionary. Try to specify all the entity types that can fully describe a database system and its environment; then specify the relationship types among them, and draw an ER diagram to describe such a general database environment. 3.21. Design an ER schema for keeping track of information about votes taken in the U.S. House of Representatives during the current two-year congressional session. The database needs to keep track of each U.S. STATE’s Name (e.g., ‘Texas’, ‘New York’, ‘California’) and include the Region of the state (whose domain is {‘Northeast’, ‘Midwest’, ‘Southeast’, ‘Southwest’, ‘West’}). Each CONGRESS_PERSON in the House of Representatives is described by his or her Name, plus the District represented, the Start_date when the congressperson was first elected, and the political Party to which he or she belongs (whose domain is {‘Republican’, ‘Democrat’, ‘Independent’, ‘Other’}). The database keeps track of each BILL (i.e., proposed law), including the Bill_name, the Date_of_vote on the bill, whether the bill Passed_or_failed (whose domain is {‘Yes’, ‘No’}), and the Sponsor (the congressperson(s) who sponsored—that is, proposed—the bill). The database also keeps track of how each congressperson voted on each bill (domain of Vote attribute is {‘Yes’, ‘No’, ‘Abstain’, ‘Absent’}). Draw an ER schema diagram for this application. State clearly any assumptions you make. 3.22. A database is being constructed to keep track of the teams and games of a sports league. A team has a number of players, not all of whom participate in each game. It is desired to keep track of the players participating in each game for each team, the positions they played in that game, and the result of the game. Design an ER schema diagram for this application, stating any assumptions you make. Choose your favorite sport (e.g., soccer, baseball, football). 3.23. Consider the ER diagram shown in Figure 3.22 for part of a BANK database. Each bank can have multiple branches, and each branch can have multiple accounts and loans. a. List the strong (nonweak) entity types in the ER diagram. b. Is there a weak entity type? If so, give its name, partial key, and identifying relationship. c. What constraints do the partial key and the identifying relationship of the weak entity type specify in this diagram? d. List the names of all relationship types, and specify the (min, max) constraint on each participation of an entity type in a relationship type. Justify your choices. e. List concisely the user requirements that led to this ER schema design. f. Suppose that every customer must have at least one account but is restricted to at most two loans at a time, and that a bank branch cannot have more than 1,000 loans. How does this show up on the (min, max) constraints? 3.24. Consider the ER diagram in Figure 3.23. Assume that an employee may work in up to two departments or may not be assigned to any department. Assume that each department must have one and may have up to three phone numbers. Supply (min, max) constraints on this diagram. State clearly any additional assumptions you make. Under what conditions would the relationship HAS_PHONE be redundant in this example?