Concepts of Database Management, Fifth Edition 6-1 Chapter 6 Database Design 2: Design Methodology At a Glance Table of Contents Overview Objectives Instructor Notes Quick Quizzes Key Terms Lecture Notes Overview In this chapter, students learn about the process of determining the tables (relations) and columns (attributes) that will make up the database. Students also learn to determine the relationships between various tables. Students discover that database design is a two-step process. In the first step, information-level design database designers design a database that satisfies the organization’s requirements as cleanly as possible. In the second step, physicallevel design designers transform the information-level design into a design for the specific DBMS that the organization will use. Information-level design is completed independently of any particular database management system. After examining the information-level design process, students explore general database design methodology. They learn how to construct entity-relationship diagrams. Students compare top-down and bottom-up approaches to database design. They learn how to use survey forms and existing documents in database design. They learn about entity subtypes. Finally, students learn about the entity-relationship model. Chapter Objectives Discuss the general process and goals of database design. Define user views and explain their function. Define Database Design Language (DBDL) and use it to document database designs. Create an entity-relationship (E-R) diagram to visually represent a database design. Present a methodology for database design at the information level and view examples illustrating this methodology. Explain the physical-level design process. Discuss top-down and bottom-up approaches to database design and examine the advantages and disadvantages of both methods. Use a survey form to obtain information from users prior to beginning the database design process. Review existing documents to obtain information prior to beginning the database design. Concepts of Database Management, Fifth Edition 6-2 Discuss special issues related to implementing one-to-one relationships and many-to-many relationships involving more than two entities. Discuss entity subtypes and their relationships to nulls. Learn how to avoid potential problems when merging third normal form relations. Examine the entity-relationship model for representing and designing databases. Instructor Notes User Views Database design is a two-step process. The first step is called information-level design, and it is completed independently of any particular DBMS. In the second step, physical-level design designers transform the in the information-level design into a design for a specific DBMS. A user view is a set of requirements that is necessary to support the operations of a particular database user. For each user view, designers must design the database structure to support the view and then merge it into a cumulative design, that is, a design that supports all the user views encountered thus far in the design process. Information-Level Design Methodology The information-level design methodology presented in this text involves representing individual user views, refining them to eliminate any problems, and then merging them into a cumulative design. When creating user views, a “user” can be a person or group that will use the system, a report, or a type of transaction. For each user view, the methodology requires you to complete the following steps: 1. 2. 3. 4. Represent the user view as a collection of tables Normalize these tables Identify all keys in these tables Merge the results of Steps 1 through 3 into the cumulative design. Represent the User View as a Collection of Tables When given a user view or some sort of stated requirement, you must develop a collection of tables that will support it. The steps involved in doing so are: 1. 2. 3. 4. Determine entities involved and create a separate table for each type of entity. Determine the primary keys for each of these tables. Determine the properties for each of these entities. Determine the relationships among the entities. There are three types of relationships: one-to-many, many-to-many, and one-to-one. There is a one-to-many relationship between the Rep table and the Customer table in the Premiere Products database. There is a many-tomany relationship between the Orders table and the Part table. To implement a man-to-many relationship, create a new table whose primary key is the combination of the primary keys of the original table. In the Premiere Products database, the primary key for the OrderLine table is the combination of PartNum and OrderNum. The simplest way to implement a one-to-one relationship is to treat is as a one-to-many relationship. Normalize the Tables Once relationships are established between entities, the next task is to normalize each table, with the target being third normal form. Represent All Keys For each table, you must identify the primary key, any alternate keys, secondary keys, and foreign keys. An alternate key is a column or collection of columns that could have been chosen as the primary key but was not. Secondary keys are columns that of interest strictly for the purpose of retrieval. A foreign key is a column or collection of columns in one table that is required to match the value of the primary key for some row in another table or be null. Foreign keys are used to establish referential integrity. Concepts of Database Management, Fifth Edition 6-3 Types of Primary Keys There are three types of primary keys that can be used in a database design: natural key, artificial key, and surrogate key. A natural key (also called a logical key or an intelligent key) is a primary key that consists of a column that uniquely identifies an entity. These characteristics are inherent to the entity and visible to the user. A column that is created for an entity to serve solely as the primary key and that is visible to users is called an artificial key. A surrogate key (or synthetic key) is a system-generated primary key that is usually hidden from users. Database Design Language (DBDL) Database Design Language (DBDL) is a mechanism for representing tables and keys. In DBDL, you represent a table by listing all columns and then underlining the primary key. Below the table definition, you list any alternate keys, secondary keys, and foreign keys. Use Figure 6.1 to explain DBDL. Entity-Relationship Diagrams An entity-relationship (E-R) diagram visually represents the structure of a database. There are several different styles of E-R diagrams. This text uses a style called IDEF1X. Use Figure 6.2 to describe E-R diagrams. Quick Quiz 1. A(n) _____key is a column that is used strictly for retrieval purposes. Answer: secondary 2. A(n) _____ key is a system-generated primary key that is usually hidden from users. Answer: surrogate (or synthetic) 3. A popular type of diagram that visually represents the structure of a database is the _____ diagram. Answer: entity-relationship (E-R) Merge the Result into the Design As soon as you have completed steps 1 through 3 (represent view as a collection of tables, normalize, identify all key) for a given user view, you can merge the results into a cumulative design. You combine tables that have the same primary key to form a new table, eliminate duplicate columns, and check the design to determine if the new tables are in third normal form. Repeat the process for each user view. Database Design Examples This section walks you through the information-level design of the two databases used in this text, Premiere Products and Henry Books. Spend time working through these examples with students. Physical-Level Design The physical-level design begins after the information-level design is completed. You must implement the design for a specific DBMS. Most DBMS support primary, alternate, secondary, and foreign keys. If a DBMS does not support these keys, you must devise an alternate scheme to ensure uniqueness of primary and alternate keys. You also must ensure that values in foreign keys are legitimate, that is, they match the value of the primary key in some row in another table. For secondary keys, it should be possible to retrieve data rapidly on the basis of a value of the secondary key. Top-Down Versus Bottom-Up You can design a database using a bottom-up design methodology or a top-down design methodology. In a bottom-up design methodology, specific user requirements are synthesized into a design. A top-down design methodology begins with a general database design that models the overall enterprise and that repeatedly refines the model to achieve a design that supports all necessary applications. One advantage of a top-down approach is that there is a more global feel to the project. A bottom-up approach provides a rigorous way of tackling each separate requirement and ensuring that it will be met. The ideal strategy combines the best of both approaches. Concepts of Database Management, Fifth Edition 6-4 Survey Form When designing a database, it may be helpful to design a survey form to obtain the required information from users. The survey form should contain the following information: entity information, attribute (column) information, relationships, functional dependencies, and processing information. If you want to survey a large population of users, consider using web-based survey forms. There are several inexpensive survey packages that allow you to create forms and store the responses electronically for later analysis. You also can use Microsoft FrontPage to create web-based survey forms. Obtaining Information from Existing Documents Existing documents, such as invoices, can provide helpful information concerning the database design. If an organization has a computerized system, current file layouts can furnish additional information about entities and attributes. One-to-One Relationship Considerations This section discusses potential problems and possible solutions for implementing one-to-one relationships between tables. Many-to-Many Relationship Considerations This section discusses potential problems and possible solutions for implementing many-to-many relationships between tables. It also describes many-to-many-to-many relationships that involve three entities. Nulls and Entity Subtypes A null actually represents the absence of a value in a field. Nulls are used when a value is either unknown or inapplicable. Suggested methods for handling null values are discussed. When you remove a column that contain null values from a table and create a separate table containing only non-null values, you create an entity subtype. There are specific methods for representing entity subtypes in E-R diagrams. Use Figures 6.27 through 6.29 to explain creating entity subtypes. Avoiding Problems with Third Normal Form When Merging Tables When you combine third normal form tables, the result may not be in third normal form. You can attempt to avoid the problem of creating a table that is not in third normal form by being cautious when representing user views. The problem occurs when a column A in one user view functionally determines a column B in a second user view. Thus, column A is a determinant for column B, yet column A is not a column in the second user view. The Entity-Relationship Model The entity-relationship (E-R) model is an approach to representing data in a database. There are several different versions but this section examines the E-R model proposed by Chen. This model has wide acceptance in the database field. In the standard E-R model, entities are represented as rectangles, relationships are drawn as diamonds and attributes are represented as ovals. Each object has a name. Relationship types are indicated by placing letters on the lines that connect entities. Use Figures 6.34 through 6.43 to explain E-R diagrams. Key Terms All key terms are defined in the Glossary section of the textbook. artificial key bottom-up design methodology cardinality category complete category composite entity cumulative design Database Design Language (DBDL) dependent entity entity subtype entity-relationship (E-R) diagram IDEF1X identifying relationship incomplete category independent entity information-level design intelligent key logical key mandatory role many-to-many relationship Concepts of Database Management, Fifth Edition many-to-many-to-many relationship natural key nonidentifying relationship optional role physical-level design secondary key 6-5 surrogate key synthetic key top-down design methodology user view weak entity