Database Design and Maintenance Review • Proper database design ensures that the data is represented properly, tables are joined correctly, and that data can be easily and accurately retrieved. Relational Database Terms • Relation = Table • Tuple = Row • Attribute = Field Relational Database Characteristics • No two tuples can be exactly the same. • The order of tuples has no significance. • Each attribute must describe the relation, and have a unique name. • Each attribute can have only one value in a tuple. • An attribute must have the same set of possible values (domain) in all tuples. Building a Relational Database • • • • Designing tables Creating tables Joining tables Designing and creating other objects Recognizing Table Types • Master tables – contain data about people and things. • Lookup tables – contain data about groups or categories of information. • Bridge or Transaction tables – contain data about transactions and events. Often used to simplify many-to-many joins. Identifying a Primary Key • One or more fields that uniquely identify each record • Primary key field must not be blank in any record. Data Dependency • Functional dependency - when any attribute determines the value of another attribute. • Transitive dependency – when a non-key attribute determines the value of another attribute. • Partial dependency – when only one field in a multiple-field primary key determines the value of another attribute. Well-Structured Relations What constitutes a well-structured relation? Intuitively, a well-structured relation contains minimal redundancy and allows users to insert, modify, and delete rows in a table without errors or inconsistencies. EmpID Name Dept Salary 230 Pillsbury Marketing 58,000 241 Marshall Finance 68,400 277 Marco Accounting 66,000 279 Gunston Marketing 42,400 290 Jaffe Planning 49,000 EMPLOYEE1 Table Well-Structured Relations EMPLOYEE1 is a well-structured relation. Each row of the table contains data describing one employee, and any modification of an employee’s data (such as a change in salary) is confined to one row in the table. EmpID Name Dept Salary 230 Pillsbury Marketing 58,000 241 Marshall Finance 68,400 277 Marco Accounting 66,000 279 Gunston Marketing 42,400 290 Jaffe Planning 49,000 EMPLOYEE1 Table Well-Structured Relations In contrast, EMPLOYEE2 is not a well-structured relation. Notice the redundancy. For example, values for EmpID, Name, Dept, and Salary appear in two separate rows for employees 241 and 290. EmpID Name Dept Salary Course Date 230 Pillsbury Marketing 58,000 C++ 2/12/06 241 Marshall Finance 68,400 SPSS 5/30/07 241 Marshall Finance 68,400 Web Design 11/2/08 277 Marco Accounting 66,000 C# 12/8/07 279 Gunston Marketing 42,400 Java 9/10/06 290 Jaffe Planning 49,000 Tax Acct 4/22/06 290 Jaffe Planning 49,000 Bus Adm 6/6/08 EMPLOYEE2 Table Data Anomalies Redundancies in a table may result in errors or inconsistencies (called anomalies) when a user attempts to update the data in the table. There are three types of data anomalies: • Insertion anomaly – Suppose we need to add a new employee to EMPLOYEE2. Since the primary key is (EmpID, Course), to insert a row both EmpID and Course must be supplied. This is an anomaly, because the user should be able to enter employee data without supplying Course data. • Deletion anomaly – Suppose that the data for employee 241 are deleted. This will result in losing information that this employee completed a course (SPSS) on 5/30/07. • Modification anomaly – Suppose that employee 290 gets a salary increase. We must record the increase in each of the rows for that employee; otherwise the data will be inconsistent. The problem with relation EMPLOYEE2 is that it contains data about two entities: EMPLOYEE and COURSE. We will use normalization techniques to split EMPLOYEE2 into two relations, one for employee data and one for course data. Normalizing Tables On the previous four slides we presented an intuitive discussion of wellstructured relations. We need a more formal procedure for designing them. Normalization is the process of successively reducing relations with anomalies to produce smaller, well-structured relations. Some of the goals are: Minimize data redundancy, thereby avoiding anomalies and conserving storage space. Simplify the enforcement of referential integrity constraints. Make it easier to maintain data (insert, delete, update). Provide a better design that is an improved representation of the real world and a stronger basis for future growth. First Normal Form • All fields describe the entity represented by the table. • All fields contain simplest possible values. • No multivalued attributes (also called repeating groups). 1st NORMAL City Chicago State IL NOT Home Town Chicago, IL Not 1NF EmpID Dept CourseName DateCompleted 203 Finance Tax Accounting 6/22/07 421 Info Systems Java Database Mgt 10/7/07 6/4/06 666 Marketing A multivalued attribute Another multivalued attribute 1NF - Eliminating multivalued attributes EmpID Dept CourseName DateCompleted 203 Finance Tax Accounting 6/22/07 421 Info Systems Java 10/7/07 421 Info Systems Database Mgt 6/4/06 666 Marketing This new table does have only single-valued attributes and so satisfies 1NF. However, as we saw, the table still has some undesirable properties. Second Normal Form • Table is in First Normal Form. • No partial dependencies exist. (No nonkey fields are determined by only part of a multiple-field primary key, i.e., nonkeys are identified by the whole primary key) 2nd NORMAL Student ID* Course#* 12345 CIS 101 NOT Grade B Student ID* Course#* 12345 CIS 101 Determines Determines *Primary key Name Higgins Third Normal Form • Table is in Second Normal Form. • No transitive dependencies (no nonkey fields are determined by other nonkey fields, i.e., nonkeys are identified by only the primary key). 3rd NORMAL Course# * CIS 101 NOT Textbook Credits Intro to CIS 3 Course# * CIS 101 Textbook Book Price Intro to CIS $45.99 Determines Determines *Primary key Fourth and Fifth Normal Form • Fourth Normal Form – Table is 3NF and has at most one multivalued dependency. Can produce records with many blank values. • Fifth Normal Form – the table cannot be split into further tables. Advanced Field Properties • Lookup fields • Multiple-field primary keys • Indexes Lookup Field • Looks up a value in a joined table. • Specify “Lookup wizard” in Data Type list. • Creates an editable query. Using a Lookup Field Looks up data values from another table (or you can create your own list). Order Customer ID 1008 Smith,Ben S349 S349 Orders Customer ID Name S349 Smith,Ben Looks up Name in Customer table Customers Multiple-field Primary Keys • Also called compound keys or composite keys • A value in one field in the key can be repeated in multiple records, but not in all fields of the primary key. Student ID* Course# * 12345 CIS 101 Grade A 12345 B CIS 200 *Primary key Index • Field property that increases search speed. • Speeds up sorting and searching in Datasheet view and all database objects. Joining Tables • One-to-Many join is the most common. • Other join types: – One-to-One – Many-to-Many One to Many Table 1 Table 2 Rec. 1 Rec. 1 Rec. 2 One to One Many to Many Rec. 1 Rec. 1 Rec. 2 Rec. 1 Rec. 1 Rec. 2 Join Types • Inner Join • Left Outer Join • Right Outer Join Join types are discussed in the slides to follow, but also see the discussion at Join-queries.htm. Inner Join The default type - includes records with corresponding values in both tables. LASTNAME Gray Gray Cedarman Malderer Nale FIRSTNAME Eric Nadine YvonneHRS Kevin Rusty DEPARTMENT CIS CIS HRS HRS HRS HRS HRS Only red records are included in join. No department for Nale DEPARTMENT CODE 1 CIS HRS WHS NAME Computer Information Systems Human Resources Warehouse No Warehouse employees Left Outer Join Includes all records from One table and corresponding records from Many table LASTNAME Gray Gray Cedarman Malderer Nale FIRSTNAME Eric Nadine YvonneHRS Kevin Rusty DEPARTMENT CODE 1 CIS HRS WHS DEPARTMENT CIS HRS HRS Rusty Nale not included – no department assigned. NAME Computer Information Systems Human Resources Warehouse How to Change the Join Properties in Query Design View Double-click on the join line that connects the tables. Click the bullet next to the join property desired. For example, in the 240students.mdb database, suppose we want all students and the pets they own. All students should be listed regardless of whether they own a pet or not. This is perfect for a Left Outer Join, because it will select a student from tblStudents (the “One” table) even if there isn’t a match (between ID and OwnerID) in tblPets (the “Many” table). Here’s the SQL and the result of running the query SELECT ID, FirstName, Name, Breed FROM tblStudents LEFT JOIN tblPets ON tblStudents.ID = tblPets.OwnerID • The result of running this query (in the 240students.mdb database) Left-Join.htm. Right Outer Join Includes all records from Many table and corresponding records from One table. LASTNAME Gray Gray Cedarman Malderer Nale FIRSTNAME Eric Nadine YvonneHRS Kevin Rusty DEPARTMENT CODE 1 CIS HRS WHS DEPARTMENT CIS HRS HRS Warehouse dept. not included – no employees assigned. NAME Computer Information Systems Human Resources Warehouse Join Types Joins displayed as Venn Diagrams A A B Inner Join A B A B Left Outer Join A is “one” table B Right Outer Join B is “many” table U A and B are tables Green striped area is join dynaset. Join Types Selected by double clicking on join line between two tables in Relationship window, and clicking Join Type button. Referential Integrity • Referential integrity keeps the relationships between tables valid. • All foreign keys have values that correspond to records in the referenced table • Maintain referential integrity by: – Updating and deleting records when matching records in a joined table are updated and deleted. – Eliminating unmatched and duplicated records in joined tables. Enforcing Referential Integrity • • • • • Normalize tables Set field properties Use lookup fields Select specific join type settings Create and run Find Duplicate and Find Unmatched queries Find Duplicate Records Query • Locates records with duplicate values in Many table. Duplicates One table Many table Find Unmatched Records Query Locates records in Many table not associated with record in One table No match One table Many table Maintaining Databases • • • • • Older versions of Access can be converted to newer versions, and vice versa. Databases can be compacted and repaired using the Tools, Database Utilities command. Databases can be split into two databases: data and objects (back and front end). Databases can be documented using the Tools, Analyze, Documenter command. Database performance can be analyzed using the Tools, Analyze, Performance command. Object Groups • Named set of shortcuts that point to database objects • Grouped objects are listed together in a single window. Modifying Access Environment • Tools, Options command allows changes to the behavior of Access. • Access standard toolbars can be modified.