Introduction to Databases – Lecture 1 Databases are used to store electronic information. In order to construct a database, you need to understand how data can and should be organized. In order to be able to use the data to obtain valuable information, you also have to understand how to find it in the database. Before we begin building and using our database, we will first discuss the benefits of using databases, components of database management systems, different ways to organize information, and types of databases. We refer to an electronic database as secondary storage. This means that the information is contained or housed in some sort of media such as magnetic disk (i.e. hard drive) or tape. Today’s fast access hard drives with large storage capacities make databases extremely useful for organizations to keep track of large amounts of data. Benefits of Databases Without a central database, an organization may need multiple files on the same subject or person. Duplicate information can cause a lack of data integrity where not all files contain the same information; in addition there would be a lot of wasted storage space. Think about a University student records system. When you applied to USA, your information was entered into a central database by the Admissions office. Those same records are accessible by the Registrar, the colleges and departments within the University, as well as other organizations. If, for example, you change your address, that information only needs to be changed in one central location and all organizations and departments can get access to the updated information. The many advantages of databases include: 1) Data Sharing – information in one department or organization can be readily shared with others. 2) Security – users can be given passwords and/or access only to the specific information they should have, while storing all of the information only once in the database. 3) Data Redundancy – fewer files are necessary; the data is stored only once and in one location. This reduces the storage space needed. 4) Data Integrity – Changes made in the file will update all occurrences of the information. Database Management Systems A Database Management System (DBMS) is special software that allows you to create, modify, and gain access to a database. Many are created specifically for microcomputers and others are created for minicomputers and mainframes, used by larger organizations. A data dictionary contains a description of the structure of the data used in the database. For a particular record, it defines the field names, what type of data can go into the field, the size of the field, and also defines which field is the key field. The part of the DBMS that allows access to the information in the database is the query language. The most widely used query language is SQL (structured query language). These commands allow us to view or extract information based on certain criteria. Later we will see how useful queries can be to extract meaningful information from the database which can also be used in reports. there are many benefits for creating an electronic database to store information. For example, users can access information quickly by finding it on the computer, and not having to go to locate something in a file cabinet. For example, a membership list of an organization is a practical use for a database. To establish a record, it must be determined what fields or pieces of information are wanted for each member. Keeping the first name and last name separate as well as city, state, and zip code fields will allow more flexibility in using other features. To create an organization membership directory the entries would be put in alphabetized by last name and formatted for each entry to be separate with the basic information included. If the organization has a phone committee that calls members, a report can be created with only names and phone numbers included. When letters are mailed the mail merge feature will allow individualized letters to be created. Mailing labels can be used for the mailings. If only the executive committee or members from a certain town need to be found, a query can be run to show only selected members. All of these various tasks can be completed and the original data only has to be entered once into the application. Often students ask, why not just store all of the information in a spreadsheet? Consider the temporary hiring agency example below. Employer ID 10126 Employer Name BeanTown Tours Boston Harbor Excursions DaySide Inn & Country Club 10190 The Briar Rose Inn 10191 Windsor Alpine Tours 10198 Trudel Spa & Resort 10126 10191 Baside Inn & Country Club Windsor 10122 10125 Address 105 State Street, Boston, MA 02109 Phone 617-4511970 75 Atlantic Avenue, Boston, MA 021110 354 Oceanside Drive, Brewster, MA 02631 105 Queen Street, Charlottetown PE CIA8R4 14 Longmeadow Road, Laconia, NH 03246 40 Rue Rivard, North Hatley, QC J0B 2CO 354 Oceanside Drive, Brewster, MA 02631 14 Longmeadow Position ID Position Title 2045 Tour Guide 617-2351800 2082 Reservationist 508-2835775 2040 Waiter/Waitress (902) 6361595 2053 Host/Hostess 603-2669233 2078 Ski Patrol 2066 Lifeguard 2073 2079 Pro Shop Clerk Day Care 8198427783 508-2835775 603-266- Alpine Tours 10126 Bayside Inn & Club Road, Laconia, NH 03246 354 Oceanside Drive, Brewster, MA 02631 9233 508-2835775 Worker 2111 Kitchen Help Look closely at the data in this figure. What problems do you see? If you examine carefully, there are problems with inconsistent data (data integrity issues), redundant data, inconsistent formats, and possibly inaccurate data, as noted in the figure below. A better solution is to break out the data into separate tables and use the tools of the DBMS to control formatting and minimize redundancy which will reduce storage requirements and limit integrity problems. Table 1 – Employer Info State/ Employer ID 10122 10125 10126 10190 10191 10198 Employer Name Address City Province Zip BeanTown Tours Boston Harbor Excursions BaySide Inn & Country Club The Briar Rose Inn Windsor Alpine Tours Trudel Spa & Resort 105 State Street 75 Atlantic Avenue 354 Oceanside Drive Boston MA 02109 Boston MA 21110 Brewster MA 02631 105 Queen Street 14 Longmeadow Road Charlottetown PE CIA8R4 Laconia NH 40 Rue Rivard North Hatley QC 03246 J0B 2CO Phone 617-4511970 617-2351800 508-2835775 902-6361595 603-2669233 819-8427783 Table 2 – Position Info Employer ID 10122 10125 10126 10190 10191 10198 10126 Position ID 2045 2082 2040 2053 2078 2066 2073 10191 10126 2079 2111 Position Title Tour Guide Reservationist Waiter/Waitress Host/Hostess Ski Patrol Lifeguard Pro Shop Clerk Day Care Worker Kitchen Help Notice that some redundancy is required. The Employer ID that occurs in both tables is an example of a common data element as mentioned earlier. This is necessary to establish a link between the records in each table, as we will see later. We mentioned earlier that the database is used to provide greater data integrity, reduced redundancy and greater access to information. Be aware that a poorly designed database will not be efficient or prevent users from making mistakes – as the author notes, garbage in, garbage out. A Relational Database Access is a relational database. A relational database is formed when there are related tables within one database. These relationships or logical associations between records are what allows us to minimize the amount of redundancy in our database and helps ensure greater data integrity. In a relational database, we typically have One-to-Many relationships between tables within a database – this is when the primary key of one table is also present in another related table. This allows for data integrity by ensuring that there is only one place for changes to be made to each field (a student record for instance, that is related to the courses table). When the primary key is duplicated in the related table, it is referred to as a foreign key. Unlike primary keys, foreign keys are not explicit; they are defined through the relationship we create between the fields in the related tables. When a relationship is established, you are given an option for enforcing referential integrity. Referential integrity is used to ensure that all child records (i.e. records with the foreign key) will have a parent record (i.e. a record with the same corresponding primary key). Consider our example Student Information System. If the master student record containing your JAG number was deleted, how would we know that JAG number of student J9999999 belonged to John Smith of Mobile, AL? If a child record does not have a corresponding parent record, this is referred to as an orphan record. In addition, when referential integrity is enforced, Access checks first to make sure that both the foreign key and primary key are of the same data type and that if there are records in the tables already, that every foreign key in the child table has a corresponding primary key in the parent table. If you are unable to establish a relationship, these are the things to check first. In the chapter 1 presentation, you will be introduced to the Access application, data manipulation, and relationships. Be sure that you read through the notes to accompany the slides and the textbook material carefully. This material is integral to understanding database design and will help you better understand the concepts presented. Lecture 2 Additional Discussion on Relationships We've discussed the importance of why we break our data into separate tables to minimize redundancy, reducing storage requirements, and for limiting data integrity problems. Let's look at a few more examples to again illustrate the flexibility and usefulness of a relational database. Consider an Employee Time Cards database where we wish to generate payroll reports. Listed below, we have three tables as part of our database design. The Employees table stores general information about our employees. It would not be efficient to store a field for every possible occurrence of a time card record, so instead we create a Time Cards table to store information pertaining to a time period instance. To link this information back to a particular employee, the primary key value (in this case, the SSN) is duplicated in the Time Cards table. Note that the SSN field in the Employee table (the primary key) is linked to a field called EmployeeID in the Time Cards table (referred to as a foreign key when stored in the Time Cards table). These fields do not need to be called by the same field name, but they must be the same data type. In addition, the EmployeeID in the Time Cards table must have a corresponding value in the Employees table. (Remember our discussion on orphan records from Chapter 1.) Those of you who have taken an Accounting course should remember that salaries are usually allocated by departments for budgetary and cost accounting reasons. We have also included a DepartmentID field to "charge back" that particular salary for that time period to one department. (The DepartmentID could have been included in the Employees table if that particular employee's salary was always tied to one department.) Now let's consider one more example for an Employee database. Suppose that we wish to maintain employee work location and health plan information. Listed below, we have three tables as part of our database design. The Employees table stores general information about our employees. Each Employee is assigned to one work location. Instead of maintaining duplicate location information for each employee at the same site, we instead create a Locations table and then store a Location identifier (LocationID) in the Employees table. Also in many organizations, employees are allowed to select from one of several available health plans. Again, to minimize redundant information and to keep from having to update multiple occurrences of health plan information such as yearly deductibles or premium changes, we create a Health Plans table and store a reference (a foreign key, the PlanID field) in the Employees table. Also notice on the diagrams presented above that the relationship line shows a "1" on the side of the table link where the primary key is located and an infinity symbol on the side where the foreign key is located. In our relationship from Locations to Employees, this represents the one-to-many relationship principle that a location will have multiple employees assigned but that the employee will only be assigned to one location. For our Health Plans to Employees relationship, each employee picks one health plan, but many employees may select the same plan. As we will see in our discussion in Chapters 3 and 4, we can create queries and reports displaying information from multiple tables. With our relationships established, these will be very easy to generate the information that we need. Now on to our discussion for Forms. Forms As discussed in chapter 1, forms are much more user-friendly for entering and displaying data. We can also structure the form to resemble a paper form and can also control the formatting and control access to information that we may not want the user to see. Form view displays a completed form and is used to enter and display data from the underlying table. Design view allows you to create and modify a form. All objects on a form are one of three types of controls: bound, unbound, or calculated. In order to display data from the underlying table or to enter data into a table, the control must be a bound control. Unbound controls are often used to provide information such as a heading for a form, to identify a field’s contents (i.e. as a label identifier), or may be used for aesthetics such as lines, graphics, or pictures. Calculated controls have a mathematical expression as its source of data. An example calculated control would be showing a current GPA on a student record or when you calculate revenue from a purchase order by multiplying an item price by quantity. All controls can be resized and moved to the desired location. (See information below regarding bound control placement in the detail section of the form). Properties: Forms and controls also have properties associated which determine how the object looks and behaves. As noted in the text discussion, format properties may be set through menu commands and toolbar options, or by directly modifying the property sheet. Creating Forms: There are several methods for creating forms: AutoForm, Form Wizard, and manually through Design View. The AutoForm tool allows you to create a form quickly based on all fields in the underlying table or query. The Form Wizard will allow you to select from several or all fields and also give you the option for selecting a particular layout and design style. Creating a form through Design View requires that you manually specify everything. You will probably feel that the Form Wizard provides the most flexibility while still allowing you to create a form in much less time. Regardless of which method you choose, you can customize the form as you wish adding titles, additional fields, and elements for aesthetics. The Detail section displays the fields from the underlying table; usually form titles and logos are added in the Header section; other descriptive information may also be placed in the Footer section. In the lab for this week, you will create a new form using the Form Wizard, add controls and set control properties, and add additional elements to create a finished form. Additional Controls Additional controls to improve the usability of a form are drop-down list boxes, check boxes, option groups, and command buttons. Drop-down list boxes display a list of possible selections to the user. The drop-down list box will automatically be created if the Lookup Wizard is used in the table field data type. This is useful when displaying a list of values that exist in another table or in a predetermined list that you want to associate with the current record. For example, suppose that in our student information system we need to allow our user to select from a list of majors. Allowing the user to select from the list of valid options makes data entry easier and will also protect the integrity of the database. Check boxes are used to display fields with Yes/No data types. Option groups are used to allow the user to select from a list of possible options. The difference between the dropdown list box and the option group is that the drop-down list is only displayed when the control is selected by the user. Option groups are always displayed on the screen. Command buttons are used to enable the user to perform procedures without having to know menu commands and to make the form more user-friendly. This makes it easier for a novice user to work within Access without having to understand details such as which menu bar options to save a record, for example. In this lab, you will also use the Lookup Wizard to modify and table and then you will modify the form to add the controls that we have just discussed.