Introduction to Databases – Lecture 1

advertisement
Introduction to Databases – Lecture 1
Databases are used to store electronic information. In order to construct a database, you
need to understand how data can and should be organized. In order to be able to use the
data to obtain valuable information, you also have to understand how to find it in the
database. Before we begin building and using our database, we will first discuss the
benefits of using databases, components of database management systems, different ways
to organize information, and types of databases.
We refer to an electronic database as secondary storage. This means that the information
is contained or housed in some sort of media such as magnetic disk (i.e. hard drive) or
tape. Today’s fast access hard drives with large storage capacities make databases
extremely useful for organizations to keep track of large amounts of data.
Benefits of Databases
Without a central database, an organization may need multiple files on the same subject or
person. Duplicate information can cause a lack of data integrity where not all files contain
the same information; in addition there would be a lot of wasted storage space. Think
about a University student records system. When you applied to USA, your information was
entered into a central database by the Admissions office. Those same records are
accessible by the Registrar, the colleges and departments within the University, as well as
other organizations. If, for example, you change your address, that information only needs
to be changed in one central location and all organizations and departments can get access
to the updated information. The many advantages of databases include:
1)
Data Sharing – information in one department or organization can be readily shared
with others.
2)
Security – users can be given passwords and/or access only to the specific information
they should have, while storing all of the information only once in the database.
3)
Data Redundancy – fewer files are necessary; the data is stored only once and in one
location. This reduces the storage space needed.
4)
Data Integrity – Changes made in the file will update all occurrences of the
information.
Database Management Systems
A Database Management System (DBMS) is special software that allows you to create,
modify, and gain access to a database. Many are created specifically for microcomputers
and others are created for minicomputers and mainframes, used by larger organizations.
A data dictionary contains a description of the structure of the data used in the database.
For a particular record, it defines the field names, what type of data can go into the field,
the size of the field, and also defines which field is the key field.
The part of the DBMS that allows access to the information in the database is the query
language. The most widely used query language is SQL (structured query language).
These commands allow us to view or extract information based on certain criteria. Later we
will see how useful queries can be to extract meaningful information from the database
which can also be used in reports.
there are many benefits for creating an electronic database to store information. For example, users can access
information quickly by finding it on the computer, and not having to go to locate something in a file cabinet.
For example, a membership list of an organization is a practical use for a database. To
establish a record, it must be determined what fields or pieces of information are wanted for
each member. Keeping the first name and last name separate as well as city, state, and zip
code fields will allow more flexibility in using other features. To create an organization
membership directory the entries would be put in alphabetized by last name and formatted
for each entry to be separate with the basic information included. If the organization has a
phone committee that calls members, a report can be created with only names and phone
numbers included. When letters are mailed the mail merge feature will allow individualized
letters to be created. Mailing labels can be used for the mailings. If only the executive
committee or members from a certain town need to be found, a query can be run to show
only selected members. All of these various tasks can be completed and the original data
only has to be entered once into the application.
Often students ask, why not just store all of the information in a spreadsheet? Consider the
temporary hiring agency example below.
Employer
ID
10126
Employer
Name
BeanTown
Tours
Boston
Harbor
Excursions
DaySide Inn
& Country
Club
10190
The Briar
Rose Inn
10191
Windsor
Alpine Tours
10198
Trudel Spa &
Resort
10126
10191
Baside Inn &
Country Club
Windsor
10122
10125
Address
105 State Street,
Boston, MA 02109
Phone
617-4511970
75 Atlantic Avenue,
Boston, MA 021110
354 Oceanside
Drive, Brewster, MA
02631
105 Queen Street,
Charlottetown PE
CIA8R4
14 Longmeadow
Road, Laconia, NH
03246
40 Rue Rivard,
North Hatley, QC
J0B 2CO
354 Oceanside
Drive, Brewster, MA
02631
14 Longmeadow
Position
ID
Position Title
2045
Tour Guide
617-2351800
2082
Reservationist
508-2835775
2040
Waiter/Waitress
(902) 6361595
2053
Host/Hostess
603-2669233
2078
Ski Patrol
2066
Lifeguard
2073
2079
Pro Shop Clerk
Day Care
8198427783
508-2835775
603-266-
Alpine Tours
10126
Bayside Inn
& Club
Road, Laconia, NH
03246
354 Oceanside
Drive, Brewster, MA
02631
9233
508-2835775
Worker
2111
Kitchen Help
Look closely at the data in this figure. What problems do you see? If you examine carefully,
there are problems with inconsistent data (data integrity issues), redundant data,
inconsistent formats, and possibly inaccurate data, as noted in the figure below.
A better solution is to break out the data into separate tables and use the tools of the DBMS
to control formatting and minimize redundancy which will reduce storage requirements and
limit integrity problems.
Table 1 – Employer Info
State/
Employer
ID
10122
10125
10126
10190
10191
10198
Employer Name
Address
City
Province
Zip
BeanTown Tours
Boston Harbor
Excursions
BaySide Inn &
Country Club
The Briar Rose
Inn
Windsor Alpine
Tours
Trudel Spa &
Resort
105 State Street
75 Atlantic
Avenue
354 Oceanside
Drive
Boston
MA
02109
Boston
MA
21110
Brewster
MA
02631
105 Queen Street
14 Longmeadow
Road
Charlottetown
PE
CIA8R4
Laconia
NH
40 Rue Rivard
North Hatley
QC
03246
J0B
2CO
Phone
617-4511970
617-2351800
508-2835775
902-6361595
603-2669233
819-8427783
Table 2 – Position Info
Employer
ID
10122
10125
10126
10190
10191
10198
10126
Position ID
2045
2082
2040
2053
2078
2066
2073
10191
10126
2079
2111
Position Title
Tour Guide
Reservationist
Waiter/Waitress
Host/Hostess
Ski Patrol
Lifeguard
Pro Shop Clerk
Day Care
Worker
Kitchen Help
Notice that some redundancy is required. The Employer ID that occurs in both tables is an
example of a common data element as mentioned earlier. This is necessary to establish a
link between the records in each table, as we will see later.
We mentioned earlier that the database is used to provide greater data integrity, reduced
redundancy and greater access to information. Be aware that a poorly designed database
will not be efficient or prevent users from making mistakes – as the author notes, garbage
in, garbage out.
A Relational Database
Access is a relational database. A relational database is formed when there are related
tables within one database. These relationships or logical associations between records
are what allows us to minimize the amount of redundancy in our database and helps ensure
greater data integrity. In a relational database, we typically have One-to-Many relationships
between tables within a database – this is when the primary key of one table is also present
in another related table. This allows for data integrity by ensuring that there is only one
place for changes to be made to each field (a student record for instance, that is related to
the courses table). When the primary key is duplicated in the related table, it is referred to
as a foreign key. Unlike primary keys, foreign keys are not explicit; they are defined
through the relationship we create between the fields in the related tables.
When a relationship is established, you are given an option for enforcing referential
integrity. Referential integrity is used to ensure that all child records (i.e. records with the
foreign key) will have a parent record (i.e. a record with the same corresponding primary
key). Consider our example Student Information System. If the master student record
containing your JAG number was deleted, how would we know that JAG number of student
J9999999 belonged to John Smith of Mobile, AL? If a child record does not have a
corresponding parent record, this is referred to as an orphan record. In addition, when
referential integrity is enforced, Access checks first to make sure that both the foreign key
and primary key are of the same data type and that if there are records in the tables
already, that every foreign key in the child table has a corresponding primary key in the
parent table. If you are unable to establish a relationship, these are the things to check
first.
In the chapter 1 presentation, you will be introduced to the Access application, data
manipulation, and relationships. Be sure that you read through the notes to accompany the
slides and the textbook material carefully. This material is integral to understanding
database design and will help you better understand the concepts presented.
Lecture 2
Additional Discussion on Relationships
We've discussed the importance of why we break our data into separate tables to minimize
redundancy, reducing storage requirements, and for limiting data integrity problems. Let's
look at a few more examples to again illustrate the flexibility and usefulness of a relational
database.
Consider an Employee Time Cards database where we wish to generate payroll reports.
Listed below, we have three tables as part of our database design. The Employees table
stores general information about our employees. It would not be efficient to store a field for
every possible occurrence of a time card record, so instead we create a Time Cards table to
store information pertaining to a time period instance. To link this information back to a
particular employee, the primary key value (in this case, the SSN) is duplicated in the Time
Cards table. Note that the SSN field in the Employee table (the primary key) is linked to a
field called EmployeeID in the Time Cards table (referred to as a foreign key when stored in
the Time Cards table). These fields do not need to be called by the same field name, but
they must be the same data type. In addition, the EmployeeID in the Time Cards table
must have a corresponding value in the Employees table. (Remember our discussion on
orphan records from Chapter 1.) Those of you who have taken an Accounting course
should remember that salaries are usually allocated by departments for budgetary and cost
accounting reasons. We have also included a DepartmentID field to "charge back" that
particular salary for that time period to one department. (The DepartmentID could have
been included in the Employees table if that particular employee's salary was always tied to
one department.)
Now let's consider one more example for an Employee database. Suppose that we wish to
maintain employee work location and health plan information. Listed below, we have three
tables as part of our database design. The Employees table stores general information
about our employees. Each Employee is assigned to one work location. Instead of
maintaining duplicate location information for each employee at the same site, we instead
create a Locations table and then store a Location identifier (LocationID) in the Employees
table. Also in many organizations, employees are allowed to select from one of several
available health plans. Again, to minimize redundant information and to keep from having
to update multiple occurrences of health plan information such as yearly deductibles or
premium changes, we create a Health Plans table and store a reference (a foreign key, the
PlanID field) in the Employees table.
Also notice on the diagrams presented above that the relationship line shows a "1" on the
side of the table link where the primary key is located and an infinity symbol on the side
where the foreign key is located. In our relationship from Locations to Employees, this
represents the one-to-many relationship principle that a location will have multiple
employees assigned but that the employee will only be assigned to one location. For our
Health Plans to Employees relationship, each employee picks one health plan, but many
employees may select the same plan.
As we will see in our discussion in Chapters 3 and 4, we can create queries and reports
displaying information from multiple tables. With our relationships established, these will be
very easy to generate the information that we need.
Now on to our discussion for Forms.
Forms
As discussed in chapter 1, forms are much more user-friendly for entering and displaying
data. We can also structure the form to resemble a paper form and can also control the
formatting and control access to information that we may not want the user to see. Form
view displays a completed form and is used to enter and display data from the underlying
table. Design view allows you to create and modify a form.
All objects on a form are one of three types of controls: bound, unbound, or calculated. In
order to display data from the underlying table or to enter data into a table, the control
must be a bound control. Unbound controls are often used to provide information such
as a heading for a form, to identify a field’s contents (i.e. as a label identifier), or may be
used for aesthetics such as lines, graphics, or pictures. Calculated controls have a
mathematical expression as its source of data. An example calculated control would be
showing a current GPA on a student record or when you calculate revenue from a purchase
order by multiplying an item price by quantity. All controls can be resized and moved to the
desired location. (See information below regarding bound control placement in the detail
section of the form).
Properties: Forms and controls also have properties associated which determine how the
object looks and behaves. As noted in the text discussion, format properties may be set
through menu commands and toolbar options, or by directly modifying the property sheet.
Creating Forms: There are several methods for creating forms: AutoForm, Form Wizard,
and manually through Design View. The AutoForm tool allows you to create a form quickly
based on all fields in the underlying table or query. The Form Wizard will allow you to
select from several or all fields and also give you the option for selecting a particular layout
and design style. Creating a form through Design View requires that you manually specify
everything. You will probably feel that the Form Wizard provides the most flexibility while
still allowing you to create a form in much less time.
Regardless of which method you choose, you can customize the form as you wish adding
titles, additional fields, and elements for aesthetics. The Detail section displays the fields
from the underlying table; usually form titles and logos are added in the Header section;
other descriptive information may also be placed in the Footer section.
In the lab for this week, you will create a new form using the Form Wizard, add controls and
set control properties, and add additional elements to create a finished form.
Additional Controls
Additional controls to improve the usability of a form are drop-down list boxes, check boxes,
option groups, and command buttons. Drop-down list boxes display a list of possible
selections to the user. The drop-down list box will automatically be created if the Lookup
Wizard is used in the table field data type. This is useful when displaying a list of values
that exist in another table or in a predetermined list that you want to associate with the
current record. For example, suppose that in our student information system we need to
allow our user to select from a list of majors. Allowing the user to select from the list of
valid options makes data entry easier and will also protect the integrity of the database.
Check boxes are used to display fields with Yes/No data types. Option groups are used
to allow the user to select from a list of possible options. The difference between the dropdown list box and the option group is that the drop-down list is only displayed when the
control is selected by the user. Option groups are always displayed on the screen.
Command buttons are used to enable the user to perform procedures without having to
know menu commands and to make the form more user-friendly. This makes it easier for a
novice user to work within Access without having to understand details such as which menu
bar options to save a record, for example. In this lab, you will also use the Lookup Wizard
to modify and table and then you will modify the form to add the controls that we have just
discussed.
Download