00-DBS201 Definition..

advertisement
00-DBS201 Definitions Part 1
A DBMS
(Database Management System) is a program, or collection of programs, through which users interact with a database. It
both stores and manipulates data, which is stored in structures called "tables".
Table
A table is a two-dimensional representation of data in rows and columns.
Database
A structure that can store information about multiple entities, the attributes of those entities, and the relationships between
those entities is called a database. Each entity is stored in a table and the attributes are columns of the table. Relating
many separate tables makes up a database.
Database Design
Is a process of designing entities and their attributes, and the relationships between the entities.
Database Application
User-oriented programs to enter, update, and delete stored data.
Form, Report, or View
Screen or print objects used to view, print, and maintain data from a database.
Database Models
There are 4 types of database models for organizing the database structure:
1. Network
2. Hierarchical
3. Relational
This DBS201 course focuses on the relational database model.
4. Object-oriented
The Relational Database Model
A relational database is a collection of relations (see "relation" definition below).
The relational database model consists of four components:
1. Entity
(Chapter 1, page 6)
An entity represents real world objects. It is a person, place, thing, or event for which we intend to store and process data.
Entities are represented in the relational database model as tables (or relations).
Table (Relation)
(Chapter 2, page 30)
A relation is a table (a two-dimensional representation of data in rows and columns), which also has the following
properties:
a) All data entries in the tables are single-valued
i.e.: each column and row "cell" has one, and only one, value of data information.
This one value of data information must be broken into it's smallest component.
b) Each column (attribute) must have a distinct name.
c) All data values in a column must be of the same data type (text, number or date).
d) Each row must be distinct (unique), and is identified by a single column (attribute) value or a combination of
columns (called the Primary Key). There is only 1 PK even if it is made up of several attributes.
e) The row order is not important.
f) The column order is not important.
The Primary Key column (attribute) uniquely identifies each row of a relation (table).
We must make sure that we choose appropriate columns as the PK.
Candidate Key
An attribute or group of attributes that could be chosen as a Primary Key, but are not chosen as the Primary Key due to
design reasons such as data security or table efficiency.
Example:
Every working person in Canada has a Social Insurance Number (SIN). We could use the SIN to identify our
employees and this would work great. However, a SIN is confidential information, just as a STUDENT NUMBER
is confidential. Therefore, we don't use SIN or STUNUM as PK. Instead, we create another identifier column such
as EMP_ID or STU_ID and give each person a unique number.
Example:
FirstName and LastName are candidate keys for an EMPLOYEE table, but due to possible name duplication it is
an inappropriate choice. Therefore, we decide to choose an EmpID column as the primary key. EmpId becomes
a Primary key; and FirstName + LastName become a candidate key.
2. Attribute
Attributes are columns of a table. Each column, or attribute, represents a characteristic of an entity (or, a piece of
information about an entity). See definition of "relation" above for attribute rules.
A Derived Attribute is an attribute whose values can be calculated (or generated) from other attributes. In general, we do
NOT store derived attributes in our relations (entities), but we store the attributes, which are used to calculate the derived
value.
3. Record
A record is a row of a table, also called a tuple.
4. Relationship
(Chapter 1, page 6)
A relationship is an association between entities.
Associations (relationships) between entities are formed when an entity's Primary Key attribute is copied as an attribute of
a second entity. This second entity's attribute is called the Foreign Key attribute. The Foreign Key attribute is used to
establish a relationship between two tables.
The Foreign Key column is an attribute whose value must be in the range of primary key values of another entity. The
value entered into a foreign key “cell” must exist already as one of the primary keys in another entity. It is through this
common value that tables can be joined together.
Associations can be of three types:
(a) 1:1 - one to one
One instance of the first entity can be related to only one instance of the second entity.
Or, put another way, the first entity's Primary Key value can be found only once as a Foreign Key value in the second
entity. This does not happen very often.
(b) 1:M - one to many
One instance of the first entity can be related to many instances of the second entity.
Or, put another way, the first entity's Primary Key value can be found many times as a Foreign Key value in the second
entity. This is NOT true in the other direction.
(c) M:N - many to many
One instance of the first entity can be related to many instances of the second entity, and one instance of the second
entity can be related to many instances of the first entity.
Or, put another way, the first entity's Primary Key value can be found many times as a Foreign Key value in the second
entity. AND, the second entity's Primary Key value can be found many times as a Foreign Key value in the first entity.
Many-to-many relationships are hard to implement and are actually implemented through the use of another table. See
next paragraph.
A Bridge entity/table, or Composite entity/table, is an entity/table in the relational database model that is required to
implement many to many relationships between entities.
The Normalization Steps
Normalization is a process that tries to minimize problems that occur when we store or manipulate data (add, change and
delete). Problems occur most often when data is stored in more than one location. This is called redundancy.
(Book Chapter 1 page 3)
Redundancy occurs when data has been duplicated within a single table or between 2 or more tables. Some of the
problems it causes are: (1) wasted storage space, (2) data changes are cumbersome and time-consuming, and (3) leads
to inconsistencies
An Inconsistency occurs when the same piece of data is stored in more than one place with more than one spelling or
format. This will require complex SQL operations with the database when data is updated, inserted, or deleted.
UNF (Un-Normal Form Relation)
A table (relation) that has one or more repeating groups is said to be in un-normalized form.
(Chapter 02, page 32)
(Chapter 05, page 145)
Each row and column should store a single piece of data. Problems will occur if there are multiple entries of data for a row
and column, called a repeating Group
When creating UNF tables:
- Place brackets around repeating groups
- Calculated (derived) attributes are to be removed
- An identifier should be chosen which best reflects the information in the view (an ID or CODE or other identifier if one
exists)
1NF (First Normal Form Relation)
First Normal Form
A table (relation) is in first normal form (1NF) if it does not contain repeating groups.
1. All key (prime) attributes are defined.
2. There are no repeating groups in the table's composite key.
3. All attributes are dependent on the primary key.
When creating 1NF tables:
- Start with the most embedded group and join its columns and identifier to the identifier of the parent group. This join will
create a bridge/composite table consisting of two or more Primary Keys. This table implements a M:N relationship
between the two identified groups of attributes.
- If a proper identifier (Primary Key) does not exist, then you must create an appropriate identifier.
2NF (Second Normal Form Relation)
Second Normal Form
A table (relation) is in second normal form (2NF) if it is in first normal form and no nonkey attribute is dependent on only a
portion of the primary key.
1. Table is in 1NF
2. The table has no partial dependencies
When creating 2NF tables:
- Split apart tables with 2 or more Primary Keys and assign the columns to the new tables or leave the columns in the
composite table.
- Break all combined columns into their smallest forms, such as a person's name becoming Fname and Lname, or an
address becoming Street, City, Province, and Postal
3NF (Third Normal Form Relation)
Third Normal Form
A table is in third normal form (3NF) if it is in second normal form and if the only determinants it contains are candidate
keys.
Any column (or collection of columns) that determines another column is called a determinant.
1. Table is in 2NF
2. The table has no transitive dependencies. Transitive dependencies are broken into separate tables.
3. The primary key and nothing but the primary key defines each non-key attribute.
When creating 3NF tables:
- When you identify the 3NF table, it will leave behind a Foreign Key. This implements a 1:M relationship.
Download