Midterm

advertisement
RELATIONAL DATABASE
Three classical data models
1. hierarchical model
2. network model
o each tuple is a separate record, e.g., (AN 145) --- [Perugini] --- (CPS 430)
o no separation between logical and physical views
o used record-at-a-time languages
o too low-level
3. relational model
o proposed by E.F. Codd in 1969
o most popular and successful model
o de facto standard for databases
o (relational) databases are one of the most popular success stories of simple theoretical ideas
o the focus of CPS 430/542
Relational Database
• Definition:
•
Data stored in tables that are associated by shared attributes (keys).
•
Any data element (or entity) can be found in the database through the name of the table, the attribute
name, and the value of the primary key.
Relational Database Definitions
• Entity: Object, Concept or event (subject)
•
Attribute: a Characteristic of an entity
•
Row or Record: the specific characteristics of one entity
•
Table: a collection of records
•
Database: a collection of tables
The Relational Database model
• Developed by E.F. Codd, C.J. Date (70s)
•
Table = Entity = Relation
•
Table row = tuple = instance
•
Table column = attribute
•
Table linkage by values
•
Entity-Relationship Model
The Relational Model
• Each attribute has a unique name within an entity
•
All entries in the column are examples of it
•
Each row is unique
•
Ordering of rows and columns is unimportant
•
Each position (tuple) is limited to a single entry.
Data Model: What’s a model?
• A data model is a representation of reality
•
It’s used to define the storage and manipulation of a data base.
Relational Database Definitions
• Entity: Object, Concept or event (subject)
•
Attribute: a Characteristic of an entity
•
Row or Record: the specific characteristics of one entity
•
Table: a collection of records
•
Database: a collection of tables
The Relational Database model
• Developed by E.F. Codd, C.J. Date (70s)
•
Table = Entity = Relation
•
Table row = tuple = instance
•
Table column = attribute
•
Table linkage by values
•
Entity-Relationship Model
The Relational Model
• Each attribute has a unique name within an entity
•
All entries in the column are examples of it
•
Each row is unique
•
Ordering of rows and columns is unimportant
•
Each position (tuple) is limited to a single entry.
Advantages of Relational Database over Flat File
1. Avoids Data Duplication
2. Avoid Inconsistent Records
3. Easier to change Data
4. Easier to change data format
5. Data can be added and remove easily
6. Easier to maintain security
Database Table Keys
Definition:
A key of a relation is a subset of attributes with the following attributes:
• Unique identification
•
Non-redundancy
Types of Keys
a. PRIMARY KEY
•
Serves as the row level addressing mechanism in the relational database model.
•
It can be formed through the combination of several items.
•
Indicates uniqueness within records or rows in a table
b. FOREIGN KEY
•
A column or set of columns within a table that are required to match those of a primary key of a second
table.
•
the primary key from another table, this is the only way join relationships can be established.
These keys are used to form a RELATIONAL JOIN - thereby connecting row to row across the individual
tables.
Keys
• Primary key – minimal subset of fields that is unique identifier for a tuple
•
•
sid is primary key for Students
•
cid is primary key for Courses
Foreign key –connections between tables
•
Courses (cid, instructor, quarter, dept)
•
Students (sid, name, login, age, gpa)
•
How do we express which students take each course?
Constructing Join Relationships
• One-to-many relationships include the Primary Key of the ‘one’ table and a Foreign Key (FK) in the ‘many’
table.
•
Cardinality: one-to-one, one-to-many, many-to-many relationships
Optionality: the relationship is either mandatory or optional
Normalization
• Normalization: a process for analyzing the design of a relational database
•
Database Design - Arrangement of attributes into entities
•
It permits the identification of potential problems in your database design
•
Concepts related to Normalization:
•
KEYS and FUNCTIONAL DEPENDENCE
1. Database Normalization (1)
•
Sample Student Activities DB
•
Poorly Designed
•
Non-unique records
•
•
Table
John Smith
Test the Design by developing
and queries
2. Database Normalization (2)
● Created a unique “ID” for each Record
Table
● Required the creation of an “ID” lookreporting (Students Table)
● Converted the “Flat-File into a
Database
sample reports
in the Activities
up table for
Relational
3. Database Normalization (3)
• Wasted Space
•
Redundant data entry
•
What about taking a 3rd Activity?
•
Query Difficulties - trying to find all swimmers
•
Data Inconsistencies - conflicting
prices
4. Database Normalization (4)
● Students table is fine
●
Elimination of two columns and
Table restructuring, Simplifies the
an Activities
Table
●
BUT, we still have Redundant
fees) and data insertion
data (activity
anomalies.
5. Database Normalization (5)
•
Modify the Design to ensure that
key field is dependent on the
• Creation of the Participants
“every nonwhole key”
Table,
corrects our problems and forms a union between 2 tables.
Database Design: Basic Steps
Step 1: Determine the entities involved and create a separate table for each type of entity (thing, concept, event,
theme) and name it.
Step 2: Determine the Primary Key for each table.
Step 3: Determine the properties for each entity (the non-key attributes).
Step 4: Determine the relationships among the entities
Database Views
• A View is an individual’s picture of a database. It can be composed of many tables, unbeknownst to the user.
•
It’s a simplification of a complex data model
•
It provides a measure of database security
•
Views are useful, primarily for READ-only users and are not always safe for CREATE, UPDATE, and
DELETE.
Table Indexing
• An Index is a means of expediting the retrieval of data.
•
Indexes are “built” on a column(s).
•
Indexes occupy disk space; occasionally a lot.
•
Indexes aren’t technically necessary for operation and must be maintained by the database administrator.
.
SUMMATIVE QUIZ
Self-Check Activity 2.1
Problem No. 1 (5 points)
a. A company database needs to store information about employees (identified by ssn, with salary and phone
as attributes), departments (identified by dno, with dname and budget as attributes), and children of employees
(with name and age as attributes).
b. Employees work in departments; each department is managed by an employee; a child must be identified
uniquely by name when the parent (who is an employee; assume that only one parent works for the company)
is known. We are not interested in information about a child once the parent leaves the company.
Note: Populate your tables with at least 10 records.
Problem No. 2 (10 points)
a. Although you always wanted to be an artist, you ended up being an expert on databases because you love
to cook data and you somehow confused database with data baste. Your old love is still there, however, so you
set up a database company, ArtBase that builds a product for art galleries. The core of this product is a
database with a schema that captures all the information that galleries need to maintain.
b. Galleries keep information about artists, their names (which are unique), birthplaces, age,and style of art.
For each piece of artwork, the artist, the year it was made, its unique title, its type of art (e.g., painting,
lithograph, sculpture, photograph), and its price must be stored. Pieces of artwork are also classified into
groups of various kinds, for example, portraits, still lifes, works by Picasso, or works of the 19th century; a
given piece may belong to more than one group.
c. Each group is identified by a name (like those just given) that describes the group. Finally, galleries keep
information about customers. For each customer, galleries keep that person’s unique name, address, total
amount of dollars spent in the gallery (very important!), and the artists and groups of art that the customer
tends to like.
Note: Populate your tables with at least 10 records.
Problem No. 3 (10 points)
Design a Relational Database for Access Registrar’s Office. The Office maintains data about each class, including the
instructor, the number of student enrolled, and the time and place of the class meetings. For each student-class pair, a
grade is recorded.
Underlined attributes indicate the primary key.
Student (StudentID, Name, Program)
Course (CourseNo, Title, Syllabus, Credits)
CourseOffering (CourseNo, SectionNo, Year, Semester, Time, Room)
Instructor (InstructoNo, Name, Department, Title)
Enrols (StudentNo, CourseNo, SectionNo, Semester, Year, Grade)
Teaches (CourseNo, SectionNo, Semester, Year, InstructorNo)
Requires (MainCourse, PreRequisite)
Download