names

advertisement
A2 Teacher Up skilling
LECTURE 1
Relational Databases
Entity Relationship Model
Database Design
Introduction
• Lectures + Practical
•
•
•
• Staff
•
•
•
•
•
Wednesday 5.30pm-8.30pm
ECS1/02/014
Minor Lab
Narelle Allen n.allen@qub.ac.uk
Neil Anderson n.anderson@qub.ac.uk
Angela Allen angela.allen@qub.ac.uk
Philip Hanna p.hanna@qub.ac.uk
Craig Dooley c.dooley@qub.ac.uk
www.computingatqueens.co.uk
•
•
•
•
•
General information / contacts
Teaching Resources:
Week by week overview
Weekly slides & exercises
•
•
Comment & Discussion
Feedback for us
Problems & successes you’ve had
Supporting each other / learning from each other
•
•
•
Teaching Community
Sharing materials / resources you have developed
•
Portal Support: Craig Dooley [c.dooley@qub.ac.uk]
What’s to come today?
• Relational Databases
•
•
Database Schema
Keys
• Database Design
•
•
Entity Relationship Model
Normalisation
Goals: Design, Implement &
Use
•
Design a database
•
Using database design methodologies and database theories to
create the best structure for a relational database
•
Implementing and maintaining a database using a data definition
language (DDL) – we will use SQL for data definition
•
•
•
Defining the structure of a database
Making changes to the structure
Access a database system (access and make changes to the
contents) using a data manipulation(DML), commonly called a query
language (we will use SQL for data manipulation) via C#
•
•
•
Populating the database
Updating
Querying
Why learn about databases?
•
Incredibly prevalent
•
Websites, telecommunications systems, banking systems, video
games, health records, censuses, search engines, just about
any other software system or electronic device that maintains
some amount of persistent information.
•
Properties making them exceptionally useful and convenient
•
Persistence, reliability, efficiency, scalability, concurrency
control, safety, data abstractions, high-level query languages
What is a Database?
•
A database is a store of structured data which can be accessed
via a high-level query language.
•
We use a DBMS (Database Management System) to create,
maintain and access/query a database containing information about a
particular application
•
•
Collection of structured data
Set of programs/commands to create and maintain the
database structure for storing data and to access the data
•
•
An environment that is both convenient and efficient to use
An example of a DBMS is MySQL.
Structure of Relational Database
•
A relational database basically consists of a number of tables,
each of which is called a relation instance, or simply relation.
•
•
•
Each table contains a number of rows and a number of columns.
Each row is called a tuple.
Each column is called an attribute.
attributes
(or columns)
customer_name customer_street customer_city
Jones
Smith
Curry
Lindsay
Main
North
North
Park
Harrison
Rye
Rye
Pittsfield
tuples
(or rows)
Customer Relation
Account Relation
Depositor Relation
•
Notice how the Depositor relation, links customer names
(customer_name attribute from the Customer relation) to account
numbers (account_number attribute from the Account relation). This
is to show that a certain customer made a deposit towards a certain
account.
Database Schema
•
The schema of a table/relation is the the structure of the
table/relation, called a relation schema.
•
The structure of a relational database is specified by a database
schema, which contains a set of relation schemas.
•
Each relation schema consists of a relation name and a number of
attributes.
•
Each attribute has a particular attribute type (similar to data types
in programming), that is, a domain of values.
•
Relation schema describes the structure and semantics of a relation.
Example:
Customer(customer_name, customer_street, customer_city)
Attribute Values
•
The set of allowed values for each attribute is called the domain of
the attribute
•
Attribute values are (normally) required to be atomic; that is,
indivisible
•
E.g. the value of an attribute can be an account number,
but cannot be a set of account numbers
•
The special value null is a member of every domain
Keys – Primary Key
•
Each tuple in a relation/table needs to be uniquely identified, e.g., a
tuple may represent a student record, a module, or an employee
record
•
Simply put, a key is an attribute that identifies a unique tuple of each
possible relation r(R), e.g., K = {customer_name}.
•
Primary key: an attribute or set of attributes that is chosen as the
principal means of identifying tuples within a relation
•
Should choose an attribute whose value never, or very rarely,
changes, e.g., national insurance number or customer_id
For instance, email address is unique, but may change
•
We normally underline the primary key
For instance, instructor(ID, name, dept_name, salary)
Keys – Foreign Key
•
The attribute of a relation schema attribute is called a foreign key if it
corresponds to the primary key of another relation schema.
E.g. customer_name and account_number attributes in depositor are
foreign keys that are the primary keys of customer and account
respectively.
•
Only values occurring in the primary key attribute of the referenced
relation may occur in the foreign key attribute of the referencing
relation. This is known as Referential Integrity.
Referential Integrity
•
Referential Integrity is a set of constraints imposed by a Relational
Database Management System that prevents users from having
inconsistent data.
•
In our example, the Depositor Relation has 2 foreign keys
(customer_name and account_number) that reference the primary
keys for the Customer and Account relation respectively.
•
Through referential integrity, we cannot add a row to the Depositor
relation that contains an account number that does not exist in the
Account relation. We also cannot add a customer name to the
Depositor relation if that customer does not exist in the Customer
relation.
Referential Integrity (cont)
•
Furthermore, referential integrity may also specify that when you
delete a primary key record from a certain table, any foreign key
records linked to that primary key from a different table are also
deleted.
•
In our case, if you delete a Customer record from the Customer
relation, then all of the records in the Depositor relation that
references that Custome are also deleted. This is known as a
cascade delete.
Schema Diagram
•
A Schema diagram shows the connections between each of the
relation schemas.
Entity Relationship Model
•
Given an application problem, we need to create a data model to
capture the data and relationships between data specified in the
given problem
•
•
•
Already know how to use a relational model
But how do we get it in the first place
Create an Entity-Relationship model by designing an E-R diagram –
a graphical representation of entities and relationships between
entities
•
We can then convert the E-R diagram into a relational model through
abiding by the rules of normalization.
Entity Relationship Modelling
•
•
In terms of an E-R model, a database can be modeled as:
•
•
a collection of entities,
relationship among entities.
An entity is an object that exists and is distinguishable from other
objects.
•
Example: specific person (e.g., John, Mary), company (e.g.,
IBM, Microsoft), event (e.g., car accidents, traffic jams)
•
•
Entities have attributes that uniquely characterize them
•
Example: people have names and addresses
An entity set is a set of entities of the same type that share the
same properties.
•
Example: set of all persons, companies, trees, holidays
Entity Sets (Instructor, Student)
student
instructor
ID
name
salary
76766 Crick
72000
45565 Katz
75000
10101 Srinivasan
65000
98345 Kim
80000
76543 Singh
80000
22222 Einstein
95000
…..
…..
……..
ID
name
Tot_cred
98988 Tanaka
120
12345 Shankar
32
10128 Zhang
102
76543 Brown
58
76653 Aoi
60
23121 Chavez
110
44553 Peltier
56
…..
…
…….
What does this set represent?
advisor
instructor_ID
student_ID
76766
98988
45565
12345
45565
10128
10101
76543
98345
76653
76543
23121
22222
44553
…..
…..
This table represents a relationship set, which contains a set
of advisor relationships between instructors and students.
Relationship Set (Advisor)
instructor
ID
student
name
salary
ID
76766 Crick
72000
98988 Tanaka
120
45565 Katz
75000
12345 Shankar
32
10101 Srinivasan
65000
10128 Zhang
102
98345 Kim
80000
76543 Brown
58
76543 Singh
80000
76653 Aoi
60
22222 Einstein
95000
23121 Chavez
110
…..
…..
44553 Peltier
56
…..
…
……..
name
…….
Tot_cred
Each of the links represents an advisor relationship between an
instructor and a student.
Note instructor Katz, is an advisor to 2 students.
Mapping Cardinality Constraints
•
Express the number of entities in one entity set to which another
entity in another entity set can be associated via a relationship set.
•
The mapping cardinality must be one of the following types:
•
•
•
•
One to one
One to many
Many to one
Many to many
One-to-One Mapping
•
One to one mapping means that
•
one entity on the left can be associated with at most one entity
on the right and
•
one entity on the right can be associated with at most one
entity on the left
•
For example, each instructor has a most one advisee and each
student has at most one advisor
One-to-One Mapping Example
instructor
ID
student
name
salary
ID
76766 Crick
72000
98988 Tanaka
120
45565 Katz
75000
12345 Shankar
32
10101 Srinivasan
65000
10128 Zhang
102
98345 Kim
80000
76543 Brown
58
76543 Singh
80000
76653 Aoi
60
22222 Einstein
95000
23121 Chavez
110
…..
…..
44553 Peltier
56
…..
…
……..
name
…….
Tot_cred
One-to-Many Mapping
•
One to many mapping means that
•
one entity on the left can be associated with many entities
(possibly 0) on the right and
•
one entity on the right can be associated with at most one
entity on the left
•
For example, each instructor has several advisees (possibly 0)
and each student has at most one advisor
One-to-Many Mapping Example
instructor
ID
student
name
salary
ID
76766 Crick
72000
98988 Tanaka
120
45565 Katz
75000
12345 Shankar
32
10101 Srinivasan
65000
10128 Zhang
102
98345 Kim
80000
76543 Brown
58
76543 Singh
80000
76653 Aoi
60
22222 Einstein
95000
23121 Chavez
110
…..
…..
44553 Peltier
56
…..
…
……..
name
…….
Tot_cred
Many-to-One Mapping
•
Many to one mapping means that
•
One entity on the left can be associated with at most one entity
on the right and
•
one entity on the right can be associated with many entities
(possibly 0) on the left
•
For example, each instructor has at most one advisee and each
student has several advisors (possibly 0)
Many-to-One Mapping Example
instructor
ID
student
name
salary
ID
76766 Crick
72000
98988 Tanaka
120
45565 Katz
75000
12345 Shankar
32
10101 Srinivasan
65000
10128 Zhang
102
98345 Kim
80000
76543 Brown
58
76543 Singh
80000
76653 Aoi
60
22222 Einstein
95000
23121 Chavez
110
…..
…..
44553 Peltier
56
…..
…
……..
name
…….
Tot_cred
Many-to-Many Mapping
•
Many to many mapping means that
•
one entity on the left can be associated with many entities
(possibly 0) on the right and
•
one entity on the right can be associated with many entities
(possibly 0) on the left
•
For example, each instructor has several advisees (possibly 0)
and each student has several advisors (possibly 0)
Many-to-Many Mapping Example
instructor
ID
student
name
salary
ID
76766 Crick
72000
98988 Tanaka
120
45565 Katz
75000
12345 Shankar
32
10101 Srinivasan
65000
10128 Zhang
102
98345 Kim
80000
76543 Brown
58
76543 Singh
80000
76653 Aoi
60
22222 Einstein
95000
23121 Chavez
110
…..
…..
44553 Peltier
56
…..
…
……..
name
…….
Tot_cred
ER Diagrams
•
•
•
•
•
Rectangles represent entity sets.
Diamonds represent relationship sets.
Attributes listed inside entity rectangle
Underline indicates primary key attributes
Lines link entity sets to relationship sets
Representing Cardinality
Constraints
•
We express cardinality constraints by drawing either a directed line
(), signifying “one” or an undirected line (—), signifying
“many” between the relationship set and the entity set.
•
One-to-one relationship:
•
A student is associated with at most one instructor via the
relationship advisor
•
A student is associated with at most one department via
stud_dept
One-to-One Relationship
•
one-to-one relationship between an instructor and a student
•
•
an instructor is associated with at most one student via advisor
and a student is associated with at most one instructor via
advisor
One-to-Many Relationship
•
one-to-many relationship between an instructor and a student
•
an instructor is associated with several (possibly 0) students
via advisor
•
a student is associated with at most one instructor via advisor,
Many-to-One Relationship
•
In a many-to-one relationship between an instructor and a student,
•
an instructor is associated with at most one student (possibly
0) via advisor,
•
and a student is associated with several (possibly 0) instructors
via advisor
Many-to-Many Relationship
•
An instructor is associated with several (possibly 0) students via
advisor
•
A student is associated with several (possibly 0) instructors via
advisor
Normalization
•
Normalization is the process of organizing data in a database into an
appropriate design.
•
Normalization is important as it imposes a set of rules that when
abided ensures that our database design is good (minimizes data
duplication and redundancy).
•
•
In this course, we will consider 1st, 2nd and 3rd Normal Form.
The forms are progressive, so in order to be in 2nd Normal Form, the
database must also satisfy the rules for 1st Normal form and so on.
•
You should strive to have a database that is in 3rd Normal form.
1st Normal Form
•
A database is in 1st Normal form if it satisfies the following
conditions:
•
•
•
Does not contain any repeating groups
All attributes are atomic (i.e. indivisible units)
Suppose we had a Student relation that was not in 1st Normal form
because not all attributes are atomic. Let’s assume that the Student
attribute is the primary key.
1st Normal Form (cont.)
•
We would separate this data into multiple rows so that are attributes
are atomic. Now we have a Student table in 1st Normal Form.
2nd Normal Form
•
A database is in 2nd Normal form if it satisfies the following
conditions:
•
•
It is in 1st Normal form.
There are no partial dependencies on any of the columns
(attributes) of the primary key.
•
In our Student relation, the Age attribute depends upon only the
Student attribute. Therefore we will extract the primary key and the
partial dependency attribute (Subject) to a new table. These
extracted attributes will form a composite primary key (Student,
Subject) in the new table.
2nd Normal Form (cont.)
•
Now we have 2 relations, the Student relation and the Subject
relation.
3rd Normal Form
•
A database is in 3rd Normal form if it satisfies the following
conditions:
•
•
•
It is in 2nd Normal form.
All non-primary fields are dependent on the primary key.
Consider we had a Student detail table with a Student_id as the
primary key.
•
In this table, the street, city and state attributes depend upon the Zip
attribute, which is not the primary key. Therefore this fails the
condition that all non-primary fields are dependent on the primary
key.
3rd Normal Form
•
To apply 3rd Normal Form to this table, we move the attributes that
are not dependent upon the primary key to a new table, along with
the attribute that they are actually dependent upon.
•
In our example, we move the street, city and state attributes to a
new table, with the Zip attribute as the primary key. We will call this
new table the Address Table.
What’s to come next time
• Week 2
•
•
Querying database
SQL
(Structured Query
Language)
Download