data model

advertisement
Introduction to relational databases
Introduction to the database field:
 Applications, concepts and
terminology
 Introduction to the relational
model
FEN 2014-02-02
1
Databases and Database Applications
Textual and numerical
databases
Multimedia Databases
GIS (Geographic
Information Systems)
Data Warehouse
Real-time and active
databases
And many-many more
FEN 2014-02-02
2
What is a Database?
A database is a collection of related data.
A database is a logical coherent collection of data
with some inherent meaning. Hence a random
collection of data is not a database.
A database is designed, built and populated with
data for a specific purpose for a specific group of
users with requirements for specific applications.
A database represents some aspect of the real
word, sometimes called the miniworld or domain.
FEN 2014-02-02
3
What is a DBMS?
Sometimes called a
Database Server or Database Engine
A DBMS (Database Management System) is a
collection of programs that enables the users
to create and maintain databases.
This involves
Defining a database (data types, structures and
constraints) and handling meta data.
Storing the data on some suitable storage
medium and controlling the storage.
Manipulating the database (executing queries and
updating the database)
Sharing the database allowing multiple users and
application to access the database
simultaneously.
Providing interfaces to interactive users and/or
application programs.
FEN 2014-02-02
Do you know
some DBMS?
For instance:
Oracle
MySQL
MS SQL Server
PostgreSQL
and many more
4
What is a Database System?
A database system is a database and a database
management system put together:
Database + DBMS == Database system
FEN 2014-02-02
5
Overview
Database
DBMS
Database System
FEN 2014-02-02
6
Data Models
Database systems are build upon a data model.
A data model is a collection of concepts used to describe the
structure of a database.
A data model provides data abstraction, hides storage details
and gives users (and developers) a conceptual view of the
database.
A data model should provide means to describe:
Structure
Data types
Relationships
Constraints
Operations
FEN 2014-02-02
7
Data Models
Legacy (pre-relational)
Hierarchical
Network
Record based
Relational
Post-relational
So the relational
model is very central.
Object oriented
Object-relational
Temporal
XML
Cubes in Data Warehouse
...
FEN 2014-02-02
8
Relational Databases
All data is organised in tables
with atomic values
Associations are represented
by primary key/foreign key
connections
Every operation operates on
tables and returns tables
Small exercise (5 min..):
•
•
Describe step by step how you
will retrieve this information:
When did Mr Smith attend the
“Intro to Computer Science”
course, who was the instructor
and what grade did he receive?
FEN 2014-02-02
9
The Relational Model
A sound theoretical data model
(Codd, 1970).
Based on the mathematical
theory of relations, sets and
first order predicate logic.
De facto standard since the late
eighties.
Many-many implementations –
most SQL-based.
The Notorious Supplier-Part Database (Date)
For instance:
Oracle
MySQL
MS SQL Server
PostgreSQLA
FEN 2014-02-02
10
The Relational Model: Concepts
The Notorious Supplier-Part Database (Date)
Central concepts:
Tables (relations).
Columns (attributes).
Type (domain).
Rows (tuples).
Tuples are
unordered.
Tuples are unique.
A relation is a set
(mathematical) of
tuples.
Primary and foreign
keys
FEN 2014-02-02
11
The Relational Model
Data is organised in a number of tables (relations).
Each table has a number (>=1) columns (attributes).
Attributes are atomic and defined over some domain.
A table holds a number (maybe none) rows (tuples).
Tuples are unordered.
Tuples are unique (existence of a key is guaranteed).
A relation is a set (mathematical) of tuples.
FEN 2014-02-02
12
Attributes and Domains
A domain defines the valid value of an attribute.
Domains are based on the built-in standard data types
(int, char etc.) offered by the DBMS.
Theoretically it should be possible to define problem
specific domains as CPR numbers, account numbers, IP
addresses etc. and complex aggregate (structured)
domain as maps, diagrams, pictures, sound bites, video
clips etc.
More attributes may be defined over the same domain.
An attribute may have the value “empty” (not known
/not defined for this instance). Empty is notated NULL.
FEN 2014-02-02
13
Properties of a Relation
Follows from the fact that relations are (mathematically) sets:
Tuples must be unique within a relation (hence a primary key always exists)
Tuples are unordered (vertically)
Attributes are unordered (horizontally)
Attribute values are atomic
Note the difference to the usual notion of a table
FEN 2014-02-02
14
Keys
A key is a combination of attributes that is:
Unique and
Minimal
An attribute combination that is unique,
but not minimal is called a superkey
The set of all attributes will always be a
superkey, hence a superkey (and a key)
always exists.
A relation (table) may have several
candidate keys.
One these is appointed primary key.
Any primary keys here?
FEN 2014-02-02
15
Associations Between Relations
Is represented by foreign keys.
A foreign key is an attribute
(combination) that corresponds to an
attribute (combination) of the primary
key of some other relation.
A foreign key references a tuple in
another relation and indicates that here
is more information about the entity.
Foreign key attributes and
corresponding primary key attributes
must be defined over compatible
domains (or even the same domain).
Any foreign keys here?
FEN 2014-02-02
16
Integrity Constraints
Domain constraints
Attributes may only hold valid values
Entity Integrity
Primary key attributes may not hold NULL-values
Referential Integrity (foreign key constraint)
A foreign key must either be NULL or reference an
existing primary key in the other relation
Semantic Integrity
Constraints depending on the problem domain
Any constraints here?
FEN 2014-02-02
17
Example: MiniBank
Two tables:
Customers
Accounts
Associated:
An account must
belong to exactly
one customer
Association
Any constraints here?
(primary keys – foreign keys)
FEN 2014-02-02
18
Example: MiniBank
What happens if:
We try to insert a customer
with an existing custNo?
We try to insert an account
with a not existing custNo?
Let’s try in MS SQL Server
FEN 2014-02-02
19
Example: MiniBank
Table definitions
(schemas):
Constraint
FEN 2014-02-02
20
Quering a relational database
Database Languages:
Data Definition DDL
Should provide constructs for defining all the previous
(as “create table)
Data Manipulation DML (queries, insert, delete, update)
procedural (How?)
nonprocedural (What?)
The Relational Algebra is a procedural DML
SQL includes a (sort of) nonprocedural DML
FEN 2014-02-02
21
The Relational Algebra
Data Manipulation in the Relational Model
Operates on relations, which are input to the
operations is tables and the result is a table
Operations
Row selection (RESTRICT/SELECT)
Column selection (PROJECT)
Combining tables (JOIN)
Set operations (UNION, INTERSECTION, DIFFERENCE,
PRODUCT)
More advanced operations (OUTER (LEFT/RIGTH) JOIN)
FEN 2014-02-02
22
Relational Algebra - Overview
FEN 2014-02-02
23
Example: MiniBank
Retrieve information about
customer number 3:
Row selection on custNo = 3 from
Customer
Retrieve account number,
balance and customer number
for accounts with a balance
between 1000 and 2000:
Row selection on 1000 <= balance
and balance <= 2000 from Account
Column selection on accNo,
balance, custNo
Retrieve information about
customer Tommy and his
accounts:
Row selection on name = ‘Tommy’
from Customer
Join with Account on custNo
FEN 2014-02-02
24
Download