Lecture 1: Infinite Relational Database

advertisement
Database Design
Dr. M.E. Fayad, Professor
Computer Engineering Department, Room #283I
College of Engineering
San José State University
One Washington Square
San José, CA 95192-0180
http://www.engr.sjsu.edu/~fayad,
m.fayad@sjsu.edu
2003
SJSU -- CmpE
L1-S1
Infinite R-DB
Lesson 1:
Infinite Relational Database
2
2003
SJSU – CmpE ---
M.E. Fayad
L1-S2
Infinite R-DB
Lesson Objectives
Objectives

Understand Infinite Relational
Databases
 Explore the view level
 Understand the logical view
 Abstract Data Type
3
2003
SJSU – CmpE ---
M.E. Fayad
L1-S3
Infinite R-DB
Infinite Relational Databases
 Data Abstraction-
allows people to
forget unimportant details
– View Level – a way of presenting data to a
– group of users
– Logical Level – how data is understood to
be when writing queries
4
2003
SJSU – CmpE ---
M.E. Fayad
L1-S4
Infinite R-DB
The View Level





2003
The highest level of data abstraction is the view
level
A view is a way of presenting data to a
particular group of users.
Data Presentation may depend on users
preferences.
Each view has to be functional for the users.
This means that when designing a view we
must keep in mind the functions to be
preformed on the data.
SJSU – CmpE ---
M.E. Fayad
L1-S5
Infinite R-DB
5
The View Level
 View
level presentation of the data:
Science, Art, or both (discussion)
 We will illustrate examples from different
computer fields, such as computer
graphics, for view level presentation of
complex data, especially spatiotemporal
data, such as realistic display of images
and movies.
2003
SJSU – CmpE ---
M.E. Fayad
L1-S6
Infinite R-DB
6
The View Level
 Examples:
–Charts
–Graphs
–Drawings
–Maps
–Video or Animation
Examples?
What is a view?
What is a model?
What are the differences between a model and a view?
2003
SJSU – CmpE ---
M.E. Fayad
7
L1-S7
Infinite R-DB
The Logical Level


Example:
Infinite relational data model
• Relation – table
(Each table has a name and defines a relation)
• Relational scheme – top row / list of attributes
(The top row of a table is called an attribute name)
(The ordered set of attributes of a table is called a
relation scheme.)
• Arity or dimension – number of attributes of a relation
(We will use arity and dimension interchangeably with
a preference for dimension in the case of
spatiotemporal relations.)
2003
SJSU – CmpE ---
M.E. Fayad
L1-S8
Infinite R-DB
8
The Logical Level


Example:
Infinite relational data model
• Database schema – set of relation names and
schemes
• Tuple / Point – each row below the scheme
(we will use these two terms interchangeably with a
preference for point in the case of spatiotemporal
relations.
• Instance – the set of tuples in a table
(Each row describes an instance of the scheme.)
(Please remember a relation schemes are usually
fixed while a relation instances may change over time
due to database updates.)
2003
SJSU – CmpE ---
M.E. Fayad
L1-S9
Infinite R-DB
9
Example (1)
SSN
123-45-6789
987-65-4321
567-89-0123
2003
Surname
Doe
Fulano
Roe
First Name(s)
Jane Q.
Juan
Richard Rodney
Telephone Number
512-555-1234
210-543-9876
512-987-6431
SSN
Wages
Interest
Capital Gain
123-45-6789
100,000
3,400
0
987-65-4321
83,640
2,821
3,400
567-89-0123
46,000
501
1,200
SJSU – CmpE ---
M.E. Fayad
L1-S10
10
Infinite R-DB
Example (2)

Name the relations!
 What is arity of each relation?
 What is the relation scheme of each relation?
 What is the database scheme?
 How many tupls in each of the relation?
 How many instances of each of these relations?
2003
SJSU – CmpE ---
M.E. Fayad
L1-S11
Infinite R-DB
11
Relation schemes & Instances (1)
T or F:
Relation schemes are usually fixed (T)
Relation instances change with updates (T)
Example Scheme:
Taxrecord(SSN,Wages,Interest,Capital_gain)
Taxtable(Income,Tax)
2003
SJSU – CmpE ---
M.E. Fayad
L1-S12
Infinite R-DB
12
Relation schemes & Instances (2)
Example:
Streets(Name, X, Y )
Streets contains pairs of street names and (x,y) points such
that the point belongs to the street. There are an infinite
number of (x, y) locations associated with each street.
Example:
Crops(Corn,Rye,Sunflower, Wheat)
Crops contains all possible combinations of four crops that a
farmer could plant. There are an infinite number of tuples in
any instance of this relation.
2003
SJSU – CmpE ---
M.E. Fayad
L1-S13
Infinite R-DB
Infinite Relational Data Model
 Other
examples:
 Temporal Data
 Spatial Data
 Operations Research
14
2003
SJSU – CmpE ---
M.E. Fayad
L1-S14
Infinite R-DB
Temporal & Spatial Data

In many application areas of machine
learning and data mining, researchers face
challenges entailed by temporal and spatial
data.
 What are the differences between temporal
and spatial data?
15
2003
SJSU – CmpE ---
M.E. Fayad
L1-S15
Infinite R-DB
Temporal Data Type (1)
The user-defined temporal data type is a time
representation specially designed to meet the
specific needs of the user. For example, the
designers of a database used for class scheduling
in a school might be based on a
"Year:Term:Day:Period" format. Terms belonging
to a user-defined temporal data type get the same
query language support as do terms belonging to
built-in temporal data types such as the DATE
data type.
16
2003
SJSU – CmpE ---
M.E. Fayad
L1-S16
Infinite R-DB
Temporal Databases
A
temporal database is a
database that supports some
aspect of time, not counting
user-defined time.
17
2003
SJSU – CmpE ---
M.E. Fayad
L1-S17
Infinite R-DB
Spatiotemporal
 The
spatiotemporal is used to
indicate that the modified
concept concerns simultaneous
support of some aspect of time
and some aspect of space, in one
or more dimensions.
18
2003
SJSU – CmpE ---
M.E. Fayad
L1-S18
Infinite R-DB
Abstract Data Types (1)
 Domain
– range of values for an attribute.
– string, integers or real numbers
 Scalar
Domain – always a single value
– (ex: string, integer or real number)
data type domains – composed of
scalar domains.
 Abstract
19
2003
SJSU – CmpE ---
M.E. Fayad
L1-S19
Infinite R-DB
Abstract Data Types (2)
Example:
Vertices(Cities)
The domain of Cities is a set of strings.
Example:
Streets(Name, Extent)
The domain of Extent is a set of (x,y) points.
20
2003
SJSU – CmpE ---
M.E. Fayad
L1-S20
Infinite R-DB
Database Glossary (1)

A database is a collection of related data.

A database management system (DBMS) is a
collection of programs that enables users to create
and maintain a database.

A database system = database + DBMS
21
2003
SJSU – CmpE ---
M.E. Fayad
L1-S21
Infinite R-DB
Database Glossary (2)

A database can be of any size and of varying complexity.
 IRS database
 Assume there are a 100 million taxpayers
 Each taxpayer file has an average of 5 forms.
 Each form is approx. 200 chars
 Assume also that IRS keeps the past three returns for each
taxpayer?
 What is the size of IRS’s database?
22
(100*(106)*200*5) = 4*(1011) = 400 gigabytes
2003
SJSU – CmpE ---
M.E. Fayad
L1-S22
Infinite R-DB
Characteristics of the
Database Approach
 Self-describing
nature of a database system
 Database contains the database itself, the definition or
description of the database structure and constraints
 The definition is stored in the system catalog which
contains the information, such as structure of each file, the
type and storage format of each data item, and various
constraints on the data.
 The information stored in the catalog is called meta-data.
23
2003
SJSU – CmpE ---
M.E. Fayad
L1-S23
Infinite R-DB
Characteristics of the
Database Approach
 Insulation
between programs and data, and
data abstraction
 In OO databases users can define operations on
data as part of the database definitions.
 An operation is called a function is specified in two
parts: the interface or signature and the
implementation
24
 Data abstraction
2003
SJSU – CmpE ---
M.E. Fayad
L1-S24
Infinite R-DB
Characteristics of the
Database Approach
 Support
multiple views of the data
 Dealing with Raw Data
 Many users = different perspectives or views
of the database.
 Facilities for multiple views
25
2003
SJSU – CmpE ---
M.E. Fayad
L1-S25
Infinite R-DB
Characteristics of the
Database Approach
 Sharing
of data and multiuser transaction
processing
 A multiuser DBMS must allow multiple users to access
the database at the same time.
 Concurrency control – to ensure that several users
trying to update the same data do so in a controlled
manner so that the result of the updates is correct.
26
2003
SJSU – CmpE ---
M.E. Fayad
L1-S26
Infinite R-DB
Actors on the Scene
 Database
administrators
 Database
designers
 End
users (casual end users, naïve or
parametric end users, sophisticated end
users, and stand-alone user
 System
analysts and application
programmers or software engineers
2003
SJSU – CmpE ---
M.E. Fayad
L1-S27
27
Infinite R-DB
Worker Behind the Scene
 DBMS
system designers and
implementers
 Tool
developers
 Operators
and maintenance
personnel
28
2003
SJSU – CmpE ---
M.E. Fayad
L1-S28
Infinite R-DB
Advantages of Using DBMS (1)

Controlling redundancy

Redundancy is storing the same data
multiple times that lead to several problems:
1. Duplication of effort
2. Waste of storage space
3. Inconsistent
29
2003
SJSU – CmpE ---
M.E. Fayad
L1-S29
Infinite R-DB
Advantages of Using DBMS (1)
 Restricting
unauthorized access
 DBMS should provide a security and
authorization mechanisms which
specify account restrictions.
 DBMS should enforce these
restrictions automatically.
2003
SJSU – CmpE ---
M.E. Fayad
30
L1-S30
Infinite R-DB
Advantages of Using DBMS (1)

Providing persistent storage for program objects and
data structures
 In OO Database Systems, an object said to be persistent if it
survives the execution of program execution and can be
later retrieved by another program.
 Compatibility – OODBs offer data structure compatible with
one or more OO programming languages
 Traditional DB systems often suffer from the so-called
impedance or mismatch problem
2003
SJSU – CmpE ---
M.E. Fayad
L1-S31
Infinite R-DB
31
Advantages of Using DBMS (1)
 Permitting
inferencing and actions using
rules
 Some database systems provide capabilities
for defining deduction rules for inferencing
new information from the stored database
facts.
 Such systems are called deductive database
systems.
2003
SJSU – CmpE ---
M.E. Fayad
L1-S32
Infinite R-DB
32
Advantages of Using DBMS (2)
2003

Providing multiple user interfaces

Representing complex relationships
among data

Enforcing integrity constraints

Providing backup and recovery
SJSU – CmpE ---
M.E. Fayad
L1-S33
33
Infinite R-DB
Additional Advantages of Using DBMS (2)
2003

Potential enforcing standards

Reducing application development time

Flexibility

Availability of up-to-date information

Economics of Scale
SJSU – CmpE ---
34
M.E. Fayad
L1-S34
Infinite R-DB
Discussion Questions
T/F:
a. A view is a way of presenting data to a particular group of users.
b. Any relation can be presented by multiple views
c. Arity = the number of columns in the relation.
d. An instance = any row of a relation
e. Spatial database is a database that supports some aspect of time, not
counting
f. Spatial data in the form of two- or three-dimensional images.
g. Spatial data is any information about the location and shape of, and
relationships among, geographic features. This includes remotely
sensed data as well as map data.
2003
SJSU – CmpE ---
M.E. Fayad
L1-S35
Infinite R-DB
35
Tasks for Next Lecture
Task 1: Data Modeling Using EntityRelationship Model
36
2003
SJSU – CmpE ---
M.E. Fayad
L1-S36
Infinite R-DB
Download