Uploaded by Cansın Bayrak

Ch01 OK

advertisement
IS 503
DATABASE CONCEPTS AND
APPLICATIONS
IS 503: GRADING & OTHER…
25% Homeworks
25% Midterm
25% Final
20% Project
Contact: betincan@metu.edu.tr
 The course materials are at ODTUClass. You are expected
to upload your assignments there.
Course book: Rames Elmasri, ‘Fundamentals of Database
Systems’, 6th Edition, 2010
Slides are mostly based on Elmasri’s presentations.
SYLLABUS

Chapter 1: Introduction to Conceptual Modeling

Chapter 2: Database System Concepts and Architecture

Chapter 3: Database modeling using the Entity-Relationship (ER) and Extended ER
(EER)

Chapter 4: The Relational Data Model and Relational Database Constraints

Chapter 5: Relational Database Design by ER-to-Relational Mapping

Chapter 6 EER-to-Relational Mapping

Chapter 7: The Relational Algebra and Calculus

Chapter 7: The Relational Algebra and Calculus

Chapter 8: SQL

Chapter 9: SQL Programming

Lab and Examples

Chapter 10: Functional Dependencies and Normalization for Relational
Databases

Chapter 11: Relational Database Design Algorithms and Further Dependencies

Chapter 12: Introduction to Transaction Processing Concepts and Theory
Slide
1-3
CHAPTER 1
Introduction and Conceptual Modeling
Copyright © 2004 Pearson Education, Inc.
BASIC DEFINITIONS





Data: Known facts that can be recorded and have an
implicit meaning.
Database: A collection of related data.
Mini-world: Some part of the real world about which
data is stored in a database. For example, student grades
and transcripts at a university.
Database Management System (DBMS): A software
package/ system to facilitate the creation and
maintenance of a computerized database.
Database System: The DBMS software together with
the data itself. Sometimes, the applications are also
included.
Slide
1-5
EXAMPLE OF A DATABASE
(WITH A CONCEPTUAL DATA MODEL)
 Mini-world
for the example: Part of a
UNIVERSITY environment.
 Some mini-world entities:
STUDENTs
 COURSEs
 SECTIONs (of COURSEs)
 (academic) DEPARTMENTs
 INSTRUCTORs

Slide
1-6
EXAMPLE
A
database that stores student information
Name
StudentNumber
Class
Major
Smith
17
1
CS
Brown
8
2
CS
Slide
1-7
EXAMPLE

Suppose we have the following information in our database:
Student
Section
Grade
Report
Name
StudentNumber
Class
Major
Smith
17
1
CS
Brown
8
2
CS
SectionIdenti
fier
CourseNumber
Semester
Year
Instructor
85
MATH2410
Fall
98
King
92
CS1310
Fall
98
Anderson
102
CS3320
Spring
99
Knuth
112
MATH2410
Fall
99
Chang
119
CS1310
Fall
99
Anderson
135
CS3380
Fall
99
Stone
StudentNumber
SectionIdentifier
Grade
17
112
B
17
119
C
8
85
A
8
92
A
8
102
B
8
135
A
Slide
1-8
EXAMPLE OF A DATABASE
(WITH A CONCEPTUAL DATA MODEL)
 Some
mini-world relationships:
SECTIONs are of specific COURSEs
 STUDENTs take SECTIONs
 COURSEs have prerequisite COURSEs
 INSTRUCTORs teach SECTIONs
 COURSEs are offered by DEPARTMENTs
 STUDENTs major in DEPARTMENTs

Note: The above could be expressed in the
ENTITY-RELATIONSHIP data model.
Slide
1-9
TYPICAL DBMS FUNCTIONALITY
 Define
a database: in terms of data
types, structures and constraints
 Construct or Load the Database on a
secondary storage medium
 Manipulating the database : querying,
generating reports, insertions, deletions
and modifications to its content
 Concurrent Processing and Sharing by
a set of users and programs – yet,
keeping all data valid and consistent
Slide
1-10
TYPICAL DBMS FUNCTIONALITY

Protection or Security measures to prevent
unauthorized access



Protection against hardware and software
malfunction (crashes)
Security against unauthorized or malicious
access
Presentation and Visualization of data
Slide
1-11
MAIN CHARACTERISTICS
DATABASE APPROACH
OF THE
 Self-describing
nature of a database
system: A DBMS catalog stores the
description of the database. The
description is called meta-data. This
allows the DBMS software to work with
different databases.
Slide
1-12
A
database that stores student information
Name
StudentNumber
Class
Major
Smith
17
1
CS
Brown
8
2
CS
 Internal
storage format for a STUDENT record
Data Item Name
Starting position in
Record
Length in Characters
(bytes)
Name
1
30
StudentNumber
31
4
Class
35
4
Major
39
4
Slide
1-13
MAIN CHARACTERISTICS
DATABASE APPROACH
OF THE
 Self-describing
nature of a database system:
A DBMS catalog stores the description of
the database. The description is called metadata. This allows the DBMS software to
work with different databases.
 Insulation between programs and data:
Called program-data independence.
Allows changing data storage structures and
operations without having to change the
DBMS access programs.
Slide
1-14
MAIN CHARACTERISTICS
DATABASE APPROACH
OF THE
 Data
Abstraction: A data model is used
to hide storage details and present the
users with a conceptual view of the
database.
 Support of multiple views of the data:
Each user may see a different view of
the database, which describes only the
data of interest to that user.
Slide
1-15
EXAMPLE

Suppose we have the following information in our database:
Student
Section
Grade
Report
Name
StudentNumber
Class
Major
Smith
17
1
CS
Brown
8
2
CS
SectionIdenti
fier
CourseNumber
Semester
Year
Instructor
85
MATH2410
Fall
98
King
92
CS1310
Fall
98
Anderson
102
CS3320
Spring
99
Knuth
112
MATH2410
Fall
99
Chang
119
CS1310
Fall
99
Anderson
135
CS3380
Fall
99
Stone
StudentNumber
SectionIdentifier
Grade
17
112
B
17
119
C
8
85
A
8
92
A
8
102
B
8
135
A
Slide
1-16
EXAMPLE

One view for those who want to see the transcript of the students
Student Transcript
StudentName
Smith
Brown
CourseNumber
Grade
Semester
Year
SectionId
CS1310
C
Fall
99
119
MATH2410
B
Fall
99
112
MATH2410
A
Fall
98
85
CS1310
A
Fall
98
92
CS3320
B
Spring
99
102
CS3380
A
Fall
99
135
Slide
1-17
MAIN CHARACTERISTICS
DATABASE APPROACH
OF THE
 Sharing
of data and multiuser transaction
processing : allowing a set of concurrent users
to retrieve and to update the database.
 Concurrency
control within the DBMS guarantees
that each transaction is correctly executed or
completely aborted.
 OLTP (Online Transaction Processing) is a major
part of database applications.
Slide
1-18
MAIN CHARACTERISTICS
DATABASE APPROACH
OF THE
 Transaction:
executing program or process
that includes one or more database accesses,
such as reading or updating of database
records.
 The isolation property ensures that each
transaction appears to execute in isolation from
other transactions, even though hundreds of
transactions may be executing concurrently.
Slide
1-19
Why use a database system instead of a file ?
 Due to characteristics of database approach:

Self-describing nature of a database system
 Insulation between programs and data
 Sharing of data and multiuser transaction processing
 Support of multiple views of the data

Slide
1-20
ADVANTAGES OF USING THE DATABASE
APPROACH

Enforcing integrity constraints on the database.


E.g. the value of the Class data item within each
STUDENT record must be an integer between 1 and 5
The value of Name must be a string of no more than 30
alphabetic characters.
Slide
1-25
ADVANTAGES OF USING THE DATABASE
APPROACH

Controlling redundancy in data storage and in
development and maintenance efforts.

Redundancy is where the same data is stored in more
than one file leading to a waste of space and possible
integrity errors. It is the Duplication of data in
different files.
Hazards of Redundancy
1. duplication of space and effort of maintenance


2.
e.g. Update “grade” should be reflected in all places where
grade is stored
Prone to inconsistencies
Slide
1-26
ADVANTAGES OF USING THE DATABASE
APPROACH

Controlling redundancy in data storage and in
development and maintenance efforts.
Redundancy is where the same data is stored in more
than one file.
 Hazards of Redundancy
1. duplication of space and effort of maintenance
2. Prone to inconsistencies

May forget to update in all places where “grade” is stored
 Still may be inconsistent because updates are applied
independently by each user group.
 E.g. group1 enters the grade as A and group2 enters the
grade erroneously as B.


Store each logical data item only in one place – data
normalization
Slide
1-27
ADVANTAGES OF USING THE DATABASE
APPROACH

Controlling redundancy in data storage and in
development and maintenance efforts.


Redundancy
Duplication is where more than one copy of the same
record occurs or there is duplication of at least one
attribute value.


If the data can be removed without causing a loss of data, then
this duplication may be acceptable.
When the duplication occurs in different files it is called Data
Redundancy.
Slide
1-28
ADVANTAGES OF USING THE DATABASE
APPROACH

Controlling redundancy in data storage and in
development and maintenance efforts.



Redundancy
Duplication
Data Integrity (or consistency) is the problem of ensuring
that the data is accurate. Inconsistencies between two
entries that intent to represent the same "fact" is known
as an integrity error.
This can only arise when there is data redundancy or, worse
still, data duplication.
 Data Integrity is brought under control through elimination
of Data Redundancy.

Slide
1-29
ADVANTAGES OF USING THE DATABASE
APPROACH


Enforcing integrity constraints on the database.
Controlling redundancy in data storage and in
development and maintenance efforts.

Sharing of data among multiple users.

Restricting unauthorized access to data.

Providing persistent storage for program Objects

Providing Storage Structures for efficient Query
Processing
Slide
1-30
ADVANTAGES OF USING THE DATABASE
APPROACH
 Providing
backup and recovery services.
 Providing multiple interfaces to different
classes of users.
 Representing complex relationships among
data.
 Drawing Inferences and Actions using rules

E.g. determine when students are on probation
Slide
1-31
ADDITIONAL IMPLICATIONS OF USING THE
DATABASE APPROACH
 Potential
for enforcing standards: this is
very crucial for the success of database
applications in large organizations.
Standards refer to data item names,
display formats, screens, report
structures, meta-data (description of
data) etc.
 Reduced application development time:
incremental time to add each new
application is reduced.
Slide
1-32
ADDITIONAL IMPLICATIONS OF USING THE
DATABASE APPROACH
 Flexibility
to change data structures:
database structure may evolve as new
requirements are defined.
 Availability of up-to-date information:
very important for on-line transaction
systems such as airline, hotel, car
reservations.
 Economies of scale: by consolidating data
and applications across departments
wasteful overlap of resources and
personnel can be avoided.
Slide
1-33
EXTENDING DATABASE CAPABILITIES
 New
functionality is being added to
DBMSs in the following areas:






Scientific Applications
Image Storage and Management
Audio and Video data management
Data Mining
Spatial data management
Time Series and Historical Data Management
The above gives rise to new research and development in
incorporating new data types, complex data structures,
new operations and storage and indexing schemes in
database systems.
Slide
1-36
WHEN NOT TO USE A DBMS
 Main
inhibitors (costs) of using a
DBMS:
High initial investment and possible need for
additional hardware.
 Overhead for providing generality, security,
concurrency control, recovery, and integrity
functions.

 When a DBMS may be unnecessary:
 If the database and applications are simple, well
defined, and not expected to change.
 If there are stringent real-time requirements
that may not be met because of DBMS overhead.
 If access to data by multiple users is not
required.
Slide
1-37
WHEN NOT TO USE A DBMS

When no DBMS may suffice:
If the database system is not able to handle the
complexity of data because of modeling limitations
 If the database users need special operations not
supported by the DBMS.

Slide
1-38
Download