Data Models

advertisement
Chapter 1
Introduction to Database System
< PART 2 >
Instructors:
Churee Techawut
CS (204)321 Database System I
Outlines
 1) Basic definitions
 2) Database system environment
 3) Examples of database
 4) Typical DBMS functionality
 5) Major characteristics of database approach
 6) Different types of database users
 7) Additional characteristics of database approach
 8) When not to use a DBMS
 9) Components of a database system
10) Database system concepts and architecture
Database System Concepts and
Architecture
1)
Data models
2)
3)
Schemas VS. instances
Three-schema architecture
4)
5)
6)
Data Independence
DBMS language
DBMS Interface
7)
8)
9)
10)
Database system environment
Database system utilities
Database architectures
Classification of DBMS
Data Models
 Data model
“A set of concepts that can be used to describe the structure of a
database (data types and relationships) and certain constraints that
the database should obey.”
 Data model operations
“Operations for specifying database retrievals and updates by
referring to the concepts of the data model.”
Operations on the data model may include basic operations and
user-defined operations.
(e.g. A user-defined operation is COMPUTE_GPA which can be
applied to a STUDENT object.)
Data Models
 Categories of data models
1) Conceptual (high-level, semantic) data models:
Provide concepts that are close to the way many users perceive data.
2) Physical (low-level, internal) data models:
Provide concepts that describe the details of how data is stored in the
computer.
3) Implementation (representational) data models:
Provide concepts that fall between above two, balancing user views with
some computer storage details.
Schemas VS. Instances
 In any data model it is important to distinguish between the description
of the database and the database itself.
 Database schema (or meta-data)
“The description of database. It includes description of database
structure and the constraints that should hold on the database.”
The database schema is specified during database design and is not
expected to change frequently.
e.g.
Name: string
StudentNumber: string
Class: integer
Major: string
Schemas VS. Instances
 Schema diagram
“A diagrammatic display of a database schema – structure of each record
type (not the actual instances of a record).”
STUDENT
Name StudentNumber
Class
COURSE
CourseName CourseNumber
PREREQUITSITE
CourseNumber
Major
CreditHours
Department
PrerequisiteNumber
SECTION
SectionIdentifier CourseNumber
Semester
GRADE_REPORT
StudentNumber SectionIdentifier
Grade
Year
Instructor
Schemas VS. Instances
 Schema construct
“An object within the schema.”
e.g. STUDENT, COURSE.
 Database instances
“The actual data stored in a database at a particular moment in time.
Also called database state or occurrence.”
Many database instances can be constructed to correspond to a
particular database schema.
Schemas VS. Instances
Define a new DB
Data firstly
loaded
Specify DB schema
DBMS
catalog
Update
operation
Database
Database
Database
empty state
initial state
DBMS
ensures
valid state
Schemas VS. Instances
 Distinction
The database schema does not frequently change, but the
database state changes every time the database is updated.
Schema is also called intension, whereas state is called extension.
Three-Schema Architecture
 The three-schema architecture was proposed to support DBMS
characteristics of :
Program-data independence.
Supporting multiple views of the data.
 The goal of the three-schema architecture is to separate the user
applications and the physical database.
Three-Schema Architecture
 Schema can be defined at the following three level.
1) Internal schema at the internal level
Describes physical storage structures and access paths.
Typically uses a physical data model.
2) Conceptual schema at the conceptual level
Describes the structure (such as entities, data type, relationship) and
constraints for the whole database.
Uses a conceptual or implementation data model.
3) External schemas at the external level
Describes the various user views.
Usually uses the same data model as the conceptual level.
Three-Schema Architecture
END USERS
External level
Conceptual level
Internal level
EXTERNAL
VIEW1
EXTERNAL
VIEWn
CONCEPTUAL SCHEMA
INTERNAL SCHEMA
STORED DATABASE
Source: Elmasri R. & Navathe S.B. (1994) Fundamentals of database systems.
Three-Schema Architecture
 Mappings among schema levels are needed to transform requests
and data.
If the request is a database retrieval, the data extracted from the
stored database must be reformatted to match the user’s external
view.
 Programs refer to an external schema, and are mapped by the
DBMS to the internal schema for execution.
 Notice that the three schemas are only descriptions of data; the only
data that actually exists is at the physical level.
Data Independence
 Two types of data independence:
1)
Logical data independence
The capacity to change the conceptual schema without having to
change the external schemas and their application programs.
2)
Physical data independence
The capacity to change the internal schema without having to
change the conceptual (or external) schema.
Data Independence
 When a schema at a lower level is changed, only the mappings
between this schema and higher-level schemas need to be changed in a
DBMS that fully supports data independence.
 The higher-level schemas themselves are unchanged. Therefore, the
application programs need not be changed since they refer to the external
schemas.
DBMS Language
Once the design of a database is completed and a DBMS is chosen to
implement the database:
 Data Definition Language (DDL) is used by the DBA and by
database designers to define the conceptual schema for the
database and any mapping between the two.
 Storage Definition Language (SDL) is used to specify the internal
schema.
 View Definition Language (VDL) are used to specify external schema user views and their mappings to the conceptual schema.
DBMS Language
Once the database schemas are compiled and the database is populated
with data:
 Data Manipulation Language (DML) are used to specify database
retrievals and updates.
 DML commands (data sublanguage) can be embedded in a generalpurpose programming language (host language), such as COBOL, C or
an Assembly Language.
 In object-oriented systems, the host and data sublanguages typically
form one integrated language such as C++.
 Alternatively, a high-level DML used in stand-alone interactive manner is
called a query language.
DBMS Language
 Types of DML
1) Procedural DML (record-at-a-time or low-level DML)
Must be embedded in a programming language.
Typically retrieve individual records from the database, and use looping
and other constructs of the host programming language to retrieve
multiple records.
Specify how to retrieve data.
e.g. COBOL, C, etc.
DBMS Language
2) Declarative or Non-procedural DML (set-at-a-time or high-level DML)
Use as a stand-alone query language or embedded in a programming
language.
Typically retrieves information from multiple related database records in
a single command.
Specify what data to retrieve than how to retrieve.
Also called declarative languages.
e.g. SQL
DBMS Interface
 Stand-alone query language interfaces
 Programmer interfaces for embedding DML in programming
languages:
1) Pre-compiler Approach
2) Procedure Call Approach
DBMS Interface
 User-friendly interfaces provided by a DBMS
1) Menu-based interface
No need to memorize the specific commands and syntax of a query
language.
2) Graphical interface
Specify query via schema diagram and can be combined with menus.
3) Forms-based interface
Usually programmed for parametric users to fill out the form entries to
insert new data for creating canned transactions.
4) Natural language interface
Accept and interpret requests written in English or some other language.
5) Combination of above
Other DBMS Interface
 Speech as Input and Output
 Web Browser as an interface
 Interfaces for parametric users (e.g., bank tellers)
Have a small set of operations.
Use function keys for minimizing number of keystrokes.
 Interface for the DBA
Use privileged commands for creating accounts, setting system
parameters, granting account authorization, changing schema, and
reorganizing the storage structure of a database.
Database System Environment
Source: Elmasri R. & Navathe
S.B. (1994) Fundamentals of
database systems
Database System Environment
 DBMS components modules are as follows.
1) Stored data manager controls access to DBMS information
stored on disk.
2) DDL compiler processes schema definitions, specified in the DDL,
and stores descriptions of the schemas (meta-data) in the DBMS
catalog. It also compiles commands into object code for database
access.
3) Run-time database processor handles database access at run
time by executing the request.
4) Query compiler parses and analyzes a query.
5) Precomplier extracts DML commands from an application
program written in a host programming language.
6) DML complier compiles DML commands into object code for
database access. The rest of the program is sent to the host language
compiler.
Database System Utilities
 Common database utilities have the following types of functions
1) Loading existing data files into the database.
2) Backing up copy of the database periodically.
3) Reorganizing database file structures to improve performance.
4) Report generation utilities.
5) Monitoring database usage and providing statistics to the DBA.
 Other functions, such as sorting, user monitoring, data compression, etc.
Database System Utilities
 Data dictionary is an important and very useful utility.
Used to store schema descriptions and other information such as
design decisions, application program descriptions, user information,
usage standard, etc.
 Data dictionary vs. DBMS catalog
 Combination of catalog/data dictionary:
Active data dictionary is accessed by DBMS s/w and users/DBA.
Passive data dictionary is accessed by users/DBA only.
Database Architectures
user
client
Application
Application client
network
Database system
user
network
server
Application server
Database system
Two-tier architecture
Three-tier architecture
Source: Silberschatz A., Korth, H.F. & Sudarshan S. (2006) Database system concepts.
Database Architectures
 Two Tier Client-Server Architectures
 Application on client machine invokes database system functionality
at the server machine through query language statements.
 Application program interface like ODBC (Open Database
Connectivity) and JDBC (Java Database Connectivity) are used for
interaction.
 Three Tier Client-Server Architectures
 Client machine communicates with application server only which
means it does not contain any direct database calls.
 Application server communicates with a database system to access
data.
 Appropriate for large applications, and web applications.
Classification of DBMS
 Based on the data model used:
 Traditional: Relational, Network, Hierarchical
 Emerging: Object-oriented, Object-relational
 Other classifications
 Single-user (typically used with micro-computers) vs. multi-user
(most DBMSs).
 Centralized (uses a single computer with one database) vs.
distributed (uses multiple computers, multiple databases)
 Distributed (or client server based database systems, a set of
database servers supports a set of clients)
Classification of DBMS
 Data model is the main criterion used to classify DBMS.
1) Relational data model represents a collection of tables.
2) Network model represents data as record types and limited type of
1:N relationship, called a set type.
3) Hierarchical model represents data as hierarchical tree structures.
Each hierarchy represents a number of related records.
4) Object-oriented model defines a database in terms of objects, their
properties, and their operations. Object with the same structure and
behavior belongs to a class.
5) Object-relational model combines relational data model and objectoriented model to define complex data types.
Example of Relational Data Model
STUDENT
COURSE
Name
StudentNumber
Class
Major
Smith
17
1
COSC
Brown
8
2
COSC
CourseName
CourseNumber
Intro to Computer Science
COSC1310
4
COSC
Data Structures
COSC3320
4
COSC
Discrete Mathematics
MATH2410
3
MATH
Database
COSC3380
3
COSC
PREREQUISITE
CreditHours Department
CourseNumber
PrerequisiteNumber
COSC3380
COSC3320
COSC3380
MATH2410
COSC3320
COSC1310
Source: Elmasri R. & Navathe S.B. (1994) Fundamentals of database systems
Example of Relational Data Model (Cont.)
SECTION
SectionIdentifier
CourseNumber
Semester
Year
Instructor
85
MATH2410
Fall
91
King
92
COSC1310
Fall
91
Anderson
102
COSC3320
Fall
92
Knuth
112
MATH2410
Fall
92
Chang
119
COSC1310
Fall
92
Anderson
135
COSC3380
Fall
92
Stone
GRADE_REPORT
StudentNumber
SectionIdentifier
Grade
17
112
B
17
119
C
8
85
A
8
92
A
8
102
B
8
135
A
Source: Elmasri R. & Navathe S.B. (1994) Fundamentals of database systems
Example of a Network Schema
STUDENT
COURSE
COURSE_OFFERINGS
IS_A
STUDENT_
GRADES
HAS_A
SECTION
SECTION_GRADES
PREREQUISITE
GRADE_REPORT
Source: Elmasri R. & Navathe S.B. (1994) Fundamentals of database systems
Example of a Hierarchical Schema
DEPARTMENT
DNAME DNUMBER MGRNAME
MGRSTARTDATE
EMPLOYEE
NAME SSN BDATE ADDRESS
PROJECT
PNAME PNUMBER PLOCATION
Source: Elmasri R. & Navathe S.B. (1994) Fundamentals of database systems
Download