Database Modelling

advertisement
DATABASE CONCEPTS
Dr. Awad Khalil
Computer Science & Engineering Department
AUC
Databases are considered as major components in almost all recent computer
application systems, including business, management, engineering, education,
medicine, science, ... etc. Database technology has a major impact on the
growing use of computer systems.
What is a Database?


A database is simply a collection of related data.
Data mean known facts that can be recorded and that have implicit meaning.
Database Properties
1. A database is a logically coherent collection of data with some inherent meaning. A
random collection of data cannot be considered as a database.
2. A database represents aspects of a real system in real world. Examples of these real
systems :
 A hospital
 A bank
 A factory
 A company
 A university
3. A database is designed and implemented to afford the informational needs of a specific
group of users. These informational needs are supported by a set of application systems
running against the implemented database.
Example I
Consider a small personal database to maintain data concerning the names, telephone numbers,
and addresses of the people you know
TELEPHONE DIRECTORY
NAME
Ahmed
Aly
Badran
Sadek
Salah
PHONE NO
2451615
2467295
2977251
3401312
3409123
1
ADDRESS
Heliopolis
Heliopolis
Nozah
Zamalek
Dokkie
Example II
Consider a suppliers-and-parts database to maintain data concerning suppliers supplying parts
for a certain company.
SUPPLIER
SNO
S1
S2
S3
S4
S5
Sname
Ahmed
Badran
Aly
Saleh
Sadek
Status
20
10
10
30
20
City
Cairo
Cairo
Alex.
Tanta
Cairo
PNO
P1
P2
P3
P4
P5
P6
Pname
Nut
Bolt
Screw
Cam
Cog
Screw
Color
Red
Green
Blue
Red
Black
Black
Weight
12
15
15
17
20
14
PART
SUPPLY
SNO
S1
S1
S1
S2
S2
S4
S4
S5
PNO
P1
P2
P3
P1
P3
P2
P3
P2
QTY
100
200
100
150
100
200
300
100
Database System Environment
In a database system environment, four main components can be recognized
1.
Database.
2.
Database Management System (DBMS).
3.
Application programs.
4.
Users.
2
Users/Programmers
DATABASE
SYSTEM
Application Programs/Queries
DBMS
Software
Software to Process
Queries/Programs
Software to Access
Stored Data
Stored Database
Definition
(Meta-Data)
Stored
Database
(1)
The Database
Data in the database will be both integrated and shared .
Database management system (DBMS)
Database
Application
programs
End-users
STUDENT
Name Address
ENROLLMENT
Name
Course
Department
...........
3
...........
(2)
The Database Management System (DBMS)
The Database Management System (DBMS) is a general-purpose software that enables users to
create and maintain a database. The DBMS facilitates the process of defining, constructing, and
manipulating databases for various applications.
1. Defining: Data Definition Language (DDL).
2. Constructing.
3. Manipulating: Data Manipulation Language (DML).
DBMS Layers
1. Software to process Queries and Programs: (DML). The language SQL (Structured Query
Language) is a typical example of a database query language.
2. Software to access stored data
(3)
The Application Programs
These are the programs written to support the end users requirements. A given end user can
access the database via one of the online applications, where he or she operates by choosing
items from a menu or filling in items on a form. Such menu- or forms-driven interfaces tend to be
easier to use for people who do not have a formal training in data processing.
Personnel
users
Personnel
Application
Common
Database
Payroll
users
Payroll
Application
(4)
USERS
Database
Users
Professional
Users
Database
Designers
Database
Adnibistrators
End Users
System Analysts
& Application
Programmers
Casual
end users
4
Parametric
end users
Sophisticated
end users
Stand-alone
end users
Characteristics of the Database Approach
1.
2.
3.
4.
Self-describing nature of a database
Program-data independence and Data Abstraction
Support multiple views of the data
Sharing of data and multiuser transaction processing
Personnel
files
Personnel
Application
Payroll
files
Payroll
Application
Personnel
users
Payroll
users
Traditional file processing approach
Personnel
Application
Personnel
users
Common
Database
Payroll
Application
The Database approach.
5
Payroll
users
DATA STORING APPROACHES
1. The File-based Approach
Each application has its own set of files.
Disadvantages:

Inflexibility: a “mass production facilty” – committed to processing particular queries.

Uncontrolled Redundancy: If these separate applications need to process the same data then
must duplicate copies of the data i.e., each application has its own data files, several copies of
the same data may exist in different applications. This leads to:
 Wastage of valuable storage space.
 Need to input data to several files.
 Data inconsistency (one fact may have more than one value – various versions may occur).

Poor Enforcement of System Standards: Data names, formats, access restrictions, … etc. are not
standardized across an organization, may have many synonyms and homonyms. This makes
modifications difficult and hinders sharing of data.

Limited Data Sharing: Each application has its own private file providing little opportunity for users
to share existing data. Additionally any new applications would not be able to use existing files
leading to low productivity.

Progran – Data Dependency: Descriptions of files, records, data items are embedded within
application programs. Any modification to a data file requires that the application programs using
that file must also be changed. In other words, program maintenance will be excessive.
2. The Database Approach
The database approach improves upon file-based systems. A DBMS (Data Base Management
System) does not fragment data into separate files but regards data as being stored in a large
concptual repository termed as database. The DBMS handles the addition, storage, update, and
retrieval of data. DBMSs are based on semantically rich data models, which can accurately
represent real world data. DBMSs allow:









Persistence of Data
Transaction Control
Concurrency Control
Recovery Control
Querying
Integrity Control
Data Security
Version Control
Performance Tuning
Components of DBMS
 DBMS Engine
 Interface Subsystem (DDL, DML, DCL, Graphical User Interface, Forms Interface,
Natural Language Interface)
 Data Dictionary Subsystem
 Performance Management Subsystem
 Concurrency Control Subsystem
6




Data Integrity Management Subsystem
Backup and Recovery Subsystem
Application Development Subsystem
Security Management Subsystem
Benefits of the Database Approach
 Ease of application development
 Minimal data redundancy
 Enforcement of standards
 Data can be shared
 Physical data independence
 Logical data independence
 Better modeling of real world data
 Uniform security and integrity controls
 Economy of scale
Risks of the Database Approach
 New specialized personnel
 Need for explicit backup
 Organizational conflict
 Large size
 Expensive
 High impact of failure
Database Modelling
Database Structure





A database structure is the description and definition of all basic structures such as simple
conceptual files, datatypes, relationships, and constraints that should hold on the data.
In any data model it is important to distinguish between the description of the database
(Schema) and the database itself (Instance).
Database Schema: The description of a database is called the database schema (or
the meta-data).
A database schema is specified during database design and is not expected to change
frequently.
The following is a simplified database schema diagram:
7
A Database Schema in SQL
CREATE TABLE EMPLOYEE
(FNAME
VARCHAR(15) NOT NULL,
MINIT CHAR,
LNAME VARCHAR(15) NOT NULL,
SSN
SSN_TYPE
NO T NUL L ,
BDATE DATE
ADDRESS
VARCHAR(30),
SEX
CHAR,
SALARY
DECIMAL(10,2),
SUPERSSN
SSN_TYPE,
DNO INT
NO T NUL L ,
PRIMARY KEY (SSN),
FOREIGN KEY (SUPERSSN) REFERENCES EMPLOYEE(SSN),
FOREIGN KEY (DNO) REFERENCES DEPARTMENT(DNUMBER));
CREATE TABLE DEPARTMENT
(DNAME
VARCHAR(15) NOT NULL,
DNUMBER
INT,
NO T NUL L ,
MGRSSN
SSN_TYPE
NO T NUL L ,
MGRSTARTDATE
DATE
PRIMARY KEY (DNUMBER),
UNIQUE (DNAME)
FOREIGN KEY (MGRSSN) REFERENCES EMPLOYEE(SSN));
CREATE TABLE DEPT_LOCATIONS
(DNUMBER
INT
NO T NUL L ,
DLOCATION VARCHAR(15) NOT NULL,
PRIMARY KEY (DNUMBER, DLOCATION),
FOREIGN KEY (DNUMBER) REFERENCES DEPARTMENT(DNUMBER));
CREATE TABLE PROJECT
(PNAME
VARCHAR(15)
NO T NUL L ,
PNUMBER
INT
NO T N UL L ,
PLOCATION VARCHAR(15)
DNUM
INT
NO T NUL L ,
PRIMARY KEY (PNUMBER),
UNIQUE (PNAME)
FOREIGN KEY (DNUM) REFERENCES DEPARTMENT(DNUMBER));
CREATE TABLE W ORKS_ON
(ESSN SSN_TYPE
NO T NUL L ,
PNO INT
NO T NUL L ,
HO UR S
DECIMAL(3,1) NOT NULL,
PRIMARY KEY (ESSN, PNO),
FOREIGN KEY (ESSN) REFERENCES EMPLOYEE(SSN),
FOREIGN KEY (PNO) REFERENCES PROJECT(PNUMBER));
CREATE TABLE DEPENDENT
(ESSN SSN_TYPE NOT NULL,
DEPENDENT_NAME VARCHAR(15) NOT NULL,
SEX CHAR,
BDATE
DATE,
RELATIONSHIP
VARCHAR(8)
PRIMARY KEY (ESSN, DEPENDENT_NAME),
FOREIGN KEY (ESSN) REFERENCES EMPLOYEE(SSN));
8
Database State (Instance)

The data in the database at a particular moment of time is called the database state (or
instance).
The Three-Layer Architecture
End Users
External Level
External
View1
External
Viewn
external/conceptual
mapping
Conceptual Level
Conceptual Schema
conceptual/internal
mapping
Internal Level
Internal Schema
Stored Database
9
Data Independence

Data independence is the capacity to change the schema at one level of a database
system without having to change the schema at the next higher level. Defined as:
 Logical data independence is the capacity to change the conceptual schema without
having to change external schemas or Logical data application programs.
 Physical data independence is the capacity to change the internal schema without having
to change the conceptual (or external) schemas.
Data Models

A data model is a set of concepts that can be used to describe a database structure.
Data Models
High-level
(conceptual)
Data Models
Relational
Implementation
Data Models
Hierarchical
Network
Low-level
(physical)
Data Models
Objectoriented
Classification of DBMSs

Classification according to Data Model
DBMSs
Relational
Hierarchical
Network
10
Objectoriented
The Relational Data Model
 The relational data model represents the database as a collection of tables, where each
table can be stored as separate file.

Examples of commercial relational DBMSs:
 DB2 from IBM
 ORACLE from Oracle Corporation
 Informix from Informix
 SyBase from OpenSoft
 SQL Server from Microsoft
 MS-ACCESS from Microsoft
An Example of a Relational Database
11
The Network Data model

The network data model represents data as a record types. An example of a network
model is known as the CODASYL DBTG model.
The Hierarchical Data model

The hierarchical data model represents data as hierarchical tree structure. Each
hierarchical represents a number of related records.
12
The Object-Oriented Data model

The object-oriented data model defines a database in terms of objects, their properties,
and their operations. Objects with the same structure and behavior belong to a class, and
classes are organized into hierarchies or a cyclic graphs. The operations of each class are
specified in terms of predefined procedures called methods.
Experimental OO prototypes
The ORION system developed at MCC,
The OpenOODB system at Texas Instruments,
The IRIS system developed at HP laboratories,
The ODE system at ATT Bell Labs, and
The ENCORE/ObServer project at Brown University.
Commercially available OO systems
GEM-STONE/OPAL of SerioLogic,
ONTOS of Ontologic,
Objectivity of Objectivity Inc.,
Versant of Versant Technologies,
ObjectStore of Object Design, and,
O2 of O2 Technology.
Classification of DBMSs (Cont’d)

Classification according to Number of Users
 Single user systems support only one user at a time and are mostly used with personal
computers.
 Multiuser
systems, which include the majority of DBMSs, support many users
concurrently.

Classification according to Number of Sites
 Centralized DBMS
where the data is stored at a single computer site. Most DBMSs
are centralized. A centralized DBMS can support multiple users, but the DBMS and
the database themselves reside totally at a single computer site.
 Distributed
DBMS (DDBMS) can have the actual database and DBMS software
distributed over many sites, connected by a computer network. Many DDBMSs use a
client-server architecture.
13
Download