UNIT5 - WordPress.com

advertisement
Truba Institute of Engineering & Information Technology Bhopal
BE-205
UNIT V









Data base Management System : Introduction
File oriented approach and Database approach
Data Models
Architecture of Database System
Data independence
Data dictionary
DBA
Primary Key
Data definition language and Manipulation Languages
Submitted by: Sugan Patel Computer Science & Engineering Department
Page 1
Truba Institute of Engineering & Information Technology Bhopal
BE-205
1.Introduction
A database is a collection of stored operational data used by various applications and/or users by some
particular enterprise or by a set of outside authorized applications and authorized users.
Database Management System:
A Database Management System (DBMS) is a software system that manages execution of users
applications to access and modify database data so that the data security, data integrity, and data
reliability is guaranteed for each application and each application is written with an assumption that it is
the only application active in the database.
Data
Different viewpoints:
–A sequence of characters stored in computer memory or storage
–Interpreted sequence of characters stored in computer memory or storage
–Interpreted set of objects
– Database supports a concurrent access to the data
File Systems:
•File is uninterrupted, unstructured collection of information
•File operations: delete, catalog, create, rename, open, close, read, write, find, …
•Access methods: Algorithms to implement operations along with internal file organization
•Examples: File of Customers, File of Students; Access method: implementation of a set of operations
on a file of students or customers.
File Management System Problems:
•Data redundancy
Submitted by: Sugan Patel Computer Science & Engineering Department
Page 2
Truba Institute of Engineering & Information Technology Bhopal
BE-205
•Data Access: New request-new program
•Data is not isolated from the access implementation
•Concurrent program execution on the same file
•Difficulties with security enforcement
•Integrity issues.
Database Applications:
•Airline Reservation Systems – Data items are: single passenger reservations; Information about flights
and airports; Information about ticket prices and tickets restrictions.
•Banking Systems – Data items are accounts, customers, loans, mortgages, balances, etc. Failures are
not tolerable. Concurrent access must be provided.
•Corporate Records – Data items are: sales, accounts, bill of materials records, employee and their
dependents
ADVANTAGES OF A DBMS:
Data independence: Application programs should be as independent as possible from details of data
representation and storage. The DBMS can provide an abstract view of the data to insulate application
code from such details.
Client data access: A DBMS utilizes a variety of sophisticated techniques to store and retrieve data
efficiently. This feature is especially important if the data is stored on external storage devices.
Submitted by: Sugan Patel Computer Science & Engineering Department
Page 3
Truba Institute of Engineering & Information Technology Bhopal
BE-205
Data integrity and security: If data is always accessed through the DBMS, the DBMS can enforce
integrity constraints on the data. For example, before inserting salary information for an employee, the
DBMS can check that the department budget is not exceeded. Also, the DBMS can enforce access
controls that govern what data is visible to different classes of users.
Data administration: When several users share the data, centralizing the administration
of data cant improvements. Experienced professionals, who understand the nature of the data being
managed, and how different groups of users use it, can be responsible for organizing the data
representation to minimize redundancy and for ne-tuning the storage of the data to make retrieval
efficient.
Concurrency recovery: A DBMS schedules concurrent accesses to the data in such a manner that users
can think of the data as being accessed by only one user at a time. Further, the DBMS protects users
from the system failures.
Reduced application development time: Clearly, the DBMS supports many important functions that are
common to many applications accessing data stored in the DBMS. This, in conjunction with the highlevel interface to the data, facilitates quick development of applications. Such applications are also
likely to be more robust than applications developed from scratch because many important tasks are
handled by the DBMS instead of being implemented by the application.
Submitted by: Sugan Patel Computer Science & Engineering Department
Page 4
Truba Institute of Engineering & Information Technology Bhopal
BE-205
Data
Levels
and
their
Roles
Physical – corresponds to the first view of data:
How data is stored, how is it accessed, how data is modified, is data ordered, how data is allocated to
computer memory and/or peripheral devices, how data items are actually represented (ASCI,
EBCDIC,…) .The physical schema species additional storage details. Essentially, the physical schema
summarizes how the relations described in the conceptual schema are actually stored on secondary
storage devices such as disks and tapes. We must decide what le organizations to use to store the
relations, and create auxiliary data structures called indexes to speed up data retrieval operations.
Conceptual – corresponds to the second view of data:
What we want the data to express and what relationships between data we must express, what “ story”
data tells, are all data necessary for the “story’ are discussed. The conceptual schema (sometimes called
the logical schema) describes the stored data in terms of the data model of the DBMS. In a relational
DBMS, the conceptual schema describes all relations that are stored in the database. In our sample
university database, these relations contain information about entities, such as students and faculty, and
about relationships, such as students' enrollment in courses. All student entities can be described using
Submitted by: Sugan Patel Computer Science & Engineering Department
Page 5
Truba Institute of Engineering & Information Technology Bhopal
BE-205
records in a Students relation, as we saw earlier. In fact, each collection of entities and each collection
of relationships can be described as a relation, leading to the following conceptual schema:
Students(sid:
string,
name:
string,
login:
string,
age:
integer,
gpa:
real)
Faculty(
d:
string,
fname:
string,
sal:
real)
Courses(cid:
string,
cname:
string,
credits:
integer)
Rooms(rno:
integer,
address:
string,
capacity:
integer)
Enrolled(sid:
string,
cid:
string,
grade:
string)
Teaches(
d:
string,
cid:
string)
Meets
In(cid:
string,
rno:
integer,
time:
string)
The choice of relations, and the choice of elds for each relation, is not always obvious,
and the process of arriving at a good conceptual schema is called conceptual
database design.
View – corresponds to the third view of data:
What part of the data is seen by a specific application? External schemas, which usually are also in
terms of the data model of the DBMS, allow data access to be customized (and authorized) at the level
of individual users or groups of users. The external schema design is guided by end user requirements.
For example, we might an to allow students to nd out the names of faculty members teaching courses,
as
well
as
course
enrollments.
Course info (cid: string, fname: string, enrollment: integer)
3. DATA MODEL:
E-R modeling is a conceptual level model





Entities are real-world objects about which we collect data
Attributes describe the entities
Relationships are associations among entities
Entity set – set of entities of the same type
Relationship set – set of relationships of same type
Relationships sets may have descriptive attributes
Represented by E-R diagrams
Submitted by: Sugan Patel Computer Science & Engineering Department
Page 6
Truba Institute of Engineering & Information Technology Bhopal
BE-205
Submitted by: Sugan Patel Computer Science & Engineering Department
Page 7
Truba Institute of Engineering & Information Technology Bhopal
BE-205
Object-oriented Model
Uses the E-R modeling as a basis but extended to include encapsulation, inheritance
Objects have both state and behavior


State is defined by attributes
Behavior is defined by methods (functions or procedures)
Designer defines classes with attributes, methods, and relationships
Class constructor method creates object instances



Each object has a unique object ID
Classes related by class hierarchies
Database objects have persistence
Submitted by: Sugan Patel Computer Science & Engineering Department
Page 8
Truba Institute of Engineering & Information Technology Bhopal
BE-205
Both conceptual-level and logical-level model
The Hierarchical Data Model
The Hierarchical Data Model structures data in a tree of records, with each record having one parent
record and many children. It can be represented as follows:
Submitted by: Sugan Patel Computer Science & Engineering Department
Page 9
Truba Institute of Engineering & Information Technology Bhopal
BE-205
A hierarchical database consists of the following:
1.
2.
3.
4.
5.
6.
7.
It contains nodes connected by branches.
The top node is called the root.
If multiple nodes appear at the top level, the nodes are called root segments.
The parent of node nx is a node directly above nx and connected to nx by a branch.
Each node (with the exception of the root) has exactly one parent.
The child of node nx is the node directly below nx and connected to nx by a branch.
One parent may have many children.
By introducing data redundancy, complex network structures can also be represented as hierarchical
databases. This redundancy is eliminated in physical implementation by including a 'logical child'. The
logical child contains no data but uses a set of pointers to direct the database management system to the
physical child in which the data is actually stored. Associated with a logical child are a physical parent
and a logical parent. The logical parent provides an alternative (and possibly more efficient) path to
retrieve logical child information.
The Network Data Model
The Network Data Model uses a lattice structure in which a record can have many parents as well as
many children. It can be represented as follows:
Submitted by: Sugan Patel Computer Science & Engineering Department
Page 10
Truba Institute of Engineering & Information Technology Bhopal
BE-205
Like the The Hierarchical Data Model the Network Data Model also consists of nodes and branches,
but a child may have multiple parents within the network structure instead of being restricted to just
one.
Relational Model
Record- and table-based model
Relational database modeling is a logical-level model
Proposed by E.F. Codd




Based on mathematical relations
Uses relations, represented as tables
Columns of tables represent attributes
Tables represent relationships as well as entities
Successor to earlier record-based models—network and hierarchical
Submitted by: Sugan Patel Computer Science & Engineering Department
Page 11
Truba Institute of Engineering & Information Technology Bhopal
BE-205
Submitted by: Sugan Patel Computer Science & Engineering Department
Page 12
Truba Institute of Engineering & Information Technology Bhopal
BE-205
4. DBMS ARCHITECTURE
DBMS (Database Management System) acts as an interface between the user and the database.
The user requests the DBMS to perform various operations (insert, delete, update and retrieval)
on the database. The components of DBMS perform these requested operations on the database
and provide necessary data to the users. The various components of DBMS are shown below: -
1. DDL Compiler - Data Description Language compiler processes schema definitions specified in
the DDL. It includes metadata information such as the name of the files, data items, storage
details of each file, mapping information and constraints etc.
2. DML Compiler and Query optimizer - The DML commands such as insert, update, delete,
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access. The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager.
Submitted by: Sugan Patel Computer Science & Engineering Department
Page 13
Truba Institute of Engineering & Information Technology Bhopal
BE-205
3. Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System.
The Main Functions Of Data Manager Are: –
• Convert operations in user's Queries coming from the application programs or combination of DML
Compiler and Query optimizer which is known as Query Processor from user's logical view to
physical file system.
• Controls DBMS information access that is stored on disk.
• It also controls handling buffers in main memory.
• It also enforces constraints to maintain consistency and integrity of the data.
• It also synchronizes the simultaneous operations performed by the concurrent users.
• It also controls the backup and recovery operations.
5.
Data
Independence
Data Independence means that the higher levels of the database model are designed to be unaffected by
changes to the lower levels (internal and physical). There are two types of Data Independence.
-
Logical
Physical
data
data
independence
independence
Logical Data Independence involves the external schema being unaffected by changes in the conceptual
schema. For example, a new field can be added to a table (relation) without any changes to application
programs etc... being required.
Submitted by: Sugan Patel Computer Science & Engineering Department
Page 14
Truba Institute of Engineering & Information Technology Bhopal
BE-205
Physical Data Independence means that the conceptual schema is not affected by changes made
to the internal schema. An example of a change to the internal schema would be changing the
storage device used to store the database data. This would not affect the conceptual or external
schemas / layers.
6.Data Dictionary –
Data Dictionary is a repository of description of data in the database. It Contains information about
• Data - names of the tables, names of attributes of each table, length of attributes, and number of rows
in each table.
• Relationships between database transactions and data items referenced by them which are useful in
determining which transactions are affected when certain data definitions are changed.
• Constraints on data i.e. range of values permitted.
Submitted by: Sugan Patel Computer Science & Engineering Department
Page 15
Truba Institute of Engineering & Information Technology Bhopal
BE-205
• Detailed information on physical database design such as storage structure, access paths, files and
record sizes.
• Access Authorization - is the Description of database users their responsibilities and their access
rights.
• Usage statistics such as frequency of query and transactions.
Data dictionary is used to actually control the data integrity, database operation and accuracy. It may be
used as an important part of the DBMS.
Importance of Data Dictionary –
Data Dictionary is necessary in the databases due to following reasons:
• It improves the control of DBA over the information system and user's understanding of use of the
system.
• It helps in documentation the database design process by storing documentation of the result of every
design phase and design decisions.
• It helps in searching the views on the database definitions of those views.
• It provides great assistance in producing a report of which data elements (i.e. data values) are used in
all the programs.
• It promotes data independence i.e. by addition or modifications of structures in the database
application program are not affected.
Submitted by: Sugan Patel Computer Science & Engineering Department
Page 16
Truba Institute of Engineering & Information Technology Bhopal
BE-205
7.Database Administrator
The database administrator (DBA) is the person (or group of people) responsible for overall control of
the database system. The DBA's responsibilities include the following:






Deciding the information content of the database, i.e. identifying the entities of interest to the
enterprise and the information to be recorded about those entities. This is defined by writing the
conceptual schema using the DDL
Deciding the storage structure and access strategy, i.e. how the data is to be represented by
writing the storage structure definition. The associated internal/conceptual schema must also be
specified using the DDL
liaising with users, i.e. to ensure that the data they require is available and to write the necessary
external schemas and conceptual/external mapping (again using DDL)
Defining authorization checks and validation procedures. Authorization checks and validation
procedures are extensions to the conceptual schema and can be specified using the DDL
Defining a strategy for backup and recovery. For example periodic dumping of the database to a
backup tape and procedures for reloading the database for backup. Use of a log file where each
log record contains the values for database items before and after a change and can be used for
recovery purposes
monitoring performance and responding to changes in requirements, i.e. changing details of
storage and access thereby organizing the system so as to get the performance that is `best for
the enterprise'
Data Redundancy
In non-database systems each application has its own private files. This can often lead to redundancy in
stored data, with resultant waste in storage space. In a database the data is integrated.
The database may be thought of as a unification of several otherwise distinct data files, with any
redundancy among those files partially or wholly eliminated.
Data integration is generally regarded as an important characteristic of a database. The avoidance of
redundancy should be an aim, however, the vigour with which this aim should be pursued is open to
question.
Redundancy is



direct if a value is a copy of another
indirect if the value can be derived from other values:
o simplifies retrieval but complicates update
o conversely integration makes retrieval slow and updates easier
Data redundancy can lead to inconsistency in the database unless controlled.
Submitted by: Sugan Patel Computer Science & Engineering Department
Page 17
Truba Institute of Engineering & Information Technology Bhopal
BE-205



The system should be aware of any data duplication - the system is responsible for ensuring
updates are carried out correctly.
a DB with uncontrolled redundancy can be in an inconsistent state - it can supply incorrect or
conflicting information
A given fact represented by a single entry cannot result in inconsistency - few systems are
capable of propagating updates i.e. most systems do not support controlled redundancy.
Data Integrity
This describes the problem of ensuring that the data in the database is accurate...




Inconsistencies between two entries representing the same `fact' give an example of lack of
integrity (caused by redundancy in the database).
Integrity constraints can be viewed as a set of assertions to be obeyed when updating a DB to
preserve an error-free state.
Even if redundancy is eliminated, the DB may still contain incorrect data.
Integrity checks which are important are checks on data items and record types.
9.DDL
Data Definition Language (DDL) statements are used to define the database structure or schema.
Some examples:
o
o
o
o
o
o
CREATE - to create objects in the database
ALTER - alters the structure of the database
DROP - delete objects from the database
TRUNCATE - remove all records from a table, including all spaces allocated for the records are
removed
COMMENT - add comments to the data dictionary
RENAME - rename an object
DML
Data Manipulation Language (DML) statements are used for managing data within schema objects.
Some examples:
o
o
o
o
o
o
o
SELECT - retrieve data from the a database
INSERT - insert data into a table
UPDATE - updates existing data within a table
DELETE - deletes all records from a table, the space for the records remain
MERGE - UPSERT operation (insert or update)
CALL - call a PL/SQL or Java subprogram
EXPLAIN PLAN - explain access path to data
Submitted by: Sugan Patel Computer Science & Engineering Department
Page 18
Truba Institute of Engineering & Information Technology Bhopal
BE-205
o
LOCK TABLE - control concurrency
DCL
Data Control Language (DCL) statements. Some examples:
o
o
GRANT - gives user's access privileges to database
REVOKE - withdraw access privileges given with the GRANT command
TCL
Transaction Control (TCL) statements are used to manage the changes made by DML statements. It
allows statements to be grouped together into logical transactions.
o
o
o
o
COMMIT - save work done
SAVEPOINT - identify a point in a transaction to which you can later roll back
ROLLBACK - restore database to original since the last COMMIT
SET TRANSACTION - Change transaction options like isolation level and what rollback
segment to use
Submitted by: Sugan Patel Computer Science & Engineering Department
Page 19
Truba Institute of Engineering & Information Technology Bhopal
BE-205
Submitted by: Sugan Patel Computer Science & Engineering Department
Page 20
Download