Operating System (OS) - Chapter 6

advertisement
Chapter – 2 Database Management Systems
Basic Terms used in DBMS:
Data & Information:
Data consists of a series of facts or statements that may have been collected, stored, processed and/or manipulated but
have not been organized or placed into context. When data is organized, it becomes information.
Database:
A collection of data stored in a standardized format, designed to be processed so as to provide a consistent and
controlled pool of data. This data is common to all users of the system, but is independent of programs that use the
data.
Database Management System (DBMS):
It is software that defines a database and controls the data, including overall organization, storage, retrieval, security
and data integrity. It supports a query language, produces reports, and creates data entry forms. Some DBMS software
are MS-Access, Oracle and FoxPro etc.
Objectives of Database Systems:
The following are the objectives of Database System:
a) To provide huge storage or space for relevant data
b) To allow easy access to the data for the user
c) To provide quick response to user request for any information or data.
d) To allow updating with the latest modification in the database
e) To remove duplicate (redundant) data
f) To allow many (multiple) users to the database at one time
g) To allow the growth of database system
h) To provide maximum protection to data from any physical damage and unauthorized access
Database Systems Vs File-Based Systems
In file processing system, each file is independent of (or unrelated to) other files and data in different files can be
integrated only by writing individual program for each application. Following are the drawbacks of traditional File
based methods:
a) Uncontrolled Duplications and Redundancies
- Wastes space
- Hard to update all files
b) Inconsistent data
c) Inflexibility
- Hard to change data
- Hard to change programs
d) Limited data sharing
e) Poor enforcement of standards
f) Poor programmer productivity
g) Excessive program maintenance
Advantages of Database over Traditional File-Based Systems:
The following are the main advantages of Database:
a) Sharing of Data: The main objectives of designing a database system are to share the data by a number of
applications and DBMS makes possible of sharing the data.
b) Data Integrity: To avoid data integrity errors, DBMS use validation procedures, which define acceptable input
ranges for each field in the record.
c) Data Independence: The term data independence refers to the storage of data in such way that it is not locked
into use by the particular application.
d) Avoid Data Redundancy: DBMS identifies existence of common data and avoid duplicate recording. The
process of avoiding duplicate data is called data redundancy.
e) Data Security: This is concerned with protecting access to data. Protection is needed at many levels for
access, modification, deletion etc.
This document is downloaded from www.sushilupreti.com.np
1 of 7
mail@sushilupreti.com.np
f) Data Maintenance: DBMS involves the system for data maintenance, which includes procedure for adding
records, updating records and deleting records.
g) Management Control: DBMS controls over addition, deletion, change and disposition of data. Data must be
protected to satisfy legal, accounting and auditing requirements.
Disadvantages of DBMS:
The principal disadvantages of a DBMS are:
a) Expensive: Database software is very expensive for large computer systems.
b) Increased Weakness: Database systems also increase weakness, since a failure in one part of the system can
make the entire system inactive.
c) Specialized Personnel Required: It is important to hire trained individuals to maintain the new database
software, develop and enforce new programming standards.
d) Need to Explicit Back up: This adds costs as new storage space are needed to hold the data.
e) Costs: Needs for high cost of sophisticated database software and also has to bear high operational costs.
f) Interference with Shared Data: Concurrent access to the same information by a user might cause problems.
Schemas and Views of Data
A database schema is a logical description of each piece of data and its relationship with other data elements. Schemas
are generally stored in a data dictionary.
The underlying physical storage, which is managed by the DBMS, is physical schema. Database designers and endusers do not need to be concerned about physical storage. Similarly, database designers, database administrators, and
some application programmers are aware of and use the logical schema, whereas end-users and application
programmers concern with external schema (user view).
ER Modeling:
It is a technique for organizing and documenting a system’s data. The tool for data modeling is Entity Relationship
(ER) Diagram.
ER Diagram:
ER diagram is graphical representation of different entities and their relationship. It is a tool for relational database
management.
Entity: An entity is any ‘thing’ about which data can be stored. For example, if the system needs to store data about
customers or products, then the model would have customer or product entities.
Attributes: The attributes of any entity are those facts that need to be stored about the entity. For example, the
attributes of a customer might include the account number, name, address etc.
Relationships: Relationships exist between various entities within a system. For example, there may be a relationship
between the customer and an order.
This document is downloaded from www.sushilupreti.com.np
2 of 7
mail@sushilupreti.com.np
Here are some symbols used in ER diagram:
S.N.
Symbols
Description
It is used for entity representation. It contains objects used in
relational database.
1
2
The oval or ellipse is used to represent attributes of entities.
3
The ellipse with underline label is used for key attributes.
4
Diamond represents relationship.
5
It is used to link attributes to entity sets and entity set to relationship.
Example: The students appear in examination. The name, subjects, address etc. are attributes of student and subjects,
schedules etc. are attributes of examination.
BOOM
eid
sid
Name
Appear
Students
Examination
Computer
Subjects
schedule
Address
TT
Fig. Example of ER Diagram
Database or Data Model:
The database model is a collection of concepts used to describe types of data, their relationship, and semantics of data
and consistency constraints of data. There are three types of database model or structure:
a) Hierarchical Database Model:
Its structure is tree like, so it is also called tree data model. In tree, there is only one root and many branches. In
this model, only one parent (root) node owing many children nodes.
Advantages:
Parent
 Node has one or many attributes.
 It supports 1:1 and 1: M relationship.
Child 2
Child 1
 All the children nodes are accessed through parent nodes.
 Helpful in building complex system from simple components.
Disadvantages:
Child y
Child x
 It does not support M: M relationship.
 When parent node is deleted, all the nodes are deleted automatically.
 The dependency on parent node is not beneficial always.
This document is downloaded from www.sushilupreti.com.np
3 of 7
mail@sushilupreti.com.np
b) Network Database Model:
In network databases, the collections of records are connected to one another by means of links. It supports all
types of relationship among entities.
Advantages:
 Many to many relationships of real world entities are established successfully.
Disadvantages:
 Entities and attributes grow; along with it complexity of structure also grows.
Stores
Clerks
Customers
Transactions
Items
c) Relational Database Model:
A relational database is composed of one or more relations sometime known as table. Each can be visualized as a
table of data or file. Each row (tuple or record) in the relation represents one entity (such as one student in student
table). Each column name (attribute) e.g. name, age is called items or fields and is drawn from the domains of age
and name.
Advantages:
 Additional indexes can be constructed.
 It has very less redundancy.
 Normalization of database is possible
Disadvantages:
 In some cases, the index file portion of the file may be larger than the file itself. This wastes storage space.
 The file index must be searched sequentially before the actual file records are obtained.
Normalization:
The process of converting table from one form to other form is called Normalization. It is designed to simplify
relationships and establish logical links between files without loosing information. It is a process of simplification of
table without loss of data. In other words, Normalization is the process of organizing data in a relational database to
reduce the redundancies.
Need for Normalization:
A bad database design may lead to certain undesirable things like:
 Repetition of information.
 Inability to represent certain information.
 Loss of information.
All these may lead to the re-writing of application. But the normalization process helps to have good database design
and thereby ensures continues efficiency of the database.
First Normal Form (1NF):
A form or table to become a First normal form (1NF) its all attributes must be atomic. 1NF simplifies attributes and
the queries become easier. For example,
Table: Department
Deptno Dname
Location
10
IT
Leeds,
Bradford, Kent
Hundredfold
20
Research
30
Marketing Leeds
This document is downloaded from www.sushilupreti.com.np
4 of 7
mail@sushilupreti.com.np
In the above table, attributes of entity Location is not atomic. In first record, it has multi-value.
Table: Location
Deptno
Table: Department
Deptno
Dname
10
20
30
Location
10
10
10
20
30
IT
Research
Marketing
Leeds
Bradfprd
Kent
Hundredfold
Leeds
Here the table ‘Department’ is divided into two tables ‘Department’ and ‘Location’ in order to make each column
atomic. Hence it is in 1NF.
Second Normal Form (2NF):
A relation is in Second normal form if it is in First normal form and each attribute must be functionally dependent on
the primary key. 2NF improves data integrity and prevents update, insert, and delete anomalies. For example,
Table: Employee
DNo
DName
DLoc
EmpNo
EName
Salary
Address
HoursNo
In the above table ‘Employee’,
The value of ‘HoursNo’ depends upon DNO as well as EmpNo.
So, to reduce the relation in 2NF, we decompose it into three relations as show below.
Table: Location
DNo DName
DLoc
Table: Employee
EmpNo EName
Salary
Address
Table: Hours
DNo EmpNo
HoursNo
Third Normal Form (3NF):
A relation is in Third normal form if it in Second normal form and no transitive dependencies exist among the
attributes. A transitive dependency can be described as follows: “if A determines B, and B determines C, then A
determines C.” For example,
Table: Employee
EmpNo EName
Salary
Address
In the above relation ‘Employee’, Assume the following functional dependency hold:
EName ====> Address
Here, both Ename and Address attributes are non-key attributes in ‘Employee’, and since Address depends on a nonPrime attribute EName, which depends on the primary key (EmpNo), a transitive dependency exists.
So the solution is: Any transitive dependencies are moved into a smaller table.
Table: Salary
EmpNo EName
Salary
Table: Address
EName Address
Components of DBMS:
A database management system has mainly two components. They are:
1. Data Dictionary System (DDS)
2. Database Management Languages (DBML)
 Data Definition Language (DDL)
 Data Manipulation Language (DML)
Data Dictionary System (DDS):
The data dictionary system is an encyclopedia of information concerning each data element. It describes the data and
its characteristics, such as location, and data type. It also identifies the origin, use, ownership and also the methods of
data access and data security.
This document is downloaded from www.sushilupreti.com.np
5 of 7
mail@sushilupreti.com.np
Database Management Languages (DBML):
We have mainly two types of database management language, which is used to define the data, manipulation of data
and help to query the data with certain criteria.
Data Definition Language (DDL): It is used to create the data, describe the data and define the schema in the DBMS.
Data Manipulation Language (DML): It processes and manipulates the data in the database. It also allows the user to
query the database and receive summary reports and/or customized reports. DML enables the user to access, update,
replace, delete and protect database records from unauthorized access.
Structured Query Language (SQL): It is a non-procedural language exclusively deals with data: data integrity, data
manipulation, data access, data retrieval, data query and data security. It was developed by the American National
Standards Institute (ANSI). SQL is the standard language for relational databases and includes the capability of
manipulating both the structure of a database and its data.
Centralized and Distributed Database Systems:
A database which resides all of its data centralized and on one machine or some machines which are connected
together but looks one for the users looking from outside is called centralized database.
Distributed databases can be defined as a collection of multiple, logically interrelated databases distributed over a
computer network. In a distributed database system, the database is stored on several computers. The computers in a
distributed system communicate with one another through various communication media, such as high-speed
networks or telephone lines.
The main difference between centralized & distributed databases is that the distributed databases are typically
geographically separated, are separately administered, & have slower interconnection. Also in distributed databases
we differentiate between local & global transactions.
Database Administrator (DBA):
An information specialist who has responsibility for the database is called a Database Administrator (DBA). His/her
duties fall into four major areas: Database planning, implementation, operation and security. Typical responsibilities
of a DBA are:
Managing Data Activities: The DBA ensures the integrity, security and privacy of data.
Managing Database Structure: DBA is also responsible for designing of logical database, its implementation and
maintenance.
Managing DBMS: S/he also maintains a data dictionary.
Managing Database Users and Provide Security: DBA is responsible for controlling database users and determine the
proper security of data.
Performing Backup and Recovery Duties: DBA establishes a regular schedule for database backup and prepare
different recovery techniques.
Data Integrity:
Data integrity means that data remains stable, secure and accurate. It is maintained by internal constraints known as
integrity rules that are invisible to users.
There are two integrity rules associated with the relational model:
a) Entity Integrity:
Entity integrity is the rule that no column that is part of the primary key may accept null values. This rule
guarantees that each record will indeed have its own identity and ensures that one record can be distinguished
from another.
b) Referential Integrity:
The Referential Integrity rule states that “If the relational table has a foreign key, then every value of the foreign
key must either be NULL or match with the values in another relational table in which that foreign key is a
primary key.”
This document is downloaded from www.sushilupreti.com.np
6 of 7
mail@sushilupreti.com.np
Data Security:
Data security is the protection of data. It mainly concerns with:
a) Preventing the loss of data
b) Preventing the misuse or unwanted modification of data
c) Preventing of disclosure of data to unauthorized persons
The following measures can be taken to ensure all three types of security:
a) The use of backup copies in tapes or disks
b) Physical prevention
c) The fuse of passwords to prevent unauthorized use of computer terminals or unauthorized access to on-line files
d) Constant checks of security
This document is downloaded from www.sushilupreti.com.np
7 of 7
mail@sushilupreti.com.np
Download