Uploaded by sheen2136

DBMS Unit-1

advertisement
Database Management System
Overview:
The database management system consists of two parts. They are:
1. Database and
2. Management System
What is a Database?
To understand database, first we need to start from data, which is the basic building block of
any DBMS.
Data: Facts, figures, statistics etc. having no particular meaning (e.g. 1, Raman, 19 etc).
Record: Collection of related data items, e.g. in the above example the three data items had
no meaning. But if we organize them in the following way, then they collectively represent
meaningful information.
Roll
1
Name
Raman
Age
19
Table or Relation: Collection of related records.
Roll
1
2
3
Name
Raman
Jay
Raj
Age
19
20
18
The columns of this table or relation are called Fields, Attributes or Domains.
The rows are called Tuples or Records.
Database: The database is a collection of inter-related data which is used to retrieve, insert
and delete the data efficiently. Collection of related relations. Consider the following collection
of tables:
Roll
1
2
3
Name
Raman
Jay
Raj
Age
19
20
18
For example: The college Database organizes the data about the admin, staff, students and
faculty etc.
In a database, data is organized strictly in row and column format.
The rows are called Tuple or Record. The data items within one row may belong to different
data types.
On the other hand, the columns are often called Domain or Attribute. All the data items within
a single attribute are of the same data type.
Database Management System: Database management system is a software which is
used to manage the database. For example: MySQL, Oracle, SQL Server etc. are a very popular
commercial database which is used in different applications.
DBMS and its applications: A Database management system is a computerized recordkeeping system.
The overall purpose of DBMS is to allow the users to define, store, retrieve and update the
information contained in the database on demand. Some of the major areas of application are
as follows:
1. Banking
2. Airlines
3. Universities
4. Manufacturing and selling
5. Human resources
File System: Data file is a collection of related records stored on a storage medium such as a
hard disk or optical disc. While a database is a collection of data organized in a manner that
allows access, retrieval, and use of that data.
Keeping information in a file processing system has a number of major disadvantages:
● Data redundancy and inconsistency
✔ Multiple file formats, duplication of information in different files
● Difficulty in accessing data
✔ Need to write a new program to carry out each new task
● Data isolation
✔ multiple files and formats
● Integrity problems
✔ Hard to add new constraints or change existing ones
● Atomicity problems
✔ Failures may leave database in an inconsistent state with partial updates carried out
✔ E.g. transfer of funds from one account to another should either complete or not
happen at all
● Security problems
DBMS VS File System
DBMS
File System
● DBMS is a collection of data. In DBMS,
the user is not required to write the
procedures.
● File system is a collection of data. In this
system, the user has to write the
procedures for managing the database.
● DBMS gives an abstract view of data that
hides the details.
● File system provides the detail of the data
representation and storage of data.
● DBMS provides a crash recovery
mechanism, i.e., DBMS protects the user
from the system failure.
● File system doesn't have a crash
mechanism, i.e., if the system crashes
while entering some data, then the
content of the file will lost.
● DBMS provides a good protection
mechanism
● It is very difficult to protect a file under
the file system.
● DBMS takes care of Concurrent access of
data using some form of locking.
● In the File system, concurrent access has
many problems like redirecting the file
while other deleting some information or
updating some information.
Database Abstraction: Database systems are made-up of complex data structures. To ease
the user interaction with database, the developers hide internal irrelevant details from users.
This process of hiding irrelevant details from user is called data abstraction.
Internal or Physical level: This is the lowest level of data abstraction. It describes how data
is actually stored in database. You can get the complex data structure details at this level.
Conceptual or logical level: This is the middle level of 3-level data abstraction architecture.
It describes what data is stored in database.
View level: Highest level of data abstraction. This level describes the user interaction with
database system.
Database Architecture: The DBMS design depends upon its architecture.
DBMS architecture depends upon how users are connected to the database to get their
request done.
Types of DBMS Architecture:
1-Tier Architecture:
●
In this architecture, the database is directly available to the user.
●
The 1-Tier architecture is used for development of the local application, where
programmers can directly communicate with the database for the quick response.
2-Tier Architecture:
The 2-Tier architecture is same as basic client-server. In the two-tier architecture,
applications on the client end can directly communicate with the database at the server
side. For this interaction, API's like: ODBC, JDBC are used.
3-Tier Architecture: The 3-Tier architecture contains another layer between the client
and server. In this architecture, client can't directly communicate with the server.
Data Independence: Data Independence is defined as a property of DBMS that helps
you to change the Database schema at one level of a database system without
requiring to change the schema at the next higher level.
In DBMS there are two types of data independence1. Physical data independence
2. Logical data independence
Before we learn Data Independence, a refresher on Database Levels is important.
The database has 3 levels as shown in the diagram below
Physical Data Independence
●
●
Physical data independence helps you to separate conceptual levels from the
internal/physical levels.
It allows you to provide a logical description of the database without the need to
specify physical structures.
●
With Physical independence, you can easily change the physical storage
structures or devices without an effect on the conceptual schema.
Examples of changes under Physical Data Independence
Due to Physical independence, any of the below change will not affect the conceptual
layer.
●
●
●
●
●
●
●
Using a new storage device like Hard Drive or Magnetic Tapes
Modifying the file organization technique in the Database
Switching to different data structures.
Changing the access method.
Modifying indexes.
Changes to compression techniques or hashing algorithms.
Change of Location of Database from say C drive to D Drive
Logical Data Independence
Logical Data Independence is the ability to change the conceptual scheme without
changing
1. External views
2. External programs
When compared to Physical Data independence, it is challenging to achieve logical data
independence.
Examples of changes under Logical Data Independence
●
●
●
Add/Modify/Delete a new attribute, entity or relationship is possible without a
rewrite of existing application programs.
Merging two records into one.
Breaking an existing record into two or more records.
Importance of Data Independence:
●
●
●
●
Helps you to improve the quality of the data.
Database system maintenance becomes affordable.
Enforcement of standards and improvement in database security.
You don't need to alter data structure in application programs
Models of Database Architecture: Hierarchical, Network and Relational Models
Conceptually, there are three broad options with regard to database models. These are:
a. Hierarchical model
b. Network model
c. Relational model
(a) Hierarchical model:
This model presents data to users in a hierarchy of data elements that can be represented in a
sort of inverted tree.
(b) Network model:
In the network model of database, there are no levels and a record can have any number of
owners and also can have ownership of several records. This model is the same as the
hierarchical model, the only difference is that a record can have more than one parent.
(c) Relational model:
The most recent and popular model of database design is the relational database model. This
model was developed to overcome the problems of complexity and inflexibility of the earlier two
models in handling databases with many-to-many relationships between entities.
Database Schema and instance:
A database schema is the skeleton structure that represents the logical view of the entire
database.
A database schema is a blueprint or architecture of how our data will look. It doesn’t hold data
itself, but instead describes the shape of the data and how it might relate to other tables or
models.
An entry in our database will be an instance of the database schema. It will contain all of the
properties described in the schema.
Database Languages: Database languages are used to read, update and store data in
a database. There are several such languages that can be used for this purpose; one of
them is SQL (Structured Query Language).
1. Data Definition Language: DDL stands for Data Definition Language.
● It is used to define database structure or pattern.
● It is used to create schema, tables, indexes, constraints, etc. in the database.
● Using the DDL statements, you can create the skeleton of the database.
● Data definition language is used to store the information of metadata like the
number of tables and schemas, their names, indexes, columns in each table,
constraints, etc.
● Here are some tasks that come under DDL:
Create: It is used to create objects in the database.
Alter: It is used to alter the structure of the database.
Drop: It is used to delete objects from the database.
Truncate: It is used to remove all records from a table.
Rename: It is used to rename an object.
Comment: It is used to comment on the data dictionary.
These commands are used to update the database schema that's why they come under
Data definition language.
2. Data Manipulation Language: DML stands for Data Manipulation Language. It is
used for accessing and manipulating data in a database.
Here are some tasks that come under DML:
Select: It is used to retrieve data from a database.
Insert: It is used to insert data into a table.
Update: It is used to update existing data within a table.
Delete: It is used to delete all records from a table.
Merge: It performs UPSERT operation, i.e., insert or update operations.
3. Data Control Language: DCL stands for Data Control Language. It is used to
retrieve the stored or saved data.
DCL commands are as follows
1. GRANT
2. REVOKE
It is used to grant or revoke access permissions from any database user.
Interfaces in DBMS:
A database management system (DBMS) interface is a user interface which allows for
the ability to input queries to a database without using the query language itself.
●
●
Menu-Based Interfaces for Web Clients or Browsing
Forms-Based Interfaces
●
●
Graphical User Interface
Interfaces for DBA
Data Models in DBMS:
●
●
●
Data Model is a logical structure of Database.
Data Models are used to show how data is stored, connected, accessed and
updated in the database management system.
We use a set of symbols and text to represent the information so that members
of the organization can communicate and understand it.
Entity-Relationship Data Model:
●
An ER model is the logical representation of data as objects and relationships among
them.
●
In ER modeling, the database structure is portrayed as a diagram called an entityrelationship diagram.
●
It is very easy and simple to understand so it can be used by the developers to
communicate with the stakeholders.
ER diagram has the following three components:
Entities: Entity is a real-world thing. It can be a person, place, or even a
concept. Example: Teachers, Students, Course, Building, Department etc are some of
the entities of a School Management System.
●
In the ER diagram, an entity can be represented as rectangles.
Attributes: An entity contains a real-world property called attribute. This is the
characteristics of that attribute. Example: The entity teacher has the property like
teacher id, salary, age, etc.
●
Eclipse is used to represent an attribute.
Relationship: Relationship tells how two attributes are related. Example: Teacher
works for a department.
● Diamond or rhombus is used to represent the relationship.
Weak Entity
An entity that depends on another entity called a weak entity. The weak entity doesn't
contain any key attribute of its own. The weak entity is represented by a double
rectangle.
Attributes:
●
Key Attribute
The key attribute is used to represent the main characteristics of an entity. It
represents a primary key. The key attribute is represented by an ellipse with the text
underlined.
●
Composite Attribute
An attribute that composed of many other attributes is known as a composite attribute.
The composite attribute is represented by an ellipse, and those ellipses are connected
with an ellipse.
●
Multivalued Attribute
An attribute can have more than one value. These attributes are known as a
multivalued attribute. The double oval is used to represent multivalued attribute.
For example, a student can have more than one phone number.
Derived Attribute
An attribute that can be derived from other attribute is known as a derived attribute. It
can be represented by a dashed ellipse.
For example, a person's age changes over time and can be derived from another
attribute like Date of birth.
Mapping Constraint:
A mapping constraint is a data constraint that expresses the number of entities to
which another entity can be related via a relationship set.
There are four types of relationships:
1. One to One
2. One to Many
3. Many to One
4. Many to Many
1. One to One Relationship:
When a single instance of an entity is associated with a single instance of another entity
then it is called one to one relationship.
2. One to Many Relationship:
When a single instance of an entity is associated with more than one instances of
another entity then it is called one to many relationship.
3. Many to One Relationship
When more than one instances of an entity is associated with a single instance of
another entity then it is called many to one relationship. For example – many students
can study in a single college but a student cannot study in many colleges at the same
time.
4. Many to Many Relationship
When more than one instances of an entity is associated with more than one
instances of another entity then it is called many to many relationship.
Keys:
●
●
Keys play an important role in the relational database.
It is used to uniquely identify any record or row of data from the table. It is also
used to establish and identify relationships between tables.
For example: In Student table, ID is used as a key because it is unique for each student.
In PERSON table, passport number, license number, SSN are keys since they are
unique for each person.
Types of key:
1. Super Key:
● Super key is a set of an attribute which can uniquely identify a tuple.
● In the above EMPLOYEE table, for (EMPLOEE_ID, EMPLOYEE_NAME) the name of
two employees can be the same, but their EMPLYEE_ID can't be the same. Hence,
this combination can also be a key.
2. Candidate key
● A candidate key is an attribute or set of an attribute which can uniquely identify a
tuple.
● CANDIDATE KEY is a set of attributes that uniquely identify tuples in a
table. Candidate Key is a super key with no repeated attributes.
● Every table must have at least a single candidate key. A table can have
multiple candidate keys but only a single primary key.
3. Primary key:
● It is the first key which is used to identify one and only one instance of an entity
uniquely. An entity can contain multiple keys as we saw in PERSON table.
●
In the EMPLOYEE table, ID can be primary key since it is unique for each employee.
In the EMPLOYEE table, we can even select License Number and Passport Number as
primary key since they are also unique.
4. Foreign key:
● Foreign keys are the column of the table which is used to point to the primary key of
another table.
● In a company, every employee works in a specific department, and employee and
department are two different entities. So we can't store the information of the
department in the employee table. That's why we link these two tables through the
primary key of one table.
● The use of a foreign key is simply to link the attributes of two tables together with
the help of a primary key attribute. Thus, it is used for creating and maintaining the
relationship between the two relations.
SID
Name
A
B
A
B
C
Marks
78
60
78
60
80
Department
CS
EE
CS
EE
IT
Course
C1
C2
C2
C3
C2
Participation Constraint:
●
●
●
Participation constraint specifies the existence of an entity when it is related to
another entity in a relationship type.
Minimum cardinality is the minimum number of instances of an entity that can be
associated with each instance of another entity.
Maximum cardinality is the maximum number of instances of an entity that can be
associated with each instance of another entity.
There are two types of participation constraints: Total and Partial Participation
Total Participation:
●
●
It specifies that each entity in the entity set must compulsorily participate in at least one
relationship instance in that relationship set.
Total participation is represented using a double line between the entity set and
relationship set.
Partial Participation:
●
●
It specifies that each entity in the entity set may or may not participate in the relationship
instance in that relationship set.
Partial participation is represented using a single line between the entity set and
relationship set.
Generalization:
Generalization uses bottom-up approach where two or more lower level entities
combine together to form a higher level new entity.
These two entities have two common attributes: Name and Address, we can make a
generalized entity with these common attributes.
We have created a new generalized entity Person and this entity has the common
attributes of both the entities.
Specialization:
●
●
●
●
It is a process in which an entity is divided into sub-entities.
Specialization is a top-down process.
The idea behind Specialization is to find the subsets of entities that have few
distinguish attributes.
For example – Consider an entity employee which can be further classified as subentities Technician, Engineer & Accountant because these sub entities have some
distinguish attributes.
Aggregation:
Aggregation is a process in which a single entity alone is not able to make sense in a
relationship so the relationship of two entities acts as one entity.
Reduction of ER diagram to Table:
The database can be represented using the notations, and these notations can be
reduced to a collection of tables.
1. A strong entity set with only simple attributes will require only one table in relational
model.
● Attributes of the table will be the attributes of the entity set.
● The primary key of the table will be the key attribute of the entity set.
2. A strong entity set with any number of composite attributes will require only one table in
relational model.
● While conversion, simple attributes of the composite attributes are taken into account
and not the composite attribute itself.
3. A strong entity set with any number of multi valued attributes will require two tables in
relational model.
● One table will contain all the simple attributes with the primary key.
● Other table will contain the primary key and all the multi valued attributes.
5. Translating Relationship Set into a Table-
●
●
●
A relationship set will require one table in the relational model.
Attributes of the table arePrimary key attributes of the participating entity sets.
Its own descriptive attribute.
6. For Binary Relationships with Cardinality RatiosThe following four cases are possible-
Case-01: Binary relationship with cardinality ratio m:n
Case-02: Binary relationship with cardinality ratio 1:n
Case-03: Binary relationship with cardinality ratio m:1
Case-04: Binary relationship with cardinality ratio 1:1
● For Binary Relationship With Cardinality Ratio m:n
Here, three tables will be required1. A ( a1 , a2 )
2. R ( a1 , b1 )
3. B ( b1 , b2 )
● For Binary Relationship With Cardinality Ratio 1:n
Here, two tables will be required1. A ( a1 , a2 )
2. BR ( a1 , b1 , b2 )
●
For Binary Relationship With Cardinality Ratio m:1
Here, two tables will be required1. AR ( a1 , a2 , b1 )
2. B ( b1 , b2 )
●
For Binary Relationship with Cardinality Ratio 1:1
Here, two tables will be required. Either combine ‘R’ with ‘A’ or ‘B’
Way-01:
1. AR ( a1 , a2 , b1 )
2. B ( b1 , b2 )
Way-02:
1. A ( a1 , a2 )
2. BR ( a1 , b1 , b2 )
Extended ER Diagram:
● Enhanced entity-relationship (EER) diagrams are basically an expanded upon version of
ER diagrams.
● EER models are helpful tools for designing databases with high-level models.
● With their enhanced features, you can plan databases more thoroughly by delving into
the properties and constraints with more precision.
An EER diagram provides you with all the elements of an ER diagram while adding:
●
Subclasses and Super classes.
● Specialization and Generalization.
● Category or union type.
● Aggregation.
Features of EER Model
●
●
●
●
●
●
EER creates a design more accurate to database schemas.
It reflects the data properties and constraints more precisely.
It includes all modeling concepts of the ER model.
Diagrammatic technique helps for displaying the EER schema.
It includes the concept of specialization and generalization.
It is used to represent a collection of objects that is union of objects of different of different
entity types.
A. Sub Class and Super Class
●
Sub class and Super class relationship leads the concept of Inheritance.
●
The relationship between sub class and super class is denoted with
symbol.
1. Super Class
● Super class is an entity type that has a relationship with one or more subtypes.
● An entity cannot exist in database merely by being member of any super class.
For example: Shape super class is having sub groups as Square, Circle and Triangle.
2. Sub Class
●
●
Sub class is a group of entities with unique attributes.
Sub class inherits properties and attributes from its super class.
For example: Square, Circle, Triangle are the sub class of Shape super class.
B. Specialization and Generalization
1. Generalization
●
●
●
●
●
Generalization is the process of generalizing the entities which contain the properties of all
the generalized entities.
It is a bottom approach, in which two lower level entities combine to form a higher level
entity.
Generalization is the reverse process of Specialization.
It defines a general entity type from a set of specialized entity type.
It minimizes the difference between the entities by identifying the common features.
For example:
2. Specialization
●
●
●
Specialization is a process that defines a group entities which is divided into sub groups
based on their characteristic.
It is a top down approach, in which one higher entity can be broken down into two lower
level entity.
It maximizes the difference between the members of an entity by identifying the unique
characteristic or attributes of each member.
●
It defines one or more sub class for the super class and also forms the
superclass/subclass relationship.
For example
C. Category or Union
●
●
Category represents a single super class or sub class relationship with more than one
super class.
It can be a total or partial participation.
For example Car booking, Car owner can be a person, a bank (holds a possession on a
Car) or a company. Category (sub class) → Owner is a subset of the union of the three
super classes → Company, Bank, and Person. A Category member must exist in at least
one of its super classes.
D. Aggregation
●
●
●
Aggregation is a process that represent a relationship between a whole object and its
component parts.
It abstracts a relationship between objects and viewing the relationship as an object.
It is a process when two entity is treated as a single entity.
Degree of Relationship:
●
●
The degree of a relationship is the number of entity types that participate in a relationship.
By seeing an E-R diagram, we can simply tell the degree of a relationship i.e. the number
of an entity type that is connected to a relationship is the degree of that relationship.
For example, if we have two entity type ‘Customer’ and ‘Account’ and they are linked using
the primary key and foreign key. We can say that the degree of relationship is 2 because
here two entities are taking part in the relationship.
Based on the number of entity types that are connected we have the following degree of
relationships:
●
Unary
●
Binary
●
Ternary
●
N-ary
Unary (degree 1): A unary relationship exists when both the participating entity type are the
same. In this case we say that the degree of relationship is 1.
● For example, suppose we have many students who belong to a particular club-like
dance club, basketball club etc. and some of them are club leads. So, a particular group
of student is managed by their respective club lead and the club leads are chosen from
students.
● So, the ‘Student’ is the only entity participating here.
● We can say that the minimum degree of a relationship can be one.
Binary (degree 2):
● A binary relationship exists when exactly two entity type participates.
● When such a relationship is present we say that the degree is 2.
● It is easy to deal with such relationship as these can be easily converted into relational
tables.
For example, we have two entity type ‘Customer’ and ‘Account’ where each ‘Customer’ has
an ‘Account’ which stores the account details of the ‘Customer’.
Since we have two entity types participating we call it a binary relationship.
Ternary(degree 3):
● A ternary relationship exists when exactly three entity type participates.
● When such a relationship is present we say that the degree is 3.
● As the number of entity increases in the relationship, it becomes complex to convert
them into relational tables.
For example, We have three entity type ‘Employee’, ‘Department’ and ‘Location’. The
relationship between these entities are defined as an employee works in a department, an
employee works at a particular location. So, we can see we have three entities participating
in a relationship so it is a ternary relationship. The degree of this relation is 3.
N-ary (n degree):
● An N-ary relationship exists when ‘n’ number of entities are participating.
● So, any number of entities can participate in a relationship. There is no limitation to the
maximum number of entities that can participate.
Database Structure:
DBMS is a software that allows access to data stored in a database and provides an easy
and effective method of –
●
●
●
●
●
Defining the information.
Storing the information.
Manipulating the information.
Protecting the information from system crashes or data theft.
Differentiating access permissions for different users.
A database system is partitioned into modules that deal with each of the responsibilities of
the overall system. The functional components of a database system can be broadly divided
into the storage manager and the query processor components.
The storage manager is important because databases typically require a large amount of
storage space.
The query processor is important because it helps the database system simplify and
facilitate access to data.
●
●
1. Query Processor:
It interprets the requests (queries) received from end user via an application program
into instructions. It also executes the user request which is received from the DML
compiler.
2. Storage Manager:
Storage Manager is a program that provides an interface between the data stored in the
database and the queries received. It is also known as Database Control System. It
maintains the consistency and integrity of the database by applying the constraints and
executes the DCL statements. It is responsible for updating, storing, deleting, and
retrieving data in the database.
Download