1.Basic Concepts Of DBMS

advertisement
INTRODUCTION
Basic Terms:
Data:
• Data is defined as a collection of facts and figures that can
be recorded and has implicit meaning.
• It is a random, unorganized collection of measurements of
certain qualities or attributes relating to an entity.
Information:
• Information is a data that is collected, processed, logically
organized and analyzed as to be used by the decision maker.
Data Processing:
• Capturing
• Verifying
• Classifying
• Sorting/Arranging
• Summarizing
• Calculating
• Storing
• Retrieving
• Reproducing
• Communication
Database:
• A database is a collection of data, typically describing the
activities of one or more related organizations.
• For example, a University Database might contain
information about the following:
• students, faculty, courses, and classrooms.
• Relationships between them is :
•
students enrollment in courses,
• Faculty teaching courses, and
• the use of rooms for courses.
Database Management System:
• DBMS is a collection of programs that enables
users to create and maintain a database.
• It facilitates the process of:
– Defining a Database.
– Constructing a Database.
– Manipulating a Database.
• Defining a Database:
– Specifying the structure of a Database.
– The name of the entity.
– The attributes of the entity.
– Data Types.
– Size
– Constraints
– Operations
• Constructing a Database:
– Storing the database itself on some storage medium.
• Manipulating a Database:
– Querying
– Updating
– Generating Reports etc.
• Database systems are designed to manage large bodies
of information.
• Management of data involves both defining structures for
storage of information and providing mechanisms for the
manipulation of information.
• In addition, the database system must ensure the safety
of the information stored, despite system crashes or
attempts at unauthorized access.
• If data are to be shared among several users, the
system must avoid possible anomalous results.
Properties of Database:
1.
A database represents some aspects of the real world.
2.
A database is a logically coherent collection of data with some inherent
meaning.
3.
A random assortment of data cannot be termed as database.
4.
A database is designed, built and populated with data for a specific
purpose. It has an intended group of users and preconceived
applications in which these users are interested.
Types of Databases:
A) Classification Based on Nature Of Data:
1.
Traditional Database:

2.
Textual or numeric in nature.
Geographic Information System:

3.
Maps, weather data, satellite images.
Multimedia Database:

4.
Pictures, video clips, sound messages.
Data Warehouses and Online Analytical Processing:

5.
Integration of data and for decision making.
Real Time and Active Database:

Controlling industrial and manufacturing processes.
B) Classification based on Data Model:
1. Hierarchical DB.
2. Network DB.
3. Relational DB.
4. Object DB.
C) Classification based on number of Users:
1. Single User Systems.
2. Multi user Systems.
D) Classification based on Number of Sites:
1. Centralized DBMS
2. Distributed DBMS
E) Classification based on Type of Processing:
1. OLTP Systems
2. OLAP Systems
Database Applications:
– Banking: customer information, accounts, and loans, and
banking transactions.
– Airlines: For reservations and schedule information.
– Universities: For student information, course
registrations, and grades.
– Sales: customers, products, purchases
– Telecommunications: For keeping records of calls
made, generating monthly bills, maintaining balances on
prepaid calling cards, and storing information about the
communication networks.
– Manufacturing: production, inventory, orders, supply
chain.
– Human resources: employee records, salaries, tax
deductions
Purpose of Database Systems:
• In the early days, database applications were built
directly on top of file systems.
• What is a File System?
• In file system permanent records are stored in various
files and different application programs are written to
extract records from, and to add records to, the
appropriate files.
Drawbacks of using file systems to store data:
1. Data redundancy and inconsistency:
• Multiple file formats,
• Programs in multiple programming languages,
• duplication of information in different files,
• data inconsistencies.
2. Difficulty in accessing data:
• Need to write a new program to carry out each new task.
3. Data isolation:
• Data scattered in multiple files.
• multiple files and formats.
• Difficult to write new application program to retrieve data.
4. Integrity problems:
• Consistency Constraints
• Integrity constraints (e.g. account balance >1000) become
“buried” in program code rather than being stated explicitly
• Hard to add new constraints or change existing ones
5. Atomicity of updates:
• Failures may leave database in an inconsistent state with
partial updates carried out.
• Example: Transfer of funds from one account to another
should either complete or not happen at all.
• Difficult to ensure this property in file system.
6. Concurrent access by multiple users:
• Concurrent access needed for performance.
• Uncontrolled
concurrent
accesses
can
lead
to
inconsistencies.
• Example: Two people reading a balance and updating it at
the same time.
7. Security problems:
• Hard to provide user access to some, but not all, data
Database systems offer solutions to all the above
problems.
Main Characteristics of Database Technology:
1. Self-Describing nature of a database system:
•
•
Database system stores:
–
Database itself;
–
Description of DB structure & constraints.
A DBMS catalog stores the description
of the
database. The description is called meta-data. This
allows the DBMS software to work with different
databases.
2. Insulation between programs and data:
•
This is called program-data independence.
•
Allows changing data storage structures and operations
without having to change the DBMS access programs.
3. Data Abstraction:
•
A data model is used to hide storage details and present
the users with a conceptual view of the database.
• Conceptual representation does not include details of how
data is stored or how operations are implemented.
4. Support of multiple views of the data:
• Each user may see a different view of the database, which
describes only the data of interest to that user.
• A view is a subset of the database or it may contain virtual
data that is derived from database but is not explicitly stored.
5. Sharing of data
Processing:
& Multi-user transaction
• OLTP Transactions.
• Concurrency Control Mechanism.
Additional Benefits of Database Technology:
-
Controlling redundancy in data storage.
-
Sharing of data among multiple users.
-
Restricting unauthorized access to data.
-
Providing multiple interfaces to different classes of users.
-
Representing complex relationships among data.
- Enforcing integrity constraints on the database.
-
Providing backup and recovery services.
-
Potential for enforcing standards.
-
Flexibility to change data structures.
-
Reduced application development time.
-
Availability of up-to-date information.
-
Economies of scale.
Users of Database:
1.
Database Administrators:
–
–
–
–
authorizing access to database,
Coordinating & monitoring database use,
Acquiring software & hardware resources as needed,
Accountable for breach of security or poor system response
time.
2. Database Designers:
–
–
–
–
Identify data to be stored in db,
Select appropriate structure for storing data,
Communicate with all users and understand
requirements,
Design a database that meets user requirements.
their
3. End Users:
1. Casual End Users:
•
•
•
•
Occasionally access a db,
Needs different information each time,
Use sophisticated database query language.
E.g. middle or high level managers.
2. Naïve or Parametric End Users:
•
•
•
Constantly query and update the database,
Uses standard types of Queries called
transactions.
E.g. Bank tellers, Railway reservation clerks etc.
canned
3. Sophisticated End Users:
•
•
Users who use database to meet complex requirements.
E.g. Engineers, scientists, business analysts.
4. Stand alone Users:
•
Use readymade program package to interact with
database.
4. System Analyst & Application Programmers:
–
System analyst determine the requirements of end users;
–
Application programmers implement these specifications as
programs.
–
Both are called software engineers.
5. DBMS System Designers & Implementers:
–
Designs & implements DBMS modules and interfaces as a
S/W package.
5. Operators & Maintenance Personnel:
–
Responsible for the actual running and maintenance of H/W &
S/W environment for the database system.
Architecture:
Data Abstraction:
• Data abstraction means the details of data storage are hided from
the users who do not need them.
Levels of Abstraction:
1. Physical level:
Describes how a record (e.g., customer) is stored.
2. Logical level:
Describes data stored in database, and the relationships
among the data.
3. View level:
Application programs hide details of data types. Views can also
hide information (such as an employee’s salary) for security
purposes.
Database Schema:
• The description of a database is called as database
schema.
•It is specified during database design.
• It includes the descriptions of the database structure and
the constraints that should hold on the database.
• It is not expected to change frequently.
Schema Diagram:
• A diagrammatic display of (some aspects of) a database
schema is called schema diagram.
Database Instance/Database State:
•The actual data stored in a database at a particular
moment in time is called as database instance , database
state or database occurrence.
Schemas VS Instances:
•The database schema changes very infrequently . The
database state changes every time the database is
updated . Schema is also called intension, whereas state
is called extension.
DBMS 3 - Tier Architecture:
In this DBMS architecture, schemas can de defined in
three levels:
1. Internal level.
2. Conceptual level.
3. External level.
1. Internal Level:
• The internal level has an internal schema.
• It describes the physical storage structure of the
database.
• The internal schema uses physical data model, which
describes the complete details of :
– data storage,
– access paths for the database, and
– how the data’s are retrieved or inserted in the database.
2. Conceptual Level:
• The conceptual level has a conceptual schema.
• It describes the whole database for different users
who
access the database.
• The conceptual schema hides the details of the physical
storage structures and concentrates basically on entities,
relationships, and constraints.
3. External Level:
• External level includes a number of external schemas or
user views.
•
Each external schema describes the part of the database
that a particular user group is interested in and hides the
rest of the database from that user groups.
• The three schemas are only descriptions of data.
• The data actually exists is at the physical level.
• DBMS transforms users request specified on an external
schema into a request against the conceptual schema and
then into a request against internal schema for processing
over the database.
• The request and retrieval must be reformatted to match
user’s external view.
• The process of transforming requests and results between
levels are called mappings.
Data Independence:
•
Data independence is the defined as the capacity to
change the schema at one level of database system
without having to change the schema at next higher level.
•
There are two types of data independence:
1. Logical Data Independence:
2. Physical Data Independence:
1. Logical Data Independence:
•
The capacity to change the conceptual schema without
having to change the external schemas and their
application programs.
•
Conceptual schema is changed to:
–
expand the Database (Adding a record type or data item)
–
Reduce the Database(Removing a record type or data item)
•
View definition and mappings need to be changed.
•
Application programs that reference the external schema
constructs must work as before.
2.
Physical Data Independence:
•
The capacity to change the internal schema without having to
change the conceptual schema.
•
Changes are needed because some physical files had to be
reorganized.
•
E.g.: Creation of additional access structure to improve the
performance of retrieval & update.
•
If the same data as before remains in the database conceptual
schema need not be changed.
Database Languages:
• Data Definition Language DDL.
• Storage Definition Language SDL
• View Definition Language VDL
• Data Manipulation Language DML
– High level or nonprocedural DML. (set at a time)
– Low level or Procedural DML. (Record at a time)
DBMS Interfaces:
1. Menu Based interfaces for browsing:
– Users are provided with a list of options (menus).
– Query is composed by picking up the options.
– No need to memorize the syntax of queries.
2. Forms based interfaces:
– A Form is displayed to each user.
– Data entry, retrieval is done through forms.
– Used by Naïve/parametric users.
– DBMS Supports for form specification language.
3. Graphical user interfaces:
– A schema is displayed to users in diagrammatic form.
– A query is constructed by manipulating a diagram.
– GUI utilizes menus & forms.
4. Natural language interfaces:
– These interfaces accepts requests written in English and tries
to understand it.
– Natural language interfaces has its own schema similar to DB
conceptual schema.
– Mapping is done.
5. Interfaces for parametric users:
– Parametric users(bank tellers) has a small set of operations
that are repeatedly performed.
– A small set of abbreviated commands is included.
– E.g. Function keys can be programmed to initiate various
commands.
6. Interfaces for the DBA:
– Privileged commands are created.
– These include commands for creating accounts, setting system
parameters, granting account authorization, changing a
schema, reorganizing storage structure.
Query processor:
• It handles high-level queries.
• It parses, validates, optimizes, and compiles or interprets a query
which results in the query plan.
Run-time DB processor:
• It handles database accesses at run-time by receiving retrieval or
update operations and carries them out on the database.
DDL compiler:
• It processes schema definitions (DDL statements) and stores
descriptions of the schemas (metadata) on the DBMS catalog.
Transaction Manager:
• With the cooperation of the concurrency and recovery subsystems,
this module ensures that the transactions which are constituted of
queries and other actions are executed atomically, in consistency, in
isolation and in durability.
Concurrency subsystem:
• Assures that individual actions of multiple transactions are executed
in such an order that the result is the same as if the transactions had
executed entirely at one time.
Recovery subsystem:
• Responsible to store every change to the database separately on
disk (I.e. log file) . Such information will be used to restore the system
to a consistent state after a any failure has occurred.
Security & authorization subsystem:
• Responsible to protect the database against unauthorized accesses.
File manager:
• File manager controls accesses to DBMS information that is stored
on the disk and can use OS’s File manager services.
Buffer manager:
• Manages the main memory buffers which store data, metadata,
indexes, statistics, and the log. Recall that all this information must be
in main memory (i.e. in buffers) before it can be used.
Note:
Buffer Manager & File Manager together are often called Stored Data
Manager.
Examples of DBMS modules interactions:
• Suppose a user requests to update a record of a relation (see Figure)
• The DML statement is issued to the query processor (1) which parses
it, checks if the user has the privileges to do such operation using the
metadata provided from the catalog information found in the main
memory buffers (2) and finally issues the query plan which is the
optimized way to execute the statement (3).
• The run-time DB processor takes the query plan and executes it by
writing the update in the buffer if the record is there (4). Otherwise, it
requests the buffer manager to get the record (5).
• The buffer Manager requests the corresponding block from the File
manager (6) who gets it from the disk (7), gives it back to the buffer
manager who puts it in the buffer (8).
• The run-time DB processor can then execute the statement.
• Before executing the statement, the run-time DB processors informs
the concurrency subsystem (9) in order to check concurrent accesses
for no inconsistency and the recovery subsystem to take the
necessary actions to allow a recovery in case of failure (10).
System Architectures for DBMS
We distinguish between two different DBMS architectures:
– Centralized DBMS architecture –for earlier systems
– Client-Server DBMS architecture –for current systems
Centralized DBMS Architecture:
• Earlier computer system architectures were based on mainframe
computers.
• Mainframe computers provide the main processing of all system
• functions.
• Most users accessed such systems via computer terminals that
provide display capabilities with no processing power.
• Hence, in the centralized DBMS architecture, all functions of the
system,
including
user
application
programs,
user
interface
programs, as well as all the DBMS functionality are executed in the
mainframe computer.
Client-Server Architecture:
• Client-server computer systems architectures emerged with new
computing environments.
• Large number of PCs, workstations, specialized servers and another
equipments are connected together via a network.
• A client-Server architecture contains:
– Specialized servers with specific functionalities providing resources
• File server, printer server, web server, E-mail server, etc.
– Client machines provide users with appropriate interface to utilize
these servers as well as with local processing power to run local
applications
Download