DATABASE MANAGEMENT SYSTEM[DBMS] 2620003 [Unit: 1] Prepared By Lavlesh Pandit SPCE MCA, Visnagar UNIT_1 1 INTRODUCTION OF DBMS Data: Data are Known facts and that can be recorded and that have implicit meaning. Data is rows facts or figures collected about entity. Anything can be data. Data is information that has been translated into a form that is more convenient to move or process For e.g., Student information Student Enrollment no Stud Name Stud Address Contact Number UNIT_1 2 Entity Entity is any real world object it may be place, person, thing concept or theory about which we are storing the data. Attributes Attributes are property (characteristic) of an entity. An attribute is a specification that defines a property of an object. Database The database organized collection of related information. A database is a collection of persistent data that is used by the application system of some given enterprise. Collection of computerized data file is also known as database. UNIT_1 3 Database system The Database System is basically a computerized record keeping system which is used to store the information and allows the user to retrieve and update the information on demand. It is a computerized system whose overall purpose is to store information and to allow users to retrieve and update that information On user demand. The Database has for major component. Data Hardware Software Users UNIT_1 4 Data It is a very important component of the database system. Most of the organizations generate, store and process large amount of data. The data stored in the system is partitioned into one or more database. The data is divided into different parts: Integrated Data Shared Data Meta Data Integrated data: Data integration involves combining data residing in different sources and providing users with a unified view of these data UNIT_1 5 Shared Data: Shared data means that individual pieces of data in the database can be shared among different users , in the sense that each of those users can have access to the same piece of data , possibly for different purposes. Metadata: A description of the structure of the database is known as Metadata. It basically means "data about data". System Tables store the Metadata, which includes. -Number of Tables and Table Names -Number of fields and field Names -Primary Key Fields Application Metadata - It stores the structure and format of Queries, reports and other applications components. UNIT_1 6 Hardware: Your PC (Personal Computer) is a system, consisting of many components. Some of those components, like Windows XP, and all your other programs, are software. The stuff you can actually see and touch, and would likely break if you threw it out a fifth-story window, is hardware. Ex…Printer to print the information and hard disk to store information. Software: Computer software, or just software, is a collection of computer programs and related data that provides the instructions for telling a computer what to do and how to do it. Software is often divided into two categories. Systems software includes the operating system and all the utilities that enable the computer to function. Applications software includes programs that do real work for users. For example, word processors, spreadsheets, and database management systems fall under the category of applications software. UNIT_1 7 User: Users are those persons who need the information from the database to carry out their primary business responsibilities i.e. Personnel, Staff, Clerical, Managers, and Executives etc. On the basis of the job and requirements made by them they are provided access to the database totally or partially. We people are considered as database application user. Users of the DBMS UNIT_1 8 Information Meaningful data is known as information. Information is the summarization of data in a presentable form. Fundamentals of Database Users of the system can perform a variety of operations an such files-for example: •Adding new, empty files to the database; •Inserting data into existing files; •Retrieving data from existing files; •Changing data in existing files; •Deleting data from existing files; •Removing existing files from the database. UNIT_1 9 Database Management System(DBMS) A Database Management System is a software system that is used both to create Databases and manage the information stored within it. The end users and application programmers are enable by DBMS software to share data. It providing systematic method to creating , updating , retrieving and storing information in a database. UNIT_1 10 Characteristics of DBMS •Self describing nature of database system •Insulation between programs and data abstraction •Support of multiple views of data. •Shared between different users •Always Persistence •Correctness of data •Provide high Security •Non-Redundant UNIT_1 11 Self Describing Nature of database system • A fundamental approach of the database is that the database system contains not only the database it self but also a complete description of database structure and constraints. • The definition is store inside DBMS catalog which contain information such as the storage of each file, the data type and storage format of each data item and various constraints on the data. UNIT_1 12 Insulation between programs and Data Abstraction •In tradition file processing system the structure of a data file are embedded in application program so any changes to the structure of a file may require change in the complete program. •By contras the DBMS access the program that does not require changes in the most of the cases. The structure of the data file are stores inside the DBMS catalogue separately from the access programs, we call this property Program & Data Independence. Support of Multiple Views of Data. A database typically has many numbers of user, each of them requires deferent different view of data. A view may be substitute of database or it may be contain virtual data that are derived from database file. UNIT_1 13 Shared between different Users •Data in database are shared among different users and application. •A multi-user DBMS as is name implied must allow the multipal user to access the database as the same time. Always Persistence •Data in a database exist permanently in the sense that data can live beyond the scope of the process that created it. UNIT_1 14 Correctness of data •Data should be correct with respect to the real world entity that they represent. Provide high Security •Data should be protected from Un-Authorized access. Non-Redundant •No two data items in a database should represent the same real world entity. E.g. Same roll no. does not represent the same Student UNIT_1 15 Elements of Database The following are the various elements of database… I. II. III. IV. I. Data items Entity and attribute Logical and physical data Schema Data items: •Data item means pies of information. It is stored as a file. II. Entity and attribute • Entity is any real world object it may be place, person, thing concept or theory about which we are storing the data. • Attributes are property (characteristic) of an entity. UNIT_1 16 III. Logical and Physical data •The term logical structure refers to the way the programmers look as the database and physical structure referred to the way the data which are actually store in a database on specific storage device. IV. Schema •The term scheme mains of overall chart of the data item and reword store in a database. •Description of database is called database schema. UNIT_1 17 Instance and Schema • The DB schema is corresponds to the variable declarations along with associated type definitions in programs. • The formal definition of database schema is set of formulas that specify integrity constraints that are applied on the database. • Schema contains 'No of records + Type of data + No of attributes’. • The schema defines the table fields , relationship , views , indexes , packages , procedures , functions , queues , triggers , types . Sequences , materialized view , synonyms , DB links , java and XML schemas and other element. • Not change frequently. UNIT_1 18 Schemas Example account_n umber branch_name balance branch_name branch_ city assets A-101 Kalawad Road 500 Kalawad Road Rajkot 7100000 A-215 Maninagar 700 University Road Rajkot 9000000 A-102 Vastrapur 400 Maninagar Ahemdabad 400000 A-305 S.G.Road 350 Patel Nagar Delhi A-201 University Road 900 Vastrapur Ahemdabad 1700000 A-222 M.G.Road 700 Hawda Kolkatta 300000 A-217 University Road 750 Kandiwali Mumbai 2100000 S.G.Road Ahemdabad 8000000 3700000 Account_schema=(account_number , branch_name , balance) Branch_schema = (branch_name , branch_city , assets) UNIT_1 19 Database Instance : The DB instance is corresponds to the value of the variables in a programs at a point in time corresponds to an instance of a database schema. The collection of information stored in the DB at particular time is called Instance of DB state or snapshot. The DB instance update every time where data are added or update. Schema is called intension and Instance is called Extension. UNIT_1 20 UNIT_1 21 Types of Schemas : The types of DB schemas are listed below 1) Physical Schema : This describes the DB design at physical level. 2) Logical Schema : This describes the DB design at the logical level. 3) Subschema : DB have several schemas at view level. that describe different view of the DB UNIT_1 22 Database System Applications 1) Banking : For Customer information , accounts ,loans , and banking transaction 2) Airlines : For reservation and schedule information. Airlines were among first to use databases In a geographically distributed manner. 3) Universities : For Students information , course Registration and grades. 4) Credit Card Transaction : For purchases on credit cards and generation of monthly statements. 5) Telecommunication : For keeping records of calls made, generating monthly bills, maintain balances on prepaid calling cards , and storing information about the communication networks. 6) Sales : For Customer , Products and purchase information. UNIT_1 23 Advantage of DBMS •The Data can be shared between User •Controlling Redundancy •Restricted unauthorized access • Inconsistency can be avoided ( To Some Extent) •Providing multiple User Interface •Transaction support can be provided •Standards can be enforce •Represent complex relationship among data •Provide Backup and Recovery UNIT_1 24 •The Data can be shared between User •A database allows the sharing of data under its control by any number of application programmes or users. •Many User can see data at a given time. •Redundancy can be reduced Centralized control of the data by the DBA avoid unnecessary duplication data and effectively reduce that total amount of data. UNIT_1 25 •Restricted unauthorized access o DBA can ensure that the only means of access to the database is through the proper channels, and hence can define security constraints to be checked whenever access is attempted to sensitive data. o Different constraints can be established for each type of access ( retrieve, insert, delete, etc ) to each piece of information in the database •Inconsistency can be avoided o If the given fact is represented by single entry ( i.e if the redundancy is removed), then such an inconsistency cannot occur. o Alternatively, if the redundancy is not removed but is controlled (by making it known to the DBMS), then the DBMS can guarantee that the database is never inconsistent as seen by the user. UNIT_1 26 •Transaction support can be provided -A transaction is a logical unit of work typically involving several database operations- in particular, several update operations. -The standard example involves the transfer of a cash amount from one account A to another account B. Clearly two update are required here, one to withdraw the cash from account A and the other to deposit it to account B. UNIT_1 27 View of Data Data Abstraction • To retrieve data effectively • Efficiency is needed into database to represent data in a complex structure • Due to some non computerized data users , developer hides the complexity from users through several levels of abstraction , to simplify users interaction with database system. UNIT_1 28 The Three Level Architecture of DBMS. 1. Physical Level (Internal level) 2. Logical Level(Conceptual level) 3. View Level (External Level or User Level) UNIT_1 29 1. Physical Level (Internal level) The internal level or physical level is physical representation of the database on the computer The lowest level of abstraction Describe How data are actually stored Describe Low level data structure in detail How the data is stored physically and where it is stored in database. UNIT_1 30 3. What data users and application programs see ? 2. What data is stored ? describe data properties such as data semantics, data relationships 1. How data is actually stored ? e.g. are we using disks ? Which file system ? UNIT_1 31 2. Logical Level(Conceptual level) This is next higher level of abstraction • What information or data is stored in the database (like what is the data type or what is format of data.) • And also describe what kind of relationship between those data • Here whole Database is divided into small simple structures • Users at this level need not be aware of the physical-level complexity used to implement the simple structures •Generally, database administrators (DBAs) work at logical level of abstraction. UNIT_1 32 3. View Level (External Level or User Level) •It is the highest level of abstraction. •The external level is user’s view of the database. •It describes only a part of the whole Database for particular group of users. •This view hides all complexity. • It exists only to simplify user interaction with system. •The system may provide many views for the whole system. UNIT_1 33 Data Model •Data Models are internal Structure of Database. •It is collection of conceptual tools for describing data , data relationships , data semantics , and consistency constraints. •A data model provides a way to describe the design of a database at the physical , logical and view level. •The data models can be classified in different categories… E-R Data Model Relational Data Model Object Oriented Data Models Semi structured Data Model Hierarchical database model Network Data Model UNIT_1 34 E-R Data Model •The E-R model is a high-level data model. •E-R model is basically used to design logical database Structure. •The logical structure of database can be expressed graphically by E-R diagram. •E-R model is widely used in database design. •Some specific Symbols or Blocks are used to represent E-R diagram Advantages of E-R Model •Easy Conversion from E-R to other data models •Graphically representation for better understanding. UNIT_1 35 To built the E-R model we use the following components 1. Entities: Specify the item of application to storing the data. 2. Relationship: Used to connect entities and define the relationship between entities. 3. Attributes: Define the properties of Entities. An E-R diagram is expressed graphically which consist following components I. Rectangle : Use to represent Entities UNIT_1 36 II. Ellipses: Represent attributes. III. Diamonds: Represent Relationship among Entities. IV. Lines: Link entities to relationships and relationships to entities UNIT_1 37 Figure: Sample E-R Diagram UNIT_1 38 Types of Relationship in E-R data model I. One-to-One(1:1) II. One-to-many(1:M) III.Many-to-One(M:1) IV.Many-to-Many(M:N) UNIT_1 39 Relational Data Model •The Relational model is collection of table to represent both data and the relationships among those data. •Each table has multiple column and each column has unique name. •Each table contains records of a particular type. •A Relation model represents the database as collection of relations. •The Relational Data model is simplified the user’s view of the database. •Tables are related with each other by sharing common entity characteristics UNIT_1 40 • Suppose two tables are CUSTOMER and AGENT. • CUSTOMER table contains fields are CUST_ID CUST_NAME CUST_PHONE AGENT_ID. • AGENT table contains field are AGENT_ID AGENT_NAME AGENT_PHONE AGENT_SAL. • Even data are stored into different table , the common link b/w the CUSTOMER and AGENT table is AGENT_ID. • Although both table are completely independent of one another , data b/w the table can be easily connect using common links. UNIT_1 41 CUSTOMER AGENT CUST_ID AGENT_ID CUST_NAME AGENT_NAME CUST_PHONE AGENT_PHONE AGENT_ID AGENT_SAL Advantages of Relational Data Model •Simplicity • Flexible and powerful query capability •Ease of design , implementation , maintenance and uses UNIT_1 42 Terms use in Relational data model 1. Relational database store data in the form of tables. 2. The rows of the table are called as tuples. 3. The columns of the table are known as attributes. 4. Every attributes has a datatype associated with it. 5. Tables are called as relations. 6. The table name are called as relational variables UNIT_1 43 Object Oriented Data Models •Relational DB is widely successful in variety of application areas. •However it does not support the distribution of database across the number of servers. •Due to these reasons , ODBMS is developed. •ODBMS is among the most recent approaches to the DBMS. •They started in the engineering and domain application. •It has capacity of storing different types of data for example , pictures ,voice , video , including text , numbers and so on. • It is capable to combine object oriented programming with database technology and thus providing an integrated application development system. •It provides powerful features like inheritance , polymorphism. UNIT_1 44 Semi structured Data Model •The semistructured data model is permits the specification of data where individual data of same type may have different set of attributes. •The XML(Extensible Markup Language) is widely used to represent semistructured data. UNIT_1 45 Database Language •A database system provides a data-definition languages(DDL) to specify the database schema and a data manipulation Language(DML) to express database queries and updates. •These two are not different languages , they are simply form parts of a single database language , such as SQL UNIT_1 46 Data-Definition Language •A language by which we can specify database schema by a set of definitions that is called data definition languages. •The DDL is also used to specify additional properties of the data. •We can specify storage structure and access methods used by the database system by a set of statement in a special type of DDL called data storage and definition language. •This statement define the implementation details of the database schemas, which are usually hidden form the users. UNIT_1 47 •The data value stored in the database must satisfy certain consistency constraints. i.e. the balance on an account should not fall below $100. •The DDL provides facilities to specify such constraints. •The database system check this constraints every time and database is update. •DDL Commands are… Create , alter , drop , truncate, rename. UNIT_1 48 Data Manipulation Language A data manipulation language is a language that enables users to access or manipulate data as organized by the appropriate data model. The types of access are : 1) Retrieval of information stored in the database. 2) Insertion of new information into the database 3) Deletion of information from the database 4) Modification of information stored in the database UNIT_1 49 DML Commands are… INSERT INTO- inserts new data into database table SELECT - extracts data from a database table UPDATE - updates data in a database table DELETE - deletes data from a database table UNIT_1 50 DCL (Data Control Language) • GRANT : gives user's access privileges to database • REVOKE: withdraw access privileges given with the GRANT command TCL (transaction control language) •Commit : save work done •Rollback : restore database to original since the last COMMIT DQL (Data Query Language) • select UNIT_1 51 Data Storage and Querying •A database system is partitioned into modules that deal with each of the responsibilities of the overall system. •The functional components of a database system can be broadly divided into the storage manager and the query processor components. UNIT_1 52 Storage Manager •A storage manager is a program module that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system. •The storage manager is responsible for the interaction with the file manager. •The storage manager translates the various DML statements into low-level file-system commands. Thus, the storage manager is responsible for storing, retrieving, and updating data in the database. UNIT_1 53 The storage manager components include: •Authorization and integrity manager: which tests for the satisfaction of integrity constraints and checks the authority of users to access data. •Transaction manager: ensures that the database transaction executions proceed without conflicting. •File manager: which manages the allocation of space on disk storage and the data structures used to represent information stored on disk. Buffer manager : which is responsible for fetching data from disk storage into main memory. UNIT_1 54 The Query Processor The query processor components include: •DDL interpreter: which interprets DDL statements and records the definitions •in the data dictionary. •DML compiler: A query can usually be translated into any of a number of alternative evaluation plans that all give the same result. The DML compiler also performs query optimization, that is, it picks the lowest cost evaluation plan from among the alternatives. •Query evaluation engine: which executes low-level instructions generated by the DML compiler. UNIT_1 55 • Data files: which store the database itself. • Data dictionary :which stores metadata about the structure of the database, in particular the schema of the database. • Indices: which provide fast access to data items that hold particular values. Statistical database: A statistical database is a database used for statistical analysis purposes. UNIT_1 56 Application Architectures Two-tier Architecture •Database applications are usually partitioned into two or three parts. •In a two-tier architecture, the application is partitioned into a component that resides at the client machine, which invokes database system functionality at the server machine through query language statements. Application program interface standards like ODBC and JDBC are used for interaction between the client and the server. UNIT_1 57 Three-tier Architecture •In a Three-tier architecture, the client machine acts as merely a front end and does not contain any direct database calls. •Instead, the client end communicates with an application server, usually through a forms interface. •The application server in turn communicates with a database system to access data. •The business logic of the application, which says what actions to carry out under what conditions, is embedded in the application server, instead of being distributed across multiple clients. •Three-tier applications are more appropriate for large applications, and for •applications that run on the World Wide Web(WWW). UNIT_1 58 Advantages of 2-tier architecture: •Understanding and maintenances is easier. Disadvantages of 2-tier architecture: •Performance will be reduced when there are more users. Advantages of 3-tier architecture over 2-tier architecture •Easy to modify with out affecting other modules •Fast communication •Performance will be good in three tier architecture. UNIT_1 59 Database System Architectures UNIT_1 60 UNIT_1 61 Database Users and Administrators There are four different types of database-system users, differentiated by the way they expect to interact with the system. I. Naive users II. Application programmers III. Sophisticated users IV. Specialized users UNIT_1 62 I. Naive users •Users who interact with the system by invoking one of the application programs that have been written previously. •For example, a bank teller who needs to transfer $50 from account A to account B invokes a program called transfer. •This program asks the teller for the amount of money to be transferred, the account from which the money is to be transferred, and the account to which the money is to be transferred. The typical user interface for naive users is a forms interface, where the user can fill in appropriate fields of the form. Naive users may also simply read reports generated from the database. UNIT_1 63 II. Application programmers •Application programmers are computer professionals who write application programs. •Application programmers can choose from many tools to develop user interfaces. •Programming languages that combine imperative control structures (for example, for loops, while loops and if-then-else statements) with statements of the data manipulation language. •These languages, sometimes called fourth-generation languages UNIT_1 64 III. Sophisticated users •Sophisticated users interact with the system without writing programs. Instead, they form their requests in a database query language. •They submit each such query to a query processor, whose function is to break down DML statements into instructions that the storage manager understands. •Analysts who submit queries to explore data in the database fall in this category. UNIT_1 65 IV. Specialized users •Specialized users are sophisticated users who write specialized database applications that do not fit into the traditional dataprocessing framework. •Among these applications are computer-aided design systems, knowledgebase and expert systems, systems that store data with complex data types . •Example, graphics data and audio data. UNIT_1 66 Database Administrator •A person who has such central control over the system is called a database administrator (DBA). •A database administrator is a knowledgeable person who is responsible for physical design and management of the database. •In short the person responsible for defining, updating, and controlling access to a database management system package. UNIT_1 67 The functions of a DBA include: •Schema definition: The DBA creates the original database schema by executing a set of data definition statements in the DDL. •Storage structure and access-method definition: Define Storage Structure •Schema and physical-organization modification: The DBA carries out changes to the schema and physical organization to reflect the changing needs of the organization, or to alter the physical organization to improve performance. UNIT_1 68 •Granting of authorization for data access: Administrator can decided which user can access the database data and what the limitation of the different user for read and write a database data. •Routine maintenance: Periodically backing up the database Ensuring that enough free disk space is available for normal operations Monitoring jobs running on the database and ensuring that performance is not degraded by very expensive tasks submitted by some users. UNIT_1 69