Database Management System II Lesson 1: Overview of Database System Concepts Nurudeen Mohammed, PhD UPSA IT Department Outline DBMS? Types of DBMS Database-System Applications Benefits and Challenges of DBMS Data Abstraction Data Independence Database Engine Database Architecture Database Users and Administrators History of Database Systems DBMS • Data? Raw facts about an object or entity under consideration which can be recoded. Format: Text, image, voice, video, numbers and so on • Database? A database is a collection of related data which represents some aspect of the real world. A database is an organized collection of data, generally stored and accessed electronically from a computer system. • DBMS ? A software for storing and retrieving users' data while considering appropriate security measures. Collection of interrelated data Set of programs to access the data An environment that is both convenient and efficient for users • What kind of Data should DBMS manage? Highly valuable Relatively large Accessed by multiple users and applications, often at the same time. DBMS Software MySQL Microsoft Access Oracle PostgreSQL dBASE FoxPro SQLite IBM DB2 LibreOffice Base MariaDB Microsoft SQL Server Types of DBMS Hierarchical DBMS Data is organized in a tree-like structure. Data is Stored Hierarchically (top down or bottom up) format. Data is represented using a parent-child relationship. Parent may have many children, but children have only one parent. Types of DBMS Network Model It allows each child to have multiple parents. It helps allows many-to-many relationship. Entities are organized in a graph and there can several pathways access each entity. Types of DBMS Relational model It is the most widely used DBMS model. It is based on normalizing data in the rows and columns of tables. It stored in fixed structures and manipulated using SQL. Types of DBMS Object-Oriented Model In Object-oriented Model data stored in the form of objects. It is structured into classes which display data within it. It defines a database as a collection of objects which stores both data values and operations. Schematic Diagram of DBMS Database-System Applications • Banking Application – – – – – For customer information account activities payments deposits loans Database-System Applications • • • • • • Flight Ticket Reservation Booking a cab (Uber, Yango, Bolt) Hotel Reservation Hospital appointments Consular services Online restaurants Other Application of DBMS Universities For student information, course registrations, colleges and grades. Telecommunication It helps to keep call records, monthly bills, maintaining balances, etc. Finance For storing information about stock, sales, and purchases of financial instruments like stocks and bonds. Sales Use for storing customer, product & sales information. Manufacturing It is used for the management of supply chain and for tracking production of items. Inventories status in warehouses. HR Management For information about employees, salaries, payroll, deduction, generation of paychecks, etc. Benefits of DBMS • DBMS was designed to solve the fundamental problems associated with storing, managing, accessing, securing, and auditing data in traditional file systems. • Traditional database applications were developed on top of the databases, which led to challenges such as: Data redundancy. Isolation. Integrity constraints. Difficulty managing data access. Challenges of DBMS • Cost of Hardware and Software of a DBMS is quite high • Most database management systems are often complex systems, so the training for users to use the DBMS is required. • In some organizations, all data is integrated into a single database which can be damaged because of electric failure • Use of the same program at a time by many users sometimes lead to the loss of some data. • DBMS can't perform sophisticated calculations Data Abstraction • A major purpose of a database system is to provide users with an abstract view of the data. There are 3 level of the data abstraction External Schema Conceptual Schema Internal Schema Data abstraction allows us to achieve data independence Data Independence • Data Independence - is the characteristic in which we are able to modify the schema at one level of database system without altering the schema at the next higher level . • There are 2 types of data independence base on 3 layers of data abstraction. • Physical Data Independence ? • the ability to modify the physical schema without changing the logical schema So when we change the location of a database it should not result in changes in the structure of our tables. Also, when we change data structures, access methods or when we modify indexes, file organization techniques .These are changes in the physical schema and it should not affect the logical schema or conceptual schema. Data Independence • Logical Data Independence? the ability to modify the logical schema without affects the applications or external schema Eg. If you modify attributes or entities in a table it should not result in modifications in your user applications or external schema. Also if you changes the relationship between tables,merge or divide records, alter table structure it should not result in modifications in your user applications or external schema. In general, the interfaces between the various levels and components should be well defined so that changes in some parts do not seriously influence others. Data Independence Benefits: • Ability to improved performance as we can make changes to both physical and logical schemas without any complications we improve performance. • Modifications in data structures does not require modification in application programs • Implementation details can be hidden from the user. So the user is only shown the data relevant to him. All other data is hidden from him for better user interaction Database Engine • • A database system is partitioned into modules that deal with each of the responsibilities of the overall system. The functional components of a database system can be divided into The storage manager, The query processor component, The transaction management component. Storage Manager • • • A program module that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system. The storage manager is responsible to the following tasks: Interaction with the OS file manager Efficient storing, retrieving and updating of data The storage manager components include: Authorization and integrity manager Transaction manager File manager Buffer manager Storage Manager (Cont.) • The storage manager implements several data structures as part of the physical system implementation: Data files -- store the database itself Data dictionary -- stores metadata about the structure of the database, in particular the schema of the database. Indices -- can provide fast access to data items. A database index provides pointers to those data items that hold a particular value. Query Processor • The query processor components include: DDL interpreter -- interprets DDL statements and records the definitions in the data dictionary. DML compiler -- translates DML statements in a query language into an evaluation plan consisting of lowlevel instructions that the query evaluation engine understands. The DML compiler performs query optimization; that is, it picks the lowest cost evaluation plan from among the various alternatives. Query evaluation engine -- executes low-level instructions generated by the DML compiler. Query Processing 1. Parsing and translation 2. Optimization 3. Evaluation Transaction Management • • • A transaction is a collection of operations that performs a single logical function in a database application Transaction-management component ensures that the database remains in a consistent (correct) state despite system failures (e.g., power failures and operating system crashes) and transaction failures. Concurrency-control manager controls the interaction among the concurrent transactions, to ensure the consistency of the database. Database Architecture • Centralized databases One to a few cores, shared memory • Client-server, One server machine executes work on behalf of multiple client machines. • Parallel databases Many core shared memory Shared disk Shared nothing • Distributed databases Geographical distribution Schema/data heterogeneity Database Architecture (Centralized/Shared-Memory) Database Applications Database applications are usually partitioned into two or three parts • • Two-tier architecture -- the application resides at the client machine, where it invokes database system functionality at the server machine Three-tier architecture -- the client machine acts as a front end and does not contain any direct database calls. The client end communicates with an application server, usually through a forms interface. The application server in turn communicates with a database system to access data. Two-tier and three-tier architectures Database Users Database Administrator A person who has central control over the system is called a database administrator (DBA). Functions of a DBA include: • • • • • • • • Schema definition Storage structure and access-method definition Schema and physical-organization modification Granting of authorization for data access Routine maintenance Periodically backing up the database Ensuring that enough free disk space is available for normal operations, and upgrading disk space as required Monitoring jobs running on the database History of Database Systems • • 1950s and early 1960s: Data processing using magnetic tapes for storage Tapes provided only sequential access Punched cards for input Late 1960s and 1970s: Hard disks allowed direct access to data Network and hierarchical data models in widespread use Ted Codd defines the relational data model Would win the ACM Turing Award for this work IBM Research begins System R prototype UC Berkeley (Michael Stonebraker) begins Ingres prototype Oracle releases first commercial relational database High-performance (for the era) transaction processing History of Database Systems (Cont.) • • 1980s: Research relational prototypes evolve into commercial systems SQL becomes industrial standard Parallel and distributed database systems Wisconsin, IBM, Teradata Object-oriented database systems 1990s: Large decision support and data-mining applications Large multi-terabyte data warehouses Emergence of Web commerce History of Database Systems (Cont.) • • 2000s Big data storage systems Google BigTable, Yahoo PNuts, Amazon, “NoSQL” systems. Big data analysis: beyond SQL Map reduce and friends 2010s SQL reloaded SQL front end to Map Reduce systems Massively parallel database systems Multi-core main-memory databases END