Relational Data Base By T. El-Shishtawy Prof. Ass. Of Computer Engineering 1 Chapter 1: The Database Environment Modern Database Management 6th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden 2 Definitions Data: – Meaningful facts, text, graphics, images, sound, video segments – Data are raw facts that constitute building block of information Information: – Data processed to be useful in decision making Metadata: – Data that describes data – The data in DBMS can be broadly classified into two types, Chapter 1 the collection of information needed by the organization and “metadata” which is the information about the database. 3 Figure 1-1a Data in Context Large volume of facts, difficult to interpret or make decisions based on Chapter 1 4 Figure 1-1b Summarized data Useful information that managers can use for decision making and interpretation Chapter 1 5 Table 1-1 Metadata Descriptions of the properties or characteristics of the data, including data types, field sizes, allowable values, and documentation Chapter 1 6 Database An organized collection of logically related data – A database is a an organized collection of data that are related in a meaningful way – Database systems are systems in which the interpretation and storage of information are of primary importance – Several users can access the data in an organization still the integrity of the data should be maintained. – A database is integrated when same information is not recorded in two places. Chapter 1 7 Disadvantages of File Processing Prior to DBMS, file system provided by OS was used to store information Program-Data Dependence – All programs maintain metadata for each file they use Data Redundancy (Duplication of data) – Different systems/programs have separate copies of the same data Limited Data Sharing- Incompatible File Formats – No centralized control of data Lengthy Development Times – Programmers must design their own file formats Excessive Program Maintenance – 80% of of information systems budget Chapter 1 8 Figure 1-2 Three file processing systems at Pine Valley Furniture Duplicate Data Chapter 1 9 Problems with Data Dependency Each application programmer must maintain their own data Each application program needs to include code for the metadata of each file Each application program must have its own processing routines for reading, inserting, updating and deleting data Lack of coordination and central control Non-standard file formats Chapter 1 10 Problems with Data Redundancy Waste of space to have duplicate data Causes more maintenance headaches The biggest Problem: – When data changes in one file, could cause inconsistencies – Compromises data integrity Chapter 1 11 Incompatible File Formats The structure of the file depends on the application programming language. – For example, the structure of the file generated by FORTRAN program may be different from the structure of a file generated by “C” program. – The incompatibility of such files makes them difficult to process jointly Chapter 1 12 SOLUTION: The DATABASE Approach Central repository of shared data Data is managed by a controlling agent Stored in a standardized, convenient form Requires a Database Management System (DBMS) Chapter 1 13 Database Management System A DBMS is a data storage and retrieval system which permits data to be stored nonredundantly while making it appear to the user as if the data is well-integrated. Chapter 1 14 Database Management System A DBMS consists of: A collection of interrelated and persistent data. This part of DBMS is referred to as database (DB). A set of application programs used to access, update, and manage data. This part constitutes data management system (MS). A DBMS is general-purpose software i.e., not application specific. The same DBMS (e.g., Oracle, Sybase, etc.) can be used in railway reservation system, library management, university, etc. A DBMS takes care of storing and accessing data, leaving only application specific tasks to application programs. Chapter 1 15 Database Management System Application #1 Application #2 Application #3 Chapter 1 DBMS Database containing centralized shared data DBMS manages data resources like an operating system manages hardware resources 16 Advantages of Database Approach Program-Data Independence – Metadata stored in DBMS, so applications don’t need to worry about data formats – Data queries/updates managed by DBMS so programs don’t need to process data access routines – Results in: increased application development and maintenance productivity Minimal Data Redundancy – Leads to increased data integrity/consistency Chapter 1 17 Advantages of Database Approach Improved Data Sharing – Different users get different views of the data Enforcement of Standards – All data access is done in the same way Improved Data Quality – Constraints, data validation rules Better Data Accessibility/ Responsiveness – Use of standard data query language (SQL) Security, Backup/Recovery, Concurrency – Disaster recovery is easier Chapter 1 18 Notes Data redundancy – duplication of data. Data redundancy will occupy more space hence it is not desirable. Data independence – independence between application program and the data. The advantage is that when the data representation changes, it is not necessary to change the application program. Data inconsistency – different copies of the same data will have different values. Chapter 1 19 Notes Centralizing the data – data can be easily shared between the users but the main concern is data security. The main threat to data integrity comes from – several different users attempting to update the same data at the same time. Support for multiple views means – DBMS allows different users to see different “views” of the database, according to the perspective each one requires. This concept is used to enhance the security of the database. Chapter 1 20 Costs and Risks of the Database Approach Up-front costs: – Installation Management Cost and Complexity – Conversion Costs Ongoing Costs – Requires New, Specialized Personnel – Need for Explicit Backup and Recovery Organizational Conflict – Old habits die hard Chapter 1 21 Data Abstarction The main objective of DBMS is to store and retrieve information efficiently. It is not necessary for the users to know physical database storage details The developers hide the complexity from users through several levels of abstraction – 1. Physical level or internal level – 2. Logical level or conceptual level – 3. View level or external level Chapter 1 22 Data Abstarction Chapter 1 23 Database Schema The overall design of the database is called the database schema A schema can contain tables, views, triggers, functions, packages, and other objects. Physical schema – Describes the Database Design at the Physical level Logical schema – Describes the database design at the logical level Subschema – Describes different views of the database Chapter 1 24 Data Models Data model is collection of conceptual tools for describing data, relationship between data, and consistency constraints Chapter 1 25 Figure 1-3 Segment from enterprise data model Figure 3 Chapter 1 26 People Interacting with Database Chapter 1 27 Database Administrator A person having central control over data and programs accessing that data. The objectives of database administrator are – To control the database environment – To standardize the use of database and associated software – To support the development and maintenance of database. The responsibilities of the database administrator are: – – – – Authorizing access to the database. Coordinating and monitoring its use. Acquiring hardware and software resources as needed. Backup and recovery. Chapter 1 28 Database Designer Database designer can be either logical database designer or physical database designer. Logical database designer is concerned with identifying the data, the relationships between the data, and the constraints on the data The physical database designer takes the logical data model and decides the way in which it can be physically implemented Chapter 1 29 Database Users Database users are the people who need information from the database to carry out their business. Can be classified into : – application programmers – and end users. Application programmers write application programs and interacts with the data base through host Language like Pascal, C , SQL and Cobol Chapter 1 30 Data Dictionary or system catalog It contains information about the database – – – – – tables, the fields of the tables, data types, primary keys, foreign keys indexes, It can be considered as a file that stores Metadata One of its major functions – to enforce the constraints placed upon the database by the designer, such as referential integrity and cascade delete. Chapter 1 31 Functional Components of Database System Structure 1- Storage manager. – It is responsible for storing, retrieving, and updating data in the database. Storage manager components are: 1. Authorization and integrity manager. 2. Transaction manager. 3. File manager. 4. Buffer manager. 2. Query processor. Chapter 1 32 1- Authorization and Integrity Manager Checks the integrity constraints and authority of users to access data. DBMS can ignore storing or updating data due to – Authority Violation, or – Constraints Violation Chapter 1 33 2- Transaction Management A transaction is a collection of operations that performs a single logical function in a database application. Transaction-management component ensures that the database remains in a consistent state despite system failures and transaction failure. Concurrency control manager controls the interaction among the concurrent transactions, to ensure the consistency of the database Chapter 1 34 3- File Manager File manager manages the allocation of space on disk storage. The file manager can: – – Create a file – – Delete a file – – Update the record in the file – – Retrieve a record from a file Chapter 1 35 Database Architecture The database architecture can be broadly classified into two-, three-, and multitier architecture. Two-Tier Architecture – two-tier architecture is a client–server architecture – The client contains the presentation code and the SQL statements for data access. – The database server processes the SQL statements and sends query results back to the client Chapter 1 36 Figure 1-8 Client server Chapter 1 37 Two-Tier Architecture Advantages of Two-tier Architecture – is a good approach for moderate number of clients. – is the simplest to implement, due to the number of good commercial development environments. Drawbacks of Two-tier Architecture – Software maintenance can be difficult PC clients contain a mixture of presentation, validation, and business logic code. – the performance of two-tier architecture can be poor when a large number of clients submit requests database server may be overwhelmed with managing messages. Chapter 1 38 Three-tier Architecture Provides greater application scalability, lower maintenance, and increased reuse of components. Through standard tiered interfaces, services are made available to the application. Multiple-tier architectures provide more flexibility on division of processing Chapter 1 39 Three-tier Architecture Chapter 1 40 Figure 1-5 Client/server system for Pine Valley Furniture Company Chapter 1 41 Multitier Architecture N-tier implementation employs a three-tier logical architecture superimposed on a distributed physical model Application Servers can access other application servers in order to supply services to the client application as well as to other Application Servers Chapter 1 42 Multitier Architecture Chapter 1 43 The Range of Database Applications Personal Database – standalone desktop database Workgroup Database – local area network (<25 users) Department Database – local area network (25-100 users) Enterprise Database – wide-area network (hundreds or thousands of users) Chapter 1 44 Figure 1-7 Typical data from a personal computer database Chapter 1 45 Figure 1-9 An enterprise data warehouse Chapter 1 46 Components of the Database Environment CASE Tools – computer-aided software engineering Repository – centralized storehouse of metadata Database Management System (DBMS) – software for managing the database Database – storehouse of the data Application Programs – software using the data User Interface – text and graphical displays to users Data Administrators – personnel responsible for maintaining the database System Developers – personnel responsible for designing databases and software End Users – people who use the applications and databases Chapter 1 47 Figure 1-10 Components of the database environment Chapter 1 48 Evolution of DB Systems Chapter 1 Flat files - 1960s - 1980s Hierarchical – 1970s - 1990s Network – 1970s - 1990s Relational – 1980s - present Object-oriented – 1990s - present Object-relational – 1990s - present Data warehousing – 1980s - present Web-enabled – 1990s - present 49