DATABASE CONCEPTS Dr. Awad Khalil Computer Science & Engineering Department AUC Databases are considered as major components in almost all recent computer application systems, including business, management, engineering, education, medicine, science, ... etc. Database technology has a major impact on the growing use of computer systems. What is a Database? A database is simply a collection of related data. Data mean known facts that can be recorded and that have implicit meaning. Database Properties 1. A database is a logically coherent collection of data with some inherent meaning. A random collection of data cannot be considered as a database. 2. A database represents aspects of a real system in real world. Examples of these real systems : A hospital A bank A factory A company A university 3. A database is designed and implemented to afford the informational needs of a specific group of users. These informational needs are supported by a set of application systems running against the implemented database. Example I Consider a small personal database to maintain data concerning the names, telephone numbers, and addresses of the people you know TELEPHONE DIRECTORY NAME Ahmed Aly Badran Sadek Salah PHONE NO 2451615 2467295 2977251 3401312 3409123 1 ADDRESS Heliopolis Heliopolis Nozah Zamalek Dokkie Example II Consider a suppliers-and-parts database to maintain data concerning suppliers supplying parts for a certain company. SUPPLIER SNO S1 S2 S3 S4 S5 Sname Ahmed Badran Aly Saleh Sadek Status 20 10 10 30 20 City Cairo Cairo Alex. Tanta Cairo PNO P1 P2 P3 P4 P5 P6 Pname Nut Bolt Screw Cam Cog Screw Color Red Green Blue Red Black Black Weight 12 15 15 17 20 14 PART SUPPLY SNO S1 S1 S1 S2 S2 S4 S4 S5 PNO P1 P2 P3 P1 P3 P2 P3 P2 QTY 100 200 100 150 100 200 300 100 Database System Environment In a database system environment, four main components can be recognized 1. Database. 2. Database Management System (DBMS). 3. Application programs. 4. Users. 2 Users/Programmers DATABASE SYSTEM Application Programs/Queries DBMS Software Software to Process Queries/Programs Software to Access Stored Data Stored Database Definition (Meta-Data) Stored Database (1) The Database Data in the database will be both integrated and shared . Database management system (DBMS) Database Application programs End-users STUDENT Name Address ENROLLMENT Name Course Department ........... 3 ........... (2) The Database Management System (DBMS) The Database Management System (DBMS) is a general-purpose software that enables users to create and maintain a database. The DBMS facilitates the process of defining, constructing, and manipulating databases for various applications. 1. Defining: Data Definition Language (DDL). 2. Constructing. 3. Manipulating: Data Manipulation Language (DML). DBMS Layers 1. Software to process Queries and Programs: (DML). The language SQL (Structured Query Language) is a typical example of a database query language. 2. Software to access stored data (3) The Application Programs These are the programs written to support the end users requirements. A given end user can access the database via one of the online applications, where he or she operates by choosing items from a menu or filling in items on a form. Such menu- or forms-driven interfaces tend to be easier to use for people who do not have a formal training in data processing. Personnel users Personnel Application Common Database Payroll users Payroll Application (4) USERS Database Users Professional Users Database Designers Database Adnibistrators End Users System Analysts & Application Programmers Casual end users 4 Parametric end users Sophisticated end users Stand-alone end users Characteristics of the Database Approach 1. 2. 3. 4. Self-describing nature of a database Program-data independence and Data Abstraction Support multiple views of the data Sharing of data and multiuser transaction processing Personnel files Personnel Application Payroll files Payroll Application Personnel users Payroll users Traditional file processing approach Personnel Application Personnel users Common Database Payroll Application The Database approach. 5 Payroll users DATA STORING APPROACHES 1. The File-based Approach Each application has its own set of files. Disadvantages: Inflexibility: a “mass production facilty” – committed to processing particular queries. Uncontrolled Redundancy: If these separate applications need to process the same data then must duplicate copies of the data i.e., each application has its own data files, several copies of the same data may exist in different applications. This leads to: Wastage of valuable storage space. Need to input data to several files. Data inconsistency (one fact may have more than one value – various versions may occur). Poor Enforcement of System Standards: Data names, formats, access restrictions, … etc. are not standardized across an organization, may have many synonyms and homonyms. This makes modifications difficult and hinders sharing of data. Limited Data Sharing: Each application has its own private file providing little opportunity for users to share existing data. Additionally any new applications would not be able to use existing files leading to low productivity. Progran – Data Dependency: Descriptions of files, records, data items are embedded within application programs. Any modification to a data file requires that the application programs using that file must also be changed. In other words, program maintenance will be excessive. 2. The Database Approach The database approach improves upon file-based systems. A DBMS (Data Base Management System) does not fragment data into separate files but regards data as being stored in a large concptual repository termed as database. The DBMS handles the addition, storage, update, and retrieval of data. DBMSs are based on semantically rich data models, which can accurately represent real world data. DBMSs allow: Persistence of Data Transaction Control Concurrency Control Recovery Control Querying Integrity Control Data Security Version Control Performance Tuning Components of DBMS DBMS Engine Interface Subsystem (DDL, DML, DCL, Graphical User Interface, Forms Interface, Natural Language Interface) Data Dictionary Subsystem Performance Management Subsystem Concurrency Control Subsystem 6 Data Integrity Management Subsystem Backup and Recovery Subsystem Application Development Subsystem Security Management Subsystem Benefits of the Database Approach Ease of application development Minimal data redundancy Enforcement of standards Data can be shared Physical data independence Logical data independence Better modeling of real world data Uniform security and integrity controls Economy of scale Risks of the Database Approach New specialized personnel Need for explicit backup Organizational conflict Large size Expensive High impact of failure Database Modelling Database Structure A database structure is the description and definition of all basic structures such as simple conceptual files, datatypes, relationships, and constraints that should hold on the data. In any data model it is important to distinguish between the description of the database (Schema) and the database itself (Instance). Database Schema: The description of a database is called the database schema (or the meta-data). A database schema is specified during database design and is not expected to change frequently. The following is a simplified database schema diagram: 7 A Database Schema in SQL CREATE TABLE EMPLOYEE (FNAME VARCHAR(15) NOT NULL, MINIT CHAR, LNAME VARCHAR(15) NOT NULL, SSN SSN_TYPE NO T NUL L , BDATE DATE ADDRESS VARCHAR(30), SEX CHAR, SALARY DECIMAL(10,2), SUPERSSN SSN_TYPE, DNO INT NO T NUL L , PRIMARY KEY (SSN), FOREIGN KEY (SUPERSSN) REFERENCES EMPLOYEE(SSN), FOREIGN KEY (DNO) REFERENCES DEPARTMENT(DNUMBER)); CREATE TABLE DEPARTMENT (DNAME VARCHAR(15) NOT NULL, DNUMBER INT, NO T NUL L , MGRSSN SSN_TYPE NO T NUL L , MGRSTARTDATE DATE PRIMARY KEY (DNUMBER), UNIQUE (DNAME) FOREIGN KEY (MGRSSN) REFERENCES EMPLOYEE(SSN)); CREATE TABLE DEPT_LOCATIONS (DNUMBER INT NO T NUL L , DLOCATION VARCHAR(15) NOT NULL, PRIMARY KEY (DNUMBER, DLOCATION), FOREIGN KEY (DNUMBER) REFERENCES DEPARTMENT(DNUMBER)); CREATE TABLE PROJECT (PNAME VARCHAR(15) NO T NUL L , PNUMBER INT NO T N UL L , PLOCATION VARCHAR(15) DNUM INT NO T NUL L , PRIMARY KEY (PNUMBER), UNIQUE (PNAME) FOREIGN KEY (DNUM) REFERENCES DEPARTMENT(DNUMBER)); CREATE TABLE W ORKS_ON (ESSN SSN_TYPE NO T NUL L , PNO INT NO T NUL L , HO UR S DECIMAL(3,1) NOT NULL, PRIMARY KEY (ESSN, PNO), FOREIGN KEY (ESSN) REFERENCES EMPLOYEE(SSN), FOREIGN KEY (PNO) REFERENCES PROJECT(PNUMBER)); CREATE TABLE DEPENDENT (ESSN SSN_TYPE NOT NULL, DEPENDENT_NAME VARCHAR(15) NOT NULL, SEX CHAR, BDATE DATE, RELATIONSHIP VARCHAR(8) PRIMARY KEY (ESSN, DEPENDENT_NAME), FOREIGN KEY (ESSN) REFERENCES EMPLOYEE(SSN)); 8 Database State (Instance) The data in the database at a particular moment of time is called the database state (or instance). The Three-Layer Architecture End Users External Level External View1 External Viewn external/conceptual mapping Conceptual Level Conceptual Schema conceptual/internal mapping Internal Level Internal Schema Stored Database 9 Data Independence Data independence is the capacity to change the schema at one level of a database system without having to change the schema at the next higher level. Defined as: Logical data independence is the capacity to change the conceptual schema without having to change external schemas or Logical data application programs. Physical data independence is the capacity to change the internal schema without having to change the conceptual (or external) schemas. Data Models A data model is a set of concepts that can be used to describe a database structure. Data Models High-level (conceptual) Data Models Relational Implementation Data Models Hierarchical Network Low-level (physical) Data Models Objectoriented Classification of DBMSs Classification according to Data Model DBMSs Relational Hierarchical Network 10 Objectoriented The Relational Data Model The relational data model represents the database as a collection of tables, where each table can be stored as separate file. Examples of commercial relational DBMSs: DB2 from IBM ORACLE from Oracle Corporation Informix from Informix SyBase from OpenSoft SQL Server from Microsoft MS-ACCESS from Microsoft An Example of a Relational Database 11 The Network Data model The network data model represents data as a record types. An example of a network model is known as the CODASYL DBTG model. The Hierarchical Data model The hierarchical data model represents data as hierarchical tree structure. Each hierarchical represents a number of related records. 12 The Object-Oriented Data model The object-oriented data model defines a database in terms of objects, their properties, and their operations. Objects with the same structure and behavior belong to a class, and classes are organized into hierarchies or a cyclic graphs. The operations of each class are specified in terms of predefined procedures called methods. Experimental OO prototypes The ORION system developed at MCC, The OpenOODB system at Texas Instruments, The IRIS system developed at HP laboratories, The ODE system at ATT Bell Labs, and The ENCORE/ObServer project at Brown University. Commercially available OO systems GEM-STONE/OPAL of SerioLogic, ONTOS of Ontologic, Objectivity of Objectivity Inc., Versant of Versant Technologies, ObjectStore of Object Design, and, O2 of O2 Technology. Classification of DBMSs (Cont’d) Classification according to Number of Users Single user systems support only one user at a time and are mostly used with personal computers. Multiuser systems, which include the majority of DBMSs, support many users concurrently. Classification according to Number of Sites Centralized DBMS where the data is stored at a single computer site. Most DBMSs are centralized. A centralized DBMS can support multiple users, but the DBMS and the database themselves reside totally at a single computer site. Distributed DBMS (DDBMS) can have the actual database and DBMS software distributed over many sites, connected by a computer network. Many DDBMSs use a client-server architecture. 13