DATABASE DESIGN Chapter-2- Database System Concepts and Architecture Reference: Elmasri Lec. 1 and Navathe, Fundamentals of Database Systems, Fifth Edition Fatimah Al-Aqeel Lecture slides DATA MODELS Data Model: A set of concepts to describe the structure of a database (the data types, entity, relationships and constraints that should hold for the data), and certain constraints that the database should obey. Structure of a database: Entity: Represent a real-world objects or concepts. (ex: employee) Attribute: Represent some property of interest that describes an entity. (ex: salary, name) Relationship: among two or more entities. (ex: a works-on relationship between an employee and a project) DATA MODELS Data Model Operations: Operations for specifying database retrievals and updates by referring to the concepts of the data model. Operations on the data model may include basic operations and userdefined operations. (An example of userdefine operation is COMPUTE-GPA which can be applied to the Student object) CATEGORIES OF DATA MODELS Conceptual (high-level, semantic) data models: Provide concepts that are close to the way many users perceive data. (Also called entity-based or object-based data models.) Physical (low-level, internal) data models: Provide concepts that describe details of how data is stored in the computer. Implementation (representational) data models: Provide concepts that fall between the above two, balancing user views with some computer storage details. CATEGORIES OF DATA MODELS Conceptua l model Logical model Physical model • Hardware independent/Software independent • (high-level, semantic)data models • Hardware independent/Software dependent • (Representational)data models • Hardware dependent Software dependent • (low-level, internal)data models CATEGORIES OF DATA MODELS High Level (Conceptual Data Model) Representational (implementation Data Model) Low Level (Physical Data Model) Provides concepts close to the way Many users see data Provides concepts may be understood by end users but are not far from the way the data is organized. (Entities, Attributes, Relationships) Relational Data Model widely used Details of how data is stored in the computer. (Computer specialist) (Records format, ordering, access paths : makes the search for a record more efficient) 6 SCHEMAS VERSUS INSTANCES Database Schema: The description of a database including descriptions of the database structure and the constraints that should hold on the database. Schema Diagram: A diagrammatic display of (some aspects of) a database schema. Schema Construct: A component of the schema or an object within the schema, e.g., STUDENT,COURSE. DATABASE SCHEMA Three-Schema Architecture: Proposed to support DBMS characteristics of: Program-data independence. Support of multiple views of the data. THREE-SCHEMA ARCHITECTURE THREE-SCHEMA ARCHITECTURE 10 •Internal level: describes how data is stored. The way perceived by the DBMS & OS. •Conceptual level: describes what data stored in DB, and the relationships among the data. The way perceived by the DBA & programmers. •External level: describes the part of the DB that user is interested in. The way perceived by the end users. THREE-SCHEMA ARCHITECTURE External Level Emp_No Conceptual Level Internal Level View 1 FName View 2 LName Dept_No EMPLOYEE Emp_No FName LName Dept_No Salary PREFIX EMP# LNM FNM DPT# PAY Staff_No LName Salary CHAR(6) CHAR(15) CHAR(15) CHAR(3) NUMBER(5) TYPE=BYTE(6),OFFSET=0 TYPE=BYTE(6),OFFSET=6, INDEX=EMPX TYPE=BYTE(15),OFFSET=12 TYPE=BYTE(15),OFFSET=27 TYPE=BYTE(4),OFFSET=42 TYPE=FULLWORD,OFFSET=46 11 MAPPING Mapping is the process of transforming requests and results between the Internal, Conceptual & External levels. Two types of mapping: - External / Conceptual mapping -Conceptual / Internal mapping 12 DATABASE STATE Database Instance: The actual data stored in a database at a particular moment in time. Also called database state (or occurrence). Database State: Refers to the content of a database at a moment in time. Initial Database State: Refers to the database when it is loaded Valid State: A state that satisfies the structure and constraints of the database. Distinction The database schema changes very infrequently. The database state changes every time the database is updated. Schema is also called intension, whereas state is called extension. DATA INDEPENDENCE Logical Data Independence: The capacity to change the conceptual schema without having to change the external schemas and their application programs. Physical Data Independence: The capacity to change the internal schema without having to change the conceptual schema. DATA INDEPENDENCE When a schema at a lower level is changed, only the mappings between this schema and higher-level schemas need to be changed in a DBMS that fully supports data independence. The higher-level schemas themselves are unchanged. Hence, the application programs need not be changed since they refer to the external schemas. DATA INDEPENDENCE External Schema External Schema External/Conceptual Mapping External Schema Logical data Independence Conceptual Schema Conceptual/Internal Mapping Physical data Independence Internal Schema 1- LOGICAL DATA INDEPENDENCE The ability to change the conceptual schema without affecting the external schemas. (application programs) Ex: adding/ removing a record or a data item The external schema that refer to the remaining data should not be affected. 17 2- PHYSICAL DATA INDEPENDENCE The ability to change the internal schema without affecting the conceptual or external schemas. This change might be for reorganizing the physical files. EX: creating new access structures to improve the performance of retrieval or update.( i.e. change in the physical files does not affect the queries) 18 DBMS LANGUAGES DDL (Data Definition Languages) Conceptual schema Internal schemas (when schema separation is not clear) External schema (in most DBMSs) SDL (Store Definition Languages) VDL (View Definition Languages) External view DML (Data Manipulation Languages) Internal schema To manipulate the populated data Modern DB All in one (E.g. SQL) Separate SDL 19 DBMS LANGUAGES Data Definition Language (DDL): Used by the DBA and database designers to specify the conceptual schema of a database. In many DBMSs, the DDL is also used to define internal and external schemas (views). In some DBMSs: separate Storage Definition Language (SDL) and View Definition Language (VDL) are used to define internal and external schemas. DBMS LANGUAGES Data Definition Language (DDL) is a descriptive language for defining the database schema DDL compiler generates meta-data stored in a the data dictionary 21 DBMS LANGUAGES Data Manipulation Language (DML): Is a language for retrieving and updating (insert, delete, & modify) the data in the DB DML commands (data sublanguage) can be embedded in a general-purpose programming language (host language), such as COBOL, C or an Assembly Language. Alternatively, stand-alone DML commands can be applied directly (query language). 22 DBMS LANGUAGES Two classes of languages •High Level or Non-procedural Languages: • e.g., SQL, • are set-oriented and specify what data to retrieve than how to retrieve. •Also called declarative languages. • Low Level or Procedural Languages: •record-at-a-time; • they specify how to retrieve data and include constructs such as looping. 23 DBMS INTERFACES Stand-alone query language interfaces. Programmer interfaces for embedding DML in programming languages: Pre-compiler Approach Procedure (Subroutine) Call Approach User-friendly interfaces: Menu-based, popular for browsing on the web Forms-based, designed for naïve users Graphics-based (Point and Click, Drag and Drop etc.) Natural language: requests in written English Combinations of the above OTHER DBMS INTERFACES Speech as Input (?) and Output Web Browser as an interface Parametric interfaces (e.g., bank tellers) using function keys. Interfaces for the DBA: Creating accounts, granting authorizations Setting system parameters Changing schemas or access path SOFTWARE COMPONENTS OF A DBMS DBMS Component Modules Database System Utilities Tools Application Environments 26 DBMS COMPONENT MODULES Application programmers DBA Staff APPLICATION PROGRAMS Casual users Precompiled DDL STATEMENTS PRIVILEGED COMMANDS Parametric users INTERACTIVE QUERY Host Language Compiler DDL Compiler Query Compiler DML STATEMENTS COMPILED (CANNED) TRANSACTIONS execution System Catalog/ Data Dictionary execution DML Compiler Run-time Database Processor execution Concurrency Control/ Backup/Recovery Subsystems Stored Data Manager STORED DATABASE DBMS COMPONENT MODULES (2/2) Disk access control OS:DB & catalog are stored on disk .Access to disk is controlled by OS which schedules disk input/output. Stored Data Manager: controls access to DBMS information stored in the disk. Compilers DDL Compiler : process schema definition specified in the DDL, and stores the description of the schema (meta-data) in the DBMS catalog. Query Compiler: handles high- level queries that are entered interactively. DML Compiler Precompiler /Host language compiler: extracts DML commands from an application program written in a host programming language. These commands are sent to the DML complier for compilation into object code for database access. Handling DB access at runtime Runtime database processor 28 DATABASE SYSTEM UTILITIES 1. 2. 3. 4. Loading: load/transfer existing data files (e.g. text files or sequential files) into the database. Backup: backup copy of the database by dumping the entire databases onto tape. File reorganization: to improve performance Performance Monitoring: monitors database usage and provides statistics to the DBA. 29 TOOLS CASE( Computer Aided Software Engineering Tools): used in the design phase of database system. Used by DB designers/DBA/users Examples: Oracle’s Designer/Developer Powersoft’s S- Designer tool. Information Engineering Facility (IEF) to implement the ER. Data repository systems (Information repository) Used in large organizations, accessed directly by users or DBA Stores : Catalog information (schema, constraints). Design decisions, Usage standards , application program descriptions and user information 30 APPLICATION DEVELOPMENT ENVIRONMENTS E.g. PowerBuilder/JBuilder Supports Development of database applications Database design GUI development Querying and updating Application program development 31 DBMS CLASSIFICATION BASED ON DATA MODEL Relational data model Hierarchical data model Network data model 32 DBMS CLASSIFICATION BASED ON DATA MODEL DBMS CLASSIFICATION BASED ON NUMBER OF SITES (VIEW OF CONTROL) Centralized systems Distributed DBMSs (DDBMS) Homogeneous DDBMS Heterogeneous DBMS (Federated DBMS/multidatabase systems) 34 CHAPTER -2- SUMMARY Data Model Database schema and Database state Three Schema Architecture Mapping Data Independence Database Languages & Interfaces The Database System Environment DBMS Classifications 35