Chapter 1 Introduction 1 Introduction Definition A database management system (DBMS) is a general-purpose software system that facilitates the process of defining, constructing, and manipulating databases for various applications. Definition A database is a collection of related data. Definition Data are known facts that can be recorded and that have implicit meaning. Definition File processing systems are business computer systems which store groups of records in separate files & used to process business records & produce information. 2 Introduction DBMS File Processing Systems - Data redundancy & inconsistency Reduced by ensuring a physical piece of data is available to all programs Data is often duplicated causing higher storage and access cost, poor data integrity, and data inconsistency - Accessing data Allow flexible access to data (e.g., using queries for data retrievals) Allow pre-determined access to data (i.e., complied programs); application programs are dependent on file formats - Concurrent access Designed to coordinate multiple users accessing the same data at the same time Designed to allow a file to be accessed by two programs concurrently only if both programs have read-only access to the file - Data security & integrity High, enforced Loose, not enforced 3 Data Abstraction Provides an abstract view of data Physical level: the lowest level of abstraction describes the storage structure of data. Conceptual level: the next-higher level of abstraction describes the logical structure of the database. View level: the highest level of abstraction describes part of the entire database. Many views are provides for the same database. 4 Database Terminology Database Schema or Conceptual View: describes the overall logical structure of the entire database Database Instance: describes the content of the database Schema = Type, Instance = Value of a variable 5 Data Independence The capacity to change the schema definition at one level without having to change the schema definition at the next higher level Physical data independence: capacity to change the physical schema without having to rewrite the application programs Logical data independence: capacity to change the conceptual schema without having to rewrite the application programs logical data independence is more difficult to achieve than physical data independence 6 Data Models Describe relationships among data, data semantics, integrity, and semantic constraints at the conceptual and view levels I. Object-Based Logic Models DB is structured in variable-length records Provide flexible structuring capabilities Allow explicit specifications of data constraints Widely used data models: Entity-Relationship and Object-Oriented II. Record-Based Logical Models DB is structured in fixed-format records of different types Three widely used data models: Relational, Hierarchical, and Network II. Semi-Structured Data Model Data items of the same type can have different sets of attributes Widely used data model: XML (Extensible Markup Language) 7 Entity-Relationship (E-R) Model An object-based Model A graphical structure (Chapter 7) Widely used in database design Consists of real world objects called entities and relationships among entities An entity is an distinguishable object with a set of attributes Entity set is a set of entities of the same type Relationship set is a set of relationships of the same type Mapping cardinalities represent the associations among different entities 8 An Entity-Relationship Diagram FNAME Attribute 1 DEPT A (Binary) Relationship ADVISES ID# Key Attribute An Entity FACULTY n SNAME STUDENT ADDR n n MAJOR Mapping Cardinalities HAS_TAKEN IS_TAKING m m COURSE COURSE# CRHRS 9 An Entity-Relationship Diagram FNAME FACULTY DEPT ADVISES ID# SNAME STUDENT ADDR MAJOR HAS_TAKEN IS_TAKING COURSE COURSE# CRHRS 10 The Object-Oriented Model An object-based Model A collection of objects with unique identities An operation/function can be performed on objects of particular classes Provide “public interface” for objects of a particular class Classes consist of objects Correspond to abstract data type (ADT) Users can define their own classes Only way to operate on an object by means of operators defined Objects can be simple, complex, or made up of other objects Objects contain methods, i.e., codes operated on objects Object identity (object-based) vs. value identity (record-based) Message passing for accessing data in different objects Apply a given method to a given object by sending a message 11 Object-Oriented Database Systems Employee PersonName EmpName first String Ray last String Ross SmallInteger SSNo 11122333 HomeAddress Address Salary Figure. An employee object. stNumber SmallInteger 1055 street String Alameda city String Gresham SmallInteger 45558 12 The Relational Model A record-based Model Data are organized and stored into 2-dimensional tables (called relations) Flexible to use and easy to understand A relational database schema consists of a number of relation schemas of the form R(A1, A2, …, An), where R is a relation name and Ai, 1 < i < n, is an attribute name. 13 The Relational Model Relational term Informal equivalence Relation Table Tuple Row/Record Cardinality # of rows Attribute Column/Field Degree # of columns Domain Set of legal values Primary Key Unique Identifier 14 The Relational Database Model Supplier# NAME STATUS CITY London Paris etc. Domains Primary Key Relation Supplier Supplier# S1 S2 S3 S4 S5 SNAME STATUS Smith Jones Blake Clark Adams 20 10 30 20 30 CITY London Paris Paris London Athens Tuples C a r d i n a l i t y Attributes Degree 15 Data Definition Languages (DDL) & Data Manipulation Languages (DML) DDL Declares the DB schema and compiles the schema into tables DML Access/Manipulate (retrieve, insert, delete, & modify) the DB • Procedural (or descriptive): specify what is needed and how to get it • Non-procedural (or declarative): specify what is needed but not how to get it 16 Overall System Structure 17