1 Week 3 September 12 • Three-Level Architecture • Database Management System (DBMS) • Relational Data Model • Views R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento Privacy and Confidentiality SSN: 123 45 6789 Customer: John K Smith Address: 1234 Main Street Dallas, TX 68213 Account: 5432 1234 4567 8901 Credit limit: 20,000 Current balance: 9,123.00 Employer: Enron Corp. Monthly income: 100,000.00 R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento 2 Schema and Subschemas 3 User User Subschema User User Subschema Schema User User Individual Subschema Views Complete catalog of all data retained in the database DBMS Software Manages the database Physical Database R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento Database Environment Three Level Architecture External Level 4 User’s view of the database Conceptual Level Internal Level Conceptual Schema Community view Internal Schema Physical representation Physical data organization R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento Physical storage Independence • Each user should be able to access the same data, but have a different customized view of the data • Users should not have to deal directly with physical database storage details • The DBA should be able to change the database storage structures without affecting the users’ views • The internal structure of the database should be unaffected by changes to the physical aspects of storage • The DBA should be able to change the conceptual or global structure of the database without affecting all users R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento 5 Three-Level Architecture • External Level Describes that part of the database that is relevant to a particular user • Conceptual Level Describes what data is stored in the database and relationships among the data • Internal Level Describes how the data is stored in the database R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento 6 Internal Conceptual External Database Environment Three Level Architecture Subschema External/conceptual mapping Conceptual/internal mapping Subschema 7 Subschema Conceptual Schema Logical data independence Internal Schema Physical data independence R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento Internal Conceptual External Database Environment Three Level Architecture Subschema External/conceptual mapping Conceptual/interna l mapping Subschema 8 Subschema Conceptual Schema Logical data independence Internal Schema Physical data independence R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento Internal Conceptual External Database Environment Three Level Architecture Subschema External/conceptual mapping Conceptual/internal mapping Subschema 9 Subschema Conceptual Schema Internal Schema R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento Logical data independence Physical data independence Data Independence • Logical data independence Immunity of external schemas to changes in the conceptual schema • Physical data independence Immunity of the conceptual schema to changes in the internal schema “Plug and Play!” R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento 10 Database Environment User User Subschema User 11 User Subschema Schema User User Individual Subschema Views Complete catalog of all data retained in the database DBMS Software Manages the database Shared and Managed Physical Database R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento File-Based Systems User User Subschema File File Each user has his/her file User 12 User User Subschema User Subschema Integrity Problems File Schema DBMS Software File R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento Everyone has access to all data in the file Database Languages: DDL vs. DML • Data definition language (DDL) Used to describe name the entities required for the application and the relationships that may exist between the different entities – Specify or modify the database schema and subschemas • Data manipulation language (DML) Provides a set of operations that support the basic data manipulation operations the data – Read and update (i.e., insert, update, delete) the database R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento 13 Models • Represents the real thing • Identifies the components and their interactions • Specifies the behavior For example... R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento 14 Data Models • An integrated collection of concepts for describing and manipulating data, relationships between data and constraints on the data in an organization • Three components: – Structural part - set of rules applied to the construction of the database – Manipulative part - defines the types of operations allowed on the data – Integrity rules - ensures the accuracy of the data R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento 15 Data Models • Object-based – Entity-relationship – Semantic Knowledge-based – Functional – Object-oriented Object-relational • Record-based (transactions) – Relational Transaction-based – Network – Hierarchical • Physical – Unifying How data are stored – Frame memory R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento 16 Record-Based Data Models • Relational (Oracle, DB2, Sybase, Informix, SQL 7, Ingres, etc.) – Based on concepts of mathematical relations – Tables, rows, columns • Network (CODASYL - COnference on DAta SYstem Languages) (Image) – Many-to-many relationships – Record types, data items • Hierarchical (IMS) – Segment types, fields In COBOL: files, records, fields R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento 17 Database 18 Management Queries Physical Database DBMS Manages the database R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento Application Programs Other Software Functions of a DBMS • • • • • • • • • • Data storage, retrieval and update User-accessible catalog Transaction support Concurrency control services Recovery services Authorization services Support for data communications Integrity services Services to promote data independence Utility services R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento 19 DBMS 20 Programmers Application Programs Users DBA Queries Database Schema DML Processor Query Processor DDL Processor Program Object Code Database Manager Dictionary Manager DBMS Access Methods File Manager System Buffers R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento Operating System Database Manager Processes query Integrity Checker Checks integrity constraints Data Manager 21 Authorization Control Command Processor Checks user authorization Determines optimal strategy Query Optimizer Transaction Manager Scheduler Buffer Manager Recovery Manager R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento Database Manager 22 Authorization Control Integrity Checker Command Processor Query Optimizer Manages concurrent operations Performs command operation Transaction Manager Scheduler Ensures recovery in case of failures Transfers data between primary and secondary storage Buffer Manager R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento Recovery Manager Database Manager 23 Authorization Control Integrity Checker Command Processor Query Optimizer Manages concurrent operations Query TransactionTransaction Journal Buffered OS Performs command Scheduler operation Manager Ensures recovery in case of failures Transfers data between primary and secondary storage Buffer Manager R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento Recovery Manager System Catalog • A repository of information describing the data in the database (metadata) • Stores – Names of users authorized to access the DBMS – Names of all data items in the database • Types and sizes • Constraints – Data items and authorization level granted to each user • Active vs. Passive • Integrated vs. Standalone R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento 24 25 Relational Data Model R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento Relational Model 26 • Objectives – A degree of data independence – Address data semantic, consistency and redundancy problems – Set-oriented data manipulation language • Structured Query Language (SQL) Database Criteria Data Set R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento Presentation method Information Data Set Presentation method 27 Information Criteria Data Set Presentation method Information Database Data Set R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento Presentation method Information Domain = all values an attribute can assume 28 Entity Tuples (rows) • Cardinalitiy = number of tuples Relation Attributes (columns) • Degree of a relation = number of attributes R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento Domain of an Attribute 29 • Set of allowable values for one or more attributes Attribute 1 Domain Attribute 2 Union or Intersection Information R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento Domain Properties of Relations • Distinct (i.e., unique) relation name • Each cell contains exactly one atomic (single) value – No repeating groups • Distinct attribute name • The values of an attribute come from the same domain • Order of attributes has no significance • Each tuple is distinct (i.e., unique) – No duplicate tuples • Order of tuples has no significance R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento 30 Unique Identification of a Relation Relation key Superkey Candidate key Primary key Foreign key ? R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento 31 Identifying a Tuple • Superkey An attribute or a set of attributes that uniquely identifies a tuple within a relation • Candidate key A super key such that no proper subset is a superkey within the relation – Uniquely identifies the tuple (uniqueness) – Contains no unique subset (irreducibility) • Primary key The candidate key that is selected to identify tuples uniquely within a relation – Should remain constant over the life of the tuple – Most efficient way of identifying a tuple R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento 32 Finding the Primary Key Super Key Candidate Key Primary key R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento 33 Keys 34 CDs Relation 129341 Help! Beatles Columbia 1-29150-8384-0 129342 Hard Day’s Night Beatles Columbia 1-29150-7115-0 129343 Sergeant Pepper’s Beatles Columbia 1-29150-2484-0 129344 Magical Mystery Tour Beatles Columbia 1-29150-7515-0 129345 Abbey Road Beatles Apple 1-15700-9510-0 Attributes • Catalog number • Record title • Artist name • Record label • UPC R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento Superkey? Candidate key? Primary key? Selecting a Key • Criteria – An efficient way of identifying an entity – The attribute (value) remains constant over the life of the entity • Never changes R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento 35 Identifying a Tuple 36 • Foreign key An attribute or set of attributes within one relation that matches the candidate key of some (possibly the same) relation Relation key foreign key Relation key R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento Foreign Key 37 CDs Relation 129341 Help! Beatles COL 1-29150-8384-0 129342 Hard Day’s Night Beatles COL 1-29150-7115-0 129343 Sergeant Pepper’s Beatles COL 1-29150-2484-0 129344 Magical Mystery Tour Beatles COL 129345 Abbey Road Beatles APP Recording Label Relation (home relation) Must match! 1-29150-7515-0 1-15700-9510-0 COL Columbia Records APP Apple Records R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento Relational Integrity Constraints placed on the set of values allowed for the attributes of a relation. • Entity integrity – No attribute of a primary key can be null (every tuple must be uniquely identified) • Referential integrity – If a foreign key exists in a relation, either the foreign key value must match a candidate key value of some tuple in its home relation, or the foreign key value must be wholly null (i.e., no key exists in the home relation) • Enterprise constraints (organizational) R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento 38 Null Value • Absence of any value (i.e., unknown or nonapplicable to a tuple) R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento 39 Views • A view is a virtual relation or one that does not actually exist, but dynamically derived – Can be constructed by performing operations (i.e., select, project, join, etc.) on values of existing base relations • Base relation - a named relation, corresponding to an entity in the conceptual schema, whose tuples are physically stored in the database • View - a dynamic result of one or more relational operations operating on the base relations to produce another R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento 40 Schema and Subschemas 41 Internal Level Physical Database DBMS Some end-user applications can be supported by views External Subschema Level User User DBMS Software Schema Conceptual Level Subschema User R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento User Subschema User User Views 42 Base Relation R Key Base Relation S Foreign Key Key Criterion View R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento Purpose of Views • Provides a powerful and flexible security mechanism by hiding parts of the database from certain users • Permits user access in a way that is customized to their needs • Simplify complex operations on the base relations • Designed to support the external model • Provides logical independence R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento 43 Updating Views • Allowed on views – Derived from a single base relation, and – Containing the primary key or a candidate key • NOT allowed on views – Derived from multiple base relations – Involving aggregations (i.e., summations) or groups operations • Vendors may have other constraints on updating views R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento 44 45 R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento