cse4701chap1and2

advertisement

CSE

4701

Chapters 1 & 2 (6e/5e): Introduction to DB

Prof. Steven A. Demurjian, Sr.

Computer Science & Engineering Department

The University of Connecticut

191 Auditorium Road, Box U-155

Storrs, CT 06269-3155 steve@engr.uconn.edu

http://www.engr.uconn.edu/~steve

(860) 486 - 4818

The majority of these slides are being used with the permission of Dr. Ling

Lui, Associate Professor, College of Computing, Georgia Tech.

Some slides have been adapted from the AWL web site for the textbook

Chaps1&2-1

What is a Database?

CSE

4701

Database (DB) is a Structured Collection of Data about the Entities that Exist in an Enterprise and that are Used (Shared) by Applications of an Enterprise

The Structure of the Database is Dependent Upon the

Data Model that is Used - Understanding the Terms:

Database (DB)

Database System (DBS)

 Database Management Systems (DBMS)

What can be a Database?

 Address Book, Contact List, Calendar, Bookmarks,

Flat File Against Which a Program is Executed,

Real-Time Data Sensor Feed into Embedded

Program, Set of Word or Excel Documents, Fitness

Data from App or Device, Recipes, etc.

Chaps1&2-2

What Does a Database Contain?

CSE

4701

 Conceptually – One View

 Consider Excel Spreadsheet with Multiple Tabs

Tabs: Employee, Department, Projects,

Each Tab has Columns that Each have Own Types

Each Tab has Data for All Columns/Multiple Rows

Data is Static, Human Readable/Updateable

Chaps1&2-3

CSE

4701

Excel Spreadsheet Example

Chaps1&2-4

What Does a Database Contain?

CSE

4701

 Conceptually – Another View

 Consider a set of Java Classes

Employee, Manager ISA Employee, Department,

Projects, etc.

Each Class has Single Instance

Need to Use Collection in Java for Set of Instances class Employee { private String Name, SSN, Address; private Date BirthDate; private double Salary; private integer DeptNum; public Empolyee() { ... };

};

Chaps1&2-5

What are Problems with each?

CSE

4701

Both Cases:

 Data is Not Stored in a Permanent Repository

Data Not easily Sharable/Modified by Multiple

Users Simultaneously

Data Consistency Constraints Not Enforceable

Need

DeptNum in Employee/Project vs. Number in

Department

Same Issue across Classes

Way to Persistently store Data

Accessible by Multiple Users

Chaps1&2-6

Defining a Database Table

CSE

4701

CREATE TABLE EMPLOYEE

( FNAME VARCHAR(15)

MINIT

LNAME

CHAR ,

VARCHAR(15)

SSN

BDATE

CHAR(9)

DATE

ADDRESS VARCHAR(30) ,

SEX CHAR ,

SALARY DECIMAL(10,2) ,

NOT NULL ,

NOT NULL ,

NOT NULL ,

SUPERSSN CHAR(9) ,

DNO INT NOT NULL ,

PRIMARY KEY (SSN) ,

FOREIGN KEY (SUPERSSN)

REFERENCES EMPLOYEE(SSN) ,

FOREIGN KEY (DNO)

REFERENCES DEPARTMENT(DNUMBER) ) ;

Chaps1&2-7

CSE

4701

A Schema Contains Multiple Tables

Chaps1&2-8

CSE

4701

…filled with Rows of Tuples/Instances

Which Represent Tuples/Instances of Each Relation

A

S

C null

W

B null null

1

4

5

5

Chaps1&2-9

CSE

4701

…filled with Rows of Tuples/Instances

Chaps1&2-10

SQL Searches the Tables

CSE

4701

Retrieve the Birthdate and Address of the Employee whose Name is 'John B. Smith'.

SELECT BDATE, ADDRESS

FROM EMPLOYEE

WHERE FNAME='John' AND MINIT='B’

AND LNAME='Smith’

Which Row(s) are Selected?

B

S

C null

W

B null null

Chaps1&2-11

Motivating Database Management

CSE

4701

 Manual Database Management

 Data are Not Stored

 Programmer Defines Both Logical Data Structure and Physical Structure (Storage)

Chaps1&2-12

Motivating Database Management

CSE

4701

 File Processing

Data are Stored in Files with Interface Between

Programs and Files.

Various Access Methods Exist (E.G., Sequential,

Indexed, Random)

 One File Corresponds to One or Several Programs.

Chaps1&2-13

Problems with File Systems

CSE

4701

Data are Still Highly Redundant

 Sharing Limited and at the File Level

Data is Unstructured

“Flat” Files

High Maintenance Costs

Data Dependence

Ensuring Data Consistency and Controlling Access to Data (Concurrent Access Problematic)

 Difficult to Understand by New Developers

Difficulties in Developing New Applications

Almost Impossible to Evolve with New Capabilities

Risk of Inefficient Applications

How have File Systems Changed today?

Collaborative Editing

Chaps1&2-14

Database vs. File System

CSE

4701

Coordinates Both

Physical and Logical

Access to the Data

Data are Shared by All

Programs Authorized to

Have Access to It

 Flexible Access to Data

(i.e., Queries)

 Multiple Users

Accessing the Same

Data at Same Time

Coordinates Only the

Physical Access to the

Data

Data Written by One

Program May Not Be

Readable by Another

Program

Pre-determined Access to Data (I.E., Compiled

Programs)

No Two Programs Can

Concurrently Access the Same File

Chaps1&2-15

CSE

4701

The Role of DBMS in Computing

Chaps1&2-16

CSE

4701

Web or

PC app

Mobile app

What is a Database System?

Chaps1&2-17

What is the Role of Database System?

CSE

4701

Pervasive in Almost All Applications and Every

Application Domain

Norm rather than Exception

Difficult to Imagine Application without Persistent

Store

Remember – Database is a Repository at Minimum

Database Management for Mobile Computing

Myriad of Architectures and Approaches:

From: http://java.sun.com/javaone/javaone98/sessions/T400/index.html

Chaps1&2-18

Santa Cruz Widgets (Two Tier)

CSE

4701

Small Manufacturer Previously on C++

New Order Entry, Inventory, and

Invoicing Applications in Java

Existing Customer and Order Database

Most of Business Logic in Stored Procedures

Tool-generated GUI Forms for Java Objects

Located Company on Web Using Widgets and Tcl, but

Not Widgets and Java

Client Application can be:

 Java, C++, VB, etc.

 Web Client

Chaps1&2-19

Nocturnal Aviation, Inc. (Three Tier)

CSE

4701

Passenger Check-in for Regional Airline

Local Database for Seating on Today's Flights

Clients Invoke EJBs at Local Site Through RMI

EJBs Update Database and Queue Updates

JMS Queues Updates to Legacy System

DBC API Used to Access Local Database

JTS Synchs Remote Queue With Local Updates

Chaps1&2-20

Wombat Securities (Four Tier)

CSE

4701

Web Access to Brokerage Accounts

Only HTML Browser Required on Front End

"Brokerbean" EJB Provides Business Logic

Login, Query, Trade Servlets Call Brokerbean

Use JNDI to Find EJBs, RMI to Invoke Them

Order and History Records from Java Blend Product

Records Mapped to Oracle Tables, JDBC Calls

Chaps1&2-21

Database Concepts - Summary

CSE

4701

Schema vs. Data

 Database-Structured Collection of Data Describing

Objects of Universe of Discourse being Modeling.

A Database Consists of Schema and Data

 Schema: Describes the Intension (Type) of Objects

Entity/Table/Relation: A portion of a Schema

Data: Describes the Extension (Tuples) of Objects

Data Definition vs. Data Manipulation Languages

What is Metadata?

DML

DDL define Schema

(metadata)

Table

Data

Operate on data according to the schema

Chaps1&2-22

What are Programming Analogs?

CSE

4701

Schema is Equivalent to a Class Library

 All of Different Types of Information

Entity/Table/Relation

Data Attributes and Types

Akin to a Class

Tuples Akin to Creating an Instance from Class

Key Difference - Entity/Table is Two Abstractions

Structure like a Class

 Also Represents a Set of all Tuples

Meta-Data

 Akin to Java Reflection and Introspection

 Access to the Runtime Features of Objects

Let’s See Example

Chaps1&2-23

Classes for a Medical Application

CSE

4701

Data Types, Methods

Patient Inherits from Person and Creates a Single

Instance “John”

Substance

Id:Integer name: String statusCode: String effectiveTime:Date repeatNumber: Int takesPrescribedMedication

Observation

Id:Integer statusCode: String name: String value: String hasMedicalObservations

Person

Id: Integer name: name address: Address bday: String tel: String

Patient

Ethnicity: String prefLang: String race:String

Email: String gender: String getAllergies() get_clinical_notes() get_demographics() get_medications() get_immunizations()

Provider deaNumber: String npiNumber:String

Ethnicity: String race:String

Email: String gender: String

Name family-name: String given-name: String prefix: String suffix: String

Address street: String locality: String region: String country: String

Chaps1&2-24

Database Entity Relationship Diagram

CSE

4701

Patient Entity represents Attributes of a set of Patients

Defines Type and the Collection

Patient Entity is a Database Table with Structure

Like a Class

However, Contains many Instances, e.g., Patients

“John”, “George”, “Jane”, etc.

statusCode id value effectiveTime

Observation id

Patient

Ethnicity prefLang race address name statusCode

Substance effectiveTime id repeatNumber tel bday name

Chaps1&2-25

CSE

4701

Tables

Patient(pid, name, address, tel, bday, etc.)

Substance(sid, name, statusCode, etc.)

Observation(oid, value, statusCode, etc.)

PatientObservations(pid, oid)

PatientMedications(pid, sid)

Chaps1&2-26

When is a Database System Needed?

CSE

4701

 Traditional Examples

 Typical Environment

Corporate Enterprise (Business Data vs.

Bibliographies)

Data With Large Homogenous Parts (e.g.,

Formatted Data)

Data Relevant Over a Long Time

Data Used by Many Simultaneous Users (Batch and On-line Users) for Retrieval & Update

Chaps1&2-27

When is a Database System Needed?

CSE

4701

Emerging Examples

 Mobile Devices

Fitness Devices

Genetic, Genomic, and Phenotypic for Medical

Research and Treatment

Emerging Platforms

Mobile: Store Locally in DB Format and

Synchronize to Remote Location

Fitness: Store on Fitness Device and Sync to both

Mobile Device and Remote Location

Genetic/Genomic:

Data for a single Individual

 Database for “Targeted” Population for Medical, Drug, etc., Research

Chaps1&2-28

An Example Database System

CSE

4701

 An Integrated Telephone Customer Information

System (Circa early 1980s)

 What are Examples Today? Has Scale Increased?

Chaps1&2-29

The OpenMRS Sample Database Schema

CSE

4701

 99 Tables, Sample Database with 5000 patients and

500,000 observations

Chaps1&2-30

CSE

4701

What are World Largest DBs? (2010) *

*http://www.comparebusinessproducts.com/fyi/10-largest-databases-in-the-world

Chaps1&2-31

What is a DBMS?

CSE

4701

A Database Management System (DBMS) is the

Generalized Tool that Facilitates the

Management of and Access to the Database

Main Functions:

 Defining a Database : Specifying Data Types,

Structures, and Constraints

 Constructing a Database : the Process of

Storing the Data Itself on Some Storage

Medium

 Manipulating a Database : Function for

Querying Specific Data in the Database and

Updating the Database

Chaps1&2-32

What is a DBMS?

CSE

4701

 Additional Functions:

 Interaction with File Manager

Data Storage and AccessIntegrity Enforcement

 Guarantee Correctness, Validity, Consistency

 Security Enforcement

Prevent Data From Illegal Uses

Concurrency Control

Control the Interference Between Concurrent Programs

 Prevent “Lost Updates”

 Don’t Give Away Last Seat of CSE Class to 4 Students

Recovery from Failure

Query Processing and Optimization

Chaps1&2-33

CSE

4701

Components of a DBMS

Chaps1&2-34

DBMS Architecture – High Level

CSE

4701

DBMS Languages

 Data Definition Language (DDL)

 Data Manipulation Language (DML)

From Embedded Queries or DB Commands Within a

Program

 “Stand-alone” Query Language

Host Language:

 DML Specification (e.g., SQL) is Embedded in a

“Host” Programming Language (e.g., Java, C++)

DBMS Interfaces

Menu-Based Interface

Graphical Interface

Forms-Based Interface

Interface for DBA (DB Administrator)

Chaps1&2-35

DDL Defining Database Tables

CSE

4701

CREATE TABLE EMPLOYEE

( FNAME VARCHAR(15)

MINIT

LNAME

CHAR ,

VARCHAR(15)

SSN

BDATE

CHAR(9)

DATE

ADDRESS VARCHAR(30) ,

SEX CHAR ,

SALARY DECIMAL(10,2) ,

NOT NULL ,

NOT NULL ,

NOT NULL ,

SUPERSSN CHAR(9) ,

DNO INT NOT NULL ,

PRIMARY KEY (SSN) ,

FOREIGN KEY (SUPERSSN)

REFERENCES EMPLOYEE(SSN) ,

FOREIGN KEY (DNO)

REFERENCES DEPARTMENT(DNUMBER) ) ;

Chapter 8-36

CSE

4701

From Tables – Define Schema

Chapter 8-37

CSE

4701

…and Corresponding DB Tables

Which Represent Tuples/Instances of Each Relation

A

S

C null

W

B null null

1

4

5

5

Chapter 8-38

CSE

4701

…and Corresponding DB Tables

Chapter 8-39

Data Manipulation via SQL

CSE

4701

Retrieve the Birthdate and Address of the Employee whose Name is 'John B. Smith'.

SELECT BDATE, ADDRESS

FROM EMPLOYEE

WHERE FNAME='John' AND MINIT='B’

AND LNAME='Smith’

Which Row(s) are Selected?

B

S

C null

W

B null null

Chapter 8-40

Data Manipulation via SQL

CSE

4701

Retrieve Name and Address of all Employees who work for the 'Research' Department

SELECT FNAME, MINIT, LNAME, ADDRESS, DNAME

FROM EMPLOYEE, DEPARTMENT

WHERE DNAME='Research' AND DNUMBER=DNO

What Action is Being Performed?

Chapter 8-41

CSE

4701

Simple SQL Queries - Result

Called a Join on DNO=DNUMBER

Chapter 8-42

CSE

4701

DBMS Architecture - Components

Main DBMS Modules

 DDL Compiler

DML Compiler

Ad-hoc (Interactive) Query Compiler

Run-time Database Processor

Stored Data Manager

 Concurrency/Back-Up/Recovery Subsystem

DBMS Utility Modules

Loading Routines

Backup Utility

System Catalog/data Dictionary

Chaps1&2-43

CSE

4701

DBMS Architecture

Chaps1&2-44

CSE

4701

ANSI/SPARC - Three Schema Architecture

External Data Schema (Users’ view)

Conceptual Data Schema (Logical Schema)

Internal Data Schema (Physical Schema)

Chaps1&2-45

CSE

4701

Another View of Three Schema Architecture

Where Mobile, Web Apps Live

Chaps1&2-46

Conceptual Schema

CSE

4701

Describes the Meaning of Data in the Universe of

Discourse

 Emphasizes on General, Conceptually Relevant, and Often Time Invariant Structural Aspects of the

Universe of Discourse

Excludes the Physical Organization and Access

Aspects of the Data

Chaps1&2-47

Conceptual Schema

CSE

4701

 Another Example

Chaps1&2-48

External Schema

CSE

4701

Describes Parts of the Information in the Conceptual

Schema in a form Convenient to a Particular User

Group’s View

Derived from the Conceptual Schema

A REST API limits the View/Access of DB

Chaps1&2-49

External Schema

CSE

4701

 Another Example

Chaps1&2-50

Internal Schema

CSE

4701

 Describes How the Information Described in the

Conceptual Schema is Physically Represented in a

Database to Provide the Overall Best Performance

Chaps1&2-51

Internal Schema

CSE

4701

 Another Example

Chaps1&2-52

CSE

4701

Unified Example of Three Schemas

Chaps1&2-53

CSE

4701

Let’s See Example via Medical Domain

Patient can read

Demographics, substances,

Observations prohibited

Physicians can read or write all data

Office Staff can read or write name, addr, tel id statusCode

Observation value effectiveTime name statusCode

Substance effectiveTime repeatNumber id id

Patient tel

Ethnicity prefLang race address name bday

Patient(pid, name, address, tel, bday, etc.)

Substance(sid, name, statusCode, etc.)

Observation(oid, value, statusCode, etc.)

PatientObservations(pid, oid)

PatientMedications(pid, sid)

Chaps1&2-54

CSE

4701

Database Access Process

Chaps1&2-55

Database Access Process

CSE

4701

1 -- User Program A Sends to DBMS an Invoke

Command to Retrieve a (Set Of) Record

2 -- DBMS Analyzes the External Schema of the User

Program A and Finds the Database Description of the

Record

3 -- DBMS Checks With the Schema to Get the Data

Types and Location Information of Record

4 -- DBMS Checks With the Physical Schema to Find

Out Which Device the Record is in and What Access

Methods Can Be Used

5 -- According to 4, DBMS Sends OS a Read

Command to Execute the Search

Chaps1&2-56

Database Access Process

CSE

4701

6 -- OS Issues the Page Invoke Command to the

Correspond Device, and Then Puts the Page Fetched

Into the System Buffer

7 -- DBMS Uses the Schema and the External Schema to Infer the Logical Structure of the Retrieving Record

8 -- DBMS Places the Relevant Data to the UWA, and

9 -- Provides the Status Information at the Program

Invocation Exit

Chaps1&2-57

CSE

4701

What is Metadata?

Google/Search Engines Live on Meta-Data

Chaps1&2-58

CSE

4701

Metadata vs. Data

Chaps1&2-59

CSE

4701

ANSI/SPARC - Three Schema Architecture

Chaps1&2-60

Conceptual Schema

CSE

4701

 Emphasizes on General, Conceptually Relevant, and

Often Time Invariant Structural Aspects of the

Universe of Discourse

Chaps1&2-61

Data Independence

CSE

4701

Ability that Allows Application Programs Not Being

Affected by Changes in Irrelevant Parts of the

Conceptual Data Representation, Data Storage

Structure and Data Access Methods

Invisibility (Transparency) of the Details of Entire

Database Organization, Storage Structure and Access

Strategy to the Users

 Both Logical and Physical

Recall Software Engineering Concepts:

Abstraction the Details of an Application's

Components Can Be Hidden, Providing a Broad

Perspective on the Design

Representation Independence : Changes Can Be

Made to the Implementation that have No Impact on the Interface and Its Users

Chaps1&2-62

Physical Data Independence

CSE

4701

 Physical Data Independence is a Measure of

How Much the Internal Schema Can Change

Without Affecting the Application Programs

Physical

Change DB

From MySQL

To Oracle

Chaps1&2-63

Physical Data Independence

CSE

4701

The Ability to Modify the Physical Data

Representation Without Causing Application

Programs to Be Rewritten

Examples:

Transparency of the Physical Storage Organization

Transparency of Physical Access Paths

Numeric Data Representation and Units

Character Data Representation

Data Coding

Physical Data Structure

Specific Time Stamp Formats

Chaps1&2-64

Logical Data Independence

CSE

4701

 Logical Data Independence is a Measure of How

Much the Conceptual Schema Can Change Without

Affecting the Application Programs

Add to APIs

For new

Apps/Users

Logical

Chaps1&2-65

Logical Data Independence

CSE

4701

Transparency of the Entire Database Conceptual

Organization

As a Result:

 Transparency of Logical Access Strategy

 Addition of New Entities

Removal of Entities

Virtual (Derived) Data Items

 Union of Records

Views

 Common Mechanism for Logical Data

Dependency

Provide Different Logical Data Contexts to

Different Users Based on Their Needs

Update Views vs. Read-Only Views

Chaps1&2-66

Data Independence: Summary

CSE

4701

Ability That Allows Application Programs Not Being

Affected by Changes in Irrelevant Parts of the

Conceptual Data Representation, Data Storage

Structure and Data Access Methods.

Invisibility (Transparency) of the Details of Entire

Database Organization, Storage Structure and Access

Strategy to the Users

Logical Data Independence:

Transparency of Entire DB Conceptual Organization

Views: Common Mechanism for Logical Data

Dependency

Physical Data Independence:

The Ability to Modify the Physical Data Representation

Without Causing Application Programs to Be Rewritten

Chaps1&2-67

Data Models and Database Systems

CSE

4701

Who are Database Users?

What are Database System Features?

Hierarchical Model and IMS System

 Data in Hierarchies in terms of Interdependencies and Connections Among Data Items

 Connected Graphs with Cycles Not Allowed

Network Model - CODASYL/COBOL

 Data in a Network in terms of Interdependencies and Connections Among Data Items

 Graphs Allowed

Relational Model and Systems

Entity Relationship Data Model

Functional Data Models

Object-Oriented Database Systems

Chaps1&2-68

CSE

4701

Who are Database Stakeholders?

Chaps1&2-69

CSE

4701

What are System Components?

ACID: Atomicity, Consistency, Isolation, Durability

Chaps1&2-70

Hierarchical Database Definition

CSE

4701

DBD @NAME = University

SEGM @NAME = Courses

FIELD @NAME = (Course#, SEQ), TYPE = CHAR, BYTES = 6

FIELD @NAME = Title, TYPE = CHAR, BYTES = 20

FIELD @NAME = Descrip, TYPE = CHAR, BYTES = 100

SEGM @NAME = Prereq, PARENT = Courses

FIELD @NAME = (PCourse#, SEQ), TYPE = CHAR, BYTES = 6

FIELD @NAME = Title, TYPE = CHAR, BYTES = 20

SEGM @NAME = Formats, PARENT = Courses

FIELD @NAME = (Section#, SEQ, M), TYPE = INT, BYTES = 2

FIELD @NAME = Quarter, TYPE = CHAR, BYTES = 10

FIELD @NAME = Campus, TYPE = CHAR, BYTES = 15

SEGM @NAME = Faculty, PARENT = Formats

FIELD @NAME = (SSN, SEQ), TYPE = CHAR, BYTES = 9

FIELD @NAME = Name, TYPE = CHAR, BYTES = 30

FIELD @NAME = Ophone, TYPE = CHAR, BYTES = 7

SEGM @NAME = Student, PARENT = Formats

FIELD @NAME = (SSN, SEQ), TYPE = CHAR, BYTES = 9

FIELD @NAME = Name, TYPE = CHAR, BYTES = 30

FIELD @NAME = Gpa, TYPE = FLOAT, BYTES = 4

Chaps1&2-71

Hierarchical Graphical Representation

CSE

4701

Courses

Course#* Title Descrip

1 1 n n

Prereq

PCourse#* Title n

Student

SSN#* Name GPA

Formats

Section#* Quarter Campus

1

1

1

Faculty

SSN#* Name Phone

Chaps1&2-72

CSE

4701

Network Database Definition

SCHEMA NAME IS University.

RECORD NAME IS Student; RECORD NAME IS Faculty;

DUPLICATES ARE NOT DUPLICATES ARE NOT

ALLOWED FOR SSN. ALLOWED FOR SSN.

Name ; CHARACTER 30. Name ; CHARACTER 30.

SSN ; CHARACTER 9. SSN ; CHARACTER 9.

Gpa ; FLOAT. Ophone ; CHARACTER 7.

RECORD NAME IS Courses; RECORD NAME IS Formats;

DUPLICATES ARE NOT DUPLICATES ARE NOT

ALLOWED FOR Course#. ALLOWED FOR Section#.

Course# ; CHARACTER 6. Section#; FIXED 3.

Title ; CHARACTER 20. Quarter ; CHARACTER 10.

Descrip ; CHARACTER 100. Campus ; CHARACTER 15.

RECORD NAME IS Prereq; SET NAME IS Requirements;

PCourse#; CHARACTER 6. OWNER IS Courses;

Title ; CHARACTER 20. MEMBER IS Prereq;

SET NAME IS COfferings; SET NAME IS QtrOfferings;

OWNER IS Courses; OWNER IS Formats;

MEMBER IS Formats; MEMBER IS Courses;

SET NAME IS Takes; SET NAME IS Teaches;

OWNER IS Formats; OWNER IS Formats;

MEMBER IS Student; MEMBER IS Faculty;

Chaps1&2-73

Network Graphical Representation

CSE

4701

Courses

Course#* Title Descrip

Requirements COfferings QtrOfferings

Prereq

PCourse#* Title

Takes

Formats

Section#* Quarter Campus

Teaches

Student

SSN#* Name GPA

Faculty

SSN#* Name Phone

Chaps1&2-74

Relational Model

CSE

4701

Relational Model of Data Based on the Concept of a

Relation

Relation - a Mathematical Concept Based on Sets

Strength of the Relational Approach to Data

Management Comes From the Formal Foundation

Provided by the Theory of Relations

RELATION: A Table of Values

A Relation May Be Thought of as a Set of Rows

A Relation May Alternately be Though of as a Set of Columns

Each Row of the Relation May Be Given an

Identifier

Each Column Typically is Called by its Column

Name or Column Header or Attribute Name

Chaps1&2-75

CSE

4701

Relational Tables - Rows/Columns/Tuples

Chaps1&2-76

CSE

4701

Relational Database Definition

CREATE TABLE Student:

Name(CHAR(30)), SSN(CHAR(9)), Gpa(FLOAT(2))

CREATE TABLE Faculty:

Name(CHAR(30)), SSN(CHAR(9)), Ophone(CHAR(7))

CREATE TABLE Courses:

Course#(CHAR(6)), Title(CHAR(20)), Descrip(CHAR(100)),

PCourse#(CHAR(6))

CREATE TABLE Formats:

Section#(INTEGER(3)), Quarter(CHAR(10)), Campus(CHAR(15))

CREATE TABLE TakeorTeach:

SSN(CHAR(9)), Course#(CHAR(6)), Section#(INTEGER(3))

CREATE TABLE COfferings:

Course#(CHAR(6)), Section#(INTEGER(3))

Student(Name*, SSN, Gpa)

Faculty(Name*, SSN, Ophone)

Courses(Course#*, Title, Descrip, PCourse#*)

Formats(Section#*, Quarter, Campus)

TakeorTeach(SSN, Course#, Section#)

COfferings(Course#, Section#)

Chaps1&2-77

Relational Views

CSE

4701

 Two Views Derived From Prior Tables

 Student Transcript View

 Course Prerequisite View

Chaps1&2-78

Entity Relationship (ER) Data Model

CSE

4701

Originally Proposed by P. Chen, ACM TODS, Vol. 1,

No. 1, March1976

Conceptual Modeling of Database Requirements

Allows an Application's Information to be

Characterized

Basic Building Blocks are Entities and Relationships

 Entities Model Static Information Aggregations

 Relationships Model Static Information

Associations

Well-Understood and Studied Technique

Well-Suited for Relational Database Development

Did Not Originally Include Inheritance!!

Chaps1&2-79

CSE

4701

ER Diagram

Chaps1&2-80

Functional Database Model

CSE

4701

Functional Data Models were Proposed in Early-to-

Mid 1980s

Intended to Exploit Data Abstraction and Abstract

Data Type Concepts

Generalization and Specialization (Inheritance)

Types like Programming Language Structures (No

Operations)

Concepts Include:

 Entity (Like ER Entity)

Inheritance and Relationships Among Entities

Object (Instance of Entity)

 Function (Functional Statements - Operations) to

Access Objects

ER Successor with Programming Language Features

Chaps1&2-81

CSE

4701

Functional Database Definition

DATABASE University IS

TYPE Person;

SUBTYPE Student;

SUBTYPE Faculty;

TYPE Courses;

TYPE Formats;

TYPE Person IS

Name : STRING(1..30);

SSN : STRING(1..9);

END Entity;

TYPE Course IS

Course# : STRING(1..6);

Title : STRING(1..20);

Descrip : STRING(1..100);

COfferings: SET OF Formats;

Requirements: SET OF Courses;

END ENTITY;

TYPE Formats IS

Section# : INTEGER;

Quarter : STRING(1..10);

Campus : STRING(1..15);

QtrOfferings: SET OF Courses;

END ENTITY;

SUBTYPE Student IS Person

Takes : SET OF Courses;

Gpa : FLOAT;

END ENTITY;

SUBTYPE Faculty IS Person

Teaches : SET OF Courses;

Ophone : STRING(1..7);

END ENTITY;

UNIQUE Course# WITHIN Courses;

UNIQUE Section# WITHIN Formats;

UNIQUE SSN WITHIN Person;

END UNIVERSITY;

Chaps1&2-82

Available Database Systems/Platforms

CSE

4701

Ranging from Relational to Object-Oriented to Real-

Time to Embedded to Mobile

Long History of Database Systems

First Database Journal – 1976

 ACM Transactions on Database Systems

 Founded by David K. Hsiao (my doctoral advisor)

1 st Issue – P. Chen on the Entity Relationship Model

2 nd Issue

 System R – IBM’s First Mainframe DBMS

 Abstraction by S. Navarthe (our textbook author)

3 rd Issue – The INGRES DBMS – DEC (Berkeley)

4 th Issue – Functional Dependencies/Normal Forms

6 th Issue – Abstraction and Generalization

Chaps1&2-83

Available Database Systems

CSE

4701

Microsoft SQL Server

IBM DB2

Oracle

MySQL

Emerging Mobile Platforms

Berkeley DB

Couchbase Lite

LevelDB

SQLite

UnQLite

Chaps1&2-84

Microsoft SQL Server

CSE

4701

 http://www.microsoft.com/en-us/servercloud/products/sql-server-editions/sql-serverexpress.aspx

Express, Enterprise, Standard Editions

Offers Typical OO-to-Relational Capabilities

Synthesize Objects from Relational Data

Allows Application to Use OO and Database to

Use Relational

 Visual C++ and Visual Basic Access

Connectivity and Distributed/Cloud Computing

 ODBC - Access to Other DBSs/DBs

ANSI-SQL

Java

Chaps1&2-85

IBM DB2 Universal Database

CSE

4701

 http://www-01.ibm.com/software/data/db2/

Enterprise Extended Edition for NT

Integrated Object-Relational Data Server

Offers Typical OO-to-Relational Capabilities

Synthesize Objects from Relational Data

Allows Application to Use OO and Database to

Use Relational

Connectivity and Distributed Computing

 JDBC Drivers

 ODBC - Access to Other DBSs/DBs

 Support for Spatial Data Management

Many Different Editions for HW/OS Platforms

Chaps1&2-86

Oracle Database 12c

CSE

4701

 https://www.oracle.com/database/index.html

Ability to Store Data as Business Objects

Integrated Object-Relational Data Server

Offers Typical OO-to-Relational Capabilities

Synthesize Objects from Relational Data

Allows Application to Use OO and Database to

Use Relational

Connectivity and Distributed Computing

 JDBC Drivers

 SQLJ (Embedded SQL in Java)

Oracle Objects for OLE

Big Data, Warehousing

Multiple Editions

Chaps1&2-87

MySQL

CSE

4701

Extensively Used Open Source Platform

Leverage MySQL Workbench

Chaps1&2-88

CSE

4701

MySQL Workbench

Chaps1&2-89

Databases for Mobile Platforms

CSE

4701

A wide Range of Emerging Products

 SQL Anywhere (Sybase)

DB2 Everyplace (IBM)

SQL Server Compact/Express (Microsoft)

Oracle Lite

MySQLMobile, Android PHP/MySQL Mobile

Features

 Embedded in the Mobile Device

Offers DB Query Capabilities

Synchronizes with Server Side

 Allows Local Storage on Mobile Device

Potential Topic for Project this Semester!

Chaps1&2-90

Databases for Mobile Platforms

CSE

4701

Oracle Berkeley DB

 Via SQL, Java Objects, or XML Documents

Couchbase Lite

 NoSQL – storing/retrieving data in format that is not relational/SQL-based

LevelDB (written at Google)

 Open Source Library for Key/Value Pair Storage and Retrieval

SQLite

 Manage in Memory and on Disk

UnQLite

 NoSQL Counterpart of SQLite

Chaps1&2-91

Object-Oriented Database Models/Systems

CSE

4701

Reasons for Creation of Object Oriented Databases

 Need for More Complex Applications

Need for Additional Data Modeling Features

Increased Use of Object-oriented

Programming Languages

Historical Systems: Orion at MCC, IRIS at H-P Labs,

Open-oodb at T.I., ODE at ATT Bell Labs, Postgres -

Montage - Illustra at UC/B, Encore/observer at Brown

Early Commercial OO Database Products: Ontos,

Gemstone ( -> Ardent), Objectivity, Objectstore ( ->

Excelon), Versant, Poet, Jasmine (Fujitsu – GM)

Also - Relational Products with Object Capabilities

Chaps1&2-92

Object-Oriented Database Models/Systems

CSE

4701

OO Databases Try to Maintain a Direct

Correspondence Between Real-world and DB Objects

Object have State (Value) and Behavior (Operations)

 In OO Databases

 Objects May Have an Object Structure of Arbitrary

Complexity in Order to Contain All of the Necessary

Information That Describes the Object

 In Traditional Database Systems

Information About a Complex Object is Often Scattered

Over Many Relations or Records

Leads to Loss of Direct Correspondence Between a

Real-world Object and Its Database Representation

Supports all OO Programming Concepts: Dispatching,

Inheritance, Polymorphism, Overloading, etc.

Chaps1&2-93

Object-Oriented Database Declarations

CSE

4701

 Specifying the Object Types Employee, Date, and

Department Using Type Constructors

Chaps1&2-94

Object-Oriented Database Declarations

CSE

4701

 Adding Operations to Definitions of Employee and

Department:

Chaps1&2-95

CSE

4701

Object Oriented DB Vendors/Products

Cache ( http://www.intersystems.com

)

CommonSQL / UncommonSQL db4o ( DeeBeeFourOh ) http://www.db4o.com

(open source)

GOODS ( http://www.garret.ru/~knizhnik/goods.html

)

Objectivity/DB ( http://www.objectivity.com/objectdatabase.shtml

)

ObjectDesignInc

OzoneDb ( http://ozone-db.org

)

PLOB! (acronym for Persistent Lisp OBjects; see http://plob.sourceforge.net/ )

XL2 ( http://www.xl2.net

)

Chaps1&2-96

CSE

4701

Summary: Database Classification

Chaps1&2-97

Market: Prerelational vs. Relational 1999

CSE

4701

Prerelational Revenue Shrinking about 9% Per

Year - Currently 1.8 Billion/year

Relational Revenue Growing about 30% Year -

Currently 11.5 Billion/year

Object-oriented Revenue about 150 Million/year

Chaps1&2-98

Database Market Share 1995

CSE

4701

Today’s market Share – the Top 3:

 Oracle: 44.4%

 IBM: 21.2%

Microsoft: 18.6% http://datadoghouse.typepad.com/data_doghouse/2007/05/database_market.html

What will be the Role of Open Source?

 MySQL (MS) and Innobase (Oracle on top of MySQL)

 Evans Data Corporation (http://www.evansdata.com/) http://news.taume.com/Technology/Tech-Deals/Report-MySQL-Gains-25-percent-Market-Share-729

Chaps1&2-99

CSE

4701

Database Market Share 2007

Chaps1&2-100

CSE

4701

Database Market Share in 2013

Chaps1&2-101

Relational Database Products

CSE

4701

 Server Based

 ORACLE

Sybase SQL Server

Informix

Microsoft SQL

Server

IBM DB2

CA-OpenIngres

MySQL

PC Based

 MS Access

MySQL

Many Other Server

Based have

Standalone Versions

Mobile

Berkeley DB

Couchbase Lite

LevelDB

SQLite

UnQLite

Chaps1&2-102

Concluding Remarks

CSE

4701

Emerging DB Technologies

 Web Databases

Multimedia Databases

Mobile Databases

Data Warehousing Systems

Temporal & Spatial Databases

Real-Time Databases

Embedded Databases

Bio/Genome/Genetic

 Statistical/Population DB

When Not to Use a DBMS:

When Database and Applications are Simple, Well

Defined and Do Not Expect to Change Over Time

Access to Data by Multi-users is Not Required

Chaps1&2-103

Download