CSE
4701
Chapters 1 & 2 (6e/5e): Introduction to DB
Prof. Steven A. Demurjian, Sr.
Computer Science & Engineering Department
The University of Connecticut
191 Auditorium Road, Box U-155
Storrs, CT 06269-3155 steve@engr.uconn.edu
http://www.engr.uconn.edu/~steve
(860) 486 - 4818
The majority of these slides are being used with the permission of Dr. Ling
Lui, Associate Professor, College of Computing, Georgia Tech.
Some slides have been adapted from the AWL web site for the textbook
Chaps1&2-1
What is a Database?
CSE
4701
Database (DB) is a Structured Collection of Data about the Entities that Exist in an Enterprise and that are Used (Shared) by Applications of an Enterprise
The Structure of the Database is Dependent Upon the
Data Model that is Used - Understanding the Terms:
Database (DB)
Database System (DBS)
Database Management Systems (DBMS)
What can be a Database?
Address Book, Contact List, Calendar, Bookmarks,
Flat File Against Which a Program is Executed,
Real-Time Data Sensor Feed into Embedded
Program, Set of Word or Excel Documents, Fitness
Data from App or Device, Recipes, etc.
Chaps1&2-2
What Does a Database Contain?
CSE
4701
Conceptually – One View
Consider Excel Spreadsheet with Multiple Tabs
Tabs: Employee, Department, Projects,
Each Tab has Columns that Each have Own Types
Each Tab has Data for All Columns/Multiple Rows
Data is Static, Human Readable/Updateable
Chaps1&2-3
CSE
4701
Excel Spreadsheet Example
Chaps1&2-4
What Does a Database Contain?
CSE
4701
Conceptually – Another View
Consider a set of Java Classes
Employee, Manager ISA Employee, Department,
Projects, etc.
Each Class has Single Instance
Need to Use Collection in Java for Set of Instances class Employee { private String Name, SSN, Address; private Date BirthDate; private double Salary; private integer DeptNum; public Empolyee() { ... };
};
Chaps1&2-5
What are Problems with each?
CSE
4701
Both Cases:
Data is Not Stored in a Permanent Repository
Data Not easily Sharable/Modified by Multiple
Users Simultaneously
Data Consistency Constraints Not Enforceable
Need
DeptNum in Employee/Project vs. Number in
Department
Same Issue across Classes
Way to Persistently store Data
Accessible by Multiple Users
Chaps1&2-6
Defining a Database Table
CSE
4701
CREATE TABLE EMPLOYEE
( FNAME VARCHAR(15)
MINIT
LNAME
CHAR ,
VARCHAR(15)
SSN
BDATE
CHAR(9)
DATE
ADDRESS VARCHAR(30) ,
SEX CHAR ,
SALARY DECIMAL(10,2) ,
NOT NULL ,
NOT NULL ,
NOT NULL ,
SUPERSSN CHAR(9) ,
DNO INT NOT NULL ,
PRIMARY KEY (SSN) ,
FOREIGN KEY (SUPERSSN)
REFERENCES EMPLOYEE(SSN) ,
FOREIGN KEY (DNO)
REFERENCES DEPARTMENT(DNUMBER) ) ;
Chaps1&2-7
CSE
4701
A Schema Contains Multiple Tables
Chaps1&2-8
CSE
4701
…filled with Rows of Tuples/Instances
Which Represent Tuples/Instances of Each Relation
A
S
C null
W
B null null
1
4
5
5
Chaps1&2-9
CSE
4701
…filled with Rows of Tuples/Instances
Chaps1&2-10
SQL Searches the Tables
CSE
4701
Retrieve the Birthdate and Address of the Employee whose Name is 'John B. Smith'.
SELECT BDATE, ADDRESS
FROM EMPLOYEE
WHERE FNAME='John' AND MINIT='B’
AND LNAME='Smith’
Which Row(s) are Selected?
B
S
C null
W
B null null
Chaps1&2-11
Motivating Database Management
CSE
4701
Manual Database Management
Data are Not Stored
Programmer Defines Both Logical Data Structure and Physical Structure (Storage)
Chaps1&2-12
Motivating Database Management
CSE
4701
File Processing
Data are Stored in Files with Interface Between
Programs and Files.
Various Access Methods Exist (E.G., Sequential,
Indexed, Random)
One File Corresponds to One or Several Programs.
Chaps1&2-13
Problems with File Systems
CSE
4701
Data are Still Highly Redundant
Sharing Limited and at the File Level
Data is Unstructured
“Flat” Files
High Maintenance Costs
Data Dependence
Ensuring Data Consistency and Controlling Access to Data (Concurrent Access Problematic)
Difficult to Understand by New Developers
Difficulties in Developing New Applications
Almost Impossible to Evolve with New Capabilities
Risk of Inefficient Applications
How have File Systems Changed today?
Collaborative Editing
Chaps1&2-14
Database vs. File System
CSE
4701
Coordinates Both
Physical and Logical
Access to the Data
Data are Shared by All
Programs Authorized to
Have Access to It
Flexible Access to Data
(i.e., Queries)
Multiple Users
Accessing the Same
Data at Same Time
Coordinates Only the
Physical Access to the
Data
Data Written by One
Program May Not Be
Readable by Another
Program
Pre-determined Access to Data (I.E., Compiled
Programs)
No Two Programs Can
Concurrently Access the Same File
Chaps1&2-15
CSE
4701
The Role of DBMS in Computing
Chaps1&2-16
CSE
4701
Web or
PC app
Mobile app
What is a Database System?
Chaps1&2-17
What is the Role of Database System?
CSE
4701
Pervasive in Almost All Applications and Every
Application Domain
Norm rather than Exception
Difficult to Imagine Application without Persistent
Store
Remember – Database is a Repository at Minimum
Database Management for Mobile Computing
Myriad of Architectures and Approaches:
From: http://java.sun.com/javaone/javaone98/sessions/T400/index.html
Chaps1&2-18
Santa Cruz Widgets (Two Tier)
CSE
4701
Small Manufacturer Previously on C++
New Order Entry, Inventory, and
Invoicing Applications in Java
Existing Customer and Order Database
Most of Business Logic in Stored Procedures
Tool-generated GUI Forms for Java Objects
Located Company on Web Using Widgets and Tcl, but
Not Widgets and Java
Client Application can be:
Java, C++, VB, etc.
Web Client
Chaps1&2-19
Nocturnal Aviation, Inc. (Three Tier)
CSE
4701
Passenger Check-in for Regional Airline
Local Database for Seating on Today's Flights
Clients Invoke EJBs at Local Site Through RMI
EJBs Update Database and Queue Updates
JMS Queues Updates to Legacy System
DBC API Used to Access Local Database
JTS Synchs Remote Queue With Local Updates
Chaps1&2-20
Wombat Securities (Four Tier)
CSE
4701
Web Access to Brokerage Accounts
Only HTML Browser Required on Front End
"Brokerbean" EJB Provides Business Logic
Login, Query, Trade Servlets Call Brokerbean
Use JNDI to Find EJBs, RMI to Invoke Them
Order and History Records from Java Blend Product
Records Mapped to Oracle Tables, JDBC Calls
Chaps1&2-21
Database Concepts - Summary
CSE
4701
Schema vs. Data
Database-Structured Collection of Data Describing
Objects of Universe of Discourse being Modeling.
A Database Consists of Schema and Data
Schema: Describes the Intension (Type) of Objects
Entity/Table/Relation: A portion of a Schema
Data: Describes the Extension (Tuples) of Objects
Data Definition vs. Data Manipulation Languages
What is Metadata?
DML
DDL define Schema
(metadata)
Table
Data
Operate on data according to the schema
Chaps1&2-22
What are Programming Analogs?
CSE
4701
Schema is Equivalent to a Class Library
All of Different Types of Information
Entity/Table/Relation
Data Attributes and Types
Akin to a Class
Tuples Akin to Creating an Instance from Class
Key Difference - Entity/Table is Two Abstractions
Structure like a Class
Also Represents a Set of all Tuples
Meta-Data
Akin to Java Reflection and Introspection
Access to the Runtime Features of Objects
Let’s See Example
Chaps1&2-23
Classes for a Medical Application
CSE
4701
Data Types, Methods
Patient Inherits from Person and Creates a Single
Instance “John”
Substance
Id:Integer name: String statusCode: String effectiveTime:Date repeatNumber: Int takesPrescribedMedication
Observation
Id:Integer statusCode: String name: String value: String hasMedicalObservations
Person
Id: Integer name: name address: Address bday: String tel: String
Patient
Ethnicity: String prefLang: String race:String
Email: String gender: String getAllergies() get_clinical_notes() get_demographics() get_medications() get_immunizations()
Provider deaNumber: String npiNumber:String
Ethnicity: String race:String
Email: String gender: String
Name family-name: String given-name: String prefix: String suffix: String
Address street: String locality: String region: String country: String
Chaps1&2-24
Database Entity Relationship Diagram
CSE
4701
Patient Entity represents Attributes of a set of Patients
Defines Type and the Collection
Patient Entity is a Database Table with Structure
Like a Class
However, Contains many Instances, e.g., Patients
“John”, “George”, “Jane”, etc.
statusCode id value effectiveTime
Observation id
Patient
Ethnicity prefLang race address name statusCode
Substance effectiveTime id repeatNumber tel bday name
Chaps1&2-25
CSE
4701
Tables
Patient(pid, name, address, tel, bday, etc.)
Substance(sid, name, statusCode, etc.)
Observation(oid, value, statusCode, etc.)
PatientObservations(pid, oid)
PatientMedications(pid, sid)
Chaps1&2-26
When is a Database System Needed?
CSE
4701
Traditional Examples
Typical Environment
Corporate Enterprise (Business Data vs.
Bibliographies)
Data With Large Homogenous Parts (e.g.,
Formatted Data)
Data Relevant Over a Long Time
Data Used by Many Simultaneous Users (Batch and On-line Users) for Retrieval & Update
Chaps1&2-27
When is a Database System Needed?
CSE
4701
Emerging Examples
Mobile Devices
Fitness Devices
Genetic, Genomic, and Phenotypic for Medical
Research and Treatment
Emerging Platforms
Mobile: Store Locally in DB Format and
Synchronize to Remote Location
Fitness: Store on Fitness Device and Sync to both
Mobile Device and Remote Location
Genetic/Genomic:
Data for a single Individual
Database for “Targeted” Population for Medical, Drug, etc., Research
Chaps1&2-28
An Example Database System
CSE
4701
An Integrated Telephone Customer Information
System (Circa early 1980s)
What are Examples Today? Has Scale Increased?
Chaps1&2-29
The OpenMRS Sample Database Schema
CSE
4701
99 Tables, Sample Database with 5000 patients and
500,000 observations
Chaps1&2-30
CSE
4701
What are World Largest DBs? (2010) *
*http://www.comparebusinessproducts.com/fyi/10-largest-databases-in-the-world
Chaps1&2-31
What is a DBMS?
CSE
4701
A Database Management System (DBMS) is the
Generalized Tool that Facilitates the
Management of and Access to the Database
Main Functions:
Defining a Database : Specifying Data Types,
Structures, and Constraints
Constructing a Database : the Process of
Storing the Data Itself on Some Storage
Medium
Manipulating a Database : Function for
Querying Specific Data in the Database and
Updating the Database
Chaps1&2-32
What is a DBMS?
CSE
4701
Additional Functions:
Interaction with File Manager
Data Storage and AccessIntegrity Enforcement
Guarantee Correctness, Validity, Consistency
Security Enforcement
Prevent Data From Illegal Uses
Concurrency Control
Control the Interference Between Concurrent Programs
Prevent “Lost Updates”
Don’t Give Away Last Seat of CSE Class to 4 Students
Recovery from Failure
Query Processing and Optimization
Chaps1&2-33
CSE
4701
Components of a DBMS
Chaps1&2-34
DBMS Architecture – High Level
CSE
4701
DBMS Languages
Data Definition Language (DDL)
Data Manipulation Language (DML)
From Embedded Queries or DB Commands Within a
Program
“Stand-alone” Query Language
Host Language:
DML Specification (e.g., SQL) is Embedded in a
“Host” Programming Language (e.g., Java, C++)
DBMS Interfaces
Menu-Based Interface
Graphical Interface
Forms-Based Interface
Interface for DBA (DB Administrator)
Chaps1&2-35
DDL Defining Database Tables
CSE
4701
CREATE TABLE EMPLOYEE
( FNAME VARCHAR(15)
MINIT
LNAME
CHAR ,
VARCHAR(15)
SSN
BDATE
CHAR(9)
DATE
ADDRESS VARCHAR(30) ,
SEX CHAR ,
SALARY DECIMAL(10,2) ,
NOT NULL ,
NOT NULL ,
NOT NULL ,
SUPERSSN CHAR(9) ,
DNO INT NOT NULL ,
PRIMARY KEY (SSN) ,
FOREIGN KEY (SUPERSSN)
REFERENCES EMPLOYEE(SSN) ,
FOREIGN KEY (DNO)
REFERENCES DEPARTMENT(DNUMBER) ) ;
Chapter 8-36
CSE
4701
From Tables – Define Schema
Chapter 8-37
CSE
4701
…and Corresponding DB Tables
Which Represent Tuples/Instances of Each Relation
A
S
C null
W
B null null
1
4
5
5
Chapter 8-38
CSE
4701
…and Corresponding DB Tables
Chapter 8-39
Data Manipulation via SQL
CSE
4701
Retrieve the Birthdate and Address of the Employee whose Name is 'John B. Smith'.
SELECT BDATE, ADDRESS
FROM EMPLOYEE
WHERE FNAME='John' AND MINIT='B’
AND LNAME='Smith’
Which Row(s) are Selected?
B
S
C null
W
B null null
Chapter 8-40
Data Manipulation via SQL
CSE
4701
Retrieve Name and Address of all Employees who work for the 'Research' Department
SELECT FNAME, MINIT, LNAME, ADDRESS, DNAME
FROM EMPLOYEE, DEPARTMENT
WHERE DNAME='Research' AND DNUMBER=DNO
What Action is Being Performed?
Chapter 8-41
CSE
4701
Simple SQL Queries - Result
Called a Join on DNO=DNUMBER
Chapter 8-42
CSE
4701
DBMS Architecture - Components
Main DBMS Modules
DDL Compiler
DML Compiler
Ad-hoc (Interactive) Query Compiler
Run-time Database Processor
Stored Data Manager
Concurrency/Back-Up/Recovery Subsystem
DBMS Utility Modules
Loading Routines
Backup Utility
System Catalog/data Dictionary
Chaps1&2-43
CSE
4701
DBMS Architecture
Chaps1&2-44
CSE
4701
ANSI/SPARC - Three Schema Architecture
External Data Schema (Users’ view)
Conceptual Data Schema (Logical Schema)
Internal Data Schema (Physical Schema)
Chaps1&2-45
CSE
4701
Another View of Three Schema Architecture
Where Mobile, Web Apps Live
Chaps1&2-46
Conceptual Schema
CSE
4701
Describes the Meaning of Data in the Universe of
Discourse
Emphasizes on General, Conceptually Relevant, and Often Time Invariant Structural Aspects of the
Universe of Discourse
Excludes the Physical Organization and Access
Aspects of the Data
Chaps1&2-47
Conceptual Schema
CSE
4701
Another Example
Chaps1&2-48
External Schema
CSE
4701
Describes Parts of the Information in the Conceptual
Schema in a form Convenient to a Particular User
Group’s View
Derived from the Conceptual Schema
A REST API limits the View/Access of DB
Chaps1&2-49
External Schema
CSE
4701
Another Example
Chaps1&2-50
Internal Schema
CSE
4701
Describes How the Information Described in the
Conceptual Schema is Physically Represented in a
Database to Provide the Overall Best Performance
Chaps1&2-51
Internal Schema
CSE
4701
Another Example
Chaps1&2-52
CSE
4701
Unified Example of Three Schemas
Chaps1&2-53
CSE
4701
Let’s See Example via Medical Domain
Patient can read
Demographics, substances,
Observations prohibited
Physicians can read or write all data
Office Staff can read or write name, addr, tel id statusCode
Observation value effectiveTime name statusCode
Substance effectiveTime repeatNumber id id
Patient tel
Ethnicity prefLang race address name bday
Patient(pid, name, address, tel, bday, etc.)
Substance(sid, name, statusCode, etc.)
Observation(oid, value, statusCode, etc.)
PatientObservations(pid, oid)
PatientMedications(pid, sid)
Chaps1&2-54
CSE
4701
Database Access Process
Chaps1&2-55
Database Access Process
CSE
4701
1 -- User Program A Sends to DBMS an Invoke
Command to Retrieve a (Set Of) Record
2 -- DBMS Analyzes the External Schema of the User
Program A and Finds the Database Description of the
Record
3 -- DBMS Checks With the Schema to Get the Data
Types and Location Information of Record
4 -- DBMS Checks With the Physical Schema to Find
Out Which Device the Record is in and What Access
Methods Can Be Used
5 -- According to 4, DBMS Sends OS a Read
Command to Execute the Search
Chaps1&2-56
Database Access Process
CSE
4701
6 -- OS Issues the Page Invoke Command to the
Correspond Device, and Then Puts the Page Fetched
Into the System Buffer
7 -- DBMS Uses the Schema and the External Schema to Infer the Logical Structure of the Retrieving Record
8 -- DBMS Places the Relevant Data to the UWA, and
9 -- Provides the Status Information at the Program
Invocation Exit
Chaps1&2-57
CSE
4701
What is Metadata?
Google/Search Engines Live on Meta-Data
Chaps1&2-58
CSE
4701
Metadata vs. Data
Chaps1&2-59
CSE
4701
ANSI/SPARC - Three Schema Architecture
Chaps1&2-60
Conceptual Schema
CSE
4701
Emphasizes on General, Conceptually Relevant, and
Often Time Invariant Structural Aspects of the
Universe of Discourse
Chaps1&2-61
Data Independence
CSE
4701
Ability that Allows Application Programs Not Being
Affected by Changes in Irrelevant Parts of the
Conceptual Data Representation, Data Storage
Structure and Data Access Methods
Invisibility (Transparency) of the Details of Entire
Database Organization, Storage Structure and Access
Strategy to the Users
Both Logical and Physical
Recall Software Engineering Concepts:
Abstraction the Details of an Application's
Components Can Be Hidden, Providing a Broad
Perspective on the Design
Representation Independence : Changes Can Be
Made to the Implementation that have No Impact on the Interface and Its Users
Chaps1&2-62
Physical Data Independence
CSE
4701
Physical Data Independence is a Measure of
How Much the Internal Schema Can Change
Without Affecting the Application Programs
Physical
Change DB
From MySQL
To Oracle
Chaps1&2-63
Physical Data Independence
CSE
4701
The Ability to Modify the Physical Data
Representation Without Causing Application
Programs to Be Rewritten
Examples:
Transparency of the Physical Storage Organization
Transparency of Physical Access Paths
Numeric Data Representation and Units
Character Data Representation
Data Coding
Physical Data Structure
Specific Time Stamp Formats
Chaps1&2-64
Logical Data Independence
CSE
4701
Logical Data Independence is a Measure of How
Much the Conceptual Schema Can Change Without
Affecting the Application Programs
Add to APIs
For new
Apps/Users
Logical
Chaps1&2-65
Logical Data Independence
CSE
4701
Transparency of the Entire Database Conceptual
Organization
As a Result:
Transparency of Logical Access Strategy
Addition of New Entities
Removal of Entities
Virtual (Derived) Data Items
Union of Records
Views
Common Mechanism for Logical Data
Dependency
Provide Different Logical Data Contexts to
Different Users Based on Their Needs
Update Views vs. Read-Only Views
Chaps1&2-66
Data Independence: Summary
CSE
4701
Ability That Allows Application Programs Not Being
Affected by Changes in Irrelevant Parts of the
Conceptual Data Representation, Data Storage
Structure and Data Access Methods.
Invisibility (Transparency) of the Details of Entire
Database Organization, Storage Structure and Access
Strategy to the Users
Logical Data Independence:
Transparency of Entire DB Conceptual Organization
Views: Common Mechanism for Logical Data
Dependency
Physical Data Independence:
The Ability to Modify the Physical Data Representation
Without Causing Application Programs to Be Rewritten
Chaps1&2-67
Data Models and Database Systems
CSE
4701
Who are Database Users?
What are Database System Features?
Hierarchical Model and IMS System
Data in Hierarchies in terms of Interdependencies and Connections Among Data Items
Connected Graphs with Cycles Not Allowed
Network Model - CODASYL/COBOL
Data in a Network in terms of Interdependencies and Connections Among Data Items
Graphs Allowed
Relational Model and Systems
Entity Relationship Data Model
Functional Data Models
Object-Oriented Database Systems
Chaps1&2-68
CSE
4701
Who are Database Stakeholders?
Chaps1&2-69
CSE
4701
What are System Components?
ACID: Atomicity, Consistency, Isolation, Durability
Chaps1&2-70
Hierarchical Database Definition
CSE
4701
DBD @NAME = University
SEGM @NAME = Courses
FIELD @NAME = (Course#, SEQ), TYPE = CHAR, BYTES = 6
FIELD @NAME = Title, TYPE = CHAR, BYTES = 20
FIELD @NAME = Descrip, TYPE = CHAR, BYTES = 100
SEGM @NAME = Prereq, PARENT = Courses
FIELD @NAME = (PCourse#, SEQ), TYPE = CHAR, BYTES = 6
FIELD @NAME = Title, TYPE = CHAR, BYTES = 20
SEGM @NAME = Formats, PARENT = Courses
FIELD @NAME = (Section#, SEQ, M), TYPE = INT, BYTES = 2
FIELD @NAME = Quarter, TYPE = CHAR, BYTES = 10
FIELD @NAME = Campus, TYPE = CHAR, BYTES = 15
SEGM @NAME = Faculty, PARENT = Formats
FIELD @NAME = (SSN, SEQ), TYPE = CHAR, BYTES = 9
FIELD @NAME = Name, TYPE = CHAR, BYTES = 30
FIELD @NAME = Ophone, TYPE = CHAR, BYTES = 7
SEGM @NAME = Student, PARENT = Formats
FIELD @NAME = (SSN, SEQ), TYPE = CHAR, BYTES = 9
FIELD @NAME = Name, TYPE = CHAR, BYTES = 30
FIELD @NAME = Gpa, TYPE = FLOAT, BYTES = 4
Chaps1&2-71
Hierarchical Graphical Representation
CSE
4701
Courses
Course#* Title Descrip
1 1 n n
Prereq
PCourse#* Title n
Student
SSN#* Name GPA
Formats
Section#* Quarter Campus
1
1
1
Faculty
SSN#* Name Phone
Chaps1&2-72
CSE
4701
Network Database Definition
SCHEMA NAME IS University.
RECORD NAME IS Student; RECORD NAME IS Faculty;
DUPLICATES ARE NOT DUPLICATES ARE NOT
ALLOWED FOR SSN. ALLOWED FOR SSN.
Name ; CHARACTER 30. Name ; CHARACTER 30.
SSN ; CHARACTER 9. SSN ; CHARACTER 9.
Gpa ; FLOAT. Ophone ; CHARACTER 7.
RECORD NAME IS Courses; RECORD NAME IS Formats;
DUPLICATES ARE NOT DUPLICATES ARE NOT
ALLOWED FOR Course#. ALLOWED FOR Section#.
Course# ; CHARACTER 6. Section#; FIXED 3.
Title ; CHARACTER 20. Quarter ; CHARACTER 10.
Descrip ; CHARACTER 100. Campus ; CHARACTER 15.
RECORD NAME IS Prereq; SET NAME IS Requirements;
PCourse#; CHARACTER 6. OWNER IS Courses;
Title ; CHARACTER 20. MEMBER IS Prereq;
SET NAME IS COfferings; SET NAME IS QtrOfferings;
OWNER IS Courses; OWNER IS Formats;
MEMBER IS Formats; MEMBER IS Courses;
SET NAME IS Takes; SET NAME IS Teaches;
OWNER IS Formats; OWNER IS Formats;
MEMBER IS Student; MEMBER IS Faculty;
Chaps1&2-73
Network Graphical Representation
CSE
4701
Courses
Course#* Title Descrip
Requirements COfferings QtrOfferings
Prereq
PCourse#* Title
Takes
Formats
Section#* Quarter Campus
Teaches
Student
SSN#* Name GPA
Faculty
SSN#* Name Phone
Chaps1&2-74
Relational Model
CSE
4701
Relational Model of Data Based on the Concept of a
Relation
Relation - a Mathematical Concept Based on Sets
Strength of the Relational Approach to Data
Management Comes From the Formal Foundation
Provided by the Theory of Relations
RELATION: A Table of Values
A Relation May Be Thought of as a Set of Rows
A Relation May Alternately be Though of as a Set of Columns
Each Row of the Relation May Be Given an
Identifier
Each Column Typically is Called by its Column
Name or Column Header or Attribute Name
Chaps1&2-75
CSE
4701
Relational Tables - Rows/Columns/Tuples
Chaps1&2-76
CSE
4701
Relational Database Definition
CREATE TABLE Student:
Name(CHAR(30)), SSN(CHAR(9)), Gpa(FLOAT(2))
CREATE TABLE Faculty:
Name(CHAR(30)), SSN(CHAR(9)), Ophone(CHAR(7))
CREATE TABLE Courses:
Course#(CHAR(6)), Title(CHAR(20)), Descrip(CHAR(100)),
PCourse#(CHAR(6))
CREATE TABLE Formats:
Section#(INTEGER(3)), Quarter(CHAR(10)), Campus(CHAR(15))
CREATE TABLE TakeorTeach:
SSN(CHAR(9)), Course#(CHAR(6)), Section#(INTEGER(3))
CREATE TABLE COfferings:
Course#(CHAR(6)), Section#(INTEGER(3))
Student(Name*, SSN, Gpa)
Faculty(Name*, SSN, Ophone)
Courses(Course#*, Title, Descrip, PCourse#*)
Formats(Section#*, Quarter, Campus)
TakeorTeach(SSN, Course#, Section#)
COfferings(Course#, Section#)
Chaps1&2-77
Relational Views
CSE
4701
Two Views Derived From Prior Tables
Student Transcript View
Course Prerequisite View
Chaps1&2-78
Entity Relationship (ER) Data Model
CSE
4701
Originally Proposed by P. Chen, ACM TODS, Vol. 1,
No. 1, March1976
Conceptual Modeling of Database Requirements
Allows an Application's Information to be
Characterized
Basic Building Blocks are Entities and Relationships
Entities Model Static Information Aggregations
Relationships Model Static Information
Associations
Well-Understood and Studied Technique
Well-Suited for Relational Database Development
Did Not Originally Include Inheritance!!
Chaps1&2-79
CSE
4701
ER Diagram
Chaps1&2-80
Functional Database Model
CSE
4701
Functional Data Models were Proposed in Early-to-
Mid 1980s
Intended to Exploit Data Abstraction and Abstract
Data Type Concepts
Generalization and Specialization (Inheritance)
Types like Programming Language Structures (No
Operations)
Concepts Include:
Entity (Like ER Entity)
Inheritance and Relationships Among Entities
Object (Instance of Entity)
Function (Functional Statements - Operations) to
Access Objects
ER Successor with Programming Language Features
Chaps1&2-81
CSE
4701
Functional Database Definition
DATABASE University IS
TYPE Person;
SUBTYPE Student;
SUBTYPE Faculty;
TYPE Courses;
TYPE Formats;
TYPE Person IS
Name : STRING(1..30);
SSN : STRING(1..9);
END Entity;
TYPE Course IS
Course# : STRING(1..6);
Title : STRING(1..20);
Descrip : STRING(1..100);
COfferings: SET OF Formats;
Requirements: SET OF Courses;
END ENTITY;
TYPE Formats IS
Section# : INTEGER;
Quarter : STRING(1..10);
Campus : STRING(1..15);
QtrOfferings: SET OF Courses;
END ENTITY;
SUBTYPE Student IS Person
Takes : SET OF Courses;
Gpa : FLOAT;
END ENTITY;
SUBTYPE Faculty IS Person
Teaches : SET OF Courses;
Ophone : STRING(1..7);
END ENTITY;
UNIQUE Course# WITHIN Courses;
UNIQUE Section# WITHIN Formats;
UNIQUE SSN WITHIN Person;
END UNIVERSITY;
Chaps1&2-82
Available Database Systems/Platforms
CSE
4701
Ranging from Relational to Object-Oriented to Real-
Time to Embedded to Mobile
Long History of Database Systems
First Database Journal – 1976
ACM Transactions on Database Systems
Founded by David K. Hsiao (my doctoral advisor)
1 st Issue – P. Chen on the Entity Relationship Model
2 nd Issue
System R – IBM’s First Mainframe DBMS
Abstraction by S. Navarthe (our textbook author)
3 rd Issue – The INGRES DBMS – DEC (Berkeley)
4 th Issue – Functional Dependencies/Normal Forms
6 th Issue – Abstraction and Generalization
Chaps1&2-83
Available Database Systems
CSE
4701
Microsoft SQL Server
IBM DB2
Oracle
MySQL
Emerging Mobile Platforms
Berkeley DB
Couchbase Lite
LevelDB
SQLite
UnQLite
Chaps1&2-84
Microsoft SQL Server
CSE
4701
http://www.microsoft.com/en-us/servercloud/products/sql-server-editions/sql-serverexpress.aspx
Express, Enterprise, Standard Editions
Offers Typical OO-to-Relational Capabilities
Synthesize Objects from Relational Data
Allows Application to Use OO and Database to
Use Relational
Visual C++ and Visual Basic Access
Connectivity and Distributed/Cloud Computing
ODBC - Access to Other DBSs/DBs
ANSI-SQL
Java
Chaps1&2-85
IBM DB2 Universal Database
CSE
4701
http://www-01.ibm.com/software/data/db2/
Enterprise Extended Edition for NT
Integrated Object-Relational Data Server
Offers Typical OO-to-Relational Capabilities
Synthesize Objects from Relational Data
Allows Application to Use OO and Database to
Use Relational
Connectivity and Distributed Computing
JDBC Drivers
ODBC - Access to Other DBSs/DBs
Support for Spatial Data Management
Many Different Editions for HW/OS Platforms
Chaps1&2-86
Oracle Database 12c
CSE
4701
https://www.oracle.com/database/index.html
Ability to Store Data as Business Objects
Integrated Object-Relational Data Server
Offers Typical OO-to-Relational Capabilities
Synthesize Objects from Relational Data
Allows Application to Use OO and Database to
Use Relational
Connectivity and Distributed Computing
JDBC Drivers
SQLJ (Embedded SQL in Java)
Oracle Objects for OLE
Big Data, Warehousing
Multiple Editions
Chaps1&2-87
MySQL
CSE
4701
Extensively Used Open Source Platform
Leverage MySQL Workbench
Chaps1&2-88
CSE
4701
MySQL Workbench
Chaps1&2-89
Databases for Mobile Platforms
CSE
4701
A wide Range of Emerging Products
SQL Anywhere (Sybase)
DB2 Everyplace (IBM)
SQL Server Compact/Express (Microsoft)
Oracle Lite
MySQLMobile, Android PHP/MySQL Mobile
Features
Embedded in the Mobile Device
Offers DB Query Capabilities
Synchronizes with Server Side
Allows Local Storage on Mobile Device
Potential Topic for Project this Semester!
Chaps1&2-90
Databases for Mobile Platforms
CSE
4701
Oracle Berkeley DB
Via SQL, Java Objects, or XML Documents
Couchbase Lite
NoSQL – storing/retrieving data in format that is not relational/SQL-based
LevelDB (written at Google)
Open Source Library for Key/Value Pair Storage and Retrieval
SQLite
Manage in Memory and on Disk
UnQLite
NoSQL Counterpart of SQLite
Chaps1&2-91
Object-Oriented Database Models/Systems
CSE
4701
Reasons for Creation of Object Oriented Databases
Need for More Complex Applications
Need for Additional Data Modeling Features
Increased Use of Object-oriented
Programming Languages
Historical Systems: Orion at MCC, IRIS at H-P Labs,
Open-oodb at T.I., ODE at ATT Bell Labs, Postgres -
Montage - Illustra at UC/B, Encore/observer at Brown
Early Commercial OO Database Products: Ontos,
Gemstone ( -> Ardent), Objectivity, Objectstore ( ->
Excelon), Versant, Poet, Jasmine (Fujitsu – GM)
Also - Relational Products with Object Capabilities
Chaps1&2-92
Object-Oriented Database Models/Systems
CSE
4701
OO Databases Try to Maintain a Direct
Correspondence Between Real-world and DB Objects
Object have State (Value) and Behavior (Operations)
In OO Databases
Objects May Have an Object Structure of Arbitrary
Complexity in Order to Contain All of the Necessary
Information That Describes the Object
In Traditional Database Systems
Information About a Complex Object is Often Scattered
Over Many Relations or Records
Leads to Loss of Direct Correspondence Between a
Real-world Object and Its Database Representation
Supports all OO Programming Concepts: Dispatching,
Inheritance, Polymorphism, Overloading, etc.
Chaps1&2-93
Object-Oriented Database Declarations
CSE
4701
Specifying the Object Types Employee, Date, and
Department Using Type Constructors
Chaps1&2-94
Object-Oriented Database Declarations
CSE
4701
Adding Operations to Definitions of Employee and
Department:
Chaps1&2-95
CSE
4701
Object Oriented DB Vendors/Products
Cache ( http://www.intersystems.com
)
CommonSQL / UncommonSQL db4o ( DeeBeeFourOh ) http://www.db4o.com
(open source)
GOODS ( http://www.garret.ru/~knizhnik/goods.html
)
Objectivity/DB ( http://www.objectivity.com/objectdatabase.shtml
)
ObjectDesignInc
OzoneDb ( http://ozone-db.org
)
PLOB! (acronym for Persistent Lisp OBjects; see http://plob.sourceforge.net/ )
XL2 ( http://www.xl2.net
)
Chaps1&2-96
CSE
4701
Summary: Database Classification
Chaps1&2-97
Market: Prerelational vs. Relational 1999
CSE
4701
Prerelational Revenue Shrinking about 9% Per
Year - Currently 1.8 Billion/year
Relational Revenue Growing about 30% Year -
Currently 11.5 Billion/year
Object-oriented Revenue about 150 Million/year
Chaps1&2-98
Database Market Share 1995
CSE
4701
Today’s market Share – the Top 3:
Oracle: 44.4%
IBM: 21.2%
Microsoft: 18.6% http://datadoghouse.typepad.com/data_doghouse/2007/05/database_market.html
What will be the Role of Open Source?
MySQL (MS) and Innobase (Oracle on top of MySQL)
Evans Data Corporation (http://www.evansdata.com/) http://news.taume.com/Technology/Tech-Deals/Report-MySQL-Gains-25-percent-Market-Share-729
Chaps1&2-99
CSE
4701
Database Market Share 2007
Chaps1&2-100
CSE
4701
Database Market Share in 2013
Chaps1&2-101
Relational Database Products
CSE
4701
Server Based
ORACLE
Sybase SQL Server
Informix
Microsoft SQL
Server
IBM DB2
CA-OpenIngres
MySQL
PC Based
MS Access
MySQL
Many Other Server
Based have
Standalone Versions
Mobile
Berkeley DB
Couchbase Lite
LevelDB
SQLite
UnQLite
Chaps1&2-102
Concluding Remarks
CSE
4701
Emerging DB Technologies
Web Databases
Multimedia Databases
Mobile Databases
Data Warehousing Systems
Temporal & Spatial Databases
Real-Time Databases
Embedded Databases
Bio/Genome/Genetic
Statistical/Population DB
When Not to Use a DBMS:
When Database and Applications are Simple, Well
Defined and Do Not Expect to Change Over Time
Access to Data by Multi-users is Not Required
Chaps1&2-103