Data Modeling using XML Schemas

advertisement
CS3431 –
Database Systems I
Introduction
Murali Mani
What is a Database System?
• Database:
a large collection of related data.
usually too large to fit in computer memory at once
Focus: information, rather than computation

Database Mangement System (DBMS)

Software that allows us to create, use and
maintain a database.
Murali Mani
Database Applications







E-commerce: inventory of books, CDs etc at
Amazon, B&N etc.
Banks, Airlines
Universities
GIS (Maps) – find restaurants closest to WPI
WWW (World Wide Web)
Bio-informatics (genome data)
Digital Libraries
Murali Mani
Focus of this course
Tabular View of Data: Airline System
Flight
Passenger
FlewIn
flightNo
start
destination
miles
101
BOS
LAX
3000
102
PVD
LAX
2900
pName
ffNumber
DoB
milesEarned
Joe
1001
1980
12000
Mary
1002
1981
11000
flightNo
ffNumber
date
101
1001
Jan 4
102
1002
Jan 5
Murali Mani
Focus of this Course: RDBMS


Tabular view of data: Relational Model
Data Model: A collection of concepts used for
describing data



Structures, Constraints, Operations
Schema: Describes structures and
constraints for a given application.
RDBMS: Relational Database Management
Systems

Software that allows us to create, use and
maintain a relational database.
Murali Mani
Levels of Abstraction
View1
• External schema (view)
describes how users see the
data
• Logical schema describes
the logical structures used
• Physical schema describes
files and indexes
Murali Mani
View2
Logical Schema
Physical Schema
disk
View3
Levels of Abstraction:
Example


Logical Schema: Flight, Passenger, FlewIn
Physical Schema



Index on flightNo for Flight
Index on flightNo for FlewIn
Views

NoOfPassengers (flightNo, date, numPassengers)
Murali Mani
Why use DBMS, and not files?




Data independence and efficient access
Reduced application development time
Data integrity: Ensure consistency of data
even with multiple users
Recovery from crashes, security etc.
Murali Mani
Data independence and
efficient access

Data independence



Logical Data Independence: Logical schema can
change, but views need not change
Physical Data Independence: Physical schema
such as indexes can change, but logical schema
ned not change.
Efficient Access

Indexes allow you to see only the “necessary”
portion of data, as opposed to sequential access
in files.
Murali Mani
Reduced application
development time


Higher level of data abstraction
Queries are written in a high level language
tailored for database applications.
SELECT pname
FROM Passengers
WHERE flightNo = 101
Murali Mani
Data Integrity

Concurrent Access, DBMS ensures data is
consistent


eg: multiple airline staff trying to reserve a seat for
different customers.
Ideas:


Transactions – grouping multiple instructions
(reads/writes) into one atomic unit
Locks – locking of resources (tables)
Murali Mani
Recovery from Crashes,
Security etc

If the system crashes in the middle of a
transaction, recovery should be possible.


Ideas: logging, commit/rollback of transactions
Also other features such as security, access
control, privileges etc to facilitate
administration.
Murali Mani
Who use databases?



End users
DB application programmers
Database Administrators




Database design
Security, Authorization
Data availability, crash recovery
Database tuning (for performance)
Murali Mani
Why study DBMS?

Need to process large amounts of data keep
increasing


Video, WWW, geographic information systems
(GIS), genome data, digital libraries
DBMS research is one of the most exciting
areas in Computer Science !!
Murali Mani
What will we learn in this
course?

Database Design





Operations for Relational Model: Relational Algebra
SQL:



Representing the application requirements formally in a
conceptual model (ER, Entity Relationship Model)
Translate an ER schema to relational schema
Analyze the goodness of relational schema designed using
normalization theory.
DDL (Data Definition Language)
DML (Data Manipulation Language)
Briefly study indexes, transactions, logging, security
Murali Mani
Course Logistics


Web Page: http://www.cs.wpi.edu/~cs3431/b05
Lectures



M, T, R – regular lectures
F – discussion on project, H/Ws
Grading





H/W assignments (mostly 5): 10%
Projects (in 3 phases): 25%
mid term (Nov 18): 30%
Final (Dec 15): 30%
Class participation: 5%
Murali Mani
H/Ws and Projects


H/Ws will be due Friday before class.
Project will be done in 3 phases




Phase 0: Due Nov 11, 4:00 pm by email
Phase 1: Due Nov 29, 4:00 pm (via turnin)
Phase 2: Due Dec 13, 4:00 pm (via turnin)
Late submissions

Marks for late submissions will not count.
However we will be happy to grade them.
Murali Mani
Tips for doing well

Exams



Master the topics
Master the topics on time
Project


Ensure that you are on schedule
Additional investigations can get up to 6 additional
points.
Murali Mani
Office Hours


Will be posted on the web.
Make use of the office hours to ensure you
are mastering the materials.
Murali Mani
Introductory Material
Sets, Relations and Functions
Murali Mani
Sets


Unordered collection of objects
Characteristics




Unordered
No duplicates (no object appears more than once
in a set)
Eg: Set of passengers, set of flights
Recall the main set operations


Union, intersection, complement
Check subset
Murali Mani
Relations


Given multiple sets A1, A2, …, An, a relation
is a set of n-tuples of the form (a11, a12, …,
a1n), where a11 is an element of A1, a12 is
an element of A2, and so on.
Eg: suppose the set of course = {DB1, DB2},
the set of TAs = {Hong, Song}, then a relation
between these two sets could be
{(DB1, Hong), (DB1, Song), (DB2, Hong)}
Murali Mani
Functions

Given two sets A, B, a function f from A to B is
denoted as f: A  B. This maps any value of A to
one value of B.



Eg: consider function from faculty members to depts
{(Mike Gennert  CS), (Peter Hansen  Humanities)}
Characteristics



A is called domain
B is called range
No value of A can map to multiple B’s.
Murali Mani
Functions

Injection (one to one):



Surjections (onto)


No 2 values in A map to the same B
Eg: set of Husbands  set of wives
Every value in B has at least 1 value in A that
maps to it
Bijections

One to one and onto
Murali Mani
Download