Uploaded by K.M Ehsan-Ul-Hasan 1921966042

L1-Introduction-to-Database-Systems-1

advertisement
Lecture 1:
Introduction to databases
Dr. M. A. Rouf
Professor
Dept. of Computer Science and Engineering
Dhaka University of Engineering and Technology
(DUET), Gazipur-1700
Bangladesh
Email: marouf.cse@duet.ac.bd,
rouf7606@gmail.com
Cell phone: 01711-780541
Dr. M. A. Rouf, Dept. of CSE, DUET
Database Prehistory
Data entry
Query processing
Storage and retrieval
Sorting
Dr. M. A. Rouf, Dept. of CSE, DUET
Our Hero --- E. F. Codd
Edgar F. "Ted" Codd ( August 23, 1923 - April 18, 2003) was a British computer scientist
who invented relational databases while working for IBM.
He was born in Portland, Dorset, studied maths
and chemistry at Oxford. He was a pilot in the
Royal Air Force during WWII. In 1948 he joined IBM
in New York as a mathematical programmer. He fled
the USA to Canada during the McCarthy period.
Later, he returned to the USA to earn a doctorate in CS
from the University of Michigan in Ann Arbor. He then
joined IBM research in San Jose.
His 1970 paper “A Relational Model of Data for
Large Shared Data Banks” changed everything.
In the mid 1990’s he coined the term OLAP.
Dr. M. A. Rouf, Dept. of CSE, DUET
Database Management Systems (DBMSs)
Your Applications
Go Here
DBMS
Raw Resources (bare metal)
Dr. M. A. Rouf, Dept. of CSE, DUET
Database abstractions
allow this interface to
be cleanly defined and
this allows applications
and data management
systems to be
implemented
separately.
Today, Database Systems are Ubiquitous
Database system design from the European Bioinformatics Institute (Hinxton UK)
Service Tools
Database
design
Submission tools
Submitters
Development
DB
Production DB
End Users
Other archives
Service DB
Add value
(computation)
Rouf, Dept. of CSE, DUET
Q/C etcDr. M. A.
Add
value (review etc.)
Releases
Releases
&
&
Updates
Updates
What is a database system?
• A database is a large, integrated
collection of data
• A database contains a model of
something!
• A database management system
(DBMS) is a software system designed to
store, manage and facilitate access to the
database
Dr. M. A. Rouf, Dept. of CSE, DUET
What does a database system
do?
• Manages Very Large Amounts of Data
• Supports efficient access to Very Large
Amounts of Data
• Supports concurrent access to Very
Large Amounts of Data
• Supports secure, atomic access to Very
Large Amounts of Data
Dr. M. A. Rouf, Dept. of CSE, DUET
File System Vs DBMS
• A company has 500GB of data
–
–
–
–
–
–
–
Employee info
Departments
Sales
Products
Raw materials
Shipment
Accounts
• A 32-bit chine can address up to 4 GB main memory
–
–
–
–
How can we make query on this 500 GB data?
We must protect data from inconsistent update.
We must ensure that is restored to a consistent state if system crash.
We must secure data to view and update by unauthorized access.
Dr. M. A. Rouf, Dept. of CSE, DUET
Databases are a Rich Area for
Computer Science
• Programming languages and software
engineering (obviously)
• Data structures and algorithms (obviously)
• Logic, discrete maths, computation theory
– Some of today’s most beautiful theoretical results are
in “finite model theory” --- an area derived directly
from database theory
• Systems problems: concurrency, operating
systems, file organisation, networks, distributed
systems…
Many of the concepts covered in this course are “classical” --- they form
the heart of the subject. But the field of databases is still evolving and
producing new and interesting research (hinted at in lectures 11 & 12).
Dr. M. A. Rouf, Dept. of CSE, DUET
What this course is about
• According to Ullman, there are three
aspects to studying databases:
1. Modelling and design of databases
2. Programming
3. DBMS implementation
• This course addresses 1 and 2
Dr. M. A. Rouf, Dept. of CSE, DUET
Course Outline
Lecture
Title
1
Introduction to database
2
Entity-relationship model
3
The relational model
4
Relational algebra
5
Relational calculus
6
Schema refinement: functional dependencies
7
Schema refinement: normalization
8
Online analytical processing
9
Basic SQL and integrity constraint
10
Transactions, recovery, concurrency
11
Database storage, indexes, query execution
Dr. M. A. Rouf, Dept. of CSE, DUET
Taken By
Recommended Reading
• Raghu Ramakrishnan, Johannes Gehrke “Database Management
Systems”
• Elmasri & Navathe, “Fundamentals of database systems”, 4th ed.
• Silberschatz, Korth & Sudarshan, “Database system concepts”, 4th
ed. (Text Book)
• Ullman & Widom, “A first course in database systems”.
•
Date, “An introduction to database systems”, 8th ed.
•
OLAP
– DB2/400: Mastering Data Warehousing Functions. (IBM Redbook) Chapters 1 &
2 only. http://www.redbooks.ibm.com/abstracts/sg245184.html
– Data Warehousing and OLAP
Hector Garcia-Molina (Stanford University)
http://www.cs.uh.edu/~ceick/6340/dw-olap.ppt
– Data Warehousing and OLAP Technology for Data Mining Department of
Computing
London Metropolitan University
http://learning.unl.ac.uk/csp002n/CSP002N_wk2.ppt
Dr. M. A. Rouf, Dept. of CSE, DUET
Some systems to play with
1. mysql:
• www.mysql.org
• Open source, quite powerful
2. PostgreSQL:
• www.postgresql.org
• Open source, powerful
3. Microsoft Access:
•
Simple system, lots of nice GUI wrappers
4. Commercial systems:
•
•
•
Oracle 10g (www.oracle.com)
SQL Server 2000 (www.microsoft.com/sql)
DB2 (www.ibm.com/db2)
Dr. M. A. Rouf, Dept. of CSE, DUET
Database system architecture
• It is common to describe databases in two ways
– The logical level:
• What users see, the program or query language interface,
describes the stored data in terms of company’s data model.
– The physical level:
• How files are organised, what indexing mechanisms are
used,
• It is traditional to split the logical level into two: overall
database design (conceptual) and the views that
various users get to see
• A schema is a description of a database
Dr. M. A. Rouf, Dept. of CSE, DUET
Three-level architecture
External
Schema 1
Conceptual
level
Physical
level
External
Schema 2
External
…
Schema n
Conceptual
Schema
Internal
Schema
Dr. M. A. Rouf, Dept. of CSE, DUET
External
level
• Physical level:
describes physical
storage structure.
• Conceptual level:
describes the structure
for the company users
• External level:
describes the view for
external users.
Logical and physical data
independence
• Data independence is the ability to change the
schema at one level of the database system
without changing the schema at the next higher
level
• Logical data independence is the capacity to
change the conceptual schema without
changing the user views
• Physical data independence is the capacity to
change the internal schema without having to
change the conceptual schema or user views
Dr. M. A. Rouf, Dept. of CSE, DUET
Database design process
• Requirements analysis
– User needs; what must database do?
• Conceptual design
Next Lecture
– High-level description; often using E/R model
• Logical design
– Translate E/R model into (typically) relational schema
• Schema refinement
– Check schema for redundancies and anomalies
• Physical design/tuning
– Consider typical workloads, and further optimise
Dr. M. A. Rouf, Dept. of CSE, DUET
The Fundamental Tradeoff of Database
Performance Tuning
• De-normalized data can often result in faster
query response
• Normalized data leads to better transaction
throughput, and avoids “update anomalies”
(corruption of data integrity)
Yes, indexing data can speed up transactions, but this just proves
the point --- an index IS redundant data. General rule of thumb:
indexing will slow down transactions!
What is more important in your database --- query response
or transaction throughput? The answer will vary.
What do the extreme
of Dept.
theofspectrum
look like?
Dr. ends
M. A. Rouf,
CSE, DUET
A Theme of this Course:
OLTP vs. OLAP
• OLTP = Online Transaction Processing
– Need to support many concurrent transactions
(updates and queries)
– Normally associated with the “operational database”
that supports day-to-day activities of an organization.
• OLAP = Online Analytical Processing
– Often based on data extracted from operational
database, as well as other sources
– Used in long-term analysis, business trends.
Dr. M. A. Rouf, Dept. of CSE, DUET
Download