Database Systems

advertisement
Database Systems
Exam and Syllabus
Spring 2014
1. Exam and syllabus
2. DBS courses at IfI
3. BSc theses, etc.
4. Beyond the DBS course
DBS14, SL10
1/27
M. Böhlen, ifi@uzh
The Exam
I
I
2/27
M. Böhlen, ifi@uzh
What is Important
The final exam is written and takes place
Tuesday, June 17, 10:15 - 12:00 in BIN 1.B.01
(see VVZ web page for details).
Auxiliary material during exam: 1 A4 sheet with notes.
I
Course web page: http://www.ifi.uzh.ch/dbtg/
I
The textbook is Database Systems by Elmasri and Navathe, 6th
edition.
DBS14, SL10
DBS14, SL10
3/27
M. Böhlen, ifi@uzh
I
Being precise is important.
I
Solving relevant examples is important.
I
Understanding material in detail; apply to new examples.
I
It is not sufficient to “know about it” or to “reproduce”.
I
You must understand and apply techniques learned during the
course.
I
Exercises are representative for exam and are the best preparation.
I
Simple and readable solutions are important (this has an impact on
your grade).
DBS14, SL10
4/27
M. Böhlen, ifi@uzh
Preparation Material
I
Syllabus/1
Exercise with solutions (practice to do exercises yourself before you
look at solution)
I
Lectures with reviews
I
Slides
I
I
I
Textbook
I
Exam 2013, 2012, 2011 and 2010 (note that the material has
evolved; e.g., definition of B+ tree; mapping from ER to relational
DB)
I
I
I
I
Elmasri and Navathe: chapters 3 and 6
relational model
relational algebra (RA):
σ, π, ∪, −, ×, ρ, 1, 1θ , ÷, ←, ϑ, d|><| , |><|d ,
domain relational calculus (DRC), FOPL
d|><|d
practice Cartesian product
practice quantifiers
move between RA, DRC, natural language
be precise (e.g., qualified names do not exist in RA; use renaming)
Open door policy in the database group
DBS14, SL10
5/27
M. Böhlen, ifi@uzh
Syllabus/2
I
Relational model, algebra, and calculus
I
I
I
I
SQL
I
I
I
I
I
I
I
I
I
I
I
DBS14, SL10
I
M. Böhlen, ifi@uzh
Constraints, triggers, views, DB programming
I
I
I
I
I
I
practice formulation of declarative queries
consider effects of duplicates and NULL values
solve and try out SQL solutions with PostgreSQL
I
6/27
Syllabus/3
Elmasri and Navathe: chapters 4 and 5
data definition language, data manipulation language
query expressions, query specifications, orthogonality
subqueries
duplicates
null values
logical update semantics
I
DBS14, SL10
I
important is systematic plan: input, output, modular SQL code
avoid trial and error
I
Elmasri and Navathe: chapters 5, 12 and 25
views, with clause
column constraints, table constraints, assertions, referential integrity
functions, triggers, stored procedures
expressiveness, recursion
know key concepts and their properties
know when and how to use these concepts; use as appropriate; helps
to solve specific problems or break down solutions into smaller parts
details of extended SQL syntax are not the crucial part
be conservative with SQL features
7/27
M. Böhlen, ifi@uzh
DBS14, SL10
8/27
M. Böhlen, ifi@uzh
Syllabus/4
I
Syllabus/5
Relational database design
I
I
I
I
I
I
I
I
I
I
I
Elmasri and Navathe: chapters 14 and 15
design goals, redundancy, keys
functional dependencies, Armstrong’s inference rules
1NF, 2NF, 3NF, BCNF, 4NF
normalization algorithm
dependency preservation, lossless join decompositions
closure, equivalence, minimal cover
I
I
I
I
I
9/27
M. Böhlen, ifi@uzh
Syllabus/6
I
I
I
I
I
I
I
I
I
I
DBS14, SL10
I
I
I
Elmasri and Navathe: chapters 7 and 8
conceptual design process: ER model, entities, attributes,
relationships
weak entities, specializations
8 step ER-to-relational mapping
ER diagrams: construct ER diagrams from real world descriptions
no unique solution; make clarifying assumptions during design
analyze strengths and weaknesses of an ER diagram
extend and modify ER diagrams
use and apply 8 step mapping algorithm
DBS14, SL10
10/27
M. Böhlen, ifi@uzh
Syllabus/7
Physical database design
I
I
definition of FDs and MVDs
definition of normal forms, dependency preservation and losless join
decomposition
application of inference rules and normalization algorithm
DBS14, SL10
Conceptual database design
I
Elmasri and Navathe: chapters 16 and 17
seek time, latency, block read time
file and buffer manager
indexing: secondary and primary index
B+ tree
extendable hashing
I
I
I
I
I
I
compute basic characteristics (nr of blocks, nr of IOs, etc)
know definitions of B+ tree and extendable hashing
apply algorithms on concrete examples
use B+ tree definitions and algorithms from slides (other solutions
are not valid)
11/27
Query processing and optimization
M. Böhlen, ifi@uzh
I
I
I
I
I
DBS14, SL10
Elmasri and Navathe: chapters 18
measures of query cost
sorting (external sort merge)
selection (scan, binary search, index)
join (nested loop, block nested loop, sort merge, hash join)
algebra trees, evaluation plans
heuristic and cost-based query optimization
compute cost of operations
algorithms for sorting, selection and join
transformation (rewriting) of relational algebra expressions
interpretation of query plans
12/27
M. Böhlen, ifi@uzh
Syllabus/7
I
Transaction processing
I
I
I
I
I
I
I
I
I
DBS14, SL10
schedules, deadlocks, recoverability
conflict, conflict serializability, conflict graphs
lock-based protocols, 2PL and variants
immediate and deferred database, redo, undo
13/27
M. Böhlen, ifi@uzh
Database Systems Courses @IfI
Database Systems, Spring (sem4)
I
Praktikum Datenbanksysteme, Fall (sem5)
I
Distributed Databases, Fall (sem5)
I
Database Management and Performance Tuning, Fall (sem5,
sem7)
I
Seminar in Database Systems, Spring (sem6, sem8, sem12)
I
XML and Databases, Spring (sem8, sem6)
I
Data Warehousing, Spring (sem8, sem6; even years only)
I
Nonstandard Databases, Fall (sem9, sem7)
I
BSc thesis, MSc thesis, independent studies (Vertiefung,
Facharbeit, etc)
15/27
DBS14, SL10
14/27
M. Böhlen, ifi@uzh
Andreas Geppert: Praktikum Datenbanksysteme
I
DBS14, SL10
Database System Courses at IfI
Elmasri and Navathe: chapters 20, 21, 22
transactions, schedules, serializability, recoverability, ACID properties
concurrency control, locking, protocols, deadlocks
transactions in SQL
recovery, database log
I
Since 2001 Andreas Geppert has been with
Credit Suisse (as database architect).
I
Before this he was a senior researcher in the
Database Technology Research Group.
I
He received his diploma in Computer Science from the University of
Karlsruhe (Germany) in 1989 and his PhD in computer science from
the University of Zurich (1994).
I
From August 1998 to August 1999 he was a visiting scientist at the
IBM Almaden Research Center.
Praktikum Datenbanksysteme
I
M. Böhlen, ifi@uzh
I
I
DBS14, SL10
Apply/practice your SQL knowledge on a case study
Dienstag 16:00 - 18:00
16/27
M. Böhlen, ifi@uzh
Andreas Geppert: Praktikum Datenbanksysteme
I
Syllabus
I
I
I
I
I
I
relational database systems
conceptual and logical design
query languages
triggers, stored procedures
application development in Java (JDBC)
Characteristic
I
I
I
Application development with PostgreSQL
Independent work with guidance
I
17/27
M. Böhlen, ifi@uzh
Pei Li: Database Management and Performance
Tuning
I
Lecture type: Lecture with exercises
I
Course content: The course aim to give an in-depth understanding
of the features that off-the-shelf database management systems
offer, in particular with respect to system performance. This
knowledge is used to tune the database system and its environment:
dimension the hardware for the database system, write efficient
queries, set effective indexes, communicate with the database
efficiently, and diagnose performance problems.
DBS14, SL10
I
I
I
I
Basic principles of how to tune database applications
Performance criteria for choosing a database management system
Testing particular aspects of systems with experimental data and
scripts
Dr. Can Türker heads the Data Integration Group
of the Functional Genomics Center Zurich (FGCZ).
He holds a Ph.D. degree in computer science (1999)
and is (co-)author of the several lecture books in the area of
databases, among others ’Object-Relational Databases’ and
’SQL:1999 & SQL:2003’.
I
Lecture Hours: Thursday 8-10am
I
Course content: XML is introduced with related technologies and it
is shown how XML can be used for storing, accessing, querying, and
updating data. The mapping between XML and databases as well as
specific requirements arising from the usage of XML for data
management are elaborated not only conceptually but are also
demonstrated practically using todays major database systems.
I
Prerequisite: The course expects background knowledge in database
systems (especially in SQL).
You are expected to:
I
DBS14, SL10
Have background knowledge in programming, database system, data
structures and algorithms
19/27
M. Böhlen, ifi@uzh
M. Böhlen, ifi@uzh
I
Fall semester 2013
Friday 14:00 - 16:00 (tentative)
What you will learn?
18/27
Can Türker: XML and Databases
When?
I
I
Pei Li is a senior researcher in Database Technology
Group at IfI. She got her PhD degree from the
University of Milano - Bicocca in 2013.
reservation system for car sharing (from the database to the web)
management of the car park, members and stations
DBS14, SL10
I
I
Application (use case)
I
I
Pei Li: Database Management and Performance
Tuning
DBS14, SL10
20/27
M. Böhlen, ifi@uzh
Can Türker: XML and Databases
Goal: This lecture deals with the interplay of two essential technologies,
namely XML and databases.
DBS14, SL10
21/27
M. Böhlen, ifi@uzh
Topics for BSc Theses, MSc
Theses, etc in the DBS Area
DBS14, SL10
22/27
M. Böhlen, ifi@uzh
BSc and MSc Theses in the DBS Area
I
I
I
I
We are happy to supervise BSc and MSc theses in the DBS area
Relevant for all specializations and degrees
Typically close interaction with an assistant with weekly meetings
Algorithms, software development, scientific writing
I
I
I
I
I
I
Beyond the DBS Course
Extension of PostgreSQL (parser, analyzer, query optimizer)
Incorporation of a load balancing algorithm into Apache Hadoop
Work with real world data (e.g., feedbase.ch)
Matrix factorization
Work goes into details
Contact assistant by email and ask for a meeting to discuss theses
DBS14, SL10
23/27
M. Böhlen, ifi@uzh
DBS14, SL10
24/27
M. Böhlen, ifi@uzh
Beyond the DBS Course
I
Beyond the DBS Course
Jennifer Widom’s (Stanford) recipe for database research:
1. pick a fundamental assumption underlying database systems
2. drop it
3. solve resulting problem
I
I
I
leads to: semistructured databases, XML databases
I
I
drop: tuple presence is certain
I
DBS14, SL10
I
M. Böhlen, ifi@uzh
Viel Erfolg!
Danke.
DBS14, SL10
27/27
M. Böhlen, ifi@uzh
leads to: MapReduce, Hadoop
drop: data is stored row by row
leads to: probabilistic databases
25/27
leads to: key values stores, NoSQL
drop: centralized processing of entire database
I
leads to: stream processing systems
leads to: data warehouses, OLAP
drop: sophisticated SQL queries
I
drop: data sets are persistent and disk-resident
I
drop: banking applications with short update transactions
I
I
drop: fixed structure (schema) of the data
I
I
DBS14, SL10
leads to: column stores
26/27
M. Böhlen, ifi@uzh
Download