4190.301; Spring 2009 Prof. Sang-goo Lee ((13:00pm: p Mon & Wed: Room 302-208)) INTRODUCTION TO DATABASE SYSTEMS Syllabus Text Book Exams (tentative dates) Midterm Exam: 4/22 (Wed) Final Exam: 6/10 (Wed) Database System Concepts, 5th Edition, A. Silberschatz, H. F. Korth, and S. Sudarshan, McGraw Hill, 2006. à Rf Reference à à 2~3 programming assignments To be announced later Database Systems, Atzeni, et al, McGraw Hill, 2000. Lecture Notes à T Term Project P j Grades will be posted before class at http://europa.snu.ac.kr username & password required Please use only for personal use Exams: 30% each Term Projects: 30% total Quizzes, Assignments, etc: 10% A score of 0 in • any one of the exams, or • more than 50% of your reports, projects, or assignments/quizzes will result in F. Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 2 CHAPTER 1. INTRODUCTION Contents Data & Database Database Management Systems View of Data Data Models Data Languages Database Users Transaction Management S Storage M Management Data Mining & Analysis Overall System Structure Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 4 Data & Database Data à A fformall description d i ti off an entity, event, phenomena, or idea that is worth recording Database An integrated collection of persistent i d data representing the information of interest for various programs that compose the computerized information system of an organization. organization à Data are separated from the programs that use them à à à à Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 5 What is a Database? Examples! Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 6 What do we do with it? Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 7 What do we do with it? (cont.) Most (all?) computing applications use some type of a database CRM ERP MIS Data Mining E-Commerce ERP Database Database b Original Slides: © Silberschatz, Korth and Sudarshan Database Database Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 8 Database Management System (DBMS) Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Warehouse Chap 1 - 9 Database Management System (DBMS) WH manager WH Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 10 Database Management System (DBMS) Database Order processing Employee Products Customer Users DBMS Payroll Analytic Reports Inventory Sales Applications Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 11 Database Management System (DBMS) Set of programs to access the data DBMS provides id an environment i t that th t is i both b th convenient i t andd efficient ffi i t to use. Database Applications (Information Systems): à à à à à à Banking: all transactions Airlines: reservations, schedules U i Universities: iti registration, i t ti grades d Sales: customers, products, purchases Manufacturing: production, inventory, orders, supply chain Human resources: employee l records, d salaries, l i tax deductions d d i Databases touch all aspects of our lives Commercial Systems à à DB2, Oracle, Informix, BADA, MS SQL Server, Sybase, dBase, FoxPro, Access Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 12 File Systems File System à à à Part of OS Stores programs, data, documents, or anything (in disk) In the early days, days database applications were built on top of file systems Drawbacks of using file systems to store data: Data redundancy and inconsistency Multiple file formats, duplication of information in different files à Difficulty in accessing data Need to write a new p program g to carryy out each new task à Data isolation — multiple files and formats à Integrity problems Integrity constraints (e.g. account balance > 0) become part of program code Hard to add new constraints or change existing ones à Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 13 File Systems (cont.) Drawbacks of using file systems (cont.) Atomicity At i it off updates d t Failures may leave database in an inconsistent state with partial updates carried out E.g. transfer of funds from one account to another should either complete or not happen at all à Concurrent access by multiple users Concurrent accessed needed for performance Uncontrolled concurrent accesses can lead to inconsistencies E g two people reading a balance and updating it at the same time E.g. à Security problems à Database systems y offer solutions to all the above problems p Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 14 Levels of Abstraction Physical level describes how a record (e.g., customer) is stored in a ph physical sical de device. ice Logical level: describes data stored in database, and the relationships among the data. data type customer = record name : string; street : string; city : integer; end; View level: application programs hide details of data types. Views can also hide information (e.g., salary) for security purposes. Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 15 Data Independence ability to modify a schema in one level without affecting a schema definition in the next ne t higher level le el physical data independence: à physical level - conceptual level logical data independence: à conceptual p level - view level View 1 Customer ‥‥ View n Account Conceptual Level pointer or table? Physical Level Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 16 Instances and Schemas Similar to types and variables in programming languages Schema – the logical structure of the database e.g., the database consists of information about a set of customers and accounts and the relationship p between them)) à Analogous to type information of a variable in a program à Physical schema: database design at the physical level à Logical L i l schema: h d t b design database d i att the th logical l i l level l l à Instance – the actual content of the database at a particular point in time à Analogous to the value of a variable Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 17 Scheme (schema) à th skeletal the k l t l structure t t off the th data d t content t t Customer Name Account Address Telephone No No. Type Balance Instance the actual content of the data at a given time à database status à 2009/3/3/12:00 Customer 2009/2/20/12:00 Customer HS Kim Suwon 323-3232 HS Kim Seoul 323-3232 KS Lee Busan 323-5454 KS Lee Busan 323-5454 MH Choi Seoul 553-3235 PL Park Seoul 553-3235 KH Na Yongin 545-5488 ⋯ Original Slides: © Silberschatz, Korth and Sudarshan ⋯ Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 18 Data Models A collection of tools for describing ddata t à data relationships à data semantics à data constraints à Entity-Relationship model Relational model Other models: à à à object-oriented model semi-structured data models Older models: network model and hierarchical model Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 19 Entity-Relationship Model Example of schema in the entity-relationship model Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 20 Entity Relationship Model (cont.) E-R model of real world Entities E titi (objects) ( bj t ) E.g. customers, accounts, bank branch à Relationships between entities E.g. Account A-101 is held by customer Johnson Relationship set depositor associates customers with accounts à Widely Wid l used d ffor ddatabase t b design d i à Database design in E-R model usually converted to design in the relational model (coming up next) which is used for storage and processing Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 21 Relational Model columns Represent data in a tabular form rows Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 22 Database Languages Data Definition Language (DDL) à Used U d ffor ddefining fi i DB Schema S h create table drop column Data Manipulation Language (DML) à Used for operating the data in the DB (DB instance) Retrieve Insert Delete Change Query a statement requesting the retrieval of information à query language: part of DML à sometimes “query language = DML” à Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 23 SQL The most widely used language à à E g find the name of the customer with customer-id 192-83-7465 E.g. select customer.customer-name from customer where customer.customer-id = ‘192-83-7465’ à E.g. g find the balances of all accounts held byy the customer with customer-id 19283-7465 à select account.balance from depositor, account where depositor.customer-id = ‘192-83-7465’ and depositor.account-number = account.account-number Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 24 Database Users Users are differentiated by the way they expect to interact with the system s stem Application programmers – interact with system through DML calls S hi i d users – form Sophisticated f requests in i a database d b query llanguage Specialized users – write specialized database applications that do nott fit into i t the th traditional t diti l data d t processing i framework f k Naïve users – invoke one of the permanent application programs that have been written previously à E.g. people accessing database over the web, bank tellers, clerical staff Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 25 Database Administrator Coordinates all the activities of the database system; the database administrator has a good understanding nderstanding of the enterprise’s information resources and needs. Database administrator's administrator s duties include: à à à à à à à Schema definition Storage structure and access method definition Schema and physical organization modification Granting user authority to access the database Specifying integrity constraints Acting as liaison with users Monitoring performance and responding to changes in requirements Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 26 Transaction Management Transaction à a collection ll i off operations i that h performs f a single i l logical l i l function f i in i a database application à p programmer g is responsible p for writingg “correct” transactions DBMS must ensure the atomicity and durability of each transaction à atomicity : all-or-nothing à durability : effect should be persistent Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 27 Storage Management DBMS must effectively and efficiently manage storage (disk) space Storage S manager à a program module à that provides the interface à between the low-level data stored in the database à and the application pp p programs g and qqueries submitted to the system y Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 28 Data Mining & Analysis The challenge is “getting information out”! Knowledge Discovery in Databases à Extract information from the database à Information retrieval Textual data files (documents) à Find the most relevant document(s) for the given information need (query) à Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 29 Overall System Structure Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 30 Application Architecture Two-tier Two tier architecture: E.g. E g client programs using ODBC/JDBC to communicate with a database Three-tier architecture: E.g. web-based applications, and applications li i built b il using i “middleware” “ iddl ” Original Slides: © Silberschatz, Korth and Sudarshan Intro to DB (2009-1) Copyright © 2008 - 2009 by S.-g. Lee Chap 1 - 31 END OF CHAPTER 1