DATABASE Management Systems CSE3DMS Lecture 1 Review SQL & Database storage Copyright © La Trobe University Topics • Subject information – Subject Objectives – Subject organization – Class materials – Assessment & Requirements for passing CSE3DMS – Lab Allocation – Staff • Lecture 1: Review SQL programming 2 Objectives • Building on the grounding gained in CSE2DBF (DATABASE Foundations), to further develop a greater understanding of database programming in Oracle SQL and database management system implementation : – to extend skills of sound database development practices; – to gain knowledge of principles and techniques for implementation of database management systems; – to understand index structures, query optimization and transaction management for design of database management system. 3 Subject Organization • Credit Points:15 for 12 weeks • Class Requirements – On Two -hour lecture per week • Wed 2-4:00pm in room Hooper LT – 2 hours of tutorials/lab work per week • Oracle11g programming - review – Lab & assignment • Database management systems - 11 lectures – lectures , Tutorials, assignment & exam 4 Class Materials Theoretical: 1. Textbook: Database systems Models, languages, design, and application programming, Elmarsri & Navathe, 6th Edition, Addison Wesley , http://search.lib.latrobe.edu.au/LATROBE:All:Almalu21169496160002146 2. Reference books Database System Concepts 4- 6th Ed. Silberschatz, Korth and Sudarshan http://search.lib.latrobe.edu.au/LATROBE:All:Almalu21161516840002146 3. Programming • Guide to ORACLE10g, Morrison & Morrison, Thomson Course technology, 2006, ISBN 0619216298 http://search.lib.latrobe.edu.au/LATROBE:In_the_Library:Almalu211590750 50002146 5 Class Materials • – – – – – LMS: https://lms.latrobe.edu.au/login/index.php Subject Leaning Guide News Lecture notes Lab instructions are available on the Subject web site • Some supplement Lab data files are available from the LMS • 11 laboratory sessions. Oracle documents related to lab questions 6 Assessment 1. Weekly lab work 20% for 10 weeks (each week earns 2 marks from week 2-11) 2. assignment 30%(50% ) – – – – Assignment 30% - general for all Group Assignment (2) TBA (Mon 10:00am) Design and implementation SQL 3. Exam 50% (50% exam hurdle) – CSE3DMS 2-hour examination (120) SAH-E (hurdle supplementary in Jul) Must prepare your time for this sup when you have failed the exam hurdles 7 Assessment • laboratory attendance for supplementary hurdle • To pass, a student must receive at least 50% overall, and pass the hurdle requirement: 50% for the final examination • Policy on plagiarism http://latrobe.libguides.com/academic-integrity/ethical_use 8 Laboratory allocation • You need to attend the Lab class and you may only attend your allocated laboratory session. • Please go the following link to enroll one lab class. (this year, we can have only one lab) Laboratory Monday Tuesday Time 11.00 - 1.00pm 11.00 - 1.00pm Room BG115 BG114 9 Staff • Lecturers: Dr Jinli Cao, Room: BG217, Phone:3035 Email: j.cao@latrobe.edu.au Consultation time: Tues 4-6pm • Tutor: Christopher Tao 10 Lecture 1 • Client/Server Databases & the Oracle11g Relational Database • Review SQL Programming • Disk Storage, Basic File Structures, and Hashing • • Reading: the textbook Chapter 1 -3 by Morrison Refer to Fundamentals of Database System by Elmasri & Navathe Chapter16 11 Types of Database Systems • Personal databases (DBMS) – DBMS and database applications run on the same workstation – used primarily for creating single-user database applications – Support small multiuser database applications by storing the database application files on a file server instead of on a single user’s workstation • Client/server database – DBMS server process runs on one workstation, and the database applications run on separate client workstations across the network 12 Client/Server Database Management Systems • Client/server database – Clients computer runs user interface programs and database applications • that retrieve and manipulate small amounts of data from database through server – Database system at Server side stores large numbers of records and management that serves to clients and maintain the consistency of databases. • There are 2-tier or 3-tier client/server architecture of DBMS • Organizations generally use a client/server database if the database will have more than 10 simultaneous users 13 Client/Server Database Architecture 14 The Oracle11g Client/Server Database • Oracle11g is the latest release of Oracle Corporation’s relational database • All Oracle server- and client-side programs use Oracle Net, a utility that enables the network communication between the client and the server 23 11g 15 Client-Side Utilities • SQL*Plus – for creating and testing command-line SQL queries and executing PL/SQL procedural programs • Oracle SQL Developer – free integrated development environment that simplifies the development and management of Oracle Database. It offers complete end-to-end development of your PL/SQL applications, a worksheet for running queries and scripts, a DBA console for managing the database, a reports interface, a complete data modelling solution, and a migration platform for moving your 3rd party databases to Oracle • Oracle10g Developer Suite for developing database applications including the following Developer tools: – Forms Builder • for creating custom user applications – Reports Builder • for creating reports for displaying, printing, and distributing summary data • Enterprise Manager – for performing database administration tasks such as creating new user accounts and configuring how the DBMS stores and manages data 16 What we have learned --SQL • • • • Structured Query Language (SQL) Define Oracle11g database schemas Create database tables using SQL*Plus Modify and delete database tables using SQL*Plus • Debug Oracle11g SQL commands and use Oracle Corporation online help resources • View information about your database tables using Oracle11g data dictionary views 17 Structured Query Language • The standard query language for relational databases • Data definition language (DDL) – Create new database objects – Modify or delete existing objects. • Data manipulation language (DML) – Insert, update, delete, and view database data. 18 Data dictionary • Data dictionary: tables that contain information about the structure of the database. – USER: shows the objects in the current user’s schema – ALL: shows both objects in the current user’s schema and objects that the user has privileges to manipulate Eg. User_tables, all_tables, user_constraints, etc 19 Viewing Tables in the Database • If you want to know the structures of the Database, to view user_tables & all_tables 20 Viewing Constraints on One Table • View constraints of Faculty table from user_constraints table 21 Create Synonyms & Copy tables • A synonym can provide an alias to any object, in any schema in a database, assuming that the user has privileges to view the underlying objects – It makes an object appear as if you own it, because you do not have to use a schema prefix when querying or performing other tasks with the object CREATE [ OR REPLACE ] [ PUBLIC ] SYNONYM <schema>.<synonymname> FOR [<schema>.]<objectname>[@<databaselinkname>]; DROP [ PUBLIC ] SYNONYM <schema>.<synonymname> [ FORCE ]; • Example: DBA can do this: CREATE PUBLIC SYNONYM customers FOR jinli.customer; Every user in the same database can use customers to view it. SQL> create Table orders As Select * From jinli.orders; 22 Data Dictionary Information on Views, Sequences, and Synonyms • Three views are of interest: – ALL_VIEWS • VIEW_NAME • TEXT – ALL_SEQUENCES – ALL_SYNONYMS More on users data dictionary – user_objects – user_source – user_errors 23 Deleting Table Rows ERROR: – Cannot delete row if it has child row – – Other table use this record 24 Deleting Table Rows (continued) • Child row – Row’s value is foreign key – Cannot delete row if it has child row • Unless first delete the row in which foreign key value exists • TRUNCATE syntax – TRUNCATE TABLE tablename; • Cannot truncate table with foreign key constraints – Must disable constraints first Guide to Oracle 10g 25 Clearwater Traders Table Relationships •Understand constraints of the database. •Can you drop table inventory? 26 Truncate tables and Alter Constraints • To delete a table: DROP TABLE (data and table) • To keep the table but delete the data use the TRUNCATE TABLE command. Not: before you truncate a table you have to disable foreign key constraints 27 Truncate tables and Alter Constraints • For Example: If you want to delete all data from table Inventory you have to disable the foreign key constraints inv_id from the tables order_line & shipment_Pline ALTER TABLE order_line DISABLE CONSTRAINT order_line_inv_id_fk; ALTER TABLE shipment_line DISABLE CONSTRAINT shipment_line_inv_id_fk; TRUNCATE TABLE Inventory; After you have populated a new table inventory : ALTER TABLE order_line ENABLE CONSTRAINT order_line_inv_id_fk; Or MODIFY CONSTRAINT order_line_inv_id_fk ENABLE VALIDATE 28 Creating Transactions and Committing New Data • Transaction: series of action queries that may change the database; it is a logical unit of work • User can commit (save) changes • User can rollback (discard) changes • Pending transaction: a transaction waiting to be committed or rolled back • Oracle DBMS locks records associated with pending transactions • Other users cannot view or modify locked records 29 Commit and Roll Back in SQL*Plus • Transactions begin automatically with first command • Type COMMIT to commit changes • Type ROLLBACK to roll back changes 30 Savepoints • A bookmark that designates the beginning of an individual section of a transaction • Changes are rolled back to savepoint 31 Creating New Sequences • CREATE SEQUENCE command – DDL command – No need to issue COMMIT command – Sequences are used to contain automated counter values, most commonly used to retain counter values for integer-valued primary keys Guide to Oracle 10g 32 Sequences • Syntax: CREATE SEQUENCE <schema>.<sequencename> [ START WITH <n> ] [ INCREMENT BY <n> ] [ [ NO ] MINVALUE <n> ] [ [ NO ] MAXVALUE <n> ] [ [ NO ] CYCLE ] [ [ NO ] CACHE <n> ] [ [ NO ] ORDER ]; DROP SEQUENCE <schema>.<sequencename> Example: CREATE SEQUENCE loc_id START WITH 1; A sequence allows for generation of unique, sequential values. 33 Using Sequences • Pseudocolumn – Acts like column in database table --DUAL – Actually command that returns specific value • CURRVAL – Returns most recent sequence value retrieved • NEXTVAL – Next available sequence value sequence_name.NEXTVAL Guide to Oracle 10g 34 Using Sequences (continued) • DUAL – Simple table in system user schema – More efficient to retrieve pseudocolumns from DUAL SELECT sequence_nm.NEXTVAL FROM DUAL; Example SELECT loc_id.NEXTVAL FROM DUAL; Increase value in each time to read loc_id --move value to next SELECT loc_id.CURRVAL FROM DUAL; Guide to Oracle 10g 35 Using Sequences Example: SELECT loc_id.NEXTVAL FROM DUAL; INSERT INTO location (LOC_ID) VALUES(loc_id.NEXTVAL); • DUAL is a table in the SYSTEM USER SCHEMA. • You can delete the sequence by drop command 36 Viewing Sequence Information • Query USER_SEQUENCES data dictionary view – sequence_name column displays sequence names SELECT sequence_name FROM user_sequences; Guide to Oracle 10g 37 Grant Sequences • Grant all privileges to all database users for the sequence created GRANT ALL ON loc_id_sequence TO PUBLIC 38 Modifying the SQL*Plus Display Environment • SQL*Plus page consists of: – Specific number of characters per line – Specific number of lines per page • linesize property – Specifies how many characters appear on line • pagesize property – Specifies how many lines appear on page • Example: SET linesize 200 SET pagesize 100 Guide to Oracle 10g 39 Modifying the SQL*Plus Display Environment You can change the window size on SQL*Plus window SQL*Plus command: SET WRAP OFF will truncate rows if longer than the screen width 40 What is AFIEDT.BUF? It is the SQL*Plus default edit save file. •Issue the command "ed" or "edit" only, the last SQL or PL/SQL command will be saved to a file called AFIEDT.BUF and opened in the default editor. •Select an editor before using this command DEFINE _EDITOR=notepad •The buffer file “afiedt.buf”appears, when type “ed”: SQL> ed -- enter “/” to run the command. 41 What is AFIEDT.BUF? You can overwrite the default edit save file's name like this: SQL> SET EDITFILE “c:\sqlspool\buffer.sql” SQL> Ed “c:\sqlspool\buffer.sql” The system will call the file in your default editor (notepad). You can work on your code in the editor and paste it to SQL*Plus window to run. Close the editor and click the pop “save” for the commands in the editor. Return back SQL*plus Rerun the query with the slash (/) command: 42 Formatting SQL*Plus Reports can define a more useful column heading with the HEADING clause of the COLUMN Syntax: COLUMN column_name HEADING column_heading Examples COLUMN S_LNAME HEADING 'LAST NAME Melbourne' COLUMN S_DOB HEADING ‘BIRTHDAY’ SELECT S_LNAME, S_DOB FROM STUDENT WHERE CITY = ‘Melbourne'; Splitting a Column Heading COLUMN S_LNAME HEADING 'LAST NAME |Melbourne' 43 Formatting SQL*Plus Reports Original Column Headings Changed Column Headings Split Column Headings You can change the underline with + using SET UNDERLINE + 44 Formatting SQL*Plus Reports Changing the Default Display COLUMN s_LAST is cut with 8 char Split COLUMN s_LAST into 2 rows COLUMN s_LAST FORMAT A8 45 Content • Storage of databases – Disk Storage Devices • Records, Blocks, Files, Files of Records • Unordered Files • Ordered Files Refer to textbook: Database systems Models, languages, design, and application programming, Elmarsri & Navathe, 6th Edition, Addison Wesley 46 Storage of Databases • The collection of data must be stored physically on some computer storage medium. • Computer storage media form a storage hierarchy that includes two main categories: 1. Primary storage. can be operated directly by the computer’s central processing unit (CPU), such as the computer’s main memory and smaller but faster cache memories. 2. Secondary and tertiary storage. Magnetic disks, optical disks (CD-ROMs, DVDs, and other similar storage media), and tapes. Hard-disk drives are classified as secondary storage, whereas removable media such as optical disks and tapes are considered tertiary storage. Data in secondary or tertiary storage cannot be processed directly by the CPU; first it must be copied into primary storage and then processed by the CPU. 47 Storage of Databases • Preferred secondary storage device for high storage capacity and low cost. • Data stored as magnetized areas on magnetic disk surfaces. • A disk pack contains several magnetic disks connected to a rotating spindle. • Disks are divided into concentric circular, Each circle is called a track • various surfaces in a disk pack are called a cylinder • The number of tracks on a disk ranges from a few hundred to a few thousand, and the capacity of each track typically ranges from tens of Kbytes to 150 Kbytes. 48 Disk Storage Devices (cont.) 49 Disk Storage Devices (cont.) • A track is divided into smaller blocks or sectors • The division of a track into sectors is hard-coded on the disk surface and cannot be changed. – One type of sector organization shown in Fig 16.1(a) calls a portion of a track that subtends a fixed angle at the center as a sector. Fig 16.1 (b) is another type of organization. • A track is divided into blocks. – The block size B is fixed for each system during initialization. • Typical block sizes range from B=512 bytes to B=4096 bytes. – Whole blocks are transferred between disk and main memory for processing. 50 Disk Storage Devices (cont.) 51 Disk Storage Devices (cont.) • A read-write head moves to the track that contains the block to be transferred. – Disk rotation moves the block under the read-write head for reading or writing. • A physical disk block (hardware) address consists of: 1. a cylinder number (imaginary collection of tracks of same radius from all recorded surfaces) 2. the track number or surface number (within the cylinder) 3. and block number (within track). • Reading or writing a disk block is time consuming because of the seek time s and rotational delay (latency) rd. • Double buffering can be used to speed up the transfer of contiguous disk blocks. 52 Records • Fixed and variable length records • Records contain fields which have values of a particular type – E.g., amount, date, time, age • Fields themselves may be fixed length or variable length • Variable length fields can be mixed into one record: – Separator characters or length fields are needed so that the record can be “parsed.” 53 Blocking • Blocking: – Refers to storing a number of records in one block on the disk. • Blocking factor (bfr) refers to the number of records per block. • There may be empty space in a block if an integral number of records do not fit in one block. • Spanned Records: – Refers to records that exceed the size of one or more blocks and hence span a number of blocks. 54 Files of Records • A file is a sequence of records, where each record is a collection of data values (or data items). • A file descriptor (or file header) includes information that describes the file, such as the field names and their data types, and the addresses of the file blocks on disk. • Records are stored on disk blocks. • The blocking factor bfr for a file is the (average) number of file records stored in a disk block. • A file can have fixed-length records or variablelength records. 55 Files of Records (cont.) • File records can be unspanned or spanned – Unspanned: no record can span two blocks – Spanned: a record can be stored in more than one block • The physical disk blocks that are allocated to hold the records of a file can be contiguous, linked, or indexed. • In a file of fixed-length records, all records have the same format. Usually, unspanned blocking is used with such files. • Files of variable-length records require additional information to be stored in each record, such as separator characters and field types. – Usually spanned blocking is used with such files. 56 Unordered Files • Also called a heap or a pile file. • New records are inserted at the end of the file. • A linear search through the file records is necessary to search for a record. – This requires reading and searching half the file blocks on the average, and is hence quite expensive. • Record insertion is quite efficient. • Reading the records in order of a particular field requires sorting the file records. 57 Ordered Files • Also called a sequential file. • File records are kept sorted by the values of an ordering field. • Insertion is expensive: records must be inserted in the correct order. – It is common to keep a separate unordered overflow (or transaction) file for new records to improve insertion efficiency; this is periodically merged with the main ordered file. • A binary search can be used to search for a record on its ordering field value. – This requires reading and searching log2 of the file blocks on the average, an improvement over linear search. • Reading the records in order of the ordering field is quite efficient. 58 Ordered Files (cont.) 59 Average Access Times • The following table shows the average access time to access a specific record for a given type of file: 60 Summary • • • • • Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files 61 Next Lecture • Next lecture Indexes. 62