Chapter 11: Physical Database Design Object-Oriented Systems Analysis and Design Joey F. George, Dinesh Batra, Joseph S. Valacich, Jeffrey A. Hoffer © Prentice Hall, 2007 11-1 Chapter Objectives After studying this chapter you should be able to: – Design database fields. – Evaluate denormalization situations. – Design file organization structures. – Design object-relational features. Chapter 11 © Prentice Hall, 2007 11-2 Chapter 11 © Prentice Hall, 2007 11-3 What Is Physical Database Design? The part of a database design that deals with efficiency considerations for access of data Processing speed, storage space, and data manipulation are key issues in physical database design Chapter 11 © Prentice Hall, 2007 11-4 Sometimes, the analyst and the designer are the same person, Deliverables Chapter 11 © Prentice Hall, 2007 11-5 Chapter 11 © Prentice Hall, 2007 11-6 What Is SQL? Structured Query Language Often pronounced “sequel” The standard language for creating and using relational databases ANSI Standards – SQL-92 – most commonly available – SQL-99 – included object-relational features Chapter 11 © Prentice Hall, 2007 11-7 Common SQL Commands CREATE TABLE – used to define table structures and link tables together SELECT – used to retrieve data using specified formats and selection criteria INSERT – used to add new rows to a table UPDATE – used to modify data in existing table rows DELETE – used to remove rows from tables Chapter 11 © Prentice Hall, 2007 11-8 Example CREATE TABLE Statement Here, a table called DEPT is created, with one numeric and two text fields. The numeric field is the primary key. Chapter 11 © Prentice Hall, 2007 11-9 Example INSERT Statement This statement inserts a new row into the DEPT table DEPTNO’s value is 50 DNAME’s value is “DESIGN” LOC’s value is “MIAMI” Chapter 11 © Prentice Hall, 2007 11-10 SELECT The SELECT, and FROM clauses are required. All others are optional. WHERE is used very commonly. Chapter 11 © Prentice Hall, 2007 11-11 Chapter 11 © Prentice Hall, 2007 11-12 SELECT Statement: Example 1 Select * from DEPT; Result: all fields of all rows in the DEPT table Chapter 11 © Prentice Hall, 2007 11-13 SELECT Statement: Example 2 Select * from EMP where ENAME = ‘SMITH’; Result: all fields for employee “Smith” Chapter 11 © Prentice Hall, 2007 11-14 SELECT Statement: Example 3 Select EMPNO, ENAME From EMP where JOB = ‘SALESMAN’ order by ENAME; Result: employee number, name and job for only salesmen from the EMP table, sorted by name Chapter 11 © Prentice Hall, 2007 11-15 What Is a Join Query? A query in which the WHERE clause includes a match of primary key and foreign key values between tables that share a relationship Chapter 11 © Prentice Hall, 2007 11-16 SELECT Statement: Example 4 Select EMPNO, ENAME, DNAME from EMP, DEPT where EMP.DEPT_NO = DEPT.DEPT_NO and DEPT.LOC = ‘CHICAGO’; Result: all employees’ number and name (from the EMP table, and their associated department names, obtained by joining the tables based on DEPT_NO. Only employees housed in department located in Chicago will be included Chapter 11 © Prentice Hall, 2007 11-17 SELECT Statement: Example 4 (cont.) Join queries almost always involve matching the primary key of the dominant table with the foreign key of the dependent table. Chapter 11 © Prentice Hall, 2007 11-18 What Is an Aggregation Query? A query results in summary information about a group of records, such as sums, counts, or averages These involve aggregate functions in the SELECT clause (SUM, AVG, COUNT) Aggregations can be filtered using the HAVING clause and/or grouped using the GROUP BY clause Chapter 11 © Prentice Hall, 2007 11-19 SELECT Statement: Example 5 Select JOB, Avg(SALARY) from EMP Group by JOB Having Avg(SALARY) >= 3000; The job name and average salary for each job of employees in the EMP table. Only jobs with average salaries exceeding $3000 will be included Chapter 11 © Prentice Hall, 2007 11-20 SELECT Statement: Example 5 (cont.) Note that clerks and salesmen are not included, because the average salaries for these jobs are below $3000. Chapter 11 © Prentice Hall, 2007 11-21 Example Data Manipulation Update EMP set SAL = 3000 where EMPNO = 7698; – Modifies the existing employee’s (7698) salary Delete from EMP where EMPNO = 7844 – Removes employee 7844 from the EMP table Chapter 11 © Prentice Hall, 2007 11-22 1. Designing Fields Field – the smallest unit of named application data recognized by system software such as a DBMS Fields map roughly onto attributes in conceptual data models Field design involves consideration of identity, data types, sizes, and constraints Chapter 11 © Prentice Hall, 2007 11-23 Data type – A coding scheme recognized by system software for representing organizational data Chapter 11 © Prentice Hall, 2007 11-24 Considerations for Choosing Data Types Balance these four objectives: 1. Minimize storage space Represent all possible values of the field Improve data integrity for the field Support all data manipulations desired for the field 2. 3. 4. Chapter 11 © Prentice Hall, 2007 11-25 Mapping a composite attribute onto multiple fields with various data types Chapter 11 © Prentice Hall, 2007 11-26 Creating and Using Composite Attribute Types Chapter 11 © Prentice Hall, 2007 11-27 Data Integrity Controls Default Values – used if no explicit value is entered Format Controls – restricts data entry values in specific character positions Range Controls – forces values to be among an acceptable set of values Referential Integrity – forces foreign keys to align with primary keys Null Value Controls – determines whether fields can be empty of value Chapter 11 © Prentice Hall, 2007 11-28 Referential integrity is important for ensuring that data relationships are accurate and consistent Chapter 11 © Prentice Hall, 2007 11-29 2. What Is Denormalization? The process of combining normalized relations into physical tables based on affinity of use of rows and fields, and on retrieval and update frequencies on the tables Results in better speed of access, but reduces data integrity and increases data redundancy Chapter 11 © Prentice Hall, 2007 11-30 This will result in null values in several rows’ application data. Chapter 11 © Prentice Hall, 2007 11-31 Chapter 11 © Prentice Hall, 2007 11-32 This will result in duplications of item descriptions in several rows of the CanSupplyDR table. Chapter 11 © Prentice Hall, 2007 11-33 Duplicate regionManager data Chapter 11 © Prentice Hall, 2007 11-34 3. What Is a File Organization? A technique for physically arranging the row objects of a file Main purpose of file organization is to optimize speed of data access and modification Chapter 11 © Prentice Hall, 2007 11-35 Chapter 11 © Prentice Hall, 2007 11-36 Secondary Storage Concepts Block – a unit of data retrieval from secondary storage Extent – a set of contiguous blocks Scan – a complete read of a file block by block Blocking factor – the number of row objects that fit in one block Chapter 11 © Prentice Hall, 2007 11-37 Determining Table Scan Time Block read time is determined by seek, rotation and transfer. Average table scan time equals #rows in table divided by blocking factor multiplied by block read time Chapter 11 © Prentice Hall, 2007 11-38 What Is a Heap? A file with no organization Requires full table scan for data retrieval Only use this for small, cacheable tables Chapter 11 © Prentice Hall, 2007 11-39 What Is Hashing? A technique that uses an algorithm to convert a key value to a row address Useful for random access, but not for sequential access Chapter 11 © Prentice Hall, 2007 11-40 What Is an Indexed File Organization? A storage structure involving indexes, which are key values and pointers to row addresses Indexed file organizations are structured to enable fast random and sequential access Index files are fast for queries, but require additional overhead for inserts, deletes, and updates Chapter 11 © Prentice Hall, 2007 11-41 Random Access Processing Using B+ Tree Indexes Indexes are usually implemented as B+ trees These are balanced trees, which preserve a sequential ascending order of items as they are added. Chapter 11 © Prentice Hall, 2007 11-42 Issues to Consider When Selecting a File Organization File size Frequency of data retrievals Frequency of updates Factors related to primary and foreign keys Factors related to non-key attributes Chapter 11 © Prentice Hall, 2007 11-43 Which Fields should be Indexed? Chapter 11 © Prentice Hall, 2007 11-44 4. Design of Object Relational Features Object-relatonal databases support: – Generalization and inheritance – Aggregation – Mulivalued attributes – Object identifiers – Relationships by reference (pointers) Chapter 11 © Prentice Hall, 2007 11-45 Generalization in Oracle 9i/10g Chapter 11 © Prentice Hall, 2007 11-46 Aggregation in Oracle 9i/10g Chapter 11 © Prentice Hall, 2007 11-47 Multivalued Attributes in Oracle 9i/10g Chapter 11 © Prentice Hall, 2007 11-48 Object Identifiers in Oracle 9i/10g Chapter 11 © Prentice Hall, 2007 11-49