Section 08 - REVIEW E-R Diagrams • The Entity-Relationship Approach – Represents reality using welldefined graphics and rules • Basic building blocks are “things” (entities) and relationships Member M 1 Adopts Animal E-R Diagrams • Advantages – Theoretical foundation (Set Theory) – Good for communication – Build E-R Model, then translate to any type of RDBMS • Disadvantages – Different (yet another new thing to learn) – Must translate to the relational model E-R Diagrams • Entity-Relationship Model: Basic Concepts – Entity • Thing, Object, Concept of interest to the enterprise • Each occurrence can be uniquely identified E-R Diagrams • Entity-Relationship Model: Basic Concepts – Attribute • Property of an entity • Column E-R Diagrams • Entity-Relationship Model: Basic Concepts – Relationship • Association between two (or more) entities E-R Diagrams • Entity-Relationship Model: Basic Concepts – Entity Identifier • Attribute(s) whose value uniquely identifies an entity • Primary Key E-R Diagrams • What is an Entity? – Physical entity types • Person • Building • Machine • Book • Usually Singular E-R Diagrams • What is an Entity? – Conceptual entity types • Contract • Account • Order • Course E-R Diagrams • What is an Entity? – Event entity types • Transaction • Shipment • Reservation • Phone Call • Seminar Offering E-R Diagrams • Entity-Relationship Model: Diagrams – Example: Member Adopts Animal – Soft Rectangle represents entities • Noun • Singular – Connecting Line represents relationships • Verb E-R Diagrams • Relationships have Characteristics – A relationship has Cardinality (Degree) One-to-One One-to-Many Many-to-Many E-R Diagrams • Each entity’s participation is Mandatory or Optional • Cardinality & Optionality are based on business rules Mandatory Optional E-R Diagrams • Mandatory – Every instance of the entity MUST participate in the relationship – Example: • Every animal is cared for by at least one employee E-R Diagrams • Optional – An instance of the entity CAN participate in the relationship – Example: • Some employees do not take care of animals E-R Diagrams • Determining Optionality & Cardinality – Optionality & Cardinality • Specify lower and upper bounds of each entity’s participation in the relationship • Use one of the following templates E-R Diagrams • Template 1 One ________(can/must) ________ (one and only one/one or more) __________ • Template 2 One ________ ________a minimum of (0/1) and a maximum of (1/many) __________ E-R Diagrams • Use either template – Read each relationship twice • Left to Right • Right to Left E-R Diagrams • Guidelines to Develop an E-R Diagram – Identify the Major Entities – Identify the Attributes for each entity – Determine the Unique Identifier(s) – Identify the Relationships – Assign Cardinality – Determine Optionality – Resolve M:N Relationships E-R Diagrams • Mapping the E-R Diagram to the Relational Database – Each entity becomes a Table – Each attribute becomes a Column – Unique Identifier becomes the PK – Each 1:M becomes a FK on the Many Side E-R Diagrams • Practice 01 – A company has ten departments – A company has five divisions – A company has one hundred employees – Each employee must work for one department – Each division has two departments E-R Diagrams • Practice 02 – A company has twenty employees – Each employee works for a department – There are two departments in the company E-R Diagrams • Practice 03 – A company has three divisions – A company has one manager per division – Each manager is in charge of one committee E-R Diagrams • Practice 04 – A company has a sales department with fifteen salespersons – Each salesperson works for the sales department – Each salesperson is supervised by one manager – The managers may not have an employee to supervise E-R Diagrams • Practice 05 – A piece of equipment is built with ten parts – The parts come from suppliers – All parts are held in inventory until needed to build a piece of equipment E-R Diagrams • Practice 06 – There are two hundred students – Each student must attend an orientation – An orientation is held at the beginning of each semester – Students attend the orientation in their first or second semester E-R Diagrams • Practice 07 – There are forty rooms in a dorm – Each room in the dorm holds two students – There are five dorms on campus – Each dorm has four floors E-R Diagrams • Practice 08 – Each faculty member advises fifty students – Each student has an advisor – There are faculty that do not advise students E-R Diagrams • Practice 09 – Students enroll in courses – Courses are taught each semester – Students receive a final grade for each course – Each course has a maximum number of students enrolled – Each course has a minimum number of students enrolled E-R Diagrams • Practice 10 – Basketball players sign contracts – Contracts are good for one to three years – Some players play in a game – Not all players may play in a game – Some players may be injured – Some injuries require a hospital visit – Hospitals take care of patients – Some hospital patients are basketball players End 10-21-05 Normalization • Normalization using Codd’s Rules – Codd and contemporaries developed rules for “Normal Forms” • 1NF • 2NF • 3NF – Normal levels to do in database design • Boyce/Codd NF – 3.5NF • 4NF • 5NF Normalization Class Enrolment Class Code Class Description Student Number Name 503 Mgt Info Systems 00001 00003 00005 Masters, Rick Smith, Steve Jones, Terry 540 Quant Methods Wallace, Fred Smith, Steve Nurk, Sterling 00002 00003 00004 Normalization 1NF Class Enrolment Class Code Class Description Student Number Name 503 Mgt Info Systems 00001 Masters, Rick 503 Mgt Info Systems 00003 Smith, Steve 503 Mgt Info Systems 00005 Jones, Terry 540 Quant Methods 00002 Wallace, Rusty 540 Quant Methods 00003 Smith, Steve 540 Quant Methods 00004 Nurk, Sterling Normalization 2NF Class Enrolment Class Code Class Description Student Number Name 503 Mgt Info Systems 00001 Masters, Rick 503 Mgt Info Systems 00006 Smith, Steve 503 Mgt Info Systems 00005 Jones, Terry 540 Quant Methods 00002 Wallace, Rusty 540 Quant Methods 00003 Smith, Steve 540 Quant Methods 00004 Nurk, Sterling Normalization 2NF & 3NF CLASS STUDENT Class Code Class Description Student Number Name 503 Mgt Info Systems 00001 Masters, Rick 540 Quant Methods 00006 Smith, Steve 00005 Jones, Terry 00002 Wallace, Rusty 00003 Smith, Steve 00004 Nurk, Sterling Normalization • Rules for 1NF, 2NF, & 3NF – 1NF • Break out repeating groups into a separate entity – 2NF • Break out attributes that are dependent on part of the primary key into a separate entity • Called Partial Dependency – 3NF • Break out attributes that are wholly dependent on another key (not PK) into a separate entity • Called Transitive Dependency Normalization • Normalization Cont’d – A relation is in 3NF if all the attributes are functionally dependent • On the Key • On the Whole Key, and • On Nothing but the Key –(So Help Me Codd) Functional Dependency & Normalization • How to Normalize Data using Functional Dependencies – Definition of Functional Dependency • Given a relation R, attribute Y of R is functionally dependent on attribute X of R, if and only if each X value in R has associated with it precisely one Y-value in R (at any one time) Functional Dependency & Normalization • Y of R is Dependent on X of R • X functionally determines Y X Y WARD NAME WARD TYPE NO. OF BEDS SENIOR NURSE PATINET NO PATIENT NAME DATE OF BIRTH Liston Orthopedic 6 J Bryan 45812 D Carter 21/02/65 Liston Orthopedic 6 J Bryan 71384 R Willis 08/10/46 Liston Orthopedic 6 J Bryan 69355 G Barnes 17/06/41 Godlee General 10 V Fox 52217 M Brown 21/02/35 Godlee General 10 V Fox 10823 R Willis 12/03/54 Functional Dependency & Normalization • Example 01 Subscriber Name Magazine Number Code Magazine Start Date End Date 101 Jones TIM Time 3/1993 2/1999 110 Allen NEW Newsweek 2/1996 1/1999 SCI Science 6/1994 5/2000 NEW Newsweek 2/1994 1/1999 TIM Time 5/1994 4/2001 TIM Time 5/1994 4/2001 202 Smith Functional Dependency & Normalization • Example 01 • Normalization begins with the arrangement of information into tables with rows and columns such that repeating groups of information have been eliminated, that is, the "cells" have data with atomic values. In addition, normalized tables should have some data field(s) which is unique for all rows. • In this case, because SMITH has two identical subscriptions, we need to invent a new field, namely SUBSCRIPTION NUMBER, in order to insure uniqueness, i.e. no duplicate rows Functional Dependency & Normalization • Example 01 • 1NF – Resulting Table Subscriber Name Subscription Magazine Number Number Code Magazine Start Date End Date 101 Jones 001 TIM Time 3/1993 2/1999 110 Allen 002 NEW Newsweek 2/1996 1/1999 110 Allen 003 SCI Science 6/1994 5/2000 202 Smith 004 NEW Newsweek 2/1994 1/1999 202 Smith 005 TIM Time 5/1994 4/2001 202 Smith 006 TIM Time 5/1994 4/2001 Functional Dependency & Normalization • Example 01 • Functional Dependency A central concept of the normalization process is the functional dependency. Simply put, a functional dependency exists between two data fields when for each distinct value of one field, there is only one possible value for the other field. Functional Dependency & Normalization • Example 01 • For example, if we assume that SUBSCRIBER NUMBER is a uniquely assigned number for each subscriber, then there is a functional dependency between SUBSCRIBER NUMBER and NAME. We could say that SUBSCRIBER NUMBER functionally determines NAME or, conversely,that NAME is functionally dependent upon SUBSCRIBER NUMBER. Functional Dependency & Normalization • Example 01 • This functional dependency and others are shown below, using a convenient notation, i.e. " A-->B.“ • SUBSCRIBER NUMBER-->NAME • MAGAZINE CODE-->MAGAZINE • SUBSCRIPTION NUMBER-->SUBSCRIBER NUMBER, NAME, MAGAZINE CODE,MAGAZINE, START DATE, END DATE Functional Dependency & Normalization • Example 01 • It is critical to this process to fully understand the underlying assumptions about the information that is being normalized. Successful normalization is, for all practical purposes, impossible without understanding the meaning and usage of information. Functional Dependency & Normalization • Example 01 • In our example, our assumptions are: • (1) SUBSCRIBER NUMBER is uniquely assigned to each subscriber. • (2) MAGAZINE CODE is a convenient unique code for each magazine name. • (3) SUBSCRIPTION NUMBER is uniquely assigned to each subscription and therefore functionally determines all fields. Functional Dependency & Normalization • Example 01 • 2NF • Information which is in Second Normal Form has the quality that some field (or fields) functionally determines all of the others. This field(s) is called a primary key. Building Second Normal Form tables is simply the mechanical process of making tables out of the functional dependencies and noting which field(s) is the primary key. The following notation shows our new tables in Second Normal Form. Functional Dependency & Normalization • Example 01 • 2NF Subscriber Number KEY Magazine Code KEY Subscriber Name Magazine Name Subscriber Table Magazine Table Subscription Subscriber Name Magazine Magazine Start End Number Number Code Name Date Date KEY Subscription Table Functional Dependency & Normalization • Example 01 • At this point in the normalization process, the task is to determine if the Second Normal Form tables are also in Third Normal Form. Third Normal Form tables are defined as tables where there is no functional dependency between non-key fields. Functional Dependency & Normalization • Example 01 • The SUBSCRIBER and MAGAZINE tables are therefore in Third Normal Form since each has only one non-key field. The SUBSCRIPTION table, however, has several functional dependencies between non-key fields. They are: • SUBSCRIBER NUMBER --> NAME • MAGAZINE CODE --> MAGAZINE. • This table can be made into Third Normal Form by very simply eliminating the dependent field(s). Functional Dependency & Normalization • Example 01 • 3NF Subscriber Number KEY Magazine Code KEY Subscriber Name Magazine Name Subscription Start End Number Date Date KEY Subscriber Table Magazine Table Subscription Table Functional Dependency & Normalization • Example 01 • Having reached Third Normal Form, it should be possible to identify foreign keys. Foreign keys are some field(s) in one table which is(are) the primary key in another. Functional Dependency & Normalization • Example 01 • In this case, SUBSCRIBER NUMBER in the SUBSCRIPTION table is a foreign key referencing the primary key of the SUBSCRIBER table. And MAGAZINE CODE in the SUBSCRIPTION table is a foreign key referencing the primary key of the MAGAZINE table. Subscription Number Subscriber Number Magazine Code Primary Key Foreign Key Foreign Key Start End Date Date Normalization • Practice 01 Student# Advisor Adv-Room Class1 Class2 Class3 1022 Jones 412 101-07 143-01 159-02 4123 Smith 216 201-01 211-02 214-01 Normalization • Practice 01 • 1NF Student# Advisor Adv-Room Class# 1022 Jones 412 101-07 1022 Jones 412 143-01 1022 Jones 412 159-02 4123 Smith 216 201-01 4123 Smith 216 211-02 4123 Smith 216 214-01 Normalization • Practice 01 • 2NF Student: Student# Advisor Adv-Room 1022 Jones 412 4123 Smith 216 Registration: Student# Class# 1022 101-07 1022 143-01 1022 159-02 4123 201-01 4123 211-02 4123 214-01 Normalization • Practice 01 • 3NF Students: Student# Advisor 1022 Jones 4123 Smith Faculty Advisor Room Jones 412 Smith 216 Normalization • Practice 01 • Final Tables Student: Registration: Student# Advisor Student# Class# 1022 Jones 1022 101-07 4123 Smith 1022 143-01 1022 159-02 4123 201-01 4123 211-02 Advisor Room 4123 214-01 Jones 412 Smith 216 Faculty Normalization • Practice 02 Class Enrollment Class Code Class Description Student Number 503 540 Mgt Info Systems Quant Methods Name 00001 Masters, Rick 00003 Smith, Steve 00005 Jones, Terry 00002 Wallace, Fred 00003 Smith, Steve 00004 Nurk, Sterling Normalization • Practice 02 • 1NF Class Enrollment Class Code Class Description Student Number Name 503 Mgt Info Systems 00001 Masters, Rick 503 Mgt Info Systems 00003 Smith, Steve 503 Mgt Info Systems 00005 Jones, Terry 540 Quant Methods 00002 Wallace, Rusty 540 Quant Methods 00003 Smith, Steve 540 Quant Methods 00004 Nurk, Sterling Normalization • Practice 02 • 2NF • 3NF CLASS Class Code Class Description 503 Mgt Info Systems 540 Quant Methods STUDENT Student Number Name 00001 Masters, Rick 00002 Wallace, Rusty 00003 Smith, Steve 00004 Nurk, Sterling 00005 Jones, Terry Normalization • Practice 03 Project number Project name Employee number Employee name Rate Category Hourly rate 1023 Madagascar travel site 11 Vincent Radebe A $60 12 Pauline James B $50 16 Charles Ramoraz C $40 11 Vincent Radebe A $60 17 Monique Williams B $50 1056 Online Estate agency Normalization • Practice 03 • 1NF Project number Project name Employee Employee Rate number name category Hourly rate 1023 Madagascar travel site 11 Vincent Radebe A $60 1023 Madagascar travel site 12 Pauline James B $50 1023 Madagascar travel site 16 Charles Ramoraz C $40 1056 Online estate agency 11 Vincent Radebe A $60 1056 Online estate agency 17 Monique Williams B $50 Normalization • Practice 03 • 2NF Cont’d PROJECT Project number Project name 1023 Madagascar travel site 1056 Online estate agency Normalization • Practice 03 • 2NF EMPLOYEE Employee Employee Rate number name category Hourly rate 11 Vincent Radebe A $60 12 Pauline James B $50 16 Charles Ramoraz C $40 17 Monique Williams B $50 Normalization • Practice 03 • 3NF EMPLOYEE RATE Employee number Employee name 11 Rate category Hourly rate Vincent Radebe A $60 12 Pauline James B $50 16 Charles Ramoraz C $40 17 Monique Williams De-Normalization • De-normalization – De-normalization means combining two (or more) tables • Usually done when tables are frequently joined – De-normalization (joining two tables) depends on usage • Depends on how applications and users access the data De-Normalization • De-normalization is done to improve performance – Tailors data structures for one specific application’s use – Improves performance of one type of access at expense of others De-Normalization • De-normalization Trade-Offs Normalization De-normalization Eliminates update anomalies Improves performance for specific application(s) Minimizes data redundancy Supports simpler logic Provides applicationindependent database design Encourages sharing of data De-Normalization • When to De-Normalize – This is EVIL, Do Not Do… – When does de-normalization have minimal impact? • Data is accessed primarily on a read-only basis • Data is accessed primarily by one application De-Normalization • When to de-normalize – After database design is done and tables are normalized to 3NF – After clustering related tables in the same logical container – After considering trade-offs and usage of data De-Normalization • Alternatives to de-normalization – Physical placement of data • Use of container • Can improve performance without impacting logical design – Selective hardware upgrades • More main memory, expanded storage, cache storage devices Fragmentation • Fragmentation – Better alternative to denormalization – Means breaking one table into two (or more) tables • Usually done when one table is very large • Or groups of user almost exclusively access a subset of data in a table Fragmentation • Fragmentation can be based on selection or projection – Must be able to reconstruct the original table – by union or join – Primary key column(s) must be included in all vertical fragments • Disadvantage is that the user must be aware of all the fragmented tables Physical Design • Physical Database Design – Goals • Improve performance –By minimizing disk I/O • Improving management of the data –By grouping tables that can be managed as a group Physical Design • Steps in Physical Design Process – Determine which tables can be managed as a group • Many RDBMSs support the concept of a Container (Oracle Tablespace, db space, Access uses the .mdb) –A collection of tables, and indexes Physical Design – Develop a plan for allocating tables to disk devices • Consider parallel disk controllers • Group tables together that are frequently joined • Distribute heavily accessed table to different disk devices –To avoid excessive head movement on one disk Physical Design – Build indexes on table columns, based on frequency of use – Restructure tables if necessary • Fragment large tables into multiple smaller ones • De-normalize tables if appropriate Physical Design • Indexes – Index is a separate structure (table) • Points into the data table • Built on one or more columns in the data table Physical Design • Comments on Indexing – An index can be built on any column or combination of columns – An index can be unique or non-unique – An index on the primary key is called the primary index – Most RDBMSs use an internal row id as the pointer to the row – Use of the index is transparent to the user Physical Design • Use of an index – Provides access to a row based on data value(s) – Avoids duplicates – only way – Supports sequential processing on the indexed field – Improves performance Physical Design • Use of an index improves performance on Retrieval – Processing an index is more efficient than processing a table – for reads • Index is usually small, relative to the table –Can be held entirely in memory • The smaller the index value, the more entries per block the more likely the index will be in memory Physical Design • Use of index degrades performance on Updates – Inserting a row is the source of much disk I/O (overhead) • Every index on the table must be searched and updated also Physical Design • Data Types – Depends on the conventions used by a particular database – ORACLE uses: • NUMBER • CHAR - Characters • VARCHAR2 - Characters • DATE/TIME • LOB Physical Design • NUMBER – Numerical data – Guaranteed to 38 digits accuracy – NUMBER(10) – 10 digits allowed • CHAR – Character data – Fixed-Length up to 2,000 bytes – Good for 2 or 3 characters – Y/N, T/F, USA/CAN – CHAR(2) – 2 characters allowed Physical Design • VARCHAR2 – Character data – Variable-Length up to 4,000 bytes – VARCHAR2(15) – up to 15 characters • DATE/TIME – Date & Time data – DATE – DD-MON-YY (Default) – TIME – HH:MN:SE (Default) Physical Design • LOB – Large OBject data type • CLOB –Long variable length characters • BLOB –Binary objects – Video, Sound, Graphics • BFILE –Reference to an OS file – Up to 4GB of data per file SQL • Structured Query Language – DDL – Data Definition Language • CREATE • DROP • ALTER SQL • Structured Query Language – DML – Data Manipulation Language • SELECT • INSERT • DELETE • UPDATE SQL • Structured Query Language – DCL – Data Control Language • GRANT • REVOKE SQL • Primary Key SQL – PRIMARY KEY (student_ID) • Foreign Key SQL – CONSTRAINT student_ID FOREIGN KEY (student_id) REFERENCES Student(student_ID) SQL • Practice 01 • Create these Tables with SQL Student: Registration: Student# Advisor Student# Class# 1022 Jones 1022 101-07 4123 Smith 1022 143-01 1022 159-02 4123 201-01 4123 211-02 Advisor Room Dept 4123 214-01 Jones 412 42 Smith 216 42 Faculty SQL • Practice 01 • Registration Table – CREATE TABLE registration • (Student# NUMBER(5), • Class# NUMBER(5)); SQL • Practice 01 • Student Table – CREATE TABLE student • (Student# NUMBER(5), • Advisor VARCHAR2(12)); SQL • Practice 01 • Faculty Table – CREATE TABLE faculty • (Advisor VARCHAR2(12), • Room NUMBER(4), • Dept NUMBER(2)); SQL • Practice 02 • Create using SQL SUPPLIER SUPPLIER_ID NAME LOCATION ZIPCODE 10024 Best Buy OH 45502 13467 Circuit City WV 36709 45001 Staples KY 20065 SQL • Practice 02 • Supplier Table – CREATE TABLE supplier • (Supplier_ID Number(5), • Name VARCHAR2(20), • Location CHAR(2), • Zipcode Number(5)); SQL • Practice 03 • Create using SQL PARTS Part# Part_Name Part_Loc Part_Price Z143028G Widget 114 100 G45610B Thingy 232 500 WAREHOUSE WHSE# WHSE_Size WHSE_City WHSE_State WHSE_Status 114 24000 Rio Grande OH Full 232 3200 Charleston WV Empty SQL • Practice 03 • Parts Table – CREATE TABLE Parts • (Part# VARCHAR2(12), • Part_Name VARCHAR2(20), • Part_Loc NUMBER(3), • Part_Price NUMBER(4)); SQL • Practice 03 • Warehouse Table – CREATE TABLE warehouse • (WHSE# NUMBER(3), • WHSE_Size NUMBER(6), • WHSE_City VARCHAR2(15), • WHSE_State CHAR(2), • WHSE_Status VARCHAR2(10)); SQL • Practice 04 • Use SQL to find the following: – How many people make $10,000 per month EMPLOYEE Employee# Employee_Name Salary 1001 Smith 10000 1002 Jones 12000 1003 Thomas 8000 1004 Harrison 9500 SQL • Practice 04 – SQL • SELECT employee#, employee_name, salary • FROM employee • WHERE salary = 10000; Employee# Employee_Name Salary 1001 Smith 10000 SQL • Practice 05 • Use SQL to find the following: – Who makes more than $8000 but less then $10000 EMPLOYEE Employee# Employee_Name Salary 1001 Smith 10000 1002 Jones 12000 1003 Thomas 8000 1004 Harrison 9500 SQL • Practice 05 – SQL • SELECT employee#, employee_name, salary • FROM employee • WHERE salary > 8000 AND salary < 10000; Employee# Employee_Name Salary 1004 Harrison 9500 SQL • Practice 06 • Use SQL to find the following: – Give me a list of employees and their salaries in alphabetical order? EMPLOYEE Employee# Employee_Name Salary 1001 Smith 10000 1002 Jones 12000 1003 Thomas 8000 1004 Harrison 9500 SQL • Practice 06 – SQL • SELECT employee_name Name, salary • FROM employee • ORDER BY employee_name; Name Salary Harrison 9500 Jones 12000 Smith 10000 Thomas 8000 END REVIEW