Logical Data Modeling Review Lecture for University of Agder, Grimstad DAT202 Databaser (5.5.11) Judith Molka-Danielsen j.molka-danielsen@himolde.no Overview Real world problem: Represent order taking activity in logical relational database that will allow quick and accurate retrieval of data. Relational models are dimensional – all pieces are not in one physical file. We need to convert a physical database to a logical relational data model, using relations between customers and orders of products. To convert the physical world to a logical data model we must: 1. Identify candidate keys 2. Select primary keys (PK) 3. Apply normalization Goal of the prior steps is to produce an Entity Relationship Diagram (ERD). Data elements • Order Clump- related data file – each field is a data element – OrderNr – OrderDate – CustomerNr – CustomerName – CustomerAddress – ItemNr – ItemName – ItemQuantity – ItemPrice – ItemAmount CustNr CustName Quantity Price ItemAmount 1 2 …many Line items are occurrences of every item associated with an order. Keys Candidate keys – We look for attributes that uniquely define the data element. For example, a personal number can be used to uniquely define an employee. • There can be more than one candidate key. The employee can be found by an EmployeeID also. But the winner should be the one that is used the most to find the employee, and this becomes the Primary Key (PK). • In our example we pick OrderNr as the PK. No other order has the same order number. • Other keys can be used to find the record, but are not included in the logical data model. (secondary or alternative keys) • Foreign key (FK) – a common data element between two groups of associated items. This can be between two or more primary keys. EmployeeID Primary Key (PK) PersonalNr Secondary or alternative key Some ERD defintions • • • • Entity – object, thing. A file in logical modeling is called an entity. Entity type – a class of entity, e.g. customer. Entity instance (occurrence) – a specific entity, ”customer number 2000” Attribute – properties of an object. Elements become attributes. Attributes specify properties of an entity, e.g. CustName • Relationship – association among entities. Describes the connection between two entities. Relationships between entities of the same type have the same attributes. • Degree – number of occurrences of each entity (1:m), (1:1), or (m:m). • Dependency – means changes in an element (OrderNr) will change what you are looking at in the data. Entities and relationships in the ER can map to relationships in the relations database (RDB). Not all relationships in the ER map to relations in the RDB. Normalization • Normalization has two purposes: – Removal of data redundancy – Data integrity is maintained • Integrity – ”Completeness: unity. Soundness. Complete and not corrupted. ” • Normalizaton protects integrity of data, because it provides referential integrity between data fields. • If you violate Normalization, the integrity of the data is at risk. Normalization – The rules of one • • • • One type of items = one entity (table) One Item = one tuple (row) One descriptive fact = one attribute (column) Every attribute must describe the entity Normalization • 1st Normal Form (1.NF) – There must be no repeating elements (no repeating groups). • 2nd Normal Form (2.NF) – There can be no partial dependencies on part of the PK. • 3rd Normal Form (3.NF) – There can be no non-key attributes dependent on another non-key attribute (that is we cannot have 2 elements dependent on each other). This is called ”decomposition” – breaking the data down into dimensions. How do we achieve Normal Form? 1st Normal Form - example Test for 1NF: are there repeating groups of elements? If so, you must create a new entity. The new element will have a concatenated PK. For example: (OrderNr, ItemNr) (The first grouping is Not in 1NF) ItemNr ItemName ItemPrice ItemQuantity ItemAmount ItemDescrip Some elements will be left in the old grouping, and groups may be further decomposed. Order Clump OrderNr CustNr CustName CustAddress Order TotalAmount Moving over elements to a new entity. This is 1NF. OrderNr ItemNr ItemName ItemPrice ItemQuantity ItemAmount ItemDescrip OrderNr ItemNr 100 a 100 b 100 200 200 Item ItemNr 2nd Normal Form - example • Test 2NF – do we have a partial dependency? Is it c possible a non-key entity (ItemName) is dependent on one portion of the concatenated key (OrderNr, b ItemNr), instead of on the whole? r • Example – if no one already ordered item ”z” then you cannot add the item. Some try to create a fake order to accommodate this, but can cause problems later. If 2NF test fails -> create a new entity • Take a portion of the key that causes the violation to bring over to the new entity. (ItemName was only dependent on ItemNR.) – If an entity is in 1NF and it does not have a ItemName concatenated key, then it is already in 2NF. ItemPrice ItemDescrip – If an entity fails 1NF, then you must test for 2NF. – ItemQuanity and ItemAmount are dependent on the whole order, so they should stay in the old entity. 3rd Normal Form - example Customer CustNr • Test 3NF: No non-key attributes can be dependent on another. Are there any? • CustName is dependent on CustNr and is a CustName violation of 3NF. CustAddress • Create a new entity. • The PK of the new entity will be the non-key attribute of the old one that is really not nonkey. CustNr must be a key. • But now there is no longer a relationship between other groups (entities). So you need a foreign key (FK). • CustNr becomes the PK in the new entity, and becomes the FK in the Orders attribute. The FK stores a pointer to where the data resides in the Customer table. 3NF (side issues) • Another violation, is if you have a derived data element. If you look at ItemAmount it is derived from ItemQuantity. (Q x P = A) • So we take out ItemAmount. • You can take out the derived entity, and then you have to calculate them each time you do a query. But, if you change price or change quantity, then you have to remember to change amount, which could cause integrity problems. • We then have to add an element, OrderItemPrice that is diffferent from the ItemPrice, in case the quantity changes over time. It will then change the derived amount. ERD Order (OrderNr, ItemNR) (PK) CustNr (FK) ERD Items (ItemNr) (PK) -this is a simple PK (ItemNr, OrderNr) -are a complex FK ERD Customer CustNr (PK) - this is a simple PK • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • // SCRIPT FOR Standard SQL // Generated from Modelator 03.05.2011 17:57:42 Model : Orders-Manage-simple-v2.mr4 // DROP DROP TABLE Customer; DROP TABLE Order; DROP TABLE Items; // TABLES CREATE TABLE Customer ( CustNr INTEGER NOT NULL, CustName VARCHAR(25), CustAddress VARCHAR(75), PRIMARY KEY ( CustNr ) ); CREATE TABLE Order ( ItemNr INTEGER NOT NULL, OrderNr INTEGER NOT NULL, CustNr INTEGER, ItemQuantity INTEGER, OrderItemPrice DECIMAL(4,1), PRIMARY KEY ( ItemNr, OrderNr ) ); CREATE TABLE Items ( ItemNr INTEGER NOT NULL, OrderNr INTEGER NOT NULL, ItemName VARCHAR(25), ItemDescrip VARCHAR(75), ItemPrice DECIMAL(4,1), PRIMARY KEY ( ItemNr, OrderNr ) Modellator 4.0 allows exporting a model to a database. Here only the script is exported. • • • • • • • • • // FOREIGN KEYS ALTER TABLE Order ADD FOREIGN KEY ( CustNr ) REFERENCES Customer ( CustNr ); • • • • • • • • • • • • • • • • • • • ALTER TABLE Items ADD FOREIGN KEY ( ItemNr, OrderNr ) REFERENCES Order ( ItemNr, OrderNr ); // INDEXES CREATE INDEX ItemNr ON Items ( ItemNr ASC ); // INDEXES ON FOREIGN KEYS CREATE INDEX FK1_Customer_Order ON Order ( CustNr ); // COMMENTS ON TABLES AND COLUM NS // GRANT ACCESS • • • • GRANT SELECT, INSERT, UPDATE, DELETE ON Customer TO PUBLIC; GRANT SELECT, INSERT, UPDATE, DELETE ON Order TO PUBLIC; GRANT SELECT, INSERT, UPDATE, DELETE ON Items TO PUBLIC; You should be able to change the logical model when the physical world changes: Can add relations and entities/attributes later.. For example • Sellers – add sellers information • ItemCategory – add category codes • Order –add order date • May choose to remove derived values (e.g. Price * Quantity = Amount)