Database Design Examples-1 22/03/2004 3 step design Conceptual Design Highest level design Issues: data types, relationships, constraints Uses ER model Logical Design Implementation of conceptual model 3 ways: hierarchical, network, relational Apply relational model Uses RA (relational algebra) as a formal query language Physical Design Actual computer implementation Issues: mem. manag., storage, indexing ER model The most popular conceptual data modeling technique. (Give an example of other conceptual design tool?) Integrates with “relational model”. The scenario is partitioned into “entities” which are characterized with “attributes” and interrelated via “relationships”. “Entity set” is a set of entities of the same type. Entity is an independent conceptual existence in the scenario. An attribute (or a set of attributes) that uniquely identifies instance of an entity is called the key. 1-) Super key 2-) candidate key (minimal superkey) 3-) primary key (candidate key chosen by DBA) An attribute can be single-valued or multi-valued. cont.. At least 2 entities participate in a relationship. (give an example of recursive relationship) Binary relationship, ternary relationship… (give an example of ternary relationship) Cardinality constraints: 1-1, 1-N, N-M Relationships may have their own attributes Example of an ER diagram: m1 E1 a1 1 R N E2 a3 Can we migrate a3? IF we can, to where? a2 d cont.. Weak Entity set: An entity set that does not have enough attributes to form a primary key. Think of Transaction entity set(transaction#, date, amount) assuming that different accounts might have similar transactions. We need a strong entity set (owner set) in order to distinguish the entities of weak entity set. Question: Is there a way to represent this kind of scenario without using another entity set? account Acc# 1 R date N transaction amount Tran# Relational Model Data representation model introduced by Codd, 1970. E-R RM Relation Table Attributes Columns Table is an unordered collection of tuples(rows). Degree of a relation is the # of columns. Data types of attributes: DOMAINS int, float, character,date, large_object (lob), user-defined data types(only for ORDBs) cont.. Logical consistency of data is ensured by certain constraints: key :: every relation must have PK key entity integrity :: no PK can be NULL referential integrity :: value of attributes of the foreign key either must appear as a value in PK of another table (or the same table, give an example) or must be null. Definitions: PrimaryKey is chosen among the candidate key by DBA. ForeignKey is set of attributes in a relation which is duplicated in another relation. ERRM Rules such as Erwin, Oracle Designer 2000, Rational Rose can translate ER to RM. 4 steps for transformation: S/w packages (CASE tools) 1.) Map each entity set in ER into a separate table in RM (Also, map the attributes, and PK) 2.) Weak entity set with attributes (a1,..an) and owner set attributes (b1,b2,..bm): MAP it to a table with {(a1,..an) U (b1,b2,..bm)} attributes. (b1,b2,..bm) becomes the foreign key. {(a1,..an) U discriminator} becomes the PK. cont.. 3.) Binary Relationship S between R1 and R2 entity sets. Assume (a1,a2,…an) is the attributes of S. If cardinality is 1-1: Chose either relations( say S) and extend it with {PK(T) U (a1,a2,…an)} If cardinality is 1-N: Chose N-side relation( say S) and extend it with {PK(T) U (a1,a2,…an)} If cardinality is N-M: Represent it with a new relation with PK(T) U PK(S) U {(a1,a2,…an)} 4.) Multivalued Attribute ‘A’ of entity set R is represented with a new relation with {A U PK(S)}. What is the PK of new table? Example 1- (3-step design,SQL) DB of a “Managing customer orders” Scenario: a customer has a unique customer number and contact information a customer can place many orders, but a given purchase order is placed by one customer a purchase order has a many-to-many relationship with a stock item. Here is the ER diagram. Example-(Relational model) CUSTOMER PURCHASE_ORDER Cust Cust Street City Zip # Name Cust# Order Ship ToStre ToCity ToZip Order# Date Date et Purchase PK is (PurchaseOrder#) FK is(Cust#) CUST_PHONES Cust # Phones STOCK_ITEMS CONTAINS Purchase Order# Stock# quantity Corresponds to 1-N relationship discount PK is (Cust#, Phones) PK is (PurchaseOrder#, Stock#) Stock# Price TaxRate Corresponds to NM relationship Physical Design-DDL CREATE TABLE Customer ( CustNo NUMBER NOT NULL, CustName VARCHAR2(200) NOT NULL, Street VARCHAR2(200) NOT NULL, City VARCHAR2(200) NOT NULL, State CHAR(2) NOT NULL, Zip VARCHAR2(20) NOT NULL, PRIMARY KEY (CustNo) ); CREATE TABLE Contains ( PONo NUMBER REFERENCES PurchaseOrder, StockNo NUMBER REFERENCES Stock, Quantity NUMBER, Discount NUMBER, PRIMARY KEY (PONo, StockNo) ); CREATE TABLE PurchaseOrder ( PONo NUMBER, Custno NUMBER REFERENCES Customer, /* purchase order no */ /* Foreign KEY referencing customer */ OrderDate DATE, ShipDate ToStreet DATE, /* date of order */ /* date to be shipped */ VARCHAR2(200), /* shipto address */ ToCity VARCHAR2(200), ToState CHAR(2), ToZip VARCHAR2(20), PRIMARY KEY(PONo) ); cont.. CREATE TABLE Cust_Phones ( CustNo NUMBER REFERENCES Customer, Phones VARCHAR2(20), PRIMARY KEY (CustNo, Phones) ); CREATE TABLE Stock ( StockNo NUMBER PRIMARY KEY, Price NUMBER, TaxRate NUMBER ); DML (data manipulation language) INSERT INTO Stock VALUES(1004, 6750.00, 2) ; INSERT INTO Stock VALUES(1011, 4500.23, 2) ; INSERT INTO Stock VALUES(1534, 2234.00, 2) ; INSERT INTO Stock VALUES(1535, 3456.23, 2) ; ****************************************** INSERT INTO Customer VALUES (1, 'Jean Nance', '2 Avocet Drive', 'Redwood Shores', 'CA', '95054') ; INSERT INTO Customer VALUES (2, 'John Nike', '323 College Drive', 'Edison', 'NJ', '08820') ; ****************************************** INSERT INTO Cust_Phones (1, '415-555-1212‘); INSERT INTO Cust_Phones (2, '609-555-1212'); INSERT INTO Cust_Phones (2, '201-555-1212'); cont.. INSERT INTO PurchaseOrder VALUES (1001, 1, SYSDATE, '10-MAY1997',NULL, NULL, NULL, NULL) ; INSERT INTO PurchaseOrder VALUES (2001, 2, SYSDATE, '20 MAY1997', '55 Madison Ave', 'Madison', 'WI', '53715') ; ********************************************** INSERT INTO Contains VALUES( 1001, 1534, 12, 0) ; INSERT INTO Contains VALUES(1001, 1535, 10, 10) ; INSERT INTO Contains VALUES(2001, 1004, 1, 0) ; INSERT INTO Contains VALUES(2001, 1011, 2, 1) ; ********************************************** NOTE: You can use bulk loading if the DB has this functionality. Example: Oracle has SQL*Loader, sqlldr command for bulk loading.. SQL Q1: Get Customer and Data Item Information for a Specific Purchase Order SELECT C.CustNo, C.CustName, C.Street, C.City, C.State, C.Zip, P.PONo, P.OrderDate, CO.StockNo, CO.Quantity, CO.Discount FROM Customer C, PurchaseOrder P, Contains CO WHERE C.CustNo = P.CustNo AND P.PONo = CO.PONo AND P.PONo = 1001 ; Q2: Get the Total Value of Purchase Orders SELECT FROM WHERE P.PONo, SUM(S.Price * CO.Quantity) PurchaseOrder P, Contains CO, Stock S P.PONo = CO.PONo AND CO.StockNo = S.StockNo GROUP BY P.PONo ; SQL Q3: List the Purchase Orders whose total value is greater than that of a specific Purchase Order. SELECT FROM WHERE P.PONo PurchaseOrder P, Contains CO, Stock S P.PONo = CO.PONo AND CO.StockNo = S.StockNo AND SUM(S.Price * CO.Quantity) > SELECT SUM(S.Price * CO.Quantity) FROM Contains CO, Stock S WHERE CO.PONo = 1001 AND CO.StockNo = S.StockNo NOTE: What if “>1” customers can have the same PurchaseOrderNumber? Use “ANY” for a general solution.. SELECT FROM WHERE P.PONo PurchaseOrder P, Contains CO, Stock S P.PONo = CO.PONo AND CO.StockNo = S.StockNo GROUP BY P.PONo ; HAVING SUM(S.Price * CO.Quantity) > ALL(SELECT FROM WHERE SUM(S.Price * CO.Quantity) Contains CO, Stock S CO.PONo = 1001 AND CO.StockNo = S.StockNo) SQL Q4: Find the Purchase Order that has the maximum total value. CREATE VIEW X(Purchase,Total) AS SELECT P.PONo, SUM(S.Price * CO.Quantity) FROM PurchaseOrder P, Contains CO, Stock S WHERE P.PONo = CO.PONo AND CO.StockNo = S.StockNo GROUP BY P.PONO --------------------------- SELECT P.PONo FROM X GROUP BY P.PONo ; HAVING Total =( SELECT max(Total) FROM X) DML Delete Purchase Order 1001 DELETE FROM Contains WHERE PONo = 1001 ; DELETE FROM PurchaseOrder WHERE PONo = 1001 ; (Important: The order of commands is important..!!!!) Delete the database. drop drop drop drop drop table Cust_Phones; table Contains; table Stock; table PurchaseOrder; table Customer; (Important: The order of commands is important..!!!!) Example-2 (ER, relational algebra) Scenario for Parking database: We want to develop a database that has parks and lakes that are overlapping with each other. Overlapping area is also stored. Parks have their name, area, distance and a unique id. Lakes have name, depth, catch and a unique id. area distance Pname Park Pid catch M Overlap area N Lake Lid depth Lname Relations for Parking DB PARK Pid Pname area distance LAKE Lid Lname L catch depth w 20 s1 I 150 52 100 200 C t 30 s2 V 255 75 300 S v 45 400 T b 28 s3 B 175 PARK_LAKE Lid 100 200 300 Pid s1 s1 s3 area 100 255 175 300 RA (relational algebra operations) RA is a formal query language and the core of SQL. Not implemented in commercial DBs. RA consists of a set of operands (tables) and operations (select, project, union,cross-product, difference, intersection) RA operations select: <selection operation>(relation R) retrieves the subset of rows. project: <list of attributes> (relation R) retrieves the subset of columns. Assume R and S are tables: union: R S, all tuples that are R OR S intersect: R S, all tuples that are both R AND S Requires compatibility difference: R - S, all tuples that are in R but not in S cross-product: R x S, all attributes of R followed by those of S Join and natural join operations A derived operation. Is the cross-product followed by a select. R ¤c S : c (R x S) c: the condition that usually refers to the attributes of both R and S. If c is an equality condition and consists of one column (the common column), then it is called natural join. (R ¤ S) For complex queries, use the renaming operation, (newname(1attr1), oldname) means that the relation “oldname” becomes the “newname”. Also the first attribute of “newname” table is called the “attr1” Relational Algebra on Parking DB Find the name of the Park which contains Lake with Lid=100. 1. solution: Pname (Park ¤ Lid=100(ParkLake)) 2. solution: Pname ( Lid=100(ParkLake ¤ Park)) 3. solution: (t1, Lid=100(ParkLake)) (t2, t1 ¤ Park) Pname (t2) cont.. Find the names of Parks with Lakes which have a depth of above 25. Pname (Park ¤ (ParkLake ¤ ( depth > 25 (Lake))) Find the depth of lakes that overlap with ‘I’. depth (Lake ¤ (ParkLake ¤ ( Pname=I (Park))) Find the names of Parks with at least 1 lake. Pname (Park ¤ (ParkLake)) Find the names of Parks with lakes whose catch is either ‘b’ or ‘w’. (t1, catch=’b’(Lake) catch=’w’(Lake) ) Pname (Park ¤ ParkLake ¤ t1) cont.. Find the names of Parks that have ‘b’ and ‘w’ as the catch in their lakes. (t1, Pname ( catch=’b’ (Lake) ¤ ParkLake ¤ Park)) (t2, Pname ( catch=’w’ (Lake) ¤ ParkLake ¤ Park)) t1 t2 cont.. Find Pid of Parks that are 50 km away from the city where catch is not ‘t’. Pid ( distance>50 (Park)) - Pid ( catch=’t’ (Lake) ¤ ParkLake ¤ Park) Find the names pf Parks that have at least 2 lakes. (t1(1Pid1,2Lid1), Pid,Lid ( ParkLake ¤ Park)) (t2(1Pid2,2Lid2), Pid,Lid ( ParkLake ¤ Park)) (t, t1 x t2) Pname (Pid1=Pid2) (Lid1 Lid2) (t) NEXT WEEK, 29/04/2004 MORE DB DESIGN EXAMPLES (weak entity set, N-ary relationships, EER model) SQL examples (set operations-union,intersect,minus set comparison operations- contains, some, all aggregate functions-count, some, avg,max,min group by, having,order by delete, update operations…) 1.vize: 5 nisan 2004