Appendix A Data Modeling Tools and Notation Chapters 2 and 3 present several common notations for representing conceptual data models. Depending on the software tool available for depicting a data model, your ­ability to replicate these notations will vary. Just as business rules and policies are not universal, neither are the symbols and notation used in the various data modeling tools. Each uses different graphical constructs and methodologies that may or may not be able to convey the meaning of a particular business rule. This appendix is intended to help you compare the book’s notations with your modeling tool’s notation. Four commonly used tools are covered: CA ERwin Data Modeler 9.5, Oracle SQL Developer 4.0, SAP Sybase PowerDesigner 16.5, and Microsoft Visio Professional 2013. Table A-1a and Table A-1b chart samples of the notation used in each tool for entities, relationships, attributes, rules, constraints, and so forth. Another drawing tool often used for creating ERDs is SmartDraw. SmartDraw is illustrated in the videos associated with Chapters 2 and 3; visit www.pearsonhighered.com/hoffer to view these videos. Figure 2-22, a data modeling diagram for Pine Valley Furniture Company (PVFC), is the basis for the examples pictured in this appendix. That figure shows the data model drawn from the narrative of PVFC business rules included in Chapter 2, using the Visio notation system, which is very similar to the notation used in this textbook. Figure A-1, included here, is this same figure. Table A-1 shows a comparison of the ­textbook n ­ otation with that available in the four software tools. Comparing E-R Modeling Conventions As can be seen from Table A-1, modeling tools can differ significantly in the notation ­available to create a data model. While not intended as an in-depth comparison of the various tools, the following explanation provides a means to analyze the tools’ differences, using the PVFC data model depicted in Figures 2-22 and A-1. Pay particular attention to differences in depicting many-to-many relationships, cardinalities and/or optionalities, foreign keys, and supertype/subtype relationships. Each tool offers multiple sets of notation. We have chosen entity/relationship sets of symbols for each tool. Note, in particular, how associative entities are drawn; the foreign key relationships are included. Visio Professional 2013 Notation The Professional version of Visio includes a database diagramming tool for modeling a ­conceptual or physical diagram. Visio provides two database modeling templates. Selecting Database Model Diagram for a new data model allows a further choice of ­relational or IDEF1X symbols. Both of these choices allow reverse engineering of existing physical d ­ atabases. The other template choice is UML Model Diagram, which allows you to use the class diagram notation from object-oriented data modeling. We illustrate A-1 Z01_HOFF4619_12_SE_APPAWEB.indd 1 5/11/15 8:18 AM A-2 Z01_HOFF4619_12_SE_APPAWEB.indd 2 5/11/15 8:18 AM Attributes Recursive Relationship Subtypes Associative Entity Basic Entity Weak EMPLOYEE HOURLY EMPLOYEE Manages SALARIED EMPLOYEE EMPLOYEE Associative Strong Hoffer-RameshTopi Notation Visio Professional 2013 CA ERWin Data Modeler 9.5 Table A-1 A Comparison of Hoffer, Ramesh, and Topi Modeling Notation with Four Software Tools (a) Common modeling tools, notations SAP Sybase PowerDesigner 16.5 # PRODUCT_LINE_ID * PRODUCT_LINE_NAME PRODUCT LINE SUBTYPE A SUBTYPE B SUPERTYPE (No special symbol. Uses regular Entity symbol.) PRODUCT LINE Oracle SQL Developer Data Modeler 4.0 A-3 Z01_HOFF4619_12_SE_APPAWEB.indd 3 5/11/15 8:18 AM (continued) (Not allowed) M:N Optional 1:M Mandatory 1:M P (Not available without cardinality) (Not available without cardinality) 1:M 1 (Not available without cardinality) (Not available without cardinality) 1:1 Mandatory 1:1 CA ERWin Data Modeler 9.5 Visio Professional 2013 Hoffer-Ramesh-Topi Notation b) Common modeling tools’ cardinality/optionality notations Table A-1 0,n 0,n 0,n 0,n 0,1 0,1 0,1 0,1 SAP Sybase PowerDesigner Oracle SQL Developer 16.5 Data Modeler 4.0 A-4 Appendix A • Data Modeling Tools and Notation Figure A-1 Visio Professional 2013 model SALESPERSON PK Salesperson ID Serves Salesperson Name Salesperson Telephone Salesperson Fax TERRITORY DOES BUSINESS IN PK Territory ID Territory Name CUSTOMER PRODUCT LINE PK Product Line ID Product Line Name PK Customer ID Customer Name Customer Address Customer Postal Code Submits Includes ORDER PK Order ID Order Date PRODUCT PK Product ID VENDOR PK Vendor ID Vendor Name Vendor Address ORDER LINE Product Description Product Finish Product Standard Price Ordered Quantity USES PRODUCED IN Goes Into Quantity RAW MATERIAL PK Material ID SUPPLIES Material Name Material Standard Cost Unit Of Measure Supply Unit Price Is Supervised By Supervises SKILL PK Skill HAS SKILL EMPLOYEE PK Employee ID WORK CENTER PK Work Center ID Work Center Location WORKS IN Employee Name Employee Address only the Database Model Diagram template. This template may be customized to indicate ­primary key (PK), foreign keys (FK), secondary indexes, nonkey fields, data types, the ­format of cardinalities, and so on. You can also elect to display the primary key fields at the top of each entity or in their actual physical order. This text uses the relational template. Z01_HOFF4619_12_SE_APPAWEB.indd 4 5/11/15 8:18 AM Appendix A • Data Modeling Tools and Notation A-5 All entities are depicted as square rectangles with optional horizontal and vertical lines used to partition entity information. Keys (primary, alt, foreign), nonkey attributes, referential integrity, and so on can be optionally displayed within the entity box. Subtype/supertype connectors are available. Entities Relationships Both binary and unary relationships can be shown, but not ternary. Lines can be labeled in one or both directions or neither, and the relationship types are either identifying (solid line) or nonidentifying (dashed line). Cardinality and optionality notation differ according to the symbol set chosen, relational or IDEF1X. Notation samples for the relational symbol set chosen for our diagram can be seen in Table A-1b. To make the “parent” entity optional, you have to uncheck the Required box for the foreign key attribute, which then also makes the relationship nonidentifying. This tool provides a helpful “range” option, where a minimum and a maximum value can also be set for cardinality. When identifying or nonidentifying relationships are established, keys are automatically migrated above or below, respectively, the entity’s horizontal separator line. The recursive Supervises relationship shows the business rule that a supervisor may supervise none or any number of employees but cannot show that the president has no supervisor, only that each employee has exactly one supervisor. A many-to-many relationship between two entities cannot be established; a new (associative) entity must be added to resolve it. The many and varied line connectors provided by the tool can be used to draw a many-to-many relationship, but these connector objects do not establish the functional relationship within the tool. CA ERwin Data Modeler 9.5 Notation Here, for physical or logical modeling, one has the choice among IDEF1X, IE (Information Engineering), or DM (Dimensional Modeling) notation. The examples used here demonstrate IE. ERwin has very robust capabilities for adding many types of metadata to the entities, attributes, and relationships. The user can choose to display the model in several Display Levels, including only entities and relationships, entities with key attributes, and fully attributed entities. As with many of the other tools, both logical and physical data models can be developed and displayed. The key difference between most conceptual and logical data models is that the tools want to resolve all primary keys in a logical data model, which is necessary to migrate to a physical data model. Thus, many tools, like ERwin, do not support development of what is purely a conceptual data model. ERwin does support versioning of a data model. An independent entity is represented as a box with a horizontal line and square corners. If an entity is a child (weak) entity in an identifying relationship, it appears as a dependent entity—a box with rounded corners. Associative entity symbols are also represented this way. ERwin determines the entity type based on the relationship in which it is involved. For example, when you initially place an entity in a model, it displays as an independent entity. When you connect it to another entity using a ­relationship, ERwin determines whether the entity is independent or dependent, based on the relationship type selected. Entities Relationships ERwin represents a relationship as a solid or dashed line connecting two entities. Depending on the notation you choose, the symbols at either end of the line may change. Cardinality options are flexible and may be specified unambiguously. A parent may be connected to “Zero, One, or More,” signified by a blank space; “One or More,” signified by a P; “Zero or One,” signified by a Z; or “Exactly,” some ­number of instances; P or Z may optionally appear on the ERD. Many-to-many relationships can be depicted or the user may opt to automatically or manually resolve them. Figure 2-22 (A-1) does not have any many-to-many relationships because it already shows all p ­ ossible ones as associative entities (e.g., DOES BUSINESS IN). (Visio does not support M:N relationships.) In Figure A-2 we show what would result from ­manually telling ERwin to resolve each M:N by creating an associative entity. For example, consider the many-to-many SUPPLIES relationship between Vendor and Raw Materials. The user selects a “Show Association Entity” option on the relationship line that then automatically eliminates Z01_HOFF4619_12_SE_APPAWEB.indd 5 5/11/15 8:18 AM A-6 Appendix A • Data Modeling Tools and Notation Figure A-2 CA ERwin Data Modeler 9.5 model SALES PERSON Salesperson ID Salesperson Name Salesperson Telephone Salesperson Fax Territory ID (FK) Serves SALES TERRITORY DOES BUSINESS IN Territory ID Territory ID (FK) Customer ID (FK) Territory Name PRODUCT LINE CUSTOMER Product Line ID Customer ID Product Line Name Customer Name Customer Address Customer Postal Code Submits Includes ORDER Order ID Order Date Customer ID (FK) PRODUCT ORDER LINE Product ID Order ID (FK) Product ID (FK) Product Description Product Finish Product Standard Price Product Line ID (FK) Ordered Quantity VENDOR PRODUCED IN Vendor ID Work Center ID (FK) Product ID (FK) USES Vendor Name Vendor Address Material ID (FK) Product ID (FK) WORK CENTER RAW MATERIAL SUPPLIES Work Center ID Material ID Vendor ID (FK) Material ID (FK) Work Center Location Material Name Material Standard Cost Unit of Measure Supply Unit Price SKILL HAS SKILL EMPLOYEE Skill Skill (FK) Employee ID (FK) Employee ID WORKS IN Employee ID (FK) Work Center ID (FK) Supervisor ID (FK) Supervises Z01_HOFF4619_12_SE_APPAWEB.indd 6 5/11/15 8:18 AM Appendix A • Data Modeling Tools and Notation A-7 the many-to-many relationship, establishes new ones with cardinality and optionality ­notations, ­creates the associative entity, and allows the “Supply Unit Price” a­ ttribute for the SUPPLIES relationship to be displayed in the diagram. SUPPLIES would not be the name automatically given this associative entity, so we have renamed it. ORDER LINE is also shown as an associative entity by ERwin. The recursive nonidentifying Supervises relationship, where parent and child are shown as the same entity, shows that an Employee (a Supervisor) may supervise many employees, but not all employees are supervisors. The notation also indicates that nulls are allowed, which shows that a ­supervisor may have no employees and an employee (the president) may have no supervisor. The diagram introduces a Role Name (Supervisor ID) for the PK attribute in its role as a nonkey FK attribute for the Supervises relationship. Keys migrate automatically when relationships are established, and foreign keys are notated “FK.” In an identifying relationship, the FK migrates above the horizontal line in the entity and becomes part of the primary key of the child entity. In a nonidentifying relationship, the foreign key migrates below the line and becomes a nonkey attribute in the child entity. In ERwin, a dashed line represents a nonidentifying relationship. The chart captured from ERwin’s online help and shown in Figure A-3 depicts the range of cardinality symbols for different ER notation sets that may be used from this product. SAP Sybase PowerDesigner 16.5 Notation PowerDesigner projects are contained within a workspace that can be customized and includes a hierarchy of folders and models. Links to model files, report files, and external files are also stored in the workspace. When a data modeler is working on ­multiple ­projects or on a part of a project with different requirements, multiple workspaces may be defined as needed. Each is kept locally and is reusable. It is p ­ ossible to work in Cardinality Description IDEF1X Notation Identifying Nonidentifying Nulls No Nulls IE Notation Identifying Nonidentifying Nulls No Nulls DM Notation Identifying Nonidentifying Nulls No Nulls Figure A-3 ERwin cardinality/ optionality symbols One to zero, one, or more One to one or more (P) P P P P P P P P P Z Z Z Z Z Z Z Z Z 4 4 4 4 4 4 4 4 4 One to zero or one (Z) One to exactly (N) Z01_HOFF4619_12_SE_APPAWEB.indd 7 5/11/15 8:18 AM A-8 Appendix A • Data Modeling Tools and Notation only one workspace at a time. PowerDesigner 16.5 includes various integrated modeling tools besides data modeling, including XML modeling, data ­movement m ­ odeling, and various enterprise information architecture tools. The examples in this appendix use the Conceptual Data Model graphics with the Information Engineering notation. Other conceptual modeling notations s­ upported are Barker and IDEF 1/x. Conceptual designs can be used to generate first logical and then physical data models. Further, PowerDesigner 16.5 has data warehouse design capabilities, including the ability to identify dimension and fact tables and to generate cubes. The amount of detail that is displayed in the data model is selected by the modeler and may include primary identifiers, a predetermined number of attributes, data type, optionality, and/or domain. A double-click of the entity allows access to the entity’s property sheet. Properties shown include name, technical code name, a comment field that contains a descriptive label if desired, stereotype (subclassification of entity), estimated number of occurrences, and the possibility of generating a table in the physical data model. Additional entity properties include attributes, identifiers, and rules. Each of these properties has its own property sheet. Entities PowerDesigner uses a solid line between entities to establish any r­elationship. Crows foot notation is used to establish cardinality and the circle and line establish optionality, similar to the Hoffer notation. Relationship properties include name, technical code name, comment, stereotype, the related pair of entities (only binary and unary relationships are supported), and a generation capability. It is possible to model a many-to-many relationship without breaking it down to include the associative entity. If desired, however, an associative entity may be modeled and ­displayed. Recursive (reflexive) relationships may be modeled easily, and subtypes may also be presented. Relationships Oracle Designer Notation Diagrams drawn using the Oracle SQL Developer Data Modeler tool can be set to show only the entity names, the entity names and the primary key, or the entity names and all of the attribute labels. No specific symbols exist for the different entity types, including associative entities and supertypes or subtypes. All entities are depicted as rounded rectangles, and attributes can be displayed within the box. Unique identifiers are preceded by a # sign and must be mandatory, mandatory attributes are tagged with *, and optional attributes are tagged with °. Entities Relationships Lines must be labeled in both directions, not just one direction, and are challenging to manipulate and align. Cardinality is read by picking up the cardinality sign attached to the other entity. Thus, a Customer may place an order or not, but when an order is placed, it must be related to a particular customer. Looking at the EMPLOYEE entity, the recursive supervisory relationship is depicted by the “pig’s ear” attached to the entity. It shows that an Employee may supervise one or more employees and that an employee must be supervised by one employee or supervisor. It is ambiguous as to whether the multiple cardinality is zero, one, or many. When working with Oracle SQL Developer Data Modeler, it is important to sketch your data model carefully and completely before attempting to use the tool. Editing the model can be challenging, and deleting an object from the diagram does not automatically delete it from the Repository. Comparison of Tool Interfaces and E-R Diagrams For each of the software modeling tools included in Table A-1, the data model for Figure 2-22 (A-1) is included here. These figures should give you a better idea of what the symbol notation looks like in actual use. Note that we use uppercase for all data Z01_HOFF4619_12_SE_APPAWEB.indd 8 5/11/15 8:18 AM Appendix A • Data Modeling Tools and Notation Figure A-4 A-9 SAP Sybase PowerDesigner 16.5 model SALESPERSON TERRITORY Serves Salesperson_ID Salesperson_Name Salesperson_Telephone Salesperson_Fax Territory_ID DOES BUSINESS IN Territory_Name CUSTOMER PRODUCT LINE Product_Line_ID Product_Line_Name Customer_ID Customer_Name Customer_Address Customer_Postal_Code Submits Includes ORDER Order_ID Order_Date PRODUCT Product_ID Product_Description Product_Finish Product_Standard_Price ORDER LINE Ordered_Quantity PRODUCED IN USES VENDOR Goes_into_Quantity Vendor_ID Vendor_Name Vendor_Address WORK CENTER Work_Center_ID Work_Center_Location RAW MATERIAL Material_ID Material_Name Material_Standard_Cost Unit_of_Measure SUPPLIES Supply_Unit_Price Supervision Supervises EMPLOYEE SKILL Skill Z01_HOFF4619_12_SE_APPAWEB.indd 9 HAS SKILL Employee_ID Employee_Name Employee_Address Is Supervised By WORKS IN 5/11/15 8:18 AM A-10 Appendix A • Data Modeling Tools and Notation Figure A-5 Oracle SQL Developer Data Modeler 4.0 model serves SALES TERRITORY is served by has business with # TERRITORY_ID * TERRITORY_NAME SALESPERSON CUSTOMER does business in # CUSTOMER_ID * CUSTOMER_NAME O CUSTOMER_ADDRESS PRODUCT LINE # SALESPERSON_ID * SALESPERSON_NAME O SALESPERSON_FAX O SALESPERSON_TELEPHONE submits # PRODUCT_LINE_ID * PRODUCT_LINE_NAME ORDER includes is submitted by belongs to # ORDER_ID * ORDER_DATE requests PRODUCT VENDOR # PRODUCT_ID * PRODUCT_DESCRIPTION O PRODUCT_FINISH O STANDARD_PRICE # VENDOR_ID O VENDOR_NAME O VENDOR_ADDRESS supplies is used by ORDER LINE * ORDERED_QUANTITY employs used to produce WORK CENTER RAW MATERIAL # MATERIAL_ID * MATERIAL_NAME * UNIT_OF_MEASURE O STANDARD_COST appears on requested by produced in uses supplied by appears on # WORK_CENTER_ID O WORK_CENTER_LOCATION works in has SKILL # SKILL_CODE O SKILL_DESCR belongs to belongs to has HAS_SKILL supervises EMPLOYEE # EMPLOYEE_ID * EMPLOYEE_NAME O EMPLOYEE_ADDRESS supervised by names and include an underscore between words in Figure A-5, which is different from other E-R diagrams in this book. We do this for two reasons: (1) This is what many Oracle practitioners do, and (2) Oracle, like many other RDBMSs, always displays data names in SQL and repository query results in all-capital letters, so creating data names in this format may be easier for some people to read. Z01_HOFF4619_12_SE_APPAWEB.indd 10 5/11/15 8:18 AM