UML & Data Modeling: A Reconciliation The Handbook DAMA Northeast March, 2012 David C. Hay Essential Strategies, Inc. 13 Hilshire Grove Lane, Houston, TX 77055 (713) 464-8316 dch@essentialstrategies.com www.essentialstrategies.com Copyright © 2011, Essential Strategies, Inc. 1/731 Today’s theme . . . A man may be a topologist or an acoustician or a coleopterist. He will be filled with the jargon of his field, and will know all its literature and all its ramifications. . . . . .but, more frequently than not, he will regard the next subject as something belonging to his colleague three doors down the corridor, and will consider any interest in it on his own part as an unwarrantable breach of privacy. These specialized fields are continually growing and invading new territory. The result is like what occurred when the Oregon country was being invaded simultaneously by the United States settlers, the British, the Mexicans, and the Russians—an inextricable tangle of exploration, nomenclature, and laws. Norbert Wiener, Cybernetics; 1948.1 1 Norbert Wiener. 1948, 1961. Cybernetics: of Control and Communication in the Animal and the Machine, second edition. (Cambridge, MA, The MIT Press). 2. Copyright © 2011, Essential Strategies, Inc. 2/73 In our industry, we have two factions: Data Modelers UML Modelers Database designers Conceptual data modelers Object-oriented designers Other UML modelers Copyright © 2011, Essential Strategies, Inc. 3/73 The Data Architecture world . . . Early 1960s Early database design Late 1960s Bachman models 1970 Relational theory 1976 Peter Chen – Data models 1978 Data flow diagrams 1978 Relational Databases 1981 Information Engineering, Barker/Ellis MODEL ELEMENT # NAME o DESCRIPTION ENTITY CLASS constrainer in CONSTRAINT by ROLE OTHER MODEL ELEMENT CARDINALITY CONSTRAINT EXCLUSIVITY CONSTRAINT on constrained by 1987 Zachman Framework 1990s Data Management 1990s Business Rules 1995 Data Model Patterns Copyright © 2011, Essential Strategies, Inc. 4/73 The object-oriented (UML) world . . . 1950s Fortran / COBOL 1967 Simula 67 1980 Small Talk / C++ 1988 Object-oriented Analysis 1992 Use Cases 1995 Design Patterns 1996 Analysis Patterns 1997 UML Copyright © 2011, Essential Strategies, Inc. 5/73 Does UML supersede data modeling? Some would say no… Since it is about object oriented design… … it is not suitable for business analysis. Copyright © 2011, Essential Strategies, Inc. 6/73 Problem: UML is. . . HERE Despite its flaws, The Unified Modeling Language has been recognized as a standard in many quarters. Clients and hiring managers keep asking if you have experience with UML. !!! How should we entity/relationship dudes deal with this? Copyright © 2011, Essential Strategies, Inc. 7/73 It’s easy . . . Just build your entity / relationship models in UML! So I did . . . Copyright © 2011, Essential Strategies, Inc. 8/73 Which meant that . . . My data modeling colleagues were convinced that I had completely sold out and gone over to the dark side . . . . . . and my UML/object modeling colleagues accused me of bastardizing their sacred notation. So, I wrote another book in response . . . Copyright © 2011, Essential Strategies, Inc. 9/73 A companion volume . . . Two audiences: Data modelers convinced that UML has nothing to do with them. UML modelers who don’t realize that architectural data modeling really is different … … and the differences are important. This is a handbook on how to use the UML class notation to produce an Architectural Entity / Relationship diagram. Copyright © 2011, Essential Strategies, Inc. 10/73 Today's Program . . . Objectives Kinds of Data Models The Architecture Framework Introduction to UML How to Use UML–the nitty gritty Copyright © 2011, Essential Strategies, Inc. 11/73 Today’s program . . . Objectives Kinds of Data Models The Architecture Framework Introduction to UML How to Use UML–the nitty gritty Copyright © 2011, Essential Strategies, Inc. 12/73 Objective . . . This presentation will not only describe how to build architectural entity/relationship models in UML . . . … it will describe how to build good architectural entity/relationship models . . . … no matter what notation you choose to use. Copyright © 2011, Essential Strategies, Inc. 13/73 Today's Program . . . Objectives Kinds of Data Models The Architecture Framework Introduction to UML How to Use UML–the nitty gritty Copyright © 2011, Essential Strategies, Inc. 14/73 Kinds of data models . . . We modeling types are quick to criticize our clients for getting their vocabularies confused. But what about us? What do we mean by . . . “Conceptual” data model? “Logical” data model? “Physical” data model? “Semantic” data model? And now you’re adding “Architectural” data model? For purpose of this presentation, here are the definitions: After all, it is my presentation… Please hear me out… Copyright © 2011, Essential Strategies, Inc. 15/73 Kinds of Models . . . Corporate Overview: Context Conceptual Semantic Architectural Logical Physical Copyright © 2011, Essential Strategies, Inc. 16/73 Corporate Overview . . . A place to start the conversation with top management. What are the most important kinds of data required to manage this business? Global things of significance, entirely connected with “manyto-many” relationships. Probably no more than a dozen boxes. Some call this the “Conceptual Model” Copyright © 2011, Essential Strategies, Inc. 17/73 Conceptual Data Model . . . Reasonably complete, detailed description of a domain (Not an overview) In terms of the business, not technology Classes of things of significance to the enterprise. These things of significance are the concepts. One-to-many relationships represent assertions about the enterprise. Moderate attribution Copyright © 2011, Essential Strategies, Inc. 18/73 Conceptual Data Model – Two Flavors . . . Semantic model – In terms of the business language, as used. The vehicle for identifying semantic conflicts. Also called the “divergent” model. “Environment model”, according to the Object Management Group Architectural model – In more abstract terms: general categories that cross departmental boundaries. Also called the “convergent” model. Some call one or both of these the “logical” model. Copyright © 2011, Essential Strategies, Inc. 19/73 Logical data model . . . In terms of a particular data management technology: Relational tables and columns Hierarchical legs Network edges and nodes Object oriented classes XML tags Etc. Describing implementation of business objects plus some technological objects. Some call this the “Physical Model”. Copyright © 2011, Essential Strategies, Inc. 20/73 Physical Data Model . . . How data are physically stored on a device. Discrete physical databases Partitions Table spaces etc. In terms of the mechanics of a particular vendor’s data management technology: There is now industry-standard technique for this kind of modeling. But you can be sure that the technicians are drawing pictures. Copyright © 2011, Essential Strategies, Inc. 21/73 Today's Program . . . Objectives Kinds of Data Models The Architecture Framework Introduction to UML How to Use UML–the nitty gritty Copyright © 2011, Essential Strategies, Inc. 22/73 .. Four ways to look at data. (1975) Context: ANSI’s Three ... External Schema External Schema 2 Logical Internal Schema Schema (Relnl.) Conceptual Schema Logical Internal Schema Schema (XML) External Schema 3 Physical Schema Copyright © 2011, Essential Strategies, Inc. Physical Schema 23/73 The Architecture Framework . . . Data (What) Activities (How) Network (Where) List of Important Things List of Processes Business Locations Business Owner’s View Terms, Definitions Business Process Model Architect’s View Entity/ Relationship Diagram Objectives/ Scope Designer’s View Builder’s View Functioning System Tables, Classes Data, physical storage design People (Who) Timing (When) Motivation (Why) Organizational Units Business Events, Cycles Business Vision and Mission Operations by Business Location Org. Chart, Roles Master Business Schedule Essential Functions Data Links, Processing Locations Roles+Data (Use Cases) System Design Network Architecture (h/w, s/w types) Detailed Program Design Network Construction State/ transactions, ELH User Interface, “Control Flow” Security diagrams Screens, Security Design Timing Definitions Business Policies and Rules Business Rule Model Rule Design Rule Specification Working System Copyright © 2011, Essential Strategies, Inc. 24/73 The Architecture Framework . . . Activities (How) Network (Where) People (Who) Timing (When) Motivation (Why) List of Important Things List of Processes Business Locations Organizational Units Business Events, Cycles Business Vision and Mission Terms, Definitions Business Processes Operations by Business Location Org. Chart, Roles Master Business Schedule Business Policies and Rules Essential Functions Data Links, Processing Locations Roles+Data (Use Cases) State/ transactions, ELH Business Rule Definitions Tables, OO Classes XML tags System Design Network Architecture (h/w, s/w types) Physical Storage, Programs Detailed Program Design Network Construction Data (What) Executive’s View Business Owner’s View Architect’s View Designer’s View Builder’s View Functioning System Entity Classes, Relationships User Interface, “Control Flow” Security diagrams Screens, Security Design Timing Definitions Rule Design Rule Implementations Working System Copyright © 2011, Essential Strategies, Inc. 25/73 In terms of the Architecture Framework . . . Architectural Model (Row 3) External Schema 1 External Schema 2 Logical Model (Row 4) Logical Schema (Relnl.) Conceptual Schema Logical Schema (XML) External Schema 3 Semantic Model (Row 2) Physical Model (Row 5) Copyright © 2011, Essential Strategies, Inc. Physical Schema Physical Schema 26/73 Ok, let’s look into the data column more deeply… Copyright © 2011, Essential Strategies, Inc. 27/73 Semantic Data Model, (E/R, SBVR, OWL) Business Owners’ Views (Semantics) Architectural Entity/Relationship Model “Conceptual” Data Model Architect’s View (Integration of Business Owners’ Views) Designer’s View (Technology) Terms, concepts. definitions Entity classes, attributes, relationships Architectural Data Model Database Design Model RELATIONAL DATA BASES Object-oriented Design Model (UML) Object-oriented Classes XML Schemas Tables, columns, keys Classes, attributes, associations Tags Copyright © 2011, Essential Strategies, Inc. 28/73 Today's Program . . . Objectives Kinds of Data Models The Architecture Framework Introduction to UML How to Use UML–the nitty gritty Copyright © 2011, Essential Strategies, Inc. 29/73 UML was originally designed to support object-oriented design… …not architectural business modeling. But do I have a deal for you . . . Copyright © 2011, Essential Strategies, Inc. 30/73 We can use UML for a data model? Yes…but with restrictions: Restrict the definition of Entity Class. Use a Subset of the notation. Recognize that E/R relationships are not the same as OO associations. Pay attention to Layout aesthetics. Add unique identifiers. Copyright © 2011, Essential Strategies, Inc. 31/73 The Domain of Discourse . . . Only elements in the domain of discourse are of interest to us. (“Domain of discourse?” I hear you ask…) The “problem domain”: An enterprise A government agency An area of study No technical object classes. Copyright © 2011, Essential Strategies, Inc. 32/73 Entity Class . . .? Originally, Dr. Philip Chen invented entity/relationship modeling to represent “entity types”. That is, an “entity” is a thing of interest. An “entity type” is a categorization of those things. On the model, a box represents an entity type. “Entities” are instances of same. Over the years, we data modelers have gotten lazy. We call the entity types on our drawings “entities”. This is wrong. Object-oriented people sneered at us because they knew the difference between a class and an object. Apparently we did not. Copyright © 2011, Essential Strategies, Inc. 33/73 So . . . Henceforth, what Dr. Chen called “entity type”, we will call “entity class”. This links us to the classes known to object-oriented designers. It clarifies that the classes of interest to architectural data modelers are a sub-set of all classes. It adds precision to our language. (And remember: you heard it here first…) Copyright © 2011, Essential Strategies, Inc. 34/73 E/R relationships are not the same as OO associations . . . E/R relationship (role) names are predicates representing assertions about the domain. Structured to form a sentence in each direction. For example, “each Order may be composed of one or more Line Items”. UML roleName supports navigation from one class to another. A label to identify the object class for a program process. For example, Line Item is labeled by “Order line item”. (Adds different symbol for “composed of” next to Order.) This has a profound effect on naming conventions. Copyright © 2011, Essential Strategies, Inc. 35/73 Subsets of the Notation Only parts of the UML notation that are relevant to data structure may be included. No behavior No design decorations Composition Association direction + standardCost Visibility Etc. Copyright © 2011, Essential Strategies, Inc. 36/73 Aesthetics . . . An architectural entity/relationship model is to be presented to non-technical business subject-matter experts. Layout is important. No bent lines Relationship orientation Each drawing limited to 8 ½ by 11. Simple English with spaces – no camelCase. Sub-types are shown within super-types. Of course these considerations are often given short shrift by E/R modelers as well. Copyright © 2011, Essential Strategies, Inc. To their peril… 37/73 Copyright © 2011, Essential Strategies, Inc. 38/73 Today's Program . . . Objectives Kinds of Data Models The Architecture Framework Introduction to UML How to Use UML–the nitty gritty Copyright © 2011, Essential Strategies, Inc. 39/73 The Steps . . . Notations About Classes Use of language Domains Unnecessary in UML Unique identifiers Representation of sub-types Presentation of models Copyright © 2011, Essential Strategies, Inc. 40/73 The Steps . . . Notations . . . E/R (Information Engineering) E/R (Barker-Ellis) Unified Modeling Language (UML) About classes Use of language Domains Unnecessary in UML Unique identifiers Representation of sub-types Presentation of models Copyright © 2011, Essential Strategies, Inc. 41/73 E/R Notation (Information Engineering) . . . Maximum Cardinality Attribute Minimum Cardinality Role Name Identifiers Entity class Line Item_1 Line Number Order Number (FK) composed of Quantity Price (Extended Value) Delivery Date Order_1 part of Order Number Order Date Copyright © 2011, Essential Strategies, Inc. 42/73 E/R Notation (Information Engineering) . . . Shows cardinality as graphics. Observer sees it. Shows identifying attributes and relationships. Identifying attributes in separate section of entity class box. Identifying relationship through combination of symbols:. NOTE: Each relationship direction is structural, representing an assertion about the nature of the domain. Minimal references to technology… … but there is a relational design bias: Foreign keys implementing relationships Complexity of identifying relationships. Copyright © 2011, Essential Strategies, Inc. 43/73 E/R Notation (Barker-Ellis) . . . Attributes Maximum Cardinality Minimum Cardinality Role Names Entity class Relationship Identifiers Copyright © 2011, Essential Strategies, Inc. 44/73 E/R Notation (Barker-Ellis) Shows cardinality as graphics. Observer sees it. Shows identifying attributes and relationships with simple symbol. NOTE: Each relationship direction is structural, representing an assertion about the nature of the domin. No references to database or any technology. Copyright © 2011, Essential Strategies, Inc. 45/73 UML Notation . . . Maximum Cardinality Attributes Minimum Cardinality ..1 Class Role Names Relationship (Association) Identifiers (None) Copyright © 2011, Essential Strategies, Inc. 46/73 46/ UML Notation . . . Systematic cardinality notation (attributes and associations). Cardinality textual, not graphic. Viewer must read and understand it. MAJOR ISSUE: In UML, an association is a navigation path, not a structure. Identifier notation added in version 2.2. (Can also be added via “stereotypes”.) No database connection . Full notation has object-oriented design symbols …that we can ignore. Copyright © 2011, Essential Strategies, Inc. 47/73 About Notations . . . Different notations (as implemented via different tools) make it easier or more difficult to do certain things. The important dimension is good practices. Best to support the practices here is Barker / Ellis Second best is the revised version of UML. Information Engineering’s bias toward relational database design is hard to thwart. But it is the best practices, not the notation that is most important. Copyright © 2011, Essential Strategies, Inc. 48/73 The Steps . . . Notations About Classes Use of language Domains Unnecessary in UML Unique identifiers Representation of sub-types Presentation of models Copyright © 2011, Essential Strategies, Inc. 49/73 According to the “Three Amigos” . . . An object is a “discrete entity with a well-defined boundary and identity that encapsulates state and behavior; an instance of a class” A class, in turn, is “the descriptor for a set of objects that share the same attributes, operations, methods, relationships, and behavior.”1 Note: No constraints as to what kinds of objects or classes were of interest. 1 Rumbaugh, J., Ivar Jacobson, Grady Booch. 1999. The Unified Modeling Language Reference Manual. p. 360. Copyright © 2011, Essential Strategies, Inc. 50/73 According to James Martin and James Odell, “anything is an object”.2 2. Martin, J., and James Odell. 1995. Object-Oriented Methods. (Englewood Cliffs, NJ: Prentice Hall). p. 34. Copyright © 2011, Essential Strategies, Inc. 51/73 An “Entity” on the other hand . . . … is not just any “discrete entity with a well-defined boundary and identity”. .. is limited to what Richard Barker calls things or objects “of significance, whether real or imagined, about which an organization needs information.”3 An “entity class”, unlike other “classes”, is not concerned with operations, methods, or behavior. Those belong to the world of “process modeling.” An entity/relationship model is only concerned with the Structure of business data. 3. Barker, Richard. 1990. CASE*Method: Entity Relationship Modeling. (Wokingham, England: Addison-Wesley). Copyright © 2011, Essential Strategies, Inc. 52/73 The Steps . . . Notations About classes Use of language Entity classes Attributes Relationships and Roles Domains Unnecessary in UML Unique identifiers Representation of sub-types . . . Presentation of models . . . Copyright © 2011, Essential Strategies, Inc. 53/73 About language in the model . . . An architectural entity/relationship diagram is essentially a graphic portrayal of English language assertions about an organization. * Therefore, the only language to appear on a diagram must be in terms relevant to the domain of interest. Only business terms (and conventional English) may be used as the names of entity classes, attributes, and the names of roles. That is, no abbreviations, computer terms, or acronyms. Words are not concatenated together. Spaces between words are shown (“Line Item”, not “lineItem”). * … or assertions in any other natural language, such as Polish, French, Chinese, or what have you. Copyright © 2011, Essential Strategies, Inc. 54/73 Entity Class names . . . The name of an entity class is in the singular, and refers to an instance of that class. Hence, Order and Line Item are acceptable. The name “Project history” is not. An entity class called Project, on the other hand, could contain instances over time, so it may in fact be a project “history” Database table names are not allowed, nor are abbreviations or acronyms. Classes that are computer artifacts (“window”, “cursor”, and the like) are not allowed. Copyright © 2011, Essential Strategies, Inc. 55/73 Again, because the model will be presented publically, spaces between words are required. Copyright © 2011, Essential Strategies, Inc. 56/73 Naming Attributes . . . In both E/R and UML an attribute is a characteristic of an entity class. It “serves to qualify, identify, classify, quantify, or express the state of an entity” 4 In the previous example: Order: “Order number” and “Order date”. Line Item: “Line number”, “Quantity”, “Price”, “Delivery date”, and “/Extended value”. “/” means a derived attribute. * /Extended value = Quantity * Price Again, spaces are required (where appropriate). (“Delivery Date”, not “deliveryDate”) 4, * Barker, op. cit., p. 5-6. This is something UML has over E/R notations. Copyright © 2011, Essential Strategies, Inc. 57/73 Cardinality of attributes . . . In UML, cardinality is represented the same way for attributes as for roles. Minimum cardinality: [1..1] – Mandatory: must be at least one value; may be no more than one value. Usually abbreviated “[1]”. [0..1] – Optional: may or not have a value; may have no more than one value. Copyright © 2011, Essential Strategies, Inc. 58/73 About maximum cardinality . . . Note that, according to relational theory, multi-valued attributes are universally prohibited. The second element can only be “..1”. Of course new versions of the SQL standard are lifting that requirement… Pity That . . . Copyright © 2011, Essential Strategies, Inc. 59/73 Associations / Relationships . . . Each E/R relationship is a structure composed of two roles. Each role is an English language assertion * about the domain of discourse: Each – (The assertion is about each instance of the first entity class.) Subject – (The first entity class) Minimum cardinality (“must be” or “may be”) Predicate – (The role name) Maximum cardinality (“one or more” or “one and only one”) Object – (The second entity class). * …or Spanish or French or Polish or whatever. The point is that it must be in a natural language, not in computer jargon. Copyright © 2011, Essential Strategies, Inc. 60/73 For example (E/R in UML notation) . . . 1. Each Order must be from one and only one Party. 1a. Each Party may be a customer in one or more Orders. 2. Each Order must be to one and only one Party. 2a. Each Party may be a vendor in one or more Orders. Copyright © 2011, Essential Strategies, Inc. 61/73 UML looks at it differently . . . An association is a path, not a structure. Because 2nd class is not in 1st class’s namespace, it cannot be part of the property of the 1st class. Hence roleName is simply a label for the second class (a noun). Copyright © 2011, Essential Strategies, Inc. 62/73 UML looks at it differently . . . Role name often simply copies the 2nd class name. (In this case, role name does distinguish two roles.) Role name is not part of a structural statement. Copyright © 2011, Essential Strategies, Inc. 63/73 Changes to the “standard” UML approach . . . Role names are prepositions Preposition is the part of speech that describes relationships. Nouns describe things. The entity classes are already the things. (…and they are already labeled.) No duplication of the entity class name in the role name. To duplicate the class name is a serious redundancy in UML. The practice comes from requirements of Java programming: The object class is not part of the subject class’s “namespace”.) Copyright © 2011, Essential Strategies, Inc. 64/73 About reading the role names . . . Each Book <entity class 1> must be 1.. (or) For example . . . may be 0.. primarily about <role name> ..1 one and only one ..* (or) one or more Topic <entity class 2> 0..* Book of primarily about 1..1 Topic Each Book must be primarily about one and only one Topic. Each topic may be of one or more Topics. Copyright © 2011, Essential Strategies, Inc. 65/73 Role names are important . . . ‘Ravenous Bugblatter Beasts often make a very good meal for visiting tourists’ Copyright © 2011, Essential Strategies, Inc. 66/73 This should have read . . . “Ravenous Bugblatter Beasts often make a very good meal of visiting tourists” Douglas Adams. 1982. The Restaurant at the End of the Universe. New York: Pocket Books, pp. 37–38. Copyright © 2011, Essential Strategies, Inc. 67/73 About conversion . . . Because it is a “platform-independent” model, any E/R model must be converted to a database design (“platform specific”) model. Entity classes become tables. Attributes become columns. Unique identifiers become primary keys. Many to one relationships become foreign keys. Copyright © 2011, Essential Strategies, Inc. 68/73 For example, conversion to a Database Design . . . <<table>> ORDERS <<table>> FK_ORDERS_PARTIES_FROM PARTIES FK_ORDERS_PARTIES_TO Copyright © 2011, Essential Strategies, Inc. 69/73 The Object-oriented Version . . . Because it is a “platform-independent” model, a UML E/R model must also be converted to an object-oriented program model: E/R role names are converted to OO roleNames as: “predicate|object class name”. Copyright © 2011, Essential Strategies, Inc. 70/73 Thus, conversion to an Object-oriented Design . . . Copyright © 2011, Essential Strategies, Inc. 71/73 From the Zachman Framework . . . “Conversion”, not simply “more detail”. - John Z. Copyright © 2011, Essential Strategies, Inc. 72/73 The Steps . . . Notations . . . Use of language . . . Domains Unnecessary in UML Unique Identifiers Representation of sub-types . . . Presentation of models . . . Copyright © 2011, Essential Strategies, Inc. 73/73 Domains . . . In E/R modeling, a domain is “A set of business validation rules, format constraints, and other properties that apply to a group of attributes”. For example: a list of values a range a qualified list or range any combination of these. “Note that attributes and columns in the same domain are subject to the same validation checks.” 5 5. Barker, op. cit. p. G1-3. Copyright © 2011, Essential Strategies, Inc. 74/73 Code lists . . . In database design, a code list is a set of valid values for a column. For example, the column “STATE_ABBR” may be controlled by the code list “State abbreviations”. This would have the values “AL”, “AK”, “AZ”, etc. This is one code list that implements the domain “State” Others might be “State official name”, “State code”, etc. In database design, a validation rule may control the legal values for a column. For example, the column SALARY may be constrained by the validation rule “Positive number”. That is, the value must be greater than zero. Copyright © 2011, Essential Strategies, Inc. 75/73 Data type . . . Each E/R domain must also in turn specify the data type of the values for a referenced attribute. These include: String Number Date Boolean Etc. Copyright © 2011, Essential Strategies, Inc. 76/73 Enumeration in UML UML takes a different approach to both code lists and domains. A code list may be described explicitly as an enumeration. This looks like an “entity class”, but instead of showing the attributes “Code” and “Definition”, it shows the list of values. Copyright © 2011, Essential Strategies, Inc. 77/73 Data Types as Domains . . . In addition to the standard data types that come with UML (“number”, “string”, etc), it is possible to define new data types to address any validation criterion desired. “Social security number” “Telephone number” Etc. Copyright © 2011, Essential Strategies, Inc. 78/73 The Steps . . . Notations . . . Use of language . . . Domains Unnecessary in UML Navigation Visibility Composition Unique Identifiers Representation of sub-types . . . Presentation of models . . . Copyright © 2011, Essential Strategies, Inc. 79/73 Unnecessary UML features . . . UML was developed to support object-oriented design. Some of its features are not meaningful in an entity/relationship diagram. Navigation Visibility Composition Copyright © 2011, Essential Strategies, Inc. 80/73 Navigation In an Entity/Relationship diagram, a relationship describes structure. By definition both ends and both roles must exist. (You cannot build half a bridge.) In an object-oriented program, program code must be written to get from one class to another. If the application only calls for navigating in one direction only, it is useful (for the developer) if the designer indicates that. This is not part of an Entity/Relationship diagram. Copyright © 2011, Essential Strategies, Inc. 81/73 Visibility . . . In an object-oriented program, attributes of a class may be “visible” only to that class, or to super-types of that class, or to the entire application. This is shown by: A “+” sign for universally visible” A “-” sign for restricted visibility. A “#” sign for protected visibility. A “~” for visibility within a package. Copyright © 2011, Essential Strategies, Inc. This is not part of an Entity/Relationship diagram. 82/73 Composition . . . Within object-oriented programs, composition structure is very common and very important. So a symbol ( ) is equivalent to the role name “composed of”. This includes the referential integrity constraint “cascade delete”. Another symbol ( ) is also “composed of”, but this enforces the the referential integrity “nullify”. Copyright © 2011, Essential Strategies, Inc. 83/73 Composition . . . Entity / Relationship modeling addresses the semantics of the business with language. Another symbol for the words “composed of” is redundant. Can’t do referential integrity anyway (There is no symbol for “Restricted Delete”). Copyright © 2011, Essential Strategies, Inc. 84/73 The Steps . . . Notations . . . Use of language . . . Domains Unnecessary in UML Unique Identifiers Representation of sub-types . . . Presentation of models . . . Copyright © 2011, Essential Strategies, Inc. 85/73 Unique identifiers . . . In relational theory, the primary key is a fundamental concept A set of columns, whose values uniquely identify each row. In practice, often it is hard to find uniquely identifying columns, so sometimes a surrogate key is added. A sequence number, which is always unique, but has no meaning. The object-oriented community has decided to assume that all classes are identified by a surrogate key, called an object identifier Until now, UML has no inherent facility for representing natural unique identifiers. Copyright © 2011, Essential Strategies, Inc. 86/73 The previous solution! (Sort of) UML has the ability to define stereotypes, or added features. It is possible to define the stereotype <<ID>> This labels components of a unique identifier. Both attributes and roles can be labeled as being part of an identifier. Note that this is actually much tidier than the information Engineering approach. Copyright © 2011, Essential Strategies, Inc. 87/73 The OMG got the message . . . With version 2.2, there is now a “property” called “isID?” It is displayed on the drawing as {id} This is still much simpler than the Information Engineering approach. Copyright © 2011, Essential Strategies, Inc. 88/73 The Steps . . . Notations . . . About classes . . . Use of language . . . Domains Unnecessary in UML Unique identifiers . . . Representation of sub-types . . . Presentation of models Copyright © 2011, Essential Strategies, Inc. 89/73 Sub-types: The UML (and IE) approach . . . Copyright © 2011, Essential Strategies, Inc. 90/73 The Barker-Ellis approach . . . PARTY ORDER PERSON from # ORDER NUMBER * ORDER DATE the source of # * o * PERSON ID FIRST NAME MIDDLE INITIAL SURNAME to the destination of ORGANIZATION # ORGANIZATION NAME INTERNAL ORGANIZATION * IN TER NAL OR G TYPE GOVERNM ENT COM PANY GOVERNM ENT AGENCY More compact. Makes it clear that attributes and relationships of supertype also apply to the sub-type. “Each Company may be the source of one or more Orders.” “Each Household may be the source of one or more Orders.” POLITICAL ORGANIZATION HOUSEHOLD Copyright © 2011, Essential Strategies, Inc. 91/73 The E/R UML Approach . . . Copyright © 2011, Essential Strategies, Inc. 92/73 The Steps . . . Notations . . . Use of language . . . Domains Unnecessary in UML Unique identifiers . . . Representation of sub-types . . . Presentation of models . . . Aesthetics Presentation approach Copyright © 2011, Essential Strategies, Inc. 93/73 A word about presentation . . . The first objective of a data model is presentation to a nontechnical audience. This requires: Effective use of language Business terms for entity classes. Business assertions for relationships. Good aesthetics No more than 10-12 boxes per page. Straight lines. “Dead crows” positioning. (OK, “starry skies”…) Effective presentation A succession of diagrams Each adding 2-4 entity classes. Copyright © 2011, Essential Strategies, Inc. 94/73 These principles are independent of notation Copyright © 2011, Essential Strategies, Inc. 95/73 About the drawings . . . No bent lines. Orient boxes so “many” side of relationships is up or to the left. (“Starry skies” approach) Each subject area must fit on one page. No more than 12-15 boxes Less than 10 is better Copyright © 2011, Essential Strategies, Inc. 96/73 Before . . . Copyright © 2011, Essential Strategies, Inc. 97/73 After . . . “Ok, let’s talk about Tests and Measurements…” Copyright © 2011, Essential Strategies, Inc. 98/73 About the Presentation . . . Build up presentation a few entity classes at a time. Start with one or two entity classes. Add one or two And so forth For each slide, highlight what is new on that slide. Copyright © 2011, Essential Strategies, Inc. 99/73 Conclusions . . . UML can be used to represent architectural entity/relationship diagrams, with constraints: Orientation toward the domain of discourse (problem domain). Addressing only classes of significance to the business. Changing the syntax of role names. Addressing the aesthetics of the models. Data model quality is a function of: Clarity of thought Clarity of presentation Data model quality is not a function of the notation selected Copyright © 2011, Essential Strategies, Inc. 100/73 Questions . . . ? And now for an example . . . Copyright © 2011, Essential Strategies, Inc. 101/73