MAKING THE TRANSITION FROM ERD’S TO UML AND ORACLE’S OBJECT DATABASE DESIGNER Dr. Paul Dorsey, Dulcian, Inc. Overview UML is not simply a replacement for Entity Relationship Diagramming. It is a complete, integrated object-modeling environment with “n” parts. The class diagrams included in the UML alone are contenders for replacing entity relationship models. Make no mistake – UML is the emerging standard. We no longer operate in a development environment where there will be many competing modeling standards. Many of the leading names in the object-oriented software community, including Oracle combined their talents to form a consortium and developed a unified modeling standard. The other companies that participated were HP, I-Logix, ICON Computing IBM, MCI Systemhouse, ObjecTime, IntelliCorp and James Martin & Co., Platinum Technology, Rational Software, Taskon, Sterling Software, and Unisys. Despite its advantages over ER diagramming, UML is not without its weaknesses. In some ways, UML diagrams have more symbols making them more cluttered and therefore less easily understood by users. This may be due to the fact that UML was not explicitly created to support database design. However, class diagrams in UML are a superset of entity relationship modeling. There is nothing that can be expressed with ERD’s that cannot be expressed in UML notation; and there are many more structures and relationships possible to express in UML that are not possible with ERD notation. A major strength of UML is that it is explicitly extendable. If there are things we need UML to do that are not inherent within UML standard, vendors such as Oracle are free to create additional objects to support the specific needs of their environments. This openness of UML architecture raises the specter of deviation from the standard. However, UML is rich enough that such deviation should be minimal and easily controlled. UML is here to stay. One could argue that it is already the standard for all object-oriented development. UML can and should be used now for both logical and physical relational database modeling. The only reason not to make the shift now is that we do not currently have the products to support this shift. As the relational environment migrates to an object-relational environment, UML will also become the standard for database development. Shifting from ERD to UML It is not difficult to shift from basic ERD to UML terminology as shown in Table 1. ERD Term UML Term entity class instance of an entity object relationship association supertype/subtype generalization relationship dependency (UID bar) composition (usually, but not always) attributes attributes Table 1: ERD – UML Terminology Any ER diagram can be easily mapped to a UML diagram. For example, arcs (exclusive ORs) are represented by a dotted line in UML diagrams. In addition, many structures that cannot be represented in an ERD can be represented in UML as shown in Figure 1. http://www.odtug.com/members/tool_corner/c_papers/083.PDF 1/6 ER Modeling UML Modeling 1 1 0..1 0..1 0..1 1 1 * 1..* 1 1..* 1..* * * * 1..* 0..1 * Figure 1: ERD relationships translated into UML notation Classes also have operations. You can declare what sort of behavior can be associated with a particular class. UML Naming Conventions In ERDs, there were two sides of a relationship to name. In UML, there are not only object role names attached to each side of the relationship, but we also have the option of placing a name and a direction on the relationship itself. The ends of the relationship are called “Association roles.” This concept of a role works very well in many cases. For example, between Department and Person, for the employment relationship between the two using the full UML notation would be as shown in Figure 2. http://www.odtug.com/members/tool_corner/c_papers/083.PDF 2/6 Person Employer Works for * Dept. 0..1 Employee Figure 2: Roles in UML This diagram reads as follows: “The department acting as employer may employ any number of persons as employees.” Linguistically, the UML notation includes the role that each object in the class is assuming in the relationship expressed with a noun on each side. The relationship itself is expressed with a verb and preposition. This notational scheme holds up fairly well. However, sometimes it is difficult to determine the appropriate word to describe the roles of the objects. Also, if you are creating abstract models where the same relationship might represent more than one type of association, the naming becomes somewhat more complex. There are times when the same ERD verb phrases that we used on relationships still work well with UML models. We always use a “relationship phrase” consisting of a noun or verb followed by a preposition. Additionally, if required to enhance the clarity of the diagram, we will use roles on the relationship ends. Note that in using this “relationship phrase” in the case of a 1:many relationship, it is usually more descriptive to name the relationship going from the child (many) side to the parent (one) side of the relationship. For example, to describe the Dept/Emp example, “works for” usually seems easier to understand than “employer for.” Extensibility of UML The UML designers allowed for the possibility that UML might not be able to satisfy the modeling needs of everyone. Because UML was designed to support object-oriented programmers, it sometimes falls short in supporting object-relational database modeling. Using UML, you can either extend the existing functionality or create new types. UML provides three mechanisms to extend the UML notation: Stereotypes, Constraints and Comments. Stereotypes and constraints are keywords that can be attached to any UML element to alternate meaning or functionality. For example, this is very useful if you have a many-to-many association between two classes. This can be drawn with an association diagram as shown in Figure 3. Enrollment * * 1 Student 1 Course Offering Figure 3: “ERD” – style representation of many-to-many relationship In this UML diagram, an association means that a student can only take a class one time. This may be satisfactory for this example. However, this would not be true for the relationship between a Department and an Employee. The Employment History of an Employee may include his/her being employed by the same Department more than once. In this case, the stated UML conventions are too restrictive. Stereotypes It is often difficult to determine whether a keyword should be designated as a stereotype or a constraint. Keep in mind that if you are limiting existing behavior or functions, a constraint should be used. If you are redefining or extending behavior, then a stereotype is appropriate. Stereotypes designated by «guillemets» extend or redefine an element. There are pre-defined stereotypes such as “Abstract” for class. However, users can also create their own stereotypes to alter or extend the semantics of UML. We need to extend the functionality of the relationship between Department and Employee. This can be done with the stereotype <<duplicates allowed>> as shown in Figure 4. http://www.odtug.com/members/tool_corner/c_papers/083.PDF 3/6 Employment History Employee * Department «duplicates allowed» * Figure 4: UML diagram showing stereotype Constraints UML can also be extended through the notion of constraints. This is one of the great strengths of UML in comparison to ER diagramming. Constraints designated with curly set brackets { } limit the functionality of the UML object. There are native constraints defined by UML such as the “Or” constraint on an association between relationships. However, additional constraints may be defined by the user. For example, a recursive relationship on an entity provides no indication whether the real situation being modeled is a regular tree structure, tree of finite height, linked list or other similar structure, which would all be modeled in the same way using entity relationship constructs. In UML, you can explicitly constrain the relationship in a very precise way. For example, a recursive relationship on the class Employee can be indicated by the constraint {tree} as shown in Figure 5. UML ERD {tree} Boss 0..1 Person Person * Employee Managed by Figure 5: Comparison of ERD and UML diagram of recursive tree relationship Comments Another useful example of the extensibility of UML is that we can create a syntax for attaching comments to a diagram. In the class diagram, comments are graphically represented text objects that can be attached to any UML object. Comments are useful for declaring data-related business rules that could not be represented otherwise. We can create a new type of comment called “Business Rule,” which can be placed as appropriate in our UML diagrams in order to make them cleaner, as shown in Figure 6. * PO PO Dtl 1 Each PO must have 1 detail before status = “APPROVED” Figure 6: Example of diagram with comment attached http://www.odtug.com/members/tool_corner/c_papers/083.PDF 4/6 Associations Naming associations is done differently in UML than in entity relationship diagramming. Rather than naming both sides with verb phrases as in ERD’s. In UML, the relationship is named once and a directionality arrow is added, usually pointing from the “many” side to the “one” side of the relationship. In addition, in UML notation, you can also declare the role that the objects in each class play in the relationship as shown in Figure 7. ERD UML Works for works for EMP EMP DEPT DEPT * 0..1 Employer of Figure 7: Comparison of naming associations in ERD and UML In Figure 7, the Employee, acting in the role of worker “works for” the Department, which acts in the role of employer. This ability to declare what role the object plays in the relationship provides greater flexibility in accurately naming associations. Generalizations Generalizations in UML are somewhat more flexible than Supertype/Subtype relationships in ERDs. There need not be only one generalization structure from a given class. We are able to have multiple generalizations operating on the same classes, as shown in Figure 8, an example of a consulting contract, which can be Time and Materials or Fixed Cost and independently classified as Government or Private. ERD UML Fixed Price Fixed Price Private Billing type Government T&M T&M Government Contract Fixed Price Time & Materials Org. type Private http://www.odtug.com/members/tool_corner/c_papers/083.PDF Private Government 5/6 Figure 8: Examples of multiple generalization relationships Eventually, we hope that UML will evolve so that fewer relationships remain untyped. This notion of “Typing” a relationship will be somewhat difficult for most ER modelers since the types of relationships in ER modeling are quite limited. ER modelers should be very careful not to fall into the trap of using UML to simply draw their ERDs with a different notation. Using Oracle’s Object Database Designer (ODD) Oracle8 is not the first Oracle product to include object-oriented thinking. In Forms 4.5 we had our first taste of object orientation through property classes. We could set and enforce standard sets of properties for any object in our applications. PL/SQL used with libraries and object groups was a rich enough set of features to provide us with some ability to create reusable program components. A few of us even went so far as to encapsulate whole parts of our applications into reusable structures. Now the products are starting to better support this evolution to object-oriented thinking. The introduction of UML as a data modeling language significantly improved upon our ability to capture complex business rules in a data model. Oracle has now released a new modeling tool, the Object Database Designer (ODD). ODD has been integrated into Oracle Designer so that both modeling techniques are based on the same set of tables and views. UML provides far more flexibility in the definition of our data models. Oracle’s ODD allows us to develop detailed data models that serve us throughout the system development life cycle, as well as the ability to generate DDL from them. However, in Oracle Designer’s ODD, many of the important features of UML have not yet been implemented. This does not mean that object-relational databases are a bad idea, just that they are still evolving. In the next few years, object-relational databases will completely supplant traditional relational databases as the standard for new systems. In Oracle Designer we need a full implementation of the UML class diagrams in the Object Database Designer, and we need to see some other parts of UML make their way into the product. We also need to be able to generate modules to the object structures. Conclusions It is clear that object-relational databases are the wave of the future. However, we are not ready to build production objectrelational systems. First, the Oracle8 object extensions need to go through another iteration. At a minimum, we need inheritance to be able to say we have an object-oriented system. We waited for ten years to get our relational databases to support objects let us hope we can get inheritance more quickly. We also need improved performance and some improved ability to modify object structures once they are built. About the Author Dr. Paul Dorsey is the President of Dulcian, Inc., an Oracle consulting firm that specializes in data warehousing, systems development and products that support the Oracle environment. Paul is co-author with Peter Koletzke of Oracle Press’ Oracle Designer Handbook and with Joseph Hudicka of Oracle Press’ Oracle8 Design Using UML Object Modeling. Paul is an Associate Editor of SELECT Magazine. Paul is also collaborating on a book about Oracle Developer. He is President of the NY Oracle Users’ Group and very active in the Oracle user community. Paul has won best presentation for both ECO and IOUW and was a finalist for best presentation at ODTUG. He and Peter Koletzke shared the Pinnacle Publishing Technical Achievement Award at ECO for their work on an Oracle Forms template. Paul can be contacted at pdorsey@dulcian.com or through Dulcian’s Website at http://www.dulcian.com/. http://www.odtug.com/members/tool_corner/c_papers/083.PDF 6/6