Integration Definition (IDEF1X) We’ll begin with IDEF1X because that’s the notation we’ll use throughout the discipline. It’s found in most CASE and modeling tools and is the language used in U.S. government projects, having been developed in the 1970s by the U.S. Air Force and revised by D. Appleton in 1993. It has gone through its testing period and is able to document most discoveries about how data relates. But unlike Latin, it isn’t a dead language and is growing to encompass some of the new Unified Modeling Language (UML) and object modeling needs in the form of its latest incarnation, IDEF1X97. Note Using IDEF1X is a personal preference; you can generally draw the same analysis in many different notations. Some notations have symbols that cover concepts others don’t. For instance, Barker notation can denote either/or relationships, but IDEF1X can’t. Whatever notation you use, you should practice it until you’re comfortable using it. The symbols used in IDEF1X are close enough to the symbols used by other modeling styles to be recognizable, and they’re different enough to be confusing if you haven’t used other modeling styles. The following list describes the elements you need to master in order to document data characteristics in this graphical language: Entities, tables, and views are both square- and round-cornered boxes. Square corners denote a lack of dependence for identification on another entity. Round corners denote the opposite. Relationships are solid lines if the foreign key is identifying and dashed if it’s nonidentifying. Solid and dashed lines are completed with a set of terminating symbols that can be combined to cover a variety of cardinality and optionality rules. Entities that are categories have symbols signifying whether the subtypes represent a complete or incomplete membership. Attributes and columns are displayed depending on the view with an optional set of characteristics. Primary keys are segregated from other attributes above a dividing line in an entity box. Foreign, alternate, and inversion entry keys/indexes are optionally displayed with text notations. Information support for control systems Lesson 12 / Student Page 1/12 No matter the choice of drawing style, what you’re capturing is essentially the same. You may just draw it a little differently depending upon the syntax you use. You’ll now take a closer look at this notation. Boxes Boxes denote three different concepts. Logical models contain entities, and Physical models contain tables and views. It’s often hard at first glance to be able to decide whether you’re looking at a Logical or Physical model for this reason (see Figure 4-1). Figure 4-1: The box shapes in IDEF1X The following are some pretty basic rules about boxes on a model. Dual Syntax All the boxes on a model represent either entities or tables/views. They can’t be mixed. Although they may look similar, logical and physical objects are never used in the same model. The only boxes used in both logical and physical diagrams are cosmetic drawing objects used to frame and highlight text. Singular Purpose Information support for control systems Lesson 12 / Student Page 2/12 Boxes have one purpose. They’re visual boundaries surrounding a data set. On a relational model, this basically means the following: Boxes don’t represent anything other than entities (in a Logical model) or tables or views (in a Physical model). All entities or tables/views are represented as a box. They aren’t represented any other way. Even discoveries from other logical analyses or actual objects in other databases, which won’t be deployed, are still boxes on the model. They’re usually noted with a distinctive color, text format, or naming convention to identify them appropriately. Location Indifference In IDEF1X, boxes aren’t in any order. They aren’t arranged in any special way, and their placement doesn’t mean anything. You get the same logical or physical design no matter how the boxes are arranged on the model. They should be arranged to best communicate data discoveries. You’ll find that people get used to your placements and react negatively when you move things around. Some argue that the physical creation order should order the boxes on the model so that everything that needs to be built first should appear above those objects that need to be built later. Others claim all relationship lines should either be “in the top and out the bottom” or on the sides noting the migrating key (but not all directions at once). But these are conventions, not hard-andfast rules. Concentrate on making the arrangement of the objects on your model part of the way to create a useful product for your customers. We have a tendency to try to use as little paper as possible so that we can print our models more easily. This won’t be useful if you’re trying to work with someone who needs the creation precedence clearly noted and wants the independent entities placed above the dependent entities. Use your best judgment, and be sensitive to your clients’ needs. Corner Meaning: Dependent/Independent The corners of the boxes denote their degree of dependence. Square-cornered boxes denote independent entities or tables, and rounded-cornered boxes denote dependent entities or tables. You can determine this dependency by whether the primary key attributes needed to identify an instance in the member set also identify an instance in another entity’s member set. If you look at Figure 4–2, you can see that Entity2 and Table_2 are dependent on their parents. On the other Information support for control systems Lesson 12 / Student Page 3/12 hand, Entity4 and Table_4 are independent of their parents. The same occurs with tables in the Physical model. Database views are always dependent because they can’t exist without the tables from which they’re derived. Figure 4-2: Corner shape meaning For example, Figure 4–3 shows the little example model described earlier. Figure 4-3: Entity dependence Information support for control systems Lesson 12 / Student Page 4/12 You can see from this example that a Car Order depends on Order for a portion of its identity. You need the Order Number identifier to be able to complete the identifier (or primary key) of the Car Order. Remember that dependence results in having a portion of the parent’s complete identifier created in the child entity or table and is important to the sequencing of some data creation and deletion. We’d like to say that it will be your visual clue to all sequential dependence, but that isn’t true. Certain mandatory nonidentifying relationships will also impact sequential tasks of creation and deletion of objects, constraints, and data, yet in IDEF1X they aren’t shown with soft corners. Lines Line styles are the same on both the Logical and Physical model. The only exception is the line noting the relationship of a table to a view, which has no counterpart in Logical modeling. Lines in the model are used to segregate, connect, and note membership. Lines Within Boxes Lines in boxes segregate the data elements into two types: those that act as a primary key or identifier and those that don’t. Every text label above the line notes one separate data element that has been chosen to either partially (in combination with other data elements), or uniquely, identify each member of the entity. That separator line divides the data elements into two groups: the group of primary identifying attributes and the group of attributes not used as part of a primary identifier. Those data elements under the line could still be defined as candidate keys and could therefore be identifying attributes—they just aren’t part of the primary key. You should only ever see one line in a box, as shown in Figure 4–4. Information support for control systems Lesson 12 / Student Page 5/12 Figure 4-4: Primary key boundary line Relationships Between Boxes Relationships, or constraints (as you saw earlier), tie entities and tables together and are also depicted using lines. These lines often have a verb phrase explaining why the tie is there or showing the name of the constraint. These lines also terminate using a specific symbol set to depict quite a bit of information about the relationship. But focus for a moment on the simple line definition shown in Figure 4–5 — solid, dashed, and dots. Figure 4-5: Logical and physical relationship line graphics Information support for control systems Lesson 12 / Student Page 6/12 You can read these examples to mean the following: Solid lines denote that the entire complement of data elements above the line (the primary key of the parent entity or table) is migrating to the primary key position of the entity (or table) to which it’s being related. This is called an identifying relationship. You can look at one between Order and Car Order in Figure 4–6. A Car Order can’t exist without an Order. Figure 4-6: Various line types in a small Physical model example Long, dashed lines denote that the entire complement of data elements above the line of the parent entity or table is migrating to a position below the line of the entity or table to which it’s being related. This is called a nonidentifying relationship. The relationship between Order and Order to denote a Giveaway Order is nonidentifying. Specifically, an Order could exist without a Giveaway Order, and one doesn’t need the Order that caused the Giveaway Order creation to find the Giveaway Order. The last line style is a series of dots or short dashes. It’s a physical notation only, since it notes that a view uses a table for source data. Views can use any or all the columns in the table they reference. The GiveawayOrderView in Figure 4–6 is an example. Another type of line notation that’s used solely in Logical modeling denotes a category grouping, as shown in Figure 4–7. As a reminder, when using a category structure, you have one parent (supertype) entity with as many subtype entities as required to document your analysis. The solid lines tell you that the relationship is identifying or, in other words, that all the category entities Information support for control systems Lesson 12 / Student Page 7/12 share the same primary key. The single or double lines at the intersection tell you whether the entire set of category subtype divisions is noted. A complete category notes all the subtypes, and an incomplete category notes only some of them. Figure 4-7: Lines representing category relationships The physical notation for categories is the default notation used on the Physical model. For Physical models, the same notation is used for both complete and incomplete categories. Terminators Terminators show up at the end of lines. They tell you how many instances of an entity or table are subject to the relationship. Cardinality Terminator Cardinality terminators cover the “how many?” part of the question. These symbols give you lots of flexibility in defining relationships. They also help database administrators (DBAs) figure out ratio and sizing estimates in the physical world. The meaning is conveyed through a simple terminated line or a dot, which can be enhanced with a letter or number notation, as shown in Figure 4–8. Information support for control systems Lesson 12 / Student Page 8/12 Figure 4-8: IDEF1X cardinality graphics Table 4–1 shows how you read these rules. Table 4-1: IDEF1X Cardinality Graphics Meanings Relationship Line Terminator Symbol Plain line Meaning Infers Example One and only one Generally the This is the parent terminator. It’s used for parent or source. child to parent business rules, such as “Each order line makes up one and only one order.” Z plus dot Zero or one Boolean decision. This is a child terminator. It’s used for parent to child business rules, such as “Each order generates zero or one order.” P plus dot One or many Mandatory requirement of at least one. This is a child terminator. It’s used for parent to child business rules, such as “Each order is made up of one or many order lines.” Information support for control systems Lesson 12 / Student Page 9/12 Table 4-1: IDEF1X Cardinality Graphics Meanings Relationship Line Terminator Symbol Dot Meaning Infers Example Zero, one, or many Most flexible rule. This is a child terminator. It is used for parent to child business rules, such as “Order is made up of zero, one, or many order lines.” <Number> plus Specifically dot <N> The most This is a child terminator. It’s used for restrictive. That parent to child business rules, such as “Each quantity only calendar year is made up of 12 months.” (always). <N–N> plus dot A range rule. Must be at least at the lower but may be as high as the higher of the two values. The range Number– Number This is a child terminator. It’s used for parent to child business rules, such as “Each calendar month is made up of 28 to 31 days.” A very This is a child terminator. It’s used for complicated parent to child business rules, such as “Each “how many?” customer younger than 12 and older than 65 that can’t be is offered a ticket discount.” stated any other way. (This is a good example of a logical modeling notation that doesn’t translate simply into a physical implementation.) Note This table includes two notations (<N–N> plus dot and <note> plus dot) that aren’t <note> plus dot Whatever it says about a multiple relationship standard IDEF1X. Nullability Terminator The nullability terminator symbol denotes whether a relationship is required from all the child instances. This is read from the child to the parent, and in the case of a mandatory relationship will read as follows:. Every child <Entity / Table> instance must be related to a parent <Entity/Table> instance. In the case of a null-allowed relationship, it will read as follows: Information support for control systems Lesson 12 / Student Page 10/12 Every child <Entity / Table> instance is allowed not be related to a parent <Entity/Table> instance. The symbol to denote this latter relationship type is an open diamond at the end of the line closest to the parent. For example, in the little Order model you can see in Figure 4–9, the diamond is at the end of the relationship noting the connection of Order to Order and Employee to Order. You’ll see this symbol only at the end of a nonidentifying relationship (the dashed line). No nulls are allowed in an identifying relationship, and it isn’t used in view notation. Figure 4-9: Notation for nullability See in Figure 4–9 how we’ve used the nullability symbols to say that every Employee must have a Building_Location? However, not every Order generates a Giveaway Order; an Order may have a Giveaway_Order, but this relationship isn’t required. View Terminator The small dashed line and a hollow dot terminator denotes a view (see Figure 4–10). As a reminder, views are found only on Physical models, and they contain prepackaged or filtered data from source tables. Everything in them depends for existence on the original sources. We tend to look at that empty dot as a ghost terminator; for example, you model views for many different reasons, but they have no substance on their own. It shows up as an empty terminator since the notation shows a dependency only between the view and the source tables. You may find views on Physical models that have no visible relationships if these views are referencing tables that aren’t on the model. Information support for control systems Lesson 12 / Student Page 11/12 Figure 4-10: Terminator for a view Information support for control systems Lesson 12 / Student Page 12/12