Inhoudsopgave Classes & attributes .......................................................................................................... 2 What is a (data) model?............................................................................................................ 2 Why does data modelling matter? ............................................................................................ 3 Modelling languages ................................................................................................................ 5 A class is an abstraction ............................................................................................................ 7 Attributes and Data types ......................................................................................................... 9 Representing Instances in MSAccess or MSExcel ..................................................................... 10 Why are class definitions important? ...................................................................................... 12 Associations.................................................................................................................... 14 Basics ..................................................................................................................................... 14 Unary association ................................................................................................................... 18 Ternary association ................................................................................................................ 19 Aggregation ........................................................................................................................... 21 1 Classes & attributes What is a (data) model? "Dictionary.com" defines a model as "A systematic description of an object or phenomenon that shares important characteristics with the object or phenomenon." So, models present a systematic, and most often simplified description of what they represent. Such a description is a helpful instrument to study the characteristics of what the model represents. UML offers a notation of visual, symbolic models used to assist a software engineer in understanding the requirements and design of a software system. The notations offered by UML are grouped into modelling techniques. One of these techniques is called "class diagrams" This technique is used to map out the structure of a business domain. It visualizes: the business objects, the information we need about these objects, and how the objects are related to each other. The description of a business domain is unique for every business, even when on the surface two businesses seem similar. Let me give you an example. Let’s say you own a web shop that sells coffee mugs. You need an information system that keeps track of the available mugs and of the orders placed by your customers. Different customers might decide to buy the same mug. So one product can end up in several shopping baskets. It’s different when you are an artist selling your unique paintings online. Each painting can be bought only once. So in this case, each product can be contained in one basket at most. This example shows how each business is unique and will therefore yield a different domain model, adapted to its specific needs. UML class diagrams is the prime modelling instrument used to map out specific business domains. Creating a UML class diagram for a business will enable to pinpoint what makes this business unique and will provide a good starting point for developing an information system. 2 Why does data modelling matter? Why is domain modelling important? Creating a domain model allows a deeper understanding of that domain. And, this deeper understanding is crucial for creating software systems that can adapt to the ever changing requirements of their users. Let me illustrate this with a little story about a boat rental business. Once upon a time, an entrepreneurial man lived along a lake. He witnessed the tourists having fun on the water and he realized he also had a boat in his shed. He put out the boat with a "FOR RENT" sign. The tourists were attracted and soon the boatman made enough money to buy extra boats. As his business was growing the boatman wanted to collect more information about his business. He contacted a consultancy company to help him out. A business analyst came over to find out what he needed to know. The boatman told the business analyst that each day he wanted to know the average duration of the boat trips during that day. The business analyst wrote this requirement as a mathematical model. She used the following formula. The average duration of a boat trip equals the arrival time minus the departure time per trip, summing up all trips of the day and dividing by the number of trips for that day. The business analyst passed on this requirement to the company's programmer. She asked a programmer to develop a system that would answer the boatman's needs. The programmer looked at the formula and figured out it would be very difficult to implement. In the formula arrival times and departure times are paired per trip. However, in practice, arrival times and departure times are mixed. If the boats would depart in the order 1-2-3-4, they might wel return in a different order. Fortunately, the programmer was really good at maths. He figured out the formula could be changed as follows. He would sum the arrival times and the departure times separately, and then subtract and divide by the number of boat trips. This adapted formula was a lot easier to implement. The programmer developed a system with a departure and an arrival gate. Each gate was equipped with an electronic eye to register the boats departing from and returning to the harbour. At the end of the day those times were summed, subtracted, and divided by the number of trips. The boatman was delighted with the solution. After a while, the boatman again needed more information about his business and went back to the business analyst. This time, he needed to know the duration of the longest trip at the end of the day. The duration of the longest trip equals arrival time minus departure time and then the maximum over the different registered times for that day. 3 The business analyst handed over the formula to the developer. She asked to make sure to add this requirement to the system. The programmer looked at the formula and figured out it would be very difficult to implement. The programmer saw immediately that he could not apply the same simplifying trick to this formula. The maximum duration of a trip does not equal the difference between the maximal departure time and the maximal arrival time. The programmer explained to the business analyst that this new requirement could not just be added to the old system. The old system did not pair arrival and departure times of a single trip. This new requirement would necessitate a completely new system. When the business analyst informed the boatman, he was, of course, not very happy with this news. He did not understand why a simple extra requirement resulted in breaking the existing system and subsequently, in high costs for a new system. It will not surprise you that the boatman never contacted the consultancy company again. What went wrong? How can a simple extra requirement break an entire system? The answer lies in the identification of the core concept of the business. What is this business about? What is the core concept of the boatman's business? The core concept in this case is the boat trip. If you look at the business analyst's specification, you can see the concept of the boat trip implicitly identified in this specification. The programmer however, by reworking the specification, removed the concept of the boat trip entirely. The system's design did not embody the core concept of the business. And that's why the system was broken by adding a new requirement. The lesson to remember is that you need to understand the core concept of a business. The domain model that captures these core concepts needs to be at the heart of your information system to ensure adaptability. 4 Modelling languages In this presentation I will brief explain what we mean when we talk about modelling languages and modelling techniques. A modelling language is a collection of modelling techniques. In particular, the Unified Modelling Language is a collection of over 9 different modelling techniques, one of which is class diagrams, you will learn about in this course. A modelling technique is what you use to write down a model. It's a collection of symbols, with a set of rules that define how these symbols can be composed in a valid way to form a model. An example is the technique of process diagrams from the language BPMN, which consists among others of these symbols. Modelling techniques range from purely textual to purely formal. More textual techniques, such as for example use case diagram, have the main advantages that they are very easy to understand. A disadvantage however, is that they tend to be unprecise, ambiguous and do not offer intelligent verification and quality control. Purely formal techniques like, for example, process algebra offer many formal verification possibilities, but, this comes at the price of being more difficult to understand and use. UML Class diagrams are somewhat in the middle. They have the advantage of being reasonably easy to understand, while at the same time, being sufficiently formal for achieving a good level of precision and verification possibilities. There is a plenitude of modelling techniques for information systems design and development. A few examples are Entity Relationship Modelling, UML Class diagrams, UML State charts, Flow Charts, Process Diagrams, Petri Nets, Event Process Chains, and so on. In software development, different models are used for different phases and different viewpoints. In terms of the phase, models will range from abstract or conceptual and implementation agnostic to detailed and implementation oriented. Typical viewpoints in IS design are the data perspective, the behavioural perspective and the interaction perspective. Some techniques address the same viewpoint and phase, but use different symbols. For example, business process modelling can be done with BPMN Process diagrams but can also be done with Petri Nets. And Conceptual data modelling can be done with entity relationship modelling but also with UML Class diagrams. Sometimes the same technique is used through different phases. For example, UML class diagram can be used for conceptual modelling but also for program design. In this course, will focus on the use of UML class diagrams for conceptual modelling. Frequently used data modelling techniques for conceptual data modelling are, for example, the extended entity relationship modelling technique. This technique was invented by Peter Chen for the conceptual design of databases and 5 it was later extended with insights from semantic modelling, and so became the "Extended Entity Relation modelling" technique. Another technique that is still used a lot, is data models using the crow's foot notation. And, in this course we'll focus on UML class diagrams. This technique is part or the Unified Modelling Language, a language that is rooted in objectoriented programming and the technique was created to represent the data aspects of classes. While different data modelling techniques exist, they all rely on the same concept of classes, represented as boxes, and associations, represented as lines that connect those boxes. Those associations are adorned at their end with cardinalities, and the most prominent difference between the different techniques, is, as you can see, the way those cardinalities are represented. We'll focus on the notations of UML, but in practice, at the end of this course, you will also be able to understand entity relationship diagrams and data models based on the Crow's foot notation. 6 A class is an abstraction In this presentation we'll explain the concept of a class. The notion of a class relies on the principle of abstraction. The principle of abstraction is something very natural we rely on every day. Already as toddlers, we learn how to distinguish between non-living and living things, between cutlery and toys, between people and animals. Every day, we classify objects around us into concepts such as 'fork', 'spoon', 'block', 'wheel', 'pen', 'ball'. When we categorize objects, we are able to make abstraction of individual properties, and focus on essential common properties. For example, this Collie looks very different from this Fox Terrier, yet, we understand that both are dogs. And, while this animal also has a furry skin, four legs, a tail, ears, etc. we know it's a cat and not a dog. In computer terms, a concept that groups objects with similar characteristics is called a 'Class'. Likewise, in order to be able to handle the objects in the universe of an organisation, these objects need to be categorized into concepts. So let's look at an example. At our university there are lots of people around and we are able to distinguish between employees, professors and students. The class 'student' will capture the concept of a student and define the characteristics of a student that are relevant for the university and will abstract the characteristics that are irrelevant. More importantly, in defining the class student, we will define what distinguishes a student from any other person within or outside the university: namely that a student is registered for a study program at the university. People that are not registered, will not belong to the class 'student'. Another example is the concept of the study program. A university offers many study programs. The concept of study program refers to a collection of courses you can follow in order to obtain a degree, and when defining the class 'study program' we will define the notion of 'study program' such that we understand what constitutes the difference between a study program and a random collection of courses. In UML a class is represented as a rectangle with the name of the class inside it. Here you see two classes: the class STUDENT and the class STUDY PROGRAM. Each class represents a set of objects as explained before. The class STUDENT represents a set that contains all the individual students. And likewise the class STUDY PROGRAM represents the set of study programs that contains all the different instances of study programs When we are doing modelling we actually can distinguish two different levels. Level zero is the instance or object level. There we find individual objects. For the class STUDENT these are the individual students like Peter, Helen, Lisa, John, ... For the class STUDY PROGRAM we find different study programs like the bachelor of law, the bachelor of business economics and so on. The upper level of abstraction, level 1, is the model or diagram level. There we find the definition of the classes like the class STUDENT or the class STUDY PROGRAM. 7 Modelling is abstracting, it means that you make the transition from level 0 to level You go from the instances to the definition of the concepts. On the other hand when you want to validate the model you have to do the reverse. So you will take the model and reason on an example. You will start from level 1 and you will try to illustrate this level 1 by finding a level 0 example that proves or disproves that your model is right. A 'Class' has two functions. On the one hand, the class will be the template or a model for a group of real world objects that are similar. It defines a type of instances and therefore we also call it an object type. This is called the "intent" of the class; it is the definition of the concept that defines class membership In our example a person is a student only if that person is subscribed for at least one study program at the KU Leuven. The template will capture characteristics that are relevant about the objects in the class In the case of students at the university, their name, their birthdate, their home address, their email address, ... are examples of features that are relevant for the concept of the student in the context of the university. The class definition will omit the irrelevant aspects. Irrelevant characteristics of students from the perspective of the university are: the color of their hair, the color of their eyes, their height, their weight ... At the same time, the class represents a collection of objects that conform to its intent. This is called the 'extent' of the class. Notice that classes can represent both tangible and intangible objects. Students are for example tangible objects while study programs are intangible objects. Both kinds of objects can be represented as classes in the UML class diagram. 8 Attributes and Data types In this presentation we'll explain the concepts of attributes and attribute types for UML class diagrams In the previous presentation we presented the class STUDENT. We have a lot of students at the university and in order to be able to distinguish one student from another we will make use of certain characteristics of the students like for example their name or their student number. The characteristics that we want to information about for every individual object in a class, are called the 'attributes' of that class. In our student example, the class student has attributes such as Last Name, First Name, Student Number. The list of attributes for a class is obviously specific for a particular domain. For a university, the mentioned attributes are relevant, and so would be the birth date of the student, and the home address. But the blood group and weight are examples of irrelevant characteristics that would therefore not be defined as attributes. Each object has its own values for the attributes: attribute values are specific for the instance they belong to. So, each individual student will have a value for Last name, like Jansens and Martens for these two students. And the same holds for the First Name and the Student Number. Let's illustrate this with a second example, namely the class PRODUCT. The class has an attribute 'number' and an individual instance of the class PRODUCT will have a value for that number like 123456. The class has an attribute 'name' and then the instance has a value for that name, like MilkyWay. And the same goes for the short description, the long description and the amount in stock. Moreover each attribute has a data type. Examples of data types are integer, float, text, string, and boolean. The data type constrains the values that can be given for a particular attribute. When an instance is given a value for an attribute, this value has to follow the constraints implied by the data type. In the class PRODUCT, the attribute 'number' has been defined of the type integer. So a product's number, that is, the value that you give to the attribute 'number', needs to be a whole number. On the other hand, the long description has the data type 'long text', so it means that you can give a whole text as value for the long description. By the fact that the data types constrain the permissible values for an attribute, they also define the permissible operations. In a UML class diagram the attributes and their data types are written in a box below the class's name, as you see here on the left side. For the instance on the right side we can simply see a list of values that this instance has for the different attributes. 9 Representing Instances in MSAccess or MSExcel In this presentation, we briefly explain how instances of a class are represented in personal office software. It is always interesting to be able to represent the 'extent', that is, the list of instances, that belong to a class. Those lists of instances can be represented as tables in a database, either in information systems or in software for personal use. An example is to represent lists of instances in programs like Microsoft Access or Microsoft Excel. Microsoft Access is an example of database software for personal use. It provides most of the database functionality like for example the query language SQL. Microsoft Excel on the other hand is not a database software, as its main focus are calculations. But it does provide some database-like functionality because it has a number of operations that are equivalent to SQL operations. Suppose that at the model level or at level 1 or at class level, you have a UML class STUDENT. Then at level 0, the instance level, in Microsoft Access it will look like this. Your database would contain a table STUDENT and if you open that table you will find the list of students. In Microsoft Excel it would look like this. You typically would have a sheet that contains all the student data and then all the info of one student will be on one row in your Excel sheet. As explained before, objects in a class are defined by means of the values they have for the class's attributes. In Microsoft Access, you will find the attributes when you look at the description of the different tables that you have in your database. If you open the design tab for a particular table, then you will see a list of attributes. They are called 'fields' in Microsoft Access. And for each of those fields or attributes you will find a data type. In the example you see that 'Number' is of data type 'Number', that 'Last Name' and 'First Name' are of the type 'short text' and that 'Birth Date' is of type 'Date/Time'. In Microsoft Excel you will find the attributes as the headings of the different columns and the attribute values are then the values you find in the different cells of these columns. Each row represents an instance with all the values this instance has for certain attributes. The data type is what you find as the format of a particular cell. In the example you see how the cell E3 has been formatted as a percentage, which means the attribute has been assigned the data type 'percentage'. A main difference with Microsoft Access is that in Microsoft Access and any other database an attribute has a data type and this data type holds for all the values for that attribute. So, in other words, a complete column has always the same data type for all its cells Here in Microsoft Excel, different cells of a single column can have a different data type or format This is obviously not desirable when the data in your sheet represents the instances of a single class. Here you can see why data types are important. The data type determines permissible operations on the values of an attribute. 10 Let's look at an example. In column C you see that a number of available amounts have been recorded for budgets. In row number 7 we attempt to make a sum of those available amounts by summing over the cells C2, C3 to C6. But the result is a 0 which is obviously not correct. The reason why you get a 0 rather than the correct available amount is because of the fact that the cells C3 to C6 have been formatted as text and not as a number. Their data type is text which means they are not available for arithmetic operations like a sum. If you set the data type to number, then arithmetic operations will be possible. 11 Why are class definitions important? In this presentation, we'll explain why class definitions are important For each class we will give a class definition. This is the description of when an object is considered to be a member of that class. Now why are those class definitions so important? A class diagram as a conceptual domain model is always made for a specific domain, namely the organization for which we create the domain model. What a concept represents, and what is relevant or irrelevant is different and specific for each organisation. Establishing class definitions allow us to create a vocabulary for the organisation so that everybody knows what is meant with a certain kind of concept and has the same idea of when an object is member or not of a particular class. This kind of vocabulary is called an ontology. In information science an ontology formally represents knowledge as a set of concepts within a domain. It means that for that domain you create a shared vocabulary that you use to denote the types, the properties and interrelationships of the concepts. So, you can easily see that the UML class diagram defines an ontology. It does this for a particular domain. In this case it will be the organization for which you are defining your conceptual model. So, the conceptual domain model defines a shared vocabulary for your organization. And, it is very important that an organization has a shared understanding of all its core concepts In other words, the concept of the classes that are defined in the UML class diagram. So, let's illustrate how the definition of an object type or class is domain specific and how it helps to correctly understand what a particular domain or organization is about. Let's look at the concept a class PRODUCT in different contexts. First, in the context of a supermarket. In that context, the instances associated to the concept of product could be like this. There are four instances in PRODUCT class: (one represents skimmed milk, the second represents a pink lady apple, the third represents cabbage, and the fourth one kitchen towel.) Now, each of those instances actually represents a number of items that are on the shelf in your supermarket. For example, skimmed milk represents the different bottles that you find on the shelf. There are many bottles linked to the instance skimmed milk. This product will have something like a stock level and when the stock is too low you can replenish this product. Now we will look at the concept of a class PRODUCT in the context of a car retailer. In this context the instances of the class PRODUCT will represent each individual car with its specific chassis number. The first instance here is a Citroën C1 with a chassis number 12345. And, the second instance is another Citroën C1 but with another chassis number: 78976 and so on. These products do not have something like a stock level and you cannot replenish this product. Rather, each product represents now a specific unique item with a unique serial number. It's a totally different concept of a product because now you cannot replenish this product 12 and it doesn't have a stock level, like in the supermarket. If you go to a pharmacy then the situation is still different. In a pharmacy you'll find both types of products. You can have something like a sunscreen, for which you typically don't keep track of the individual serial numbers of the different bottles of sunscreen. But, for some regulated medicine the pharmacist does have to keep track of the individual serial numbers of the individual boxes. So you see how the notion of product is different from organization to organization and that's the reason why each class within a class diagram needs a good definition so that all people in the organization have a shared understanding of the meaning of each individual class or concept 13 Associations Basics In this presentation we'll explain the basics of the concept of association. When we look at the world around us, we see that individual objects are related to each other Consider for example the world of our university, with its students and study programs. Because of the fact that students register for study programs, there are links between certain objects in the class student, and objects in the class study program. For example, we see here that the student Peter Martens is registered for the study program Master of Information Management. So there is a relationship (also called a link) between Peter Martens and the master of information management. Following the principle of abstraction means that similar relationships or links between instances at level 0 need to be abstracted into a higher level concept at level 1. Such abstract concept is called an association. An Association is therefore a level 1 concept that relates classes and that represents a collection of links between the objects of those classes. It represents a type of link with specific characteristics. The first of these characteristics is whether it is optional or mandatory to have a link with other objects. So for example is it optional or mandatory for a student to be registered for a program? The other important characteristic is the maximum number of objects we will find at the other side of the association. In this example maximum number of programs a student can be subscribed to at a any single point in time? The combination of the minimum and maximum is represented as an interval: minimum .. maximum. The minimum can be 0 or 1, and the maximum can be 1 or many and the many is represented as a star Let's illustrate this with our example of students and study programs In the class diagram we see that a student can be registered for minimum 1, maximum many programs. And that a study program has 0 to many students that are registered for that program. Let's look at what this means at the instance level. So, looking at the example we see the 0 on the side of students (the red zero) indicates that we can have programs with 0 students. Here, the bachelor of Canonical law is an example of a program that has 0 registered students. But we also see that each student must be registered for a program. That's the blue 1 If we look at the class students, we see that indeed each student has at least one 'registered for' link to a study program. On the other hand we also see that the student can be registered for many programs. This has given rise to the purple start. For example, Helen Jansens is registered both for the bachelor business economics and for the master business economics The way we write down an association in UML can vary. It's always a line that connects the two classes and the line can be annotated with extra information. An association can be read in two directions. here: from the student to the program, and from the program to the student. These individual directions are called the 'roles' of the association. In this example, the role from the student to the program has been named 'Registered for' and the role name is put next to the destination class of the role, so next to the class Program. In the other direction a program has registered students, so we put 14 the role name 'registered students' next to the class Student. The other notation is to put the role names in the middle. In this example, an Employee works in a department, and a department has employees. We put those two role names in the middle but we add two little arrows to indicate in what direction you have to use these role names. It's not mandatory to put role names so the third example is just a straight line with no role names. And, the last example is where you use only one role name and the other role name is implicit. A program consists of program years, but the relationship from program year to program has no name. When you have two classes in a UML class diagram you can have two different associations between those two same classes. So here's an example where a person can on the one hand be the owner of a car and on the other hand can be the driver of the car. We record this information separately as blue and green links. In the bottom scheme you see that Pete owns a mini but that this mini has two drivers: both Pete and John can drive the mini. Same holds for the BMW. Ann owns the BMW but both Pete and Ann can drive the BMW. 15 16 17 Unary association In the previous examples, the association connected two different classes. But what can also happen is that a different number of classes are connected through an association. In this presentation we'll explain the unary associations. In some cases an association connects a class to itself. This is called a unary association. An example is when people are related to each other. In this example a person is supervised by someone or someone is the supervisor of other people Then you have an association with roles "supervises" and "supervised by", depending in what direction you read the association. This expresses the fact that people can be related to each other. So in the instance level example you see that Eric supervises Bart and that Bart is supervised by Eric and on the other hand Erik supervises Jan and Jan is supervised by Erik. Another typical example is the modelling of a hierarchy, like the departmental structure of an organization. Departments can have departments: so each organizational unit has zero to many other organizational units as subdepartment: zero if it is at the bottom of the hierarchy, and one to several if it is higher in the hierarchy. Reversely, each department is part of zero to one other organizational unit: zero if it is at the top of the hierarchy, and one otherwise. Similarly, a bill-of-material can also be modelled by means of a unary association. In this case, we typically don't have a strict hierarchy as certain components, such as for example screws and bolts, may be used in several components. So, a component may use zero to many other components and each component may be used on zero to many other components. Notice that no constraint is enforced on the two participants of a link. So according to this model, a component could use itself. and a component that is part of another component, could at the same time contain this component. If such links should be prohibited, then additional constraints need to be specified, for example in OCL. 18 Ternary association In this presentation, we'll explain ternary associations. Sometimes binary and unary associations are not able to correctly capture the relationships between objects. Consider the following example. Assume we have Suppliers, Products and Projects, and we want to capture which supplier supplied what construction material for which construction project. The model that you see, attempts to capture this information is through three binary manyto-many associations: A supplier may have supplied zero to many products, and a single product may have been sourced from different suppliers. A Supplier may have supplied to 0 to Many projects, and a project may have sourced its materials from many suppliers. A Project may have used 0 to many products and a product may have been used by several projects. A sample set of instances and their links is shown in these tables. The upper left table shows which supplier supplied what product. The table on the right shows which supplier supplied to what project And the bottom left table shows which project used what product. Now try to answer the question who supplied the concrete for project P2. In the first table, you can see that both Peters and MaxConstruct supplied Concrete And in the second table you can see that they both supplied to project P2. So, actually, you cannot be sure. Looking again at the third table, you can see that project P2 also used wooden beams. And in the first table you can see that both Peters and MaxConstructs can supply Wooden beams. So each of them may have supplied wooden beams rather than concrete to P2. Or maybe they supplied both. So the combination of the three tables just doesn't give you enough information to answer that question. Similarly, It's not possible to answer the question which project used the wooden beams of MaxConstruct. MaxConstruct supplied to both P2 and P3, which both used Wooden Beams. But since Peters also supplied Wooden Beams and also supplied to project P2, the wooden beams of Max Construct could have been used for both projects or maybe only for P3. The only way to correctly capture which supplier supplied which product for which project, is to use a three-way association. like this, also called a ternary association. The interpretation of cardinality constraints for a ternary association requires however some care. If you put a zero to many cardinality next to supplier, what does it mean? Does it mean that each product and each project each can have zero tot many suppliers? This is what the UML manual says about the interpretation of multiplicities For an Association with N memberEnds, choose any N-1 ends and associate specific instances with those ends. Then the collection of links of the Association that refer to these specific instances will identify a collection of instances at the other end. 19 The multiplicity of the other end constrains the size of this collection. So, to determine the multiplicity on the side of supplier, you take a pair of one project and one product. And then you look at the set of suppliers that match with this (project, product) pair. The (project, product) pair could have been sourced from no, one or several suppliers If the product has not been used in that project, you won't find any supplier. If it has been supplied by only one supplier, then you'll find one supplier. If the product has been sourced from two or more, then you'll find many suppliers. You have to do this for all project-product pairs to determine the general rule that is applicable. If your model allows to distinguish, for example, different orders at the same supplier, then a (project, product) pair may match several times with the same supplier. In such case additional identifying attributes will be required to distinguish the different links from each other. We'll come back to this when discussing the concept of AssociationClass. So for this example, in order to determine what to write next to the "supplied_by" role, we need to consider all project, product pairs, and see how many suppliers we find. The role will most likely not be mandatory: some products will not be used by all projects. For those pairs, we won't find any suppliers. So the minimum is 0. For the maximum, it depends on the rules. If within a project, a product always has to be supplied by the same supplier, then the maximum is one. If you can source a product from many suppliers within the context of a single project, it may be many. 20 Aggregation In this presentation we'll explain the concept of aggregation Some associations convey a meaning of a "part of" relation or composition. For example, this course is "composed of" a number of modules, an orderline is "part of" an order, a parcel is "composed of" items, and so on. In the UML, you can adorn the association end at the side of the "whole" with a diamond to clarify that the association has a "whole-part" meaning. The diamond comes with two flavours: a white diamond representing a "shared" aggregation and a black diamond representing a "composite" aggregation. However, such diamond is purely "syntactic sugar": the symbol could be removed from the UML notation without any effect on what the language can do: functionality and expressive power will remain the same. The white diamond represents a shared aggregation. Intuitively, this means that the parts can be shared by different wholes. If course modules can be shared across courses, then this would be an example of a shared aggregation. The UML definition of shared aggregation is rather vague and states that the precise semantics varies by application area and modeler. In other words, adding the diamond doesn't really add any particular meaning to the UML diagram. You could even argue that it rather adds confusion, since different modelers may attribute a different meaning to this symbol. The black diamond represents a composite aggregation, which is intended as a stronger form of ownership, meaning that the parts are owned by only one whole. The order lines being part of an order, and the items being part of a parcel are examples of a composite aggregation. The UML definition of composite aggregation states that in this case, the the composite object has responsibility for the existence and storage of the composed objects. As you can see, this is a quite implementation-oriented definition. To better understand the meaning of this construct, let's take a further look at UML's definition of the semantics of composite aggregation. The further explanation in the UML manual states that a part can be included in at most one composite object at a time. This means that when using the black diamond, the multiplicity on the side of the whole should be 1..1 or 0..1 Furthermore, composite aggregation is associated with a cascading delete: If the composite object is deleted, all of its part objects are deleted with it. This makes sense for an order: if you delete the order, you likely also delete all the order lines that are part of that order. But for a parcel it may be different: if a parcel is destroyed, maybe you would like to take out the items first. Fortunately, this alternative behaviour is also allowed by UML: in a NOTE UML states that a part object may be removed from a composite object before the composite object is deleted, and thus not be deleted as part of the composite object So, two different life cycle semantics can be associated with the black diamond. Looking at the further explanations in the UML manual, it becomes clear that also the concept of composite aggregation lacks clear semantics. The manual explicitly states that the precise life cycle semantics of composite aggregation is intentionally not specified. The order and way in which composed objects are created is intentionally not defined. What is the conclusion then? How should we use this concept well? 21 In fact, the concept of aggregation and its associated diamond symbols are purely "syntactic sugar" The symbol could be removed from the UML notation without any effect on what the language can do. If you stick to normal binary associations, the functionality and expressive power of your model will remain the same. So the advice for good modelling is to use conventional binary associations with appropriate role names to avoid confusion in a model reader's mind. 22 23 24 25 26 27 Understanding a larger UML class diagram Derived/implicit association In this presentation we take a look at implicit associations in a model. Associations can be used to navigate from one object to the next along consecutive paths. In our example here, we can navigate from student to the programs the student is registered for and then from these programs to the faculty the programs belong to. So, for example, from Elisa Smith you can navigate first to the programs she is registered for That is, the master of information management and the bachelor computer science and then, subsequently, you can navigate to the faculty of Economics and Business and to the Faculty of Sciences respectively. The fact that you can navigate over several consecutive associations implies that there are implicit associations between classes. In this example, there is an implicit association from the student to faculty that represents to which faculty a student belongs and which students a faculty has. Through their subscription to the bachelor Business Economics, we can see that Karen Dieltjens and Helen Jansens belongs to the Faculty of Economics and Business. Through her subscription in the master Business Economics, we see again that Helen Jansens belongs to the Faculty of Economics and Business as well. Through his subscription to the bachelor of Business Engineering we can see that Sam Johnsons belongs to the Faculty of Economics and Business Through their subscription to the Master of Information Management, we can see that Peter Martens and Elisa Smith also belong to the faculty of Faculty of Economics and Business. And, then, finally, through her subscription to the bachelor computer sciences we see that Elisa Smith belongs to the Faculty of Science. So, we have an association between students and faculty that represents which faculty a student belongs to and which students a faculty has The cardinality of an implicit association can be derived from cardinality of base associations In this example, the role 'belongs_to' of the association between student and faculty, can be derived by combining the roles 'subscribed for' of the association between student and program with the role 'offered by' of the association between program and faculty. Looking at the minimum cardinality, we see that a student has to be subscribed to minimum one program And, that a program belongs to, or is offered by, at least one faculty. So, it means that each student will belong to at least to one faculty. Looking at the maximum cardinality we see that a student can be subscribed for many programs, and each of those programs will be offered by one faculty. But they can each be offered by a different faculty. So, therefore, a student can also belong to many faculties. Reversely, the role 'has' results from the combination of the role 'offers' from the association between faculty and program, and the roles subscribed student of the association between program and student. Again, the minimum cardinality, we see that a faculty has at least one program, but a program can have zero students So, a faculty can have zero students And, a faculty can offer many programs which in turn can have many subscribed students ... So, a faculty can have many students. Note that such implicit association is not drawn on the diagram. But you have to be aware of its existence while reading a diagram. 28 Navigation & information need satisfaction In this presentation we take a look at implicit associations in a model. The way we define associations in an UML class diagram is pretty important. As we have seen, an association allows you navigating from one class to another. By connecting multiple associations in a chain we can navigate from one end of the model to another. This navigability of associations determines the navigability of information in an information system: when applications are built based on a UML class diagram the associations in that diagram will define how in an application a user can navigate from one piece of information to another. As a result of this, the navigation paths will also impact the possibilities to satisfy information needs. Let's look at an example. I have logged into the ERP system of my university and this means that in the class professor one instance has been selected, namely me. Now what I want to do in the ERP application is to find the students I'm teaching to. In order to navigate to my students the first thing I will have to do is to navigate to my courses. In the ERP-system I have a button 'Search via courses'. This corresponds to navigating in the UML diagram from the professor to the class courses, and and as you can see in the UML class diagram I can have 0 to many courses to teach. So indeed when I click on it on the button 'Search via courses', I see the list of courses that I am teaching. Now, I will select one course: I choose for the first one, the “business information systems” course. By clicking the 'show' button, I will navigate to the students who follow that course. And what do I see? The application offers me a list of students but not just a flat list. It will group the students according to the program these students are subscribed to. You can see how the report shows the list of programs and then for each of those programs the possibility to show the list of students for this course that are subscribed in that program. This illustrates how you can navigate from a professor to a course and then to all the students that are in that course. And, since students are also related to programs, the report can additionally show to which program those students are subscribed. As we explained through the example of the ERP application, the associations of a UML class diagram define how a user can navigate from one class to another. This navigability of information determines the possibility to satisfy information needs. Let's look at the same example but now assume that the direct link from course to student hasn't been captured by the business analyst so the gray association isn't available. On the other hand, I do have the red link from course to program so I can see which program a course is part of and I can also see which students have been subscribed to a program. This means I can navigate from professor to course, then using the red link, I can navigate from course to program, and then from program to student. like this What is the problem then? Well, Suppose I want to have a list of the students that are subscribed to the course business information systems. I can still navigate from myself to my courses That’s the purple segment of the path. 29 But then, navigating along the red association or this blue arrow returns me information on all the programs the course business information systems appears. Following then the green segment of the path returns me the list of all the students in each of those programs. This is much more than the students that are subscribed to the course business information systems, as not all student of a program follow all the courses of that program. As an example: the course business information systems is part of 21 programs in total, the first three of which are shown here. Of the bachelor Business Engineering 168 students out of the 621 follow this course. Why? Because the program has 3 years and only the students of the 2nd year follow the course business information systems. The course is also part of the program master of information management but there only 20 out of the 49 students follow this course. The reason is that some students have an exemption and some students study part-time. The same holds for the bachelor computer science. There are only approximately 40 students out of more than 200 students that follow the class. And, same follows for the master management where only 2 students chose this as an optional course. So, we see that clearly the non-availability of the grey link going directly from course to students hampers the satisfaction of the need to know how many students are subscribed in the course business information systems. So, wrong associations and missing associations are detrimental to information need satisfaction. 30 Parallel paths In this presentation, we'll dig a little deeper into the issues one can encounter when one can navigate between two classes using two different parallel paths. An important thing we have to take into account when reading a UML class diagram, is that –based on our domain knowledge- we may assume that certain 'intuitive' constraints are imposed, but which in fact are not part of the UML class diagram. This is particularly the case when a UML class diagram contains parallel paths to navigate from one class to another. For example, here we can see that we can navigate along the blue path from course to students. This gives us the students that are subscribed to a course. We can also navigate along the green path. In the first leg of this path, we navigate from course to program and we find all the programs the course is a part of. In the second leg, we navigate from program to student, and so we find all the students of that program. So, this green path gives us the students that are subscribed to a program the course is part of. Clearly, navigating along the blue path will give you another set of students than when navigating along the green path. Let's illustrate this with a concrete Level 0 example. Here you find 3 students, 3 programs and 6 courses. The orange lines indicate the relationships between students and courses: which student has taken which course in his or her individual study program. The purple lines show the relationships between courses and programs: which course is part of what program. And the black lines tell which student is registered for what program. Let us now start from the course Object Oriented Programming. Navigating along the blue path, means navigating from the course to the students taking that course. We find Maria Havena Navigating along the green path, means first navigating to the programs this course is part of. We find the bachelor of Computer Science. We then navigate to the students registered for this program and We find Peter Martens and Elisa Smith. So the two paths yield a different set of students. The same holds in the other direction. So if you navigate from student to course via the blue path, then you will find all the courses a student has registered for in his or her 'individual study program'. If you navigate along the green path then starting from a student, you will first find all the programs the student is registered for, and that all the courses of those program. So, the navigation along the blue path will give you a limited set of courses whereas navigating along the green path will give you potentially a very large set of courses. The general rule is that when you have two paths to navigate from one class to another, then the sets of objects you will find will be different. A 'natural' business rule would be that students can only subscribe to a course that is part 31 of the program they are registered for. This would mean that when you start from a student, the courses you find when navigating along the blue path must be a subset of the courses you find when navigating along the green path. Although this seems kind of obvious, one must be aware that the UML diagram doesn't impose any constraint on the relationships between the two sets of objects. The only thing you know is that the set of courses found through the blue path is potentially different than the set of courses found through the green path. If you want to impose a constraint, for example, that one is a subset of the other, then additional constraints need to be defined on top of the UML class diagram. There is even a specific language for doing that, namely the Object Constraint Language or OCL for short. 32 33 34 35