Goal-Oriented Conceptual Database Design Hareune Asfour Faculty of Science - Utrecht University February 2013 Author Note: A Paper Presented in Partial Fulfillment of the Requirements for The Master Business Informatics course Method Engineering Notice of Originality I declare that this paper is my own work and that information derived from published or unpublished work of others has been acknowledged in the text and has been explicitly referred to in the list of references. All citations are in the text between quotation marks (“ ”). I am fully aware that violation of these rules can have severe consequences for my study at Utrecht University. Name: Hareune Asfour Signature: Date: GOAL-ORIENTED CONCEPTUAL DATABASE DESIGN 2 Abstract. This paper is part of the requirements for passing the Master Business Informatics course “Method Engineering” (Brinkkemper, 1994). In this course, the students acquire the knowledge and skill to describe and evaluate various ICT systems methodologies. Multiple papers on different methodologies are presented to students in order for them to analyze it and to describe it efficiently .This paper presents a description of one of the methodologies used in the IT development sector, this methodology is commonly known as the Goal-Oriented Methodology. In order to describe this Methodology rationally, an early paper on Goal-Oriented Conceptual Database Design (Jiang, 2007) is used as core reference. Keywords: Method Engineering, Methodology, Goal-Oriented, Database Design 1 Introduction Any standard project starts with a requirements analysis phase regardless of the project field (IT, Business, Biology…etc.), this phase is so important that many methodologies were developed to define and structure the steps taken in this phase. Needless to say, database development is a very active IT sector where requirements analysis is given a lot of consideration (especially when sensible data is being addressed). This paper presents a description of a goal-oriented methodology for database requirement analysis based on the paper Goal-Oriented Conceptual Design (Jiang, 2007). The Goal-Oriented methodology describes the process of transforming stakeholders’ goals and wishes into a detailed conceptual schema of the data to be stored. In contrast to software development methodologies, the goal-oriented methodology starts with an early step in the requirements analysis process where the focus in on modeling stakeholders’ goals. The modeling of the stakeholders’ goals undergoes a set of predefined systematic operations from which the conceptual schema is derived. Such approach would not only capture what data means, but also who wants it and for what purpose. So the focus of this paper does not fall on how to design a conceptual database, but rather on how to use Goal-Oriented methodology to design a database. An overview of the proposed methodology is shown in Figure 1. Goal-oriented requirements analysis starts with a list of stakeholders and their high-level goals, these goals in return are refined and interrelated to generate a goal model. The goal model elicits several alternative sets of data requirements, from which a particular one is chosen to generate the conceptual schema for the database-to-be. Stakeholders Goals Goal Model Goal Analysis Conceptual DB schema Domain Model Domain Modeling App. Goals, Quality goals Figure 1 Overview of Goal-Oriented methodology Schema Design GOAL-ORIENTED CONCEPTUAL DATABASE DESIGN 3 This methodology is a result of a Ph.D. project at the department of computer science at University of Toronto. This project was led by Lei Jiang where he worked on a novel method for conceptual database design. Lei Jiang was able to develop this approach by drawing ideas from the Goal-Oriented Requirements Engineering research, and its application to address Data Quality issues. A number of professors in the computer science field were involved in this project namely: Thodoros Topaloglou (associate professor at University of Toronto), Alex Borgida (Professor at University of Toronto) and John Mylopoulos (Professor at Rutgers University).Although there aren’t any resources indicating how exactly these professors were involved in the project, but this approach wouldn’t have been developed without their supervision and advice. This approach was noted in a paper which appeared in the Requirements Engineering conference held at University of Toronto on 15th-19th October 2007 (there is no indication on which date exactly this topic was addressed). This methodology is influenced by a number of other methodologies in the software development domain similarly in the Requirements Engineering domain. In the next section, this paper will present where the influence is coming from and how. The paper is organized as follow: a brief review of related work is provided in section 2, followed by a detailed description of the methodology in section 3, and finally a conclusion is drawn in section 4. 2. Related literature The process of modeling Goals and deriving the conceptual database schema is highly influenced by the TROPOS goal-oriented software development methodology (Bresciani, 2004), which its meta-model (Susi, 2005) is based on the I*Framework (Yu, 1997). Although, there is no Goal-Oriented Requirements Engineering framework devoted specifically to database development, some early Goal-Oriented methods were developed to model the requirements. For instance, The EKD methodology (Bubenko, 1998) came up with a model for enterprises in terms of analyzing, planning, designing and changing the business. One of the sub-models in the EKD methodology model is the Goals Model where the focus is on describing the goals of the enterprise (what the enterprise wants to achieve or avoid and when). Nevertheless, the KAOS framework (Dardenne, 1993) starts with a method called Goal-Oriented Requirements Specification; this method basically transforms goals into objects using a simple rule “an Object is concerned by a goal”. Therefore the guidelines and techniques used in this method were considered as a suitable starting point for Goal-Oriented Conceptual Database Design. However, many important issues related to data storage, management and access are not addressed at all in these previous approaches. Many papers and books written in database design mention the fundamentality of identifying the highlevel purpose of the database to be developed (Connolly 2003), this resembles in some way identifying the goals in the Goal-Oriented Conceptual Database Design approach. The core difference lies within the fact that in this approach at hands, the goals are not only identified but fully analyzed and formally recorded. Unfortunately, the Goal-Oriented conceptual Database Design paper did not get much attention, this is due to the amount of books and papers which cover in some way the Goal-Oriented approaches. In fact, the citation of the paper does not exceed 16 references (based on Google scholar on the 14th of February GOAL-ORIENTED CONCEPTUAL DATABASE DESIGN 4 2013), most of the citations are self-citations. Nevertheless, the majority of citations relate to subjects such as: Data quality techniques, Requirements Analysis and Schematic designs. 3. Description and example In order to describe the process of Goal Oriented Conceptual Database Design, the following case description is used as practical case: “Bob is a student who would like to take part in the master program of ICT at Nador University. Nador University is popular in its region thanks to its facilities and courses. Moreover, Nador University holds the brightest professors who provide exceptional publications along with the set of courses they teach. The university provides scholarships to students who excel in their studies. It also provides the professors with the necessary fund to conduct their research (it usually depends on the research plan provided by the professor). Being the top university in the region is very rewarding, for the government encourages such institutions through funds and honors. In order for Bob to graduate, he needs to pass all courses (either Business Informatics courses or Computer science courses) and to get his thesis done, this might be difficult for Bob since he is working fulltime, so bob has to make a choice: either keep working and enroll as a part-time student, or quite the job and enroll as full-time student”. The process of Goal Oriented Conceptual Database Design is divided into eight steps, each step is illustrated with an example from the case above. Step 1: Identify stakeholder goals, including their quality requirements. Input: A list of stakeholders. Output: A list of high-level goals of the stakeholders. Task(s): Goal identification. Description: In this step, the objective is to pinpoint the goals of each stakeholder, and categorize them into hard goals (accomplishable) and soft goals (very hard to accomplish). Example: based on the case description we have three stakeholders: the student, the Professor and the institution. The student takes courses, and the professor provides the courses. The institution provides the environment for course to take place. The ultimate goal of the student is to graduate (G1) and get a scholar ship from the institution (S1), the ultimate goal of the professor is to get his research published (G2) and get funds from the institution for a new research (S2), the ultimate goal of the institution is to be ranked number 1in the region (G3) and to get funds from the government (S3). These goals are modeled and illustrated in Figure 3. Step 2: Generate a goal model Input: A list of goals produced in Step 1. Output: A goal model. Task(s): Goals analysis. Description: Not enough details can be extracted from the list of goals, that is why Goal (AND/OR) decomposition can be used to refine these goals into sub goals based on the case addressed. Systematic methods of decomposing goals have been proposed in a variety of approaches (take goal-based variability GOAL-ORIENTED CONCEPTUAL DATABASE DESIGN 5 acquisition and analysis (Liaskos, 2006) as an example). For soft goals, the NFR framework (Chung, 2000) offers a catalogue of refinement methods. Moreover, the contribution analysis (Bresciani, 2004) can be used to identify positive/negative contributions to the fulfillment of goals. Example: based on the practical case at hands, in order for the student to graduate, he has to pass all courses (G1.1) (either Business Informatics courses G1.1.1 or Computer science courses G1.1.2). Furthermore, the student has to pass his Thesis (G1.2) as well. in order for the professor to start with his research and publish it, he needs to provide a Research plan (G2.2) and get funds for his research (G2.1) (either institution funds G2.1.1 or own funds G2.1.2). The institution can only be the best in the region if it has the best facilities (G3.1), the best researchers (G3.2) and the best courses (G3.3). Being ranked as number 1 in the region will contribute positively to getting the necessary funds and allowances from the government, which in in return will contribute positively for the professor to get his funds for his research. Figure 3 shows the goal model for our practical case. Figure 3 the goal model drawn for the practical case at hands Step 3: Select a design alternative Input: The goal model obtained in Step 2. Output: A set of the leaf-level goals in the goal model Task(s): Goal evaluation. Description: The Goal model shows the possible paths to achieve the ultimate goal; these possible paths are called design alternatives. In this step we choose the best suitable design alternative accordingly to the preference of the stakeholders. Example: based on the goal model, we can deduct that there are 2 (G1.1.1 or G1.1.2) x 1(no alternatives to achieve G3) x 2 (G2.1.1 or G2.1.2) = 4 alternative ways of achieving the ultimate goals: G1 can ultimately be achieved either by passing BI courses or CS courses. The same holds for G2, it can ultimately be achieved either by institution funds or own funds. So a design alternative DA1 where all ultimate goals are fulfilled can be {G1, G1.1, G1.2, G1.1.1} U {G2, G2.1, G2.2, G2.1.1} U {G3, G3.1, G3.2, G3.3} U {S1, S2, S3} GOAL-ORIENTED CONCEPTUAL DATABASE DESIGN 6 Step 4 Identify initial set of domain notions from goals. Input: The design alternative chosen in step 3 Output: A list of domain notions extracted from the design alternative chosen. Task(s): Domain knowledge extraction. Description: this is the step where the data requirements are extracted from the goals stated in the chosen design alternative. The focus in this step is to derive the domain notions which will form the database schema later on. Domain notions are potential notions about which data is to be stored. Example: The domain notions DN1 which can be extracted from DA1 for G1 are listed in table 1 Goal Domain notions G1 Graduation G1.1 Course G1.2 Thesis G1.1.1 BI course Table 1 Domain notions extracted from DA1 for fulfilling G1 Step 5 Identify and select plans. Input: The design alternative chosen in step 3 Output: A set of plans that collectively fulfill these goals. Task(s): Plan evaluation. Description: the focus in this step is on generating SMART plans to fulfill the goals chosen respectively in the alternative design. Example: In order to achieve G1 in DA1, we can draw several plans to fulfill that. These plan are illustrated in Figure 4 (Please note that those plans are drawn based on assumptions, the purpose of this paper is just to provide an example. However, for live practical cases, these plans are drawn based on the requirements analysis) Figure 4 a Portion of the goal model with the fulfilling plans GOAL-ORIENTED CONCEPTUAL DATABASE DESIGN Means-end analysis introduces additional design alternatives. The total number of alternatives we have identified so far is 4 × 2 (for the type of study) × 2 (for choice of Thesis) = 16. Assuming that chosen plans are P1.1 and P2.1 then DA1= DA1 U {P1.1, P2.1}. Step 6 Expand the set of domain notions using plans. Input: The plans in the chosen design alternative. Output: A list of domain notions extracted from these plans. Task(s): Process modeling, domain knowledge extraction. Description: In this step we extend the list of domain notions from step 4 by extracting domain notions from the plans drawn in step 5 in addition to other possible sub plans. Example: In order for the student to study fulltime (P1.1), the student needs to pay the tuition fees (P1.1.1), he needs to enroll to the program (P1.1.2). Thus, we can extract the following domain notions which are illustrated in table 2 Plan P1.1 P1.1.1 P1.1.2 Domain notions study type, period Payment status, amount, payment date Student number, enrollment number, enrollment date, program number Table 2 the Domain notions extracted from the plans drawn in step 5 Step 7: Construct the domain model Input: The expanded list of domain notions from step 6. Output: A domain model Task(s): Domain analysis. Description: This is the step where the domain notions are transformed into a domain model (the initial database schema) using a diagrammatic notation such as UML. Note that this step can be carried out in parallel with Step 4 and Step 6: once domain notions are extracted from goals/plans, they can be further analyzed and added to the domain model. Example: Figure 5 shows a portion of the domain model using UML class diagrams, the portion of the domain model belongs to the extended list of the domain notions from step 6 7 GOAL-ORIENTED CONCEPTUAL DATABASE DESIGN 8 Figure 5 A portion of the domain model for the domain notions extracted in step 6. Step 8 Construct the conceptual schema. Input: The domain model from Step 7. Output: A conceptual schema. Task(s): Schema transformation Description: In this step, we expand the domain model through series of predefined design operations. These design operations are part of the Goal-Oriented design strategy template of this methodology. This template provides guidelines to tackle certain design issues by applying the design operations provided in the template. Example: One of the common design issues is the security issue, Figure 6 shows part of the conceptual schema where the security issue has been addressed Figure 6 Part of the conceptual schema where the security issue is tackled. Below you can find a Process Deliverable Diagram which summarizes the deliverables and activities carried out in this method: GOAL-ORIENTED CONCEPTUAL DATABASE DESIGN 9 GOAL Goal Identification - Goal - Category - Preference Idontify goals Categorize goals Software Developer 1..* Goal model generation 0..* Refine goals SUB-GOAL Identify Contributions CONTRIBUTION Includes Build goal model GOAL MODEL Software Developer Design Alternative selection Select a design alternative DESIGN ALTERNATIVE 1..* Check Stakeholders preference [else] [Prefered] Contains Software Developer 1..* Extract domain notions DOMAIN NOTION Software Developer Plans identification Identify Plans PLAN Identify contributions Software Developer Expand Domain notions Software Developer Construct Domain Model DOMAIN MODEL Software Developer Conceptual schema 1..1 Identify design issues Is solved by DESIGN ISSUE 1..1 1..* Implement design operations Is implemented in DESIGN OPERATION 1..* Construct conceptual schema Software Developer Figure 6 Goal-Oriented Conceptual Database Design Process Deliverable Diagram CONCEPTUAL SCHEMA GOAL-ORIENTED CONCEPTUAL DATABASE DESIGN 10 Below you can find the tables which describe the activities and the deliverables stated in the previous process deliverable diagram: Activity(phase) Goal Identification Sub-Activity Identify goals Categorize goals Goal model generation Refine goals Identify contributions Build goal model Design alternative selection Select a Design Alternative Check stakeholder preference Domain Notion Identification Extract domain notions Plans identification Identify Plans Domain notion expansion Identify Contributions Expand domain notions Domain model construction Construct domain model Conceptual schema Identify Design issues Implement Design operations Construct Conceptual schema Description A list of goals are identified based on the wishes of the stakeholders The goals identified in activity “Identify goals” are categorized into hard goals (accomplishable) and soft goals (very hard to accomplish). The goals identified in activity “Identify goals” are refined into sub goals using (AND/OR) decomposition A set of contributions are identified for the refined goals in activity “Refine goals” A goal model is built based on the refined goals (Activity “Refine goals”) and the contributions (Activity “Identify contributions”) The Goal model shows the possible paths to achieve the ultimate goal; these possible paths are called design alternatives, in this activity a Design Alternative is selected The design alternative selected in Activity “Select a Design Alternative” is checked based on the preference of the stakeholders. If the design alternative selected doesn’t have a high preference then a new Design Alternative must be selected, otherwise it is chosen. Domain notions are potential notions about which data is to be stored, so in this activity, a set of domain notions are extracted from the chosen Design Alternative from activity “Check stakeholder preference” A set of plans are identified to fulfill the goals stated in the chosen Design Alternative from Activity “Check stakeholder preference” A set of contributions are identified which show how a plan can contribute in fulfilling a goal The previous domain notion from Activity “Domain Notion identification” is expanded by domain notions extracted from the Plans identified in Activity “Identify Plans” The domain notions identified from the Activities “Extract domain notions” and “Expand domain notions” are used to construct a domain model A set of Design issues are identified Role Software Developer Software Developer Software Developer Software Developer Software Developer Software Developer Software Developer Software Developer Software Developer Software Developer Software Developer Software Developer A set of Design Operation are implemented for the identified Design issues from previous (sub)activity Software Developer Software Developer The conceptual schema is build based on the domain model build in Activity “Construct domain model” Software Developer GOAL-ORIENTED CONCEPTUAL DATABASE DESIGN 11 Table 3 Activity table Concept GOAL SUB-GOAL CONTRIBUTION GOAL MODEL DESIGN ALTERNATIVE DOMAIN NOTION PLAN DOMAIN MODEL DESIGN ISSUE DESIGN OPERATION CONCEPTUAL SCHEMA Description A GOAL has a “category” (Hard or Soft) and a “preference”, GOAL is part of the GOAL MODEL. (Jiang, 2007) A SUB-GOAL is identified based on GOAL. (Jiang, 2007) A set of CONTRIBUTIONs are identified and used in the GOAL MODEL. (Jiang, 2007) A GOAL MODEL constitutes of goals from GOAL, contributions from CONTRIBUTION and plans from PLAN. GOAL MODEL has DESIGN ALTERNATIVES. (Jiang, 2007) A DESIGN ALTERNATIVE is extracted from GOAL MODEL. A DESIGN ALTERNATIVE has DOMAIN NOTIONs. (Jiang, 2007) A DOMAIN NOTION is Extracted from a DESIGN ALTERNATIVE. A DOMAIN NOTION is part of the DOMAIN MODEL. (Jiang, 2007) A PLAN is identified to fulfill a GOAL in the GOAL MODEL. (Jiang, 2007) A DOMAIN MODEL constitutes of DOMAIN NOTION(s). (Jiang, 2007) A set of DESIGN ISSUEs are identified. (Jiang, 2007) A DESIGN OPERATION is implemented to counter measure a DESIGN ISSUE. (Jiang, 2007) A CONCEPTUAL SCHEMA constitutes of DOMAIN MODEL(s) where DESIGN OPERATIONs are implemented. (Jiang, 2007) Table 4 Deliverable table 4. Conclusion This paper presented a description of the Goal-Oriented approach for conceptual database design. This process concentrates on converting the stakeholders’ wishes into a goal model. The goal model is then used to derive a set of design alternatives to define the conceptual schema of the database-to-be. In order to transform the goal model into a conceptual schema, a group of predefined design strategies are provided based on a set of design issues. These design issues are thoroughly analyzed and modeled. It is only fair to point out that the Goal-Oriented approach for conceptual database design has novel contributions which evaluated the potential benefits of applying Goal-Oriented techniques to conceptual database design. These benefits include: i. Consideration of the stakeholders’ goals ii. Distinction between the domain model (the line of communication between stakeholders and developers) and the conceptual schema. iii. Provision of a predefined template of design strategies for transforming a goal model into a database schema iv. Provision of design operations to counter measure the design issues. References Bresciani, P., Perini, A., Giorgini, P., Giunchiglia, F. and Mylopoulos, J. (2004). Tropos: An agent-oriented software development methodology. Autonomous Agents and Multi-Agent Systems,8(3):203–236 GOAL-ORIENTED CONCEPTUAL DATABASE DESIGN Brinkkemper, S. (1996). Method Engineering: Engineering of Information Systems Development Methods and Tools. In Journal of Information and Software Technology, Department of Computer Science, University of Twente, Netherlands. Bubenko, J. A., Brash, D. and Stirna. J. (1998). Ekd user guide. Technical report, Kista, Dept. of Computer and Systems Science, Royal Institute of Technology (KTH) and Stockholm University, Stockholm, Sweden Chung, L., Nixon, B. A., Yu, E. and Mylopoulos, J. (2000). Non-Functional Requirements in Software Engineering. Kluwer Publishing. Connolly, T. M. and Begg, C. E. (2003). Database Solutions: A step by step guide to building databases. Addison Wesley. Dardenne, A., Lamsweerde, A. van and Fickas. S. (1993) Goaldirected requirements acquisition. Sci. Comput. Program., 20(1-2):3–50 Greenwood, M., Goble, C., Stevens, R., Zhao, J., Addis, M., Marvin, D., Moreau, L. and Oinn, T. (2003) Provenance of e-science experiments experience from bioinformatics. In The UK OST e-Science second All Hands Meeting 2003 (AHM’03), Nottingham, UK, pages 223–226. Jiang, L., Topaloglou, T., Borgida, A. and Mylopoulos, J. (2007). Goal-Oriented Conceptual Database Design. Proceedings of the 15th IEEE International Requirements Engineering Conference, Univ. of Toronto, Toronto, Canada. Lamsweerde , A. van, and Letier, E. (2003). From object orientation to goal orientation: A paradigm shift for requirements engineering. In the Monterey’02 Workshop, pages 4–8. Springer-Verlag Liaskos, S., Lapouchnian, A., Yu, Y., Yu, E. and Mylopoulos, J. (2006). On goalbased variability acquisition and analysis. In Proceedings of the 14th IEEE International Conference on Requirements Engineering. Susi, A., Perini, A., Mylopoulos, J. and Giorgini, P. (2005). The tropos metamodel and its use. Informatica, 29(4):401–408 Yu, E. S. K. (1997).Towards modeling and reasoning support for early-phase requirements engineering. In Proceedings of the 3rd IEEE International Symposium on Requirements Engineering, page 226. IEEE Computer Society. 12