ReportFinal

advertisement
UNIVERSITY OF YORK
DEPARTMENT OF COMPUTER SCIENCE
Scalable persistence of EMF models
Author
AVRAAM – LEONIDAS DRAKOPOULOS
Supervisor
DR. DIMITRIS KOLOVOS
SUBMITTED IN SUPPORT OF THE DEGREE OF
MASTER OF SCIENCE IN INFORMATION TECHNOLOGY
Academic Year
2009-2010
Word count: 17,530 as counted from MS Word. (All the body is included in the word count)
Scalable persistence of EMF models
Abstract
In the software engineering world, the notion of modelling has been established as a good practice
towards designing and documenting a solution. In the modern world though, due to the increasing
complexity of software systems, new technologies were in need to cope with the new emerging
demands. Model Driven Engineering is such a methodology that is introduced in order to meet the
emerging software demands and provide a higher level of abstraction in the software development
process. In MDE, models are used as the central artefact of the development process that in most
cases are used for automatic code generation. Generally this new approach to software development
has many benefits such as increased productivity and improved software quality. As almost all new
technologies though, along with all the benefits, some challenges emerge too. These challenges have
to be addressed in order for MDE to become fully adopted by the industry. One of the major
challenges faced by MDE is this of scalability and more specifically the scalable persistence of
models in the MDE context. This project will attempt to partly address the problem of scalability by
proposing a relational backed database persistence solution for storing EMF based models. The
project is developed in the context of Eclipse Modelling Framework, which is a framework for MDE
development that is intergraded with Eclipse IDE platform.
2
Scalable persistence of EMF models
Acknowledgement
Throughout this project, I would firstly like to thank my supervisor Dr. Dimitiris Kolovos who has
provided me with solid support, precise guidance and valuable advice which has helped me both
personally and professionally.
In addition I would like to thank my family for believing in me and who provided me with moral
support and encouragement.
Finally I would like to thank my friends for hearing patiently my frustrations.
3
Scalable persistence of EMF models
Ethical Statement
For this project there were no immediate ethical issues that needed to be considered.
4
Scalable persistence of EMF models
Table of figures
FIGURE 2.1: MDA SOFTWARE DEVELOPMENT LIFE CYCLE (ADOPTED FROM [1]) ............................... 12
FIGURE 2.2: METAMODELLING ARCHITECTURE (ADOPTED FROM [35]) ......................................... 13
FIGURE 2.3: BASIC STRUCTURE OF AN EMF MODEL (ADOPTED FORM [10]) ....................................... 16
FIGURE 2.4: EXAMPLE OF EMFATIC SYNTAX ....................................................................................... 17
FIGURE 2.5: EMF RESOURCE DIAGRAM (ADOPTED FROM [9]) ............................................................. 18
FIGURE 2.6: EPSILON ARCHITECTURE (ADOPTED FROM [31]).............................................................. 19
FIGURE 2.7: CDO ARCHITECTURE (ADOPTED FROM [15]) ................................................................... 21
FIGURE 2.8: TENEO - HIBERNATE ARCHITECTURE (ADOPTED FROM [19] ).......................................... 21
FIGURE 3.1: WATERFALL DEVELOPMENT PROCESS (ADOPTED FROM [22]) ......................................... 24
FIGURE 3.2: ITERATIVE WATERFALL DEVELOPMENT PROCESS (ADOPTED FROM [21]) ....................... 25
FIGURE 4.1: PROJECT BUILDING BLOCKS ............................................................................................. 27
FIGURE 5.1: PROJECT’S USE CASE DIAGRAM....................................................................................... 31
FIGURE 5.2: DATABASE ER DIAGRAM ................................................................................................. 33
FIGURE 5.3: INJECTION ALGORITHM FLOWCHART .............................................................................. 38
FIGURE 5.4: PROJECT'S CLASS DIAGRAM ............................................................................................ 37
FIGURE 7.1: CASE STUDY ECORE MODEL............................................................................................. 48
FIGURE 7.2: REGISTER EPACKAGES MENU .......................................................................................... 49
FIGURE 7.3: INSTANTIATED METAMODEL ............................................................................................ 49
FIGURE 7.4: REFLECTIVE EDITOR PROPERTY VIEW .............................................................................. 50
FIGURE 7.5: OBJECTS TABLE ................................................................................................................ 50
FIGURE 7.6: ATTRIBUTES TABLE .......................................................................................................... 51
FIGURE 7.7: REFERENCES TABLE ......................................................................................................... 51
FIGURE 7.8: BLACK BOX TESTING SCHEMA ......................................................................................... 52
FIGURE 7.9: UI EXTENSION .................................................................................................................. 57
FIGURE 7.10: TIME THROUGH EMF MODEL ......................................................................................... 57
FIGURE 7.11: DATABASE BOOT TIME ................................................................................................ 58
FIGURE 7.12: EMF BOOT TIME .............................................................................................................. 58
5
Scalable persistence of EMF models
Table of Contents
CHAPTER 1 .......................................................................................................................................... 9
INTRODUCTION................................................................................................................................. 9
1.1 PROJECT MOTIVATION ................................................................................................................... 9
1.2 OUTLINE OF THE REPORT............................................................................................................... 9
CHAPTER 2 ........................................................................................................................................ 11
LITERATURE REVIEW .................................................................................................................. 11
2.1 MODEL DRIVEN ENGINEERING ................................................................................................... 11
2.2.1 Model Driven Architecture Development Life Cycle ........................................................... 11
2.2.2 Model Driven Architecture Benefits .................................................................................... 12
2.2.3 Model Driven Engineering Challenges................................................................................ 13
2.3 METAMODELLING ARCHITECTURE ............................................................................................. 14
2.4 DEFINING THE BASIC CONCEPTS OF ECLIPSE MODELLING FRAMEWORK. .................................. 14
2.4.1 A first glance ........................................................................................................................ 15
2.4.3 What is an EMF model?....................................................................................................... 15
2.4.4 What is the basic concept of the generated code? ............................................................... 17
2.4.5 How are models persisted in EMF? ..................................................................................... 18
2.5 OBJECT PERSISTENCE API ........................................................................................................... 18
2.6 REFLECTIVE API ......................................................................................................................... 18
2.7 EPSILON ....................................................................................................................................... 19
2.8 DATABASE PERSISTENCE FOR EMF............................................................................................. 20
2.8.1 Connected Data Objects (CDO) .......................................................................................... 20
2.8.2 Teneo .................................................................................................................................... 21
2.9 CONCLUSION ............................................................................................................................... 21
CHAPTER 3 ........................................................................................................................................ 23
METHODOLOGY AND ANALYSIS ............................................................................................... 23
3.1 INTRODUCTION ............................................................................................................................ 23
3.2 WATERFALL MODEL .................................................................................................................... 23
3.3 WATERFALL MODEL ANALYSIS ................................................................................................... 24
3.3.1 Project analysis .................................................................................................................... 24
CHAPTER 4 ........................................................................................................................................ 26
REQUIREMENTS .............................................................................................................................. 26
4.1 INTRODUCTION ............................................................................................................................ 26
4.2 PROJECT OBJECTIVES .................................................................................................................. 26
4.3 PROJECT BUILDING BLOCKS ....................................................................................................... 27
3.3.1 Database Injection ............................................................................................................... 27
3.3.2 Database querying ............................................................................................................... 29
3.4 SUMMARY ................................................................................................................................... 29
CHAPTER 5 ........................................................................................................................................ 30
DESIGN ............................................................................................................................................... 30
5.1 APPROACHES CONSIDERED ......................................................................................................... 30
6
Scalable persistence of EMF models
5.1.1 CDO approach ..................................................................................................................... 30
5.1.2 Teneo approach ................................................................................................................... 30
5.1.3 Summary .............................................................................................................................. 31
5.2 USE CASE DIAGRAM ................................................................................................................... 31
5.3 GENERIC DATABASE DESIGN ...................................................................................................... 32
5.3.1 Conceptual design ................................................................................................................ 32
5.3.2 Logical Design ..................................................................................................................... 33
4.3.3 Physical design .................................................................................................................... 35
5.4 EMF INJECTION INTO THE DATABASE ......................................................................................... 37
5.5 GENERAL UML DESIGN .............................................................................................................. 39
5.4.1 Class diagram ...................................................................................................................... 39
5.4.2 Class diagram summary....................................................................................................... 39
CHAPTER 6 ........................................................................................................................................ 41
IMPLEMENTATION ........................................................................................................................ 41
6.1 INJECTING INTO THE RELATIONAL DATABASE ............................................................................ 41
6.2 QUERYING METHODS EXPLAINED ............................................................................................... 42
6.3 ECLIPSE PLUG – IN ....................................................................................................................... 45
6.3.1 Plug – in architecture .......................................................................................................... 46
6.3.2 Plug – in roles explained ..................................................................................................... 46
CHAPTER 7 ........................................................................................................................................ 48
EVALUATION ................................................................................................................................... 48
7.1 CASE STUDY ................................................................................................................................ 48
7.1.1 Constructing an EMF model ................................................................................................ 48
7.1.2 Instantiating the model......................................................................................................... 49
7.1.3 Objects injected in the database .......................................................................................... 50
7.2 TESTING....................................................................................................................................... 52
7.2.1 Black Box testing.................................................................................................................. 52
7.2.1 Database injection testing .................................................................................................... 52
7.2.2 Database querying testing ................................................................................................... 53
7.3 REQUIREMENTS EVALUATION..................................................................................................... 53
7.3.1 Database injection functional requirements evaluation ...................................................... 54
7.3.2 Database querying functional requirements evaluation ...................................................... 54
7.3.1 Database injection non functional requirements evaluation ............................................... 54
7.3.2 Database querying non functional requirements evaluation ............................................... 55
7.4 GENERAL PERFORMANCE EVALUATION ..................................................................................... 55
7.4.1 EOL Scripts explained ......................................................................................................... 56
7.4.2 Extender plug – in ................................................................................................................ 57
7.4.3 EMF run time vs. database run time .................................................................................... 57
7.4.4 EMF boot time vs. database boot time ................................................................................. 58
7.4.5 Comparison and outcomes ................................................................................................... 58
CHAPTER 8 ........................................................................................................................................ 59
CONCLUSION ................................................................................................................................... 59
8.1 PROJECT OVERVIEW .................................................................................................................... 59
8.2 PERSONAL DEVELOPMENT .......................................................................................................... 60
8.3 FUTURE WORK ............................................................................................................................. 60
7
Scalable persistence of EMF models
BIBLIOGRAPHY ............................................................................................................................... 62
8
Scalable persistence of EMF models
Chapter 1
Introduction
1.1 Project motivation
More and more, the concept of modelling is embedded in the area of computer software development.
The concept of modelling has a big range of interpretations even inside the context of software
engineering. Unified Modelling Language (UML) so far has been the dominant technology that was
related to modelling. Increasingly though the concept of Model Driven Engineering (MDE) is gaining
momentum and popularity.
Eclipse Modelling Framework (EMF) is a subproject of the open source and well known Integration
Development Environment (IDE) Eclipse and provides a solid base to the use of modelling and code
generation in terms of MDE.
EMF is a powerful framework that is designed in such a way in order to make modelling useful and
practical to the Java Developer. EMF can be considered as the bridge between modellers and
programmers unifying three widely used technologies: Java, UML and XML. EMF integrates with
many platforms. More specifically for the purposes of this project Epsilon will be used, which is a
platform that integrates with EMF and supports task specific languages for model management.
EMF is a flexible, efficient framework that encapsulates a complete and useful API, allowing the
principles of MDE to be successfully applied. EMF community supports this technology and
continues to add new features to the framework making it an even more appealing solution to the
potential adopters.
EMF technology as a cutting edge technology, even though it is extremely useful and promising, it is
not without important issues that have to be addressed. One important challenge that EMF faces is the
scalable persistence of EMF models. The goal of this project is to partly address the scalability issues
faced by EMF through the development of a database backed solution that integrates with Epsilon
platform. In this context all the technologies that were aforementioned are going to be explained in the
Literature Review (Chapter 2) section of the report.
1.2 Outline of the report
The project has 7 Chapters excluding the Introduction chapter.
Chapter 2 Literature Review: This chapter investigates some of the literature associated with EMF
and generally MDE. A focus on EMF persistence is given.
Chapter 3 Methodology: This chapter describes the chosen software development process that was
followed during the implementation of this project.
Chapter 4 Requirements: This chapter identifies the clearly the project’s objective. In addition the
requirements of the project are identified.
9
Scalable persistence of EMF models
Chapter 5 Design: This chapter outlines the design of the basic building blocks of the project. ER
modelling, UML and flowcharts are used.
Chapter 6 Implementation: This chapter explains the code of the basic parts of the implementation.
In addition important information about Eclipse plug – ins is provided.
Chapter 7 Evaluation: This chapter investigates a case study that makes use of the code developed
as well as some of the most important tools of EMF. Basic system cases are outlined and a discussion
takes place about performance through instantiating a big EMF model.
Chapter 8 Conclusion: This chapter summarizes the work and outcomes of the project. In addition
approaches are proposed for the future continuation of the project.
10
Scalable persistence of EMF models
Chapter 2
Literature Review
The topic of interest in this dissertation is the scalable persistence of EMF models. To ease through
the reader, this report will provide a solid background on the Eclipse Modelling Framework and its
key features. The purpose of the chapter is to give a comprehensive view of the purpose of EMF,
identify the existing literature on that matter and give a basic perspective of the benefits and
challenges faced by this cutting edge technology.
2.1 Model Driven Engineering
Software complexity is increasing rapidly. Developers and software engineers in their effort to keep
up with this boost of complexity, often seek solutions in the Model Driven Engineering approach
which provides a higher level of abstraction when compared with traditional methods of coding [32].
The notion of the MDE approach alters the primary objective of the developer. The developer focuses
on modelling a solution for the problem at hand rather than developing code [32].
A reasonable question a software engineer or a developer would ask at this point is: Isn’t that what
UML does?
The answer to this question is that the MDE approach is differentiated from other modelling
techniques because the model is used as a basic artefact to the process of code development rather
than a tool of documentation as UML and other similar approaches do.
The basic idea of the MDE approach is that the developer focuses on producing models that are going
to be used as templates for automatic code generation which can be executed, tailored and customized
[2].
Eclipse Modelling Project conforms to the principles of MDE approach and is operating under the
umbrella of the widely used open source Eclipse IDE. The basis of Eclipse Modelling Project is the
Eclipse Modelling Framework which provides the tools for modelling and code generation. Models
inside EMF can be developed by its own tree based editor but can also support the import of UML by
other famous IDE’ s like IBM Rational Rose [3].
2.2.1 Model Driven Architecture Development Life Cycle
MDA is a specific notion that is encapsulated in the context of MDE [33]. In many ways the MDA
lifecycle is similar to the traditional lifecycle of software development. Though many similarities
exist, there are also some key differences. The differences lie mainly to the different artefacts that are
created during the development process [1]. Below in Figure 2.1 the MDA lifecycle is presented:
11
Scalable persistence of EMF models
Figure 2.1: MDA software development life cycle (adopted from [1])
In Figure 2.1 an iterative process is presented which includes all the traditional steps of software
development like requirement analysis, coding and testing. The Figure 2.1 seems quite familiar
except from PIM and PSM components which stand for Platform Independent Model and Platform
Specific Model respectively [1].
PIM and PSM are both models with the main difference that PIM has a higher level of abstraction
than PSM. A PIM can be transformed into many PSMs. A PSM is focused in the platform that is
implemented. The final objective is code to be generated from PIM model [1].
2.2.2 Model Driven Architecture Benefits
The above architecture encapsulates many benefits. There is a shift of focus of the developer, to
designing and generating PIM models instead of coding. The development process in this way is more
productive since there is an abstraction of technical details. In addition, the amount of code to be
developed is less since if the transformation from a PIM to PSM is successful then in most cases the
code is automatically generated [1].
In addition there are portability benefits because of PIM’s platform independency. A PIM can be
deployed to any platform and then be transformed to PSM models [1].
Also there are important benefits in Maintenance and Documentation. The PIM by being a vital part
of the development process is not abandoned. Once the first code generation cycle is complete, it is
actively used and updated when changes take place. This way PIM has a dual role that serves both as
a high level documentation model as well as an input template for automatic code generation [1].
12
Scalable persistence of EMF models
2.2.3 Model Driven Engineering Challenges
Model Driven Engineering and more specifically the Eclipse Modelling Framework starts to emerge
as a very promising technology. As happens with every new technology though, new challenges
emerge. This is the case too with MDE. Many challenges have arisen that have to be dealt in order for
this technology to be fully adopted by the software industry [4].
Some of the most significant MDE challenges that are faced will be enumerated. Note that these are
not the only challenges to be faced by MDE but the ones that are more in accordance with the topic of
this dissertation will be presented [5]:
Traceability:
In traditional practices of programming when an error occurs in the code, the compiler or interpreter
points out the exact line of the error in the application. When MDE is used to develop an application
the model is the central artefact of the process. The model is used as a template for automatic code
generation. If though a problem in the generated code occurs the process to correct it is not as
straightforward as in the traditional practices of programming. Fixing the mistake in the code will not
fix the problem in the long-term. The problem has to be traced back to the model level so that the
generated code that it produces is always correct. Also if the code is derived by a set of models the
traceability of a possible mistake is even harder since all the models have to be examined [4, 5, 30].
Verification and validation:
In all software development approaches the verification and validation processes are essential. As
such in MDE these processes are necessary too. Along with verification and validation in MDE
though many more challenges are emerging in this context. A very important challenge is to find
approaches to verify test and validate models as well as the code that is automatically generated by
them. Also there is a huge necessity for formal verification of models. Many existing formalisms
already exist like graph transformation theory, category theory, model checking and others but the
challenge that underlies is to define, structure and standardize the relationships between them [4, 5].
Industrial adoption:
MDE has successfully been introduced to some areas of software industry. Despite this fact many
companies are still reluctant to use MDE technology. The most common reason is that a significant
extra amount of training is needed in order for the MDE technology to successfully be applied in
industry. Also many companies have already invested in other technologies so the transition from
previous technology to MDE could become a very costly operation [4, 5].
Scalability in MDE:
Often in the industry engineers have to cope with very large models of thousands of elements which
take too much time to load causing a substantial overhead to the development process. There are
many issues that need to be dealt that underlie the scalability of MDE [4, 5].
1. Model transformations should not take place after a small change in the original model.
2. Also in the case of code generation the entire code should not be regenerated after a small change
to my source model.
13
Scalable persistence of EMF models
3. In addition there should be a suitable framework for collaboration between the developers working
on a project. A change to the model from one developer should not be in conflict with the changes of
another.
4. There is a necessity for efficient manipulation of the model in parts instead of having to load in
memory all at once.
This dissertation will attempt to address some of the problems of scalability and especially No 4
problem. More specifically the projects objective is the development of an Eclipse IDE plug-in which
primarily will allow the injection of an EMF model into a database as well as the querying of parts of
the EMF model. The goal of the project will be discussed in detail in the chapters that follow.
2.3 Metamodelling Architecture
The notion of modelling is very important for this project. The objective of this section is to provide a
basic framework that will allow the reader to be familiar with terms like metametamodel, metamodel
and model in the context of this report.
Figure 2.2 describes the hierarchy of metamodeling architecture which is organized into 3 levels: M1
– M3. Each level describes the rules that the lower levels must conform to. Thus a higher level can be
considered as the definition of the modelling language that its lower levels must comply to. [35].
For example in the first branch of Figure 2.2, in M2 level there is the UML metamodel. The UML
metamodel complies with the modelling language as defined from the M3 level which is the
Metametamodel. Now the UML M2 level could describe Classes and Relationships. The M1 UML
model is like an instance of M2 metamodel and conforms to the modelling language defined by it.
Figure 2.2: Metamodelling Architecture (adopted by [35])
In the following sections these terms will be used in extent. Eclipse Modelling Framework is based on
the 3 level Metamodelling Architecture that was analysed [35]. In the following sections these terms
that were explained will be presented again in the EMF context.
2.4 Defining the basic concepts of Eclipse Modelling Framework.
This section’s purpose is to explain how an EMF model can be represented as well as answer some of
the very basics questions regarding EMF’s core structure.
What is an EMF model?
14
Scalable persistence of EMF models
What is the basic concept of the generated code?
How are models persisted in EMF?
2.4.1 A first glance
EMF is the technology that glues together the modelling and the programming worlds and conforms
to the principles of MDE. What really makes this technology stand out is that after an EMF model is
defined, efficient, fine tuned and fully customizable code can be generated with just the clicks of
some buttons. The structure of EMF models are constructed in a way that makes reuse of the
developers previous experience of Java, XML and UML [6].
EMF is a modelling framework that is integrated with Eclipse IDE and glues together three very
important technologies: XML, UML and Java. An EMF model can be well thought out as the
representation that summarizes all the three of the above technologies. A transformation or in simpler
words a change to an EMF model would also be propagated to the other technologies too [7].
A starting point to define an EMF model as said before could be either XML or UML. Defining an
EMF model with UML or XML would require to sync and integrate Eclipse with other modelling
tools, like for example IBM Rational Rose [8]. Vlad Varnica, the OMONDO business developer
director said for EMF on 2002 [36]:
“EMF represents the core subset that’s left when the non-essentials are eliminated.
It represents a rock solid foundation upon which more ambitious extensions of
UML and MDA can be built.”
Vlad Varnica
2.4.3 What is an EMF model?
All EMF models are represented by the Ecore, which is one of the major milestones of Eclipse
Modelling Framework [9]. The Ecore is an EMF model itself [9]. In the context of the modelling
hierarchy that was presented in section 2.3, the Ecore is the Metametamodel M3 can be considered the
modelling language in order for a specific metamodel M2 to be defined. Then based on the M2
metamodel another model can be defined M1, that conforms to the M2 metamodel.
In Figure 2.3 the basic structure of the Ecore is presented. In a closer look its structure is quite
straightforward and resembles a lot like a UML class diagram. This fact is not surprising since the
Ecore can be considered to be in a sense a subset of a UML Class diagram [9].
In order for the Ecore to be represented we need 4 basic Ecore classes [9]:
EClass: Represents the modelled Class. As seen from Figure 2.3 an EClass as expected can have 0 or
more EAttributes and 0 or more EReferences.
EAttribute: Represents an attribute in the EClass. Each EAttribute can have one EDataType.
EDataType: Represents the type of EAttribute.
EReference: Represents an EClass modelled in another EClass. As stated before an EClass can have
0 or more EReferences.
EStructuralFeature: It is the super Class that EReferences and EAttributes inherit from.
15
Scalable persistence of EMF models
The simplified basic structure of the Ecore metametamodel as described above is [10]:
Figure 2.3: Basic structure of Ecore (adopted form [10])
There are four different ways to represent and construct an EMF model that conforms to the
modelling language defined by Ecore. Anyone who wants to build an EMF model can have as an
input any of the four technologies cited below.




XML,
UML,
Java interfaces and
Eclipse own tree based editor
The easiest way to start is of course is Eclipse’s own tree based editor which is a tool that is fairly
easy to use and does not require an integration with other editors.[9]
Eclipse Modelling Framework also provides a textual editor called Emfatic which can automatically
be transformed to an EMF model and provides a quicker and more convenient approach for building
EMF models [11].
In order to make the picture complete, a simple EMF metamodel will be provided and its equivalent
syntax in Emfatic as implemented in the Eclipse IDE modelling environment. This model can be
considered to be a metamodel since other models can be instantiated that conform to it. Extending this
logic the Library metamodel that is presented in Figure 2.4 can be considered to be in the level M2 of
the metamodelling hierarchy and the models that conform to it in the level M1.The Library
metamodel conforms to the modelling language of Ecore Metametamodel.
16
Scalable persistence of EMF models
Figure 2.4: Example of Emfatic syntax and equivalent EMF model
In this very simple example the first EClass that is shown in Figure 2.3 is the Library. The EAtribute
is the name of the Library which is of type EString. The EReferences in the EClass Library are the
EClasses Writer and Book respectively which are also represented in the model. A Library object can
have zero to many Writer objects as well as zero to many Book objects. The EClasses Book and
Writer that are defined in the model follow exactly the same pattern as the EClass Library that was
just explained.
The Emfatic syntax resembles a lot like the Java syntax. With just the click of a button any change to
the Emphatic syntax is propagated to the tree based Ecore editor and vice versa.
2.4.4 What is the basic concept of the generated code?
This project objective is not to use the generated code that derives from an EMF model so only the
very basic features of code generation will be described at this point.
As discussed above the primary objective when designing EMF models in the Eclipse Modelling
Environment is ultimately the automatic code generation.
With just some clicks away, from the EMF model the Generator model is derived which is an EMF
model too. The majority of information needed from the Generator model in order to produce the code
exists in the initial EMF model. The separation of the initial model and Generator model introduces
some extra complexity to the whole process of modelling and generating the code. It is necessary
though in order to keep the initial EMF model independent and pure of the extra information needed
for the code generation [8, 12].
In any project when we want to generate code there should be two models in our project: the initial
EMF model and the Generator model with extensions .ecore and .genmodel respectively. Any
changes to the .ecore model are automatically propagated to the .genmodel model in order to be
always in synch. Both .ecore and .genmodel EMF models are highly dependent on each other. [8, 12]
17
Scalable persistence of EMF models
An important observation at this point is that EMF uses the previous experience of the user in Java. In
Java all classes inherit from the java.lang.Object. Extending the same logic all interfaces in EMF
inherit from the interface EObject [13].
2.4.5 How are models persisted in EMF?
Eclipse Modelling Framework provides a built – in model persistence mechanism. The technology
used is called XML Metadata Interchange (XMI). This being the default way of persistence in EMF,
no additional code or effort is required to store any objects that conform to the Ecore in XMI form [9].
The EMF persistence framework though provides a powerful API that supports persistence in other
forms other than XMI like for example databases. The downside of this approach though is that the
serialization code has to be developed from scratch using the API provided by EMF [9].
2.5 Object persistence API
The most basic building block in the EMF persistence framework is called a resource. Any Object in
order to be persisted, a resource is required.
As discussed previously an EClass includes references and attributes. When we decide to save an
object of an EClass in a resource then generally all the structural features, which are the attributes and
the references that are included in this object are saved to the same resource by default.
What if it’s required though to create more that one resource? It does not make sense to have dangling
resources without some sort of grouping. For this reason EMF persistence framework introduces
another interface called ResourceSet. The ResourceSet contains all the resources grouping them
altogether and thus making them more accessible for manipulation [9].
Below in Figure 2.4 we can observe two resources loaded in a ResourceSet. The URI ‘s next to each
resource are used as unique identifiers between resources. A resource can be loaded to a ResourceSet.
Resources in a ResourceSet can also have cross references with each other as shown in Figure 2.5.
Figure 2.5: EMF resource diagram (adopted from [9])
2.6 Reflective API
The base interface of EMF is EObject, which provides an API for managing model elements
reflectively. The reflective API provides an alternative way to manage objects in EMF that differs
from the mainstream approach which is to use the automatically generated code. A developer has the
18
Scalable persistence of EMF models
choice either to use the reflective EObject API or the code generation facility provided by EMF in
order to implement the desirable runtime behaviour of his system. [9]
The reflective API provides accessor and mutator methods like for example eGet, eSet. A simple
example is provided below:
instance.eSet(…….) to set the value of an attribute
String example = (String)instance.eGet(…) to read the value of an attribute
Through the reflective API any resource that is loaded in a ResourceSet can be accessed. Any object
of an EClass can be interrogated and in parallel access all the object’s structural features. There is the
option to navigate among objects and their references, retrieve any information about their attributes
and manipulate the data at will. The reflective API enables a more flexible approach regarding the
manipulation of objects since there is no need to initiate the process of the automatic code generation
that EMF provides. Unfortunately the trade off is that the data takes longer to be processed when
using the reflective API compared to the auto generated code.
2.7 Epsilon
Epsilon stands for “Extensible Platform of Integrated Languages for mOdel maNagement”. It is
a platform that was developed at the University of York and supports the construction and utilization
of purpose – specific languages for model management tasks such as model to model transformation,
code generation and model comparison merging validation and refactoring [29]. As it can be derived
from Figure 2.6, Epsilon is a platform that operates under the Eclipse project and the task specific
languages that operate in this context can support model management activities for EMF models.
Figure 2.6: Epsilon architecture (adopted from [31]).
The task - specific languages that so far are supported by Epsilon are:
19
Scalable persistence of EMF models
1. Epsilon Object Language (EOL): It is the base language that all the others task specific
languages that are described below extend to. EOL can be used as a standalone language that
can provide model management properties.
2. Epsilon Validation Language (EVL): The objective of the use of this language is for
validation reasons as well as for evaluation of constraints.
3. Epsilon Transformation Language (ETL): The objective of this language is to transform a
number of input models to a number of output models.
4. Epsilon Comparison Language (ECL): The objective of this language is to identify
similarities and matching patterns between models that are possibly derived from different
modelling technologies.
5. Epsilon Merging Language (EML): The objective of this language is to merge models.
6. Epsilon Wizard Language (EWL): The objective of this language is to support the updating of
models that possibly derive from different metamodels and modelling technologies.
7. Epsilon Generation Language (EGL): The objective of this language is code generation and it
is build on top of EOL.
The interest of this project is to integrate the persistency database backed solution for EMF models,
with the interface IModel which is provided by Epsilon. Only through the implementation of the
interface IModel a driver on the Epsilon platform can be developed. Through the integration with the
IModel interface the task specific languages provided by Epsilon can be used.
2.8 Database persistence for EMF
As aforementioned in previous sections, XML Metadata Interchange is the default technology that
EMF uses in order to describe EMF models into a persistent form. Along with XMI, Teneo and
Connected Data Objects (CDO) are both technologies that support a different kind of persistence.
Both technologies were developed under the Eclipse community umbrella and provide a relational
database persistence solution for EMF models [14, 15, 16].
XMI is the build in persistence solution for EMF models. When large EMF models need to be
persisted though, XMI does not scale well. The main reason is that the entire model needs to be
loaded into memory at all times even if a small part of the model needs to be interrogated. This is
inefficient from the boot time and memory footprint perspective. As thus database persistency
solutions like CDO and Teneo were investigated in the context of improving the scalable persistence
of EMF models. These solutions could have served as the backbone technologies for this project.
CDO and Teneo technologies are revisited again though on the Design part (Chapter 5) of the report
where the reasons for which these technologies were not used for this project are analyzed.
2.8.1 Connected Data Objects (CDO)
CDO technology is a framework that acts like a plug-in and is integrated with the Eclipse Modelling
Framework.[16] CDO is a runtime environment operating on a 3 tier client server architecture as
shown in Figure 2.7 that supports data persistence on the back end.
20
Scalable persistence of EMF models
Figure 2.7: CDO architecture (adopted from [15])
Examples of pluggable data storage technologies that can be integrated with CDO is relational
databases like MySql and Oracle as well as object databases and file systems. Using the EMF API it is
possible to save your models into relational databases and thus making your application more
scalable. CDO through its architecture also supports collaboration and the existence of concurrent
users [17].
2.8.2 Teneo:
Teneo like CDO is a database persistence solution for EMF models into databases. The mapping
between the EMF model and the relational database is created automatically. Teneo supports
integration with both Hibernate and EclipseLink. Hibernate is the technology that actually provides
the API that allows the EMF model to be injected into the database as well as the API for further
database manipulation and querying [18,19].
Figure 2.8: Teneo - Hibernate architecture (adopted from [19])
In Figure 2.7 it is illustrated how an EMF model is automatically mapped into a Hibernate mapping
and then stored into a relational database.
2.9 Conclusion
So far an overview of all the technologies that were used for the project implementation was
presented. The basic concepts around MDE, EMF and modelling were introduced and explained. Also
the concepts of EMF reflective API and generated code were discussed. A focus was given to the
EMF object persistence. Also the existent technologies that so far support database persistency
solutions for EMF models were introduced. In this context the project objective was briefly discussed.
In addition information was presented about Epsilon platform as well as the reasons for its integration
with this project.
In the next Chapters the software development methodology as well as the requirements analysis will
be presented. Also the design as well as the implementation of the persistence database backed
scalability solution that was developed in this project will be thoroughly analysed. Moreover a
21
Scalable persistence of EMF models
detailed evaluation will be provided that will focus on the performance stats of the developed
solution.
22
Scalable persistence of EMF models
Chapter 3
Methodology and Analysis
3.1 Introduction
The project objective is to provide a scalable and memory efficient persistence mechanism for storing
- loading EMF models and integrate it with the Epsilon platform.
In order for this objective to be accomplished the selection of a suitable software development process
was necessary.
This section includes an outline of the software development process that was selected for the
development lifecycle of this project. In addition a short but to the point analysis is presented
regarding the reasons behind this selection as well as the modifications made to tailor the
development process according to the goals of this project.
The three candidate software development processes examined were the Waterfall model, the
Evolutionary model and the Component Based Software Engineering model. Out of the three
processes the Waterfall model was chosen as the most suitable to be used for this project.
3.2 Waterfall model
Every software process has some basic activities that are common. The most important of them is
1. Software specification
2. Software design and implementation
3. Software validation
4. Software evolution
The Waterfall model is no exception including these fundamental activities. The basic characteristic
of this model though is that it treats these basic functions them with a sequential approach. More
specifically each lifecycle phase has to be complete before the other can begin. In Figure the phases of
the Waterfall model are presented [22]:
23
Scalable persistence of EMF models
Figure 3.1: Waterfall development process (adopted from [22])
Requirements and analysis definition: It is the phase where the system specifications and
constraints are defined.
System and software design: It is the phase where the building blocks of a system are decided as
well as the relationships between them.
Implementation and unit testing: It is the phase where every part of the system is implemented
using a set of programs and tested separately.
Integration and system testing: It is the phase where the separate parts of the system
implemented in the previous phase are integrated and are tested as a single complete unit.
Operation and maintenance: Further optimization of the system according to new requirements and
potential improvements
3.3 Waterfall model analysis
In real life it is rarely the case that a software engineering project is distinctly divided in 5 phases.
These phases usually overlap with each other.
In addition during the course of the development process the requirements change in a significant
proportion of projects. Moreover it is rarely the case that the design and implementation phase of a
project goes as planned.
Taking into consideration the above observations it is evident that almost in every development
process, there is a need for iterations.
3.3.1 Project analysis
The requirements of this project where stable and well defined from the beginning. Each phase in the
project lifecycle was clear and had to be completed before the other phase could start. Based on these
facts it was decided that the Waterfall model was the suitable choice for the completion of this
project.
After researching though a hybrid model of the Waterfall Software process was decided to get
adopted. The approach for the design phase of the project was not clear yet. Many were the candidate
technologies that were investigated at the time for the backbone of the application. As thus there
24
Scalable persistence of EMF models
would definitely be a need for iteration during the second phase of the development process,
something that was not supported by the Waterfall model as described above. In addition it was not
known in advance the programming obstacles that could arise during the implementation phase of the
project. There was a need for a model that could provide an alternative plan of action and that
supported iteration in case the approaches that were decided during every phase were inefficient.
Figure 3.2: Iterative Waterfall development process (adopted from [21])
So the solution was to adopt an enhanced Waterfall model which would keep the lifecycle phases
separated as in Figure 2.2 but at the same time support iterations during every phase of the
development process [21].
This hybrid model would give additional flexibility to the project but also provide alternatives in the
remote case that an obstacle could not be bypassed.
25
Scalable persistence of EMF models
Chapter 4
Requirements
4.1 Introduction
In this section the requirement analysis of the project will be provided. The project will be
broken down into two basic milestones. For each milestone a requirements analysis will be
provided. The requirements analysis will be based on the two most important requirements
classifications: functional requirements and non-functional requirements [22].
The functional requirements state the services and the functionality that a system should
deliver. Moreover functional requirements describe the behaviour of a system under
particular situations and inputs.
The non functional requirements span through many areas in a project. Non functional
requirements state the constraints on the services and functions that the system should
deliver. Some examples of non functional requirements measures are the speed of the system,
size, reliability, portability etc.
As explained in the methodology part of the project (Chapter 3) the definition of the
requirements is the first phase of the development lifecycle process.
4.2 Project Objectives
This project objective is the development of an Eclipse IDE plug – in that aims to improve
the scalable persistence of large EMF models. As discussed in the Literature review section
(Chapter 2), one of the main problems that MDE faces is that it is still unable to cope with
very large models. These models have to be transformed, constructed and merged. In addition
it is time consuming to load these huge models since the whole model has to be loaded in
memory in order to query a part of it using XMI, which is the default persistency solution for
EMF models.
The strategy devised for this project is to map the information contained into EMF models
into a generic relational database. Then through the querying of the generic relational
database it will be possible to load only a selected part of an EMF model. Thus the costly
operation of loading the entire EMF model into memory each time an interaction is required,
is avoided.
More specifically, Epsilon platform implements drivers through the implementation of the
interface IModel which can then be integrated with EMF platform. The default technology for
persistence that EMF provides, as discussed in the literature review section (Chapter 2) is
XMI. The problem is that when IModel interface is implemented with XMI a big EMF model
26
Scalable persistence of EMF models
in the scale of MBytes takes too long to be loaded into memory since the entire XMI file has
to be loaded. Thus the goal is to implement the IModel interface using a database backed
solution in order to solve the loading problem XMI faces.
4.3 Project Building Blocks
EMF Metamodel
instantiated objects
DB module
EMF Reflective API
EMF Persistence API
H2
DBMS
Querying module
EMF Reflective API
Querying Interface
Implementation
Figure 4.1: Project building blocks
Figure 4.1 illustrates the main building blocks of this project. The first part of the project
injects instances of an EMF metamodel into the generic relational database using the EMF
reflective and the EMF persistence API. The second part of the project queries the data that
were injected into the database using the EMF reflective API.
We will analyze the functional and the non – functional requirements for each of the building
blocks that were identified above.
3.3.1 Database Injection
In this building block, the instances that conform to an EMF metamodel are injected into a
generic relational database. Each functional requirement is given an index so that it can be
27
Scalable persistence of EMF models
easier identified in the Evaluation part of the project (Chapter 7). The functional requirements
are presented below:
Functional requirements:
F.R Injection
1. The instances of the EMF model that was given as an input shall be injected into a
database.
F.R Complete
2. The objects, the information that hold as well as the relationships between them shall
be stored in the database.
F.R. Generic
3. The schema of the database shall be generic. The schema shall be compatible with
any EMF model given as an input. All the information of the instances of an EMF
model shall be mapped to the relational language of the database schema.
F.R. Transparency
4. The location of the database shall be transparent. The user should be able to
determine the location of the database which stores the EMF objects.
F.R Reflection_1
5. The code shall use the EMF reflective API to inject the instances into the database.
The code generation facility offered by EMF shall not be used. The reflective API
usage will facilitate the user to use the plug – in since there will be no need for the
actions needed in order for the code to be generated.
Non – functional requirements:
N.F.R Embedded
1. The DBMS used shall be embedded to the application and be as efficient as possible
for the specific purpose needed. The database shall be built-in to the application in
order to be easier for the user to use.
N.F.R Efficiency_1
2. The Java code used for the database injection shall be as efficient as possible.
N.F.R Maintainability
3. The java code shall be easy maintainable.
28
Scalable persistence of EMF models
3.3.2 Database querying
In this building block of the project the instances that were previously stored in the database
have to be queried and loaded into in-memory objects to facilitate the management of the
information they contain.
Functional requirements:
F.R Querying
1. The users shall access the information in the database by loading parts of the stored
model into in-memory objects.
F.R Variety
2. The users shall be provided with a large set of methods that allows them to access the
information stored in the database with many different ways of choice. The methods
provided for database querying are more than 10 and conform to the interface IModel.
The methods provided are for read – only use of the database and do not support
modification.
F.R Reflection_2
3. The code written for the querying of the database shall use the EMF reflective API
again. This fact shall permit the user to use the querying methods provided with less
effort since there will be no need for code generation which would add additional
steps to the whole process.
Non – functional requirements:
N.F.R Efficiency_2
1. The code shall be reliable, robust and efficient.
N.F.R Adaptability
2. The application shall handle potential mistakes by the user while querying the
database. For example the application should return null when the user requests for an
object that does not exist in the database.
3.4 Summary
This chapter identifies the functional and non functional requirements of the main building
blocks of the application in accordance so far with the software methodology chosen for this
project. The next phase in the development lifecycle is the design phase.
29
Scalable persistence of EMF models
Chapter 5
Design
As presented in the requirements section the project consists of two main building blocks. Each
building block design will be presented separately.
5.1 Approaches considered
As discussed in the literature review section (Chapter 2) before the design of the solution could start
we had to examine all existing technologies that could provide the backbone for the Eclipse IDE plug
– in. Three were the approaches that were considered at that point. CDO technology, Teneo
technology or design from scratch a database schema that could store the information contained in
EMF objects.
5.1.1 CDO approach
CDO seemed as the technology that could provide the mapping between EMF and a relational
database for the plug - in. With further research though the following conclusions were made:
1. CDO technology was mostly oriented for Client – Server architecture with also support for changes
made from concurrent users. These features are of course useful but out of focus of the objective of
this dissertation. There was no need for adding more complexity and computational overheads.
2. Judging also from hands on experiments, a conclusion was reached that CDO is under documented
and it was not clear how to connect a relational database on the back end. This fact would also add
time overhead at the implementation of the querying module of the project [20].
Based on these facts the CDO approach was not chosen and Teneo became the next candidate
technology for consideration.
5.1.2 Teneo approach
Teneo is a database persistency solution for EMF. Applying this technology to our project could
provide an automatic database mapping between EMF and a relational database. Also due to its
integration with Hibernate technology a suitable API would be available for the querying part of the
project. After though implementing some applications with Teneo the following conclusions were
reached:
1) The database schema that is created is highly dependent on the model itself. This feature of Teneo
significantly deviates from the requirements of this project. Our goal is to create a generic mapping of
the Ecore model to a database and not a specific mapping for each Ecore model.
2) The mapping of the EMF model in the database was successful but there was not support for
loading only a part of the ECore model to the memory. This fact made this solution very inefficient
since if the ECore model was very large, the time to load it from the database to the memory would be
very long. This fact was not in accordance with the requirements of the project.
30
Scalable persistence of EMF models
3) The saving of the model to the database was not transparent. The location of the database that
contained the EMF models was not clear to the user. This fact again was not in harmony with the
requirements set for this project.
5.1.3 Summary
CDO and Teneo initially appeared as promising solutions to use as the backbones for the
implementation of the plug – in, since they provided the mapping of an EMF model to a database. On
a closer look though both technologies lacked many design features that were described in the
requirements of this application. In addition both technologies added unnecessary complexity to the
project.
Based on the above facts the third alternative approach was selected for the implementation of the
application. First a generic relational database would be designed from scratch that would provide the
persistency solution. Second, the code for the mapping between EMF and database had to be
implemented.
5.2 Use Case Diagram
Below a use case diagram is presented that analyzes the two basic axis of the project. Also a goal
based description will be provided for each use case.
Figure 5.1: Project’s Use Case diagram
Use Case: Inject model into database
Goals
- User must be able to create a database.
- User must be able to specify the location of the database
- User can persist any number of models in the database
Pre-Condition
- The models must be loaded to a resource before they can be persisted in the
database.
31
Scalable persistence of EMF models
Use Case: Query the database
- User can use the query interface in order load data for the database
Goals
- User must be able to load parts of the model into memory.
Pre-Condition
- The models must be saved into a database before they can be queried.
5.3 Generic Database design
As discussed previously a relational database design that can hold EMF objects is a very important
part of this project. In this section the steps of the design of the database schema will be presented.
The design process will be divided into three phases. The Conceptual, the Logical and the Physical
design.
5.3.1 Conceptual design
In the Conceptual design phase the primary objective is to construct an Entity Relationship diagram
by identifying [28]:
1. The data objects or entities,
2. The relationships between the objects,
3. The constraints and the rules that govern the operations on the objects.
The ER diagram will be constructed in three steps:
1st step: Building an entity – attribute table.
The entities that were identified for the database design for this project are in accordance with the
structure of the ECore model. Every entity that is identified has also specific attributes that describes
it. Below we will provide an entity – attribute table.
Entities
Attributes
Object
objectID, eClassName
Attribute
objectID, attributeName, attributeValue
Reference
objectID, referenceName, valueID
Table 5.1: Entity-attribute table
As it is evident from the table 4.1 the entities of the ER diagram are the Object, the Attribute and the
Reference. These entities represent the building blocks of an Ecore Model which are the EClass,
EAttribute and EReference as discussed in the Literature review section.
2nd step: Building an entity-relationship table.
32
Scalable persistence of EMF models
In this step of the Conceptual design the relationships and the multiplicities will be defined and
grouped together in a table.
Entities
Relationship
Multiplicity
Object, Attribute
contains
1..1 : 0..*
Object, Reference
Consists of
1..1 : 0..*
Table 5.2: Entity Relationship table
An object can contain zero or more Attributes and can consist of 0 or more References. A Reference
or an Attribute though cannot exist without a base Object.
3rd step: Building the Entity – Relationship diagram.
The ER diagram will be based on the Entity – Attribute table and Entity – Relationship table. To
represent the ER diagram a hybrid representation will be used that includes UML schematics.
Figure 5.2: Database ER diagram
The next step is to translate the Entity Relationship diagram into a set of tables.
5.3.2 Logical Design
In this phase of the database design the conceptual design will be transformed into a set of tables that
can support the operations needed as defined in the requirements section.
Database Definition Language (DBDL) will be used for the tables description which is a simple way
of representing a table by means of its name, columns and primary key.
For example “Moons (ID, Moon, Planet)” is describing a table that is called Moons and has three
columns ID, Moon and Planet. ID is the primary key of the table [28].
It is a three step process of transforming a conceptual database design into a logical database design.
33
Scalable persistence of EMF models
1st step: map the ER entities into tables.
Each ER entity that was identified in the figure 4.1 will be mapped to a table.
Objects (objectID, eclassName)
Attributes (objectID, attributeName, attributeValue)
References (objectID, referenceName, valueID)
Listing 5.1
2nd step: mapping the relationships between the entities.
Normally the relationships are mapped between entities introducing the notion of the foreign key.
This particular database design though implements the relationships between the entities in a more
abstract level.
As defined in the ER model the Object Entity contains attributes and consists of references. Through
the objectID field which is unique the connection between Objects, Attributes and References is
implemented.
Objects (objectID, eclassName)
Listing 5.2
The objectID is identified as the primary key in the table Objects. To each different EClass in the
Ecore model a unique id is assigned. For the tables Attributes and References there is no point in
identifying primary keys since all the fields of the tables are required in order to distinguish one
record from another.
3rd step: determine the fields’ data types.
For each table that was identified the data types of its fields will be identified.
Table Objects
Column
Data type
ObjectID
INT
eClassName
CLOB
Table 5.3: Table Objects
Table Attributes
Column
Data type
34
Scalable persistence of EMF models
ObjectID
INT
attributeName
CLOB
attributeValue
CLOB
Table 5.4: Table Attributes
As shown in Table 4.3 the attributeValue field type is CLOB which is a data type used to represent
large data types. An attribute of course can be of any type like INT or BOOLEAN. CLOB data type is
used to represent the attribute value in the database. When though this field is loaded into memory it
is then casted to the correct data type as described in the Ecore model.
Table References
Column
Data type
ObjectID
INT
referenceName
CLOB
valueID
INT
Table 5.5: Table References
4.3.3 Physical design
In order to realize and implement the database design there was a need for a suitable selection of a
Database Management System (DBMS). The selected DBMS is H2 Database [34]. In this section the
reasons for the selection of H2 will be provided, the code to connect to H2 Database as well as the
code to create the tables that were described above.
4.3.3.1 H2 Database
A very important part of the design for this application is the selection of the DBMS. The objective
was to find a DBMS that was as efficient as possible when operating in embedded mode since the
application was going to be applied on very large EMF models and thus the speed of the database was
crucial.
After researching and comparing different DBMS the final decision was that H2 Database would be
used. Below a matrix is presented comparing stats between different DBMS with H2 when operating
in embedded mode. H2 is faster and has lower memory footprint than other DBMS [23].
35
Scalable persistence of EMF models
Table 5.6: H2 Performance comparison (adopted from [23])
In addition H2 Database is very easy to ease in embedded mode. The only thing required in order for
the Java code to be connected with the database is the connector JAR file. Other DBMS like MySql
require more information in order for the connection to be established like the port number that
MySql listens at the user’s computer. Additional information like the above would probably add
unnecessary complexity to the application.
For all the aforementioned reasons H2 Database was selected.
5.3.3.2 Database creation – connection code
In this section the code developed for the connection and the creation of the database schema that was
described above, will be presented.
Listing 5.1: Open connection to H2 DB method
private void openConnection(String dbNamePath){
try {
//load software driver
Class.forName("org.h2.Driver");
//open connection
con = DriverManager.getConnection("jdbc:h2:" +
dbNamePath, "sa", "");
} catch (Exception e) {
e.printStackTrace();
}
}
The openConnection(String):void method connects the application with the H2 Database.
For the creation of the tables there is a general method called for all three tables
Listing 5.2: Create table method
private void createTable(String dbNamePath, String sqlDrop, String
sqlCreate) {
//open connection
if(con == null){
openConnection(dbNamePath);
}
try {
//instantiate statement
36
Scalable persistence of EMF models
stmt = con.createStatement();
//try to execute sql
stmt.executeUpdate(sqlDrop);
//try to execute sql
stmt.executeUpdate(sqlCreate);
stmt.close();
} catch(SQLException s) {
JOptionPane.showMessageDialog(null, s.getMessage(),
"CREATE Error", JOptionPane.ERROR_MESSAGE);
} catch(Exception e){
//inform user of error
JOptionPane.showMessageDialog(null, "General
error");
} finally {
//close the connection
closeConnection();
}
}
The create table method executes two Sql strings every time it is called. For every table two Sql
strings are executed. The first drops the table if it exists already and the other creates the table. Below
the Sql strings for the creation of table Objects are presented:
Listing 5.3: SQL Queries
"DROP TABLE IF EXISTS Objects";
"CREATE TABLE Objects (objectID INTEGER PRIMARY KEY AUTO_INCREMENT,
eClassName CLOB)";
5.4 EMF Injection into the database
After the design of the relational database in order for the first milestone of the project to be
completed an algorithm had to be designed and implemented in order for the EMF models to be
injected in the database.
There is no real need to provide at this point a UML diagram since this algorithm does not involve
interactions with other classes. A flow chart will illustrate better the logic that underlies the design of
EMF injection algorithm.
37
Scalable persistence of EMF models
Start
Initialize Iterator
Has the resource more
Eclass objects?
Put the Eclass objects in
the HashMap mapping
them with their id
Yes
No
Initialize Iterator
No
Has the resource more
Eclass objects?
End
No
Yes
Inject the id and
Eclass name into
table Objects
Has the current
Eclass object more
EAttributes?
No
Has the current
Eclass object more
EReferences?
Yes
Yes
Inject the EAttribute name,
EAttribute value and Eclass
id into the Attributes table
Get the Ereference
object id from the
HashMap
Inject the current Eclass object
id, EReference Name and
Ereference object id into the
References table.
Figure 5.3: Injection Algorithm Flowchart
The Figure 5.3 explains the logic of the algorithm that was going to be implemented in order to inject
the EMF models to the generic database. An important part of the algorithm is the HashMap object
creation in the beginning. A HashMap object can be imagined as a table with two columns and
variable number of rows. This specific HashMap that was designed in the first column contains the
Eclass object and in the second column the EClass’s equivalent id. The HashMap creation has to be
the first step in the algorithm in order to be used at the injection of data in the References table. It is
required to create a general map with objects ids since any object can be a reference to another object.
38
Scalable persistence of EMF models
5.5 General UML design
In this section we will analyze the classes of the application design using UML. A Class diagram will
be provided that describes both of the building blocks of the application.
5.4.1 Class diagram
The design of the application is based on three classes: LiveObjectModelBuilder, LiveObjectModel
and LiveObject. The Class LiveObjectModelBuilder is used to for the first building block of the
application and the LiveObjectModel as well as LiveObject classes are used for the second building
block of the application. LiveObject Class is an entity class.
Below is the Class diagram of the application design.
Figure 5.4: Project’s Class diagram
5.4.2 Class diagram summary
The general idea of the design is that an EMF model using the LiveObjectModelBuilder class is
injected into a database. Any LiveObjectModel must directly be associated with a database that was
created by the class LiveObjectModelBuilder since the data manipulated by the LiveObjectModel must
be coming from the database where the model is stored. LiveObjectModel implements the interface
IModel and in this way the application is integrated with the Epsilon platform.
LiveObject is an entity Class. When the user calls the methods provided by LiveObjectModel to query
the EMF models stored in the database, the information has to be stored in memory in a specific form.
39
Scalable persistence of EMF models
The information taken from the database through LiveObjectModel is stored in memory in LiveObject
objects. Each LiveObject object has three basic fields that represent each instantiated EClass. The id
of the instantiated EClass object as described in the database, the EClass name as well as the
LiveObjectModel that is derived from. The last field is very important since many LiveObjectModel
objects can operate on a single database. The basic concept that underlies this design is to fetch data
on – demand. The data that need to be interrogated is either loaded into LiveObject objects or other
suitable objects depending on the data type of the data requested. This way only the part of the model
that was requested needs to be in memory.
40
Scalable persistence of EMF models
Chapter 6
Implementation
The purpose of this section is to describe and explain very important pieces of code that implement
the basic parts of the design as discussed in Chapter 5. In addition basic points will be discussed about
the plug – in architecture which was used for the implementation of this plug-in.
6.1 Injecting into the relational database
In this section it is explained how a model is inserted into the H2 Database.
It has been mentioned in the Literature Review section (Chapter 2) that in order for the model to be
able to be persisted, first it has to be put into a Resource which is contained in a ResourceSet. The
code that implements this functionality is presented in Listings 6.1, 6.2:
Listing 6.1: Resister metamodel
public EPackage registerMetamodel() throws Exception {
String ecore = base + "library.ecore";
ResourceSet resourceSet = new ResourceSetImpl();
resourceSet.getResourceFactoryRegistry().
getExtensionToFactoryMap().put("*", new
XMIResourceFactoryImpl());
Resource resource =
resourceSet.createResource(URI.createFileURI(ecore));
resource.load(null);
return (EPackage) resource.getContents().get(0);
}
Listing 6.2: Load model to resource
public Resource loadModel() throws Exception {
String model = base + "myLibrary.model";
ResourceSet resourceSet = new ResourceSetImpl();
resourceSet.getResourceFactoryRegistry().getExtensionToFactoryM
ap().put("*", new XMIResourceFactoryImpl());
EPackage metamodel = registerMetamodel();
resourceSet.getPackageRegistry().put(metamodel.getNsURI(),
metamodel);
Resource resource =
resourceSet.createResource(URI.createFileURI(model));
resource.load(null);
return resource;
}
41
Scalable persistence of EMF models
To get into the full detail of this code is out of focus. The main concept though is that a ResourceSet is
created in which a resource is loaded. The unique identifier in order to distinguish resources between
themselves in a ResourceSet is the URI.
As seen above, the loadModel() method returns a specific resource. Based on Figure 5.4 which
provides a class diagram now all there is to be done in order to load the model in the database is call
two methods from the LiveObjectModelBuilder Class. First the database must be created calling the
createDatabase() method and then a resource can be added calling the addResource(Resource)
method.
The functionality of adding the information contained in a resource to the database was explained
through a flowchart in the Design phase so the code will not be presented.
6.2 Querying methods explained
In this section the basic methods used to query parts of the database will be explained. The methods
that were implemented conform to the interface IModel which as stated in the requirements section
(Chapter 4) is the interface that has to be implemented in order to implement a driver through Epsilon.
When the IModel method that is implemented is complicated then code will be provided to support
the descriptive explanation.
Methods:
allContents():Collection: The objective of this method is to store all the information in the Objects
table into LiveObject objects. As discussed previously the Objects table stores all the names of the
EClasses of a model. The number of the LiveObjects created will be as much as the number of the
EClasses of the model. As expected this method will return a Collection of LiveObjects objects.
getAllOfType(String):List<LiveObject>: The objective of this method is to return a list of
LiveObjects of a specific type. With simple SQL strings the Objects table is queried and the
information is stored in a List of LiveObject objects.
getAllOfKind(String):List<LiveObject>: The objective of this method is to return all the objects that
inherit from a specific type of object. More specifically the specification of this method is to return all
the types of objects that are of the same kind of a specific EClass. All the children, including the
children of children and so on, are considered to be of the same kind as the root of an inheritance tree.
The root of this tree in this situation is an EClass that is specified in the String parameter of the
method. This method is more complex than the methods presented above so the code will be
presented and explained.
Listing 6.3: Method to find subtypes
protected List<EClass> getAllSubTypes(String clazz) {
//Stores the sub Types of a particular class
ArrayList<EClass> allSubTypes = new ArrayList<EClass>();
//Stores all the classes of a package
ArrayList<EClass> allClasses = new ArrayList<EClass>();
allClasses = loadAllEClasses();
EClass eClass = classForName(clazz);
42
Scalable persistence of EMF models
EClass current = eClass;
//adds the Class itself in the subTypes ArrayList
allSubTypes.add(eClass);
int j = 0;
while(j < allSubTypes.size()){
for(int i = 0; i < allClasses.size(); i++){
//checks if super type and also not to be the same class
if(current.isSuperTypeOf(allClasses.get(i))
&& current != allClasses.get(i)){
allSubTypes.add(allClasses.get(i));
}
}
j = j + 1;
if(j < allSubTypes.size()){
current = allSubTypes.get(j);
}
}
return allSubTypes;
}
Listing 6.4: Method to get all types of subtypes
public List<LiveObject> getAllOfKind(String type) {
List<LiveObject> x = new ArrayList<LiveObject>();
ArrayList<EClass> allSubTypes = new ArrayList<EClass>();
allSubTypes = (ArrayList<EClass>) getAllSubTypes(type);
for (int i = 0; i < allSubTypes.size(); i++){
x.addAll(getAllOfType(allSubTypes.get(i).getName()));
}
return x;
}
The first method that is presented above “getAllSubTypes()” (Listing 6.3), is to get all the sub types
of an EClass. This method returns a List of EClasses that contains the entire inheritance tree when
considering as a root the EClass specified as a parameter. For that to be achieved all EClasses of an
EPackage were loaded into another List first. The algorithm contains two nested loops and also a
basic if statement.
The basic logic of the “getAllSubTypes()” method is to examine the number of children for a current
EClass every time sequentially by investigating all the EClasses of the EPackage. Every time a child
is identified it is added in a specific Collection. This process is repeated for every child that is
identified each time. The base EClass in order for this procedure to begin is given as a parameter.
43
Scalable persistence of EMF models
The challenging part of this algorithm was that no method was provided by the EMF API that
identified the entire children tree of an EClass. The EMF API method that was used is the
“isSuperTypeOf(EClass):Boolean” which only identifies if an EClass is an immediate parent of
another EClass.
After identifying all the sub classes of an EClass, then the procedure to return all the types of each sub
type is easy (Listing 6.4). For each EClass that is a sub type the “getAllOfType(String):
List<LiveObject>” method is called, getting all the types of objects that are of the same kind of an
EClass that is specified as a parameter.
knowsAboutProperty(Object, String):Boolean: This method can be considered as a representative of
many other methods of the interface IModel which examine an aspect of the model and return a
Boolean value. This method takes as a parameter a LiveObject object and checks if there is a
Reference or an Attribute that belongs to an object of an EClass that is described by the LiveObject
object. The logic that underlies this method is to take the object of the EClass that is represented by
the LiveObject object and check if there is any EStructuralFeature that is described by the String type
at the parameters of the method.
get(String):Object: This method as seen from the Class diagram in the design section, is in the
LiveObject Class. Its objective is to get a structural feature that is either a reference or an attribute and
load it to a suitable object. This object can be a Collection of objects as well. The EStructuralFeature
is described by the String parameter of the method.
Because the method is complicated, some parts of the code will be presented and explained.
First it is examined if the String parameter is actually an EStructuralFeature. If it is not, a null value
is returned. If it is, then it is either an EReference or an EAttribute.
So for an EAttribute the code is (Listing 6.5):
Listing 6.5: Part of method to get attribute values
if (sf instanceof EAttribute) {
if (sf.isMany()) {
List<Object> castedValues = newArrayList<Object>();
Collection<String> values = model.getAttributeValues(id,
property);
for (String value : values) {
castedValues.add(cast(value, sf.getEType()));
}
return castedValues;
}else {
List<String> values = model.getAttributeValues(id,
property);
if (values.size() > 0) {
return cast(values.get(0), sf.getEType());
}else {
return sf.getDefaultValue();
}
}
44
Scalable persistence of EMF models
}
The castedValues List is used because the EAttribute values should not be loaded in the String form
that are saved into the database. The EAttribute values should be loaded in the type that is specified in
the model.
The EAttributes values are taken from the database and loaded into Collection objects. If the
EAttribute is “not many”, meaning that only one value corresponds to the EAtrribute, then the first
element of the collection is returned. If though the List that holds the Attributes values is empty then it
means that the Attribute has a default value that is the same for all the objects of the same type and so
the default value is returned.
A similar logic is applied when the EStructuralFeature is an EReference. The first else in the code
section implies that the EStructuralFeature is an EReference.
Listing 6.6: Part of method to get reference values
else {
if(sf.isMany()){
Collection<LiveObject> values = model.getLiveObject(id, property);
return values;
}else{
Collection<LiveObject> values = model.getLiveObject(id, property);
if(values.size()>0){
LiveObject x = model.getLiveObject(id, property).get(0);
return x;
}else{
return sf.getDefaultValue();
}
}
The basic logic that underlies the algorithm is presented below
The objective of this part of the program is to get the LiveObject objects that correspond to a specific
object id and to a specific reference name. The reference name is given to the method as a String
parameter. In order for this objective to be accomplished the algorithm has to visit two tables: the
References table and the Objects table. In order for the procedure to be more efficient an SQL query
was used in order to join the two tables and extract the desired information more efficiently instead of
visiting separately the two tables.
6.3 Eclipse plug – in
As stated in the Requirements section (Chapter 4) the objective of this project is to develop a plug – in
which will improve the scalable persistence of EMF models. In this section the architecture of a plug
– in component in Eclipse will be discussed briefly. In addition the role of the code developed will be
explained in this context.
45
Scalable persistence of EMF models
6.3.1 Plug – in architecture
The basic unit or component that provides functionality in Eclipse is called a plug-in. While there is a
running instance of Eclipse, a plug – in is attached to a plug – in Class which is responsible for
managing and configuring the plug – in instance.
Every plug – in resides in its own folder. In the folder there is the manifest file which gives the
Eclipse runtime the necessary information to activate the plug – in.
Each time Eclipse is running there are some core plug –ins that are automatically activated by
providing some default plug – in classes and functionality. Other plug – ins though, noncore, are
activated only when needed by other plug - ins. There are two kinds of relationship that describe this
situation:
1. Dependency: Some plug – ins in order to run need some other plug – ins as prerequisites
2. Extension: The procedure of adding elements to a plug – in is called an extension.
The manifest file includes the information that describe the kind of relationship that a plug – in is
involved.
Dependency:
There are two ways to make use of a dependency plug – in. It can be used at runtime, but it must be
made available to the dependant plug – in or at compile time where Eclipse’s classpath has to be
augmented including the dependency plug – in [24].
Extension:
In this situation there exists the host plug – in and the extender plug – in. The second extends the
functions of the first. As stated before an extension plug – in can extend the behaviour of the host plug
– in. Any plug – in component in Eclipse can allow its extension in many different ways. A simple
example would be that a workbench UI allows its menus to be augmented.
There many different of extensions allowed by a plug – in. All these extensions are attached to the
host plug - in from specific slots that are called extension points [24].
Between the relationship of a host plug – in and an extender plug – in, a callback object can play a
significant role in the communication of the two. A callback object is implemented through plain Java
Objects and can add additional functionality to an extender plug – in, which is attached to an
extension point on a host plug – in.
At this point a basic context for the plug – in architecture was provided. Of course there are a lot more
to explain around Eclipse plug – ins but an extensive analysis would be out of scope for this project.
6.3.2 Plug – in roles explained
Now that a better framework was provided around Eclipse plug – ins the building blocks of the
application regarding the implementation can fit into place.
Dependencies:
46
Scalable persistence of EMF models
The H2 DBMS plays an important role in this project. Each time it is needed though to connect to the
DBMS, the classpath of Eclipse has to be augmented since the H2 .jar connector file is necessary. In
other words H2 connector file was a prerequisite for the code of this project to work. Eclipse provides
the tools to convert a file into a plug – in dependency. In this case the H2 connector file was converted
into a dependency plug – in and added to the project’s workbench. By doing that potential users of the
plug – in developed will not have to add the H2 connector file manually.
Extensions:
For the extension part, the Eclipse Modelling Framework can be imagined as the host plug – in. The
code described in the Design (Chapter 5) and Implementation part (Chapter 6) of the project can be
considered to be the callback object. So what was needed to implement apart from the callback object
would be the extender plug – in so that the callback object could operate in this context. To handle the
Extension and Extension points requires experience and deep knowledge of Eclipse’s architecture.
The steps to develop the extender plug – in so that the callback object could operate were
implemented by Dr. Dimitris Kolovos who is the supervisor of this degree thesis.
The actions above that included the creation of a dependency plug – in, the implementation of a
callback object as well as the extender plug – in allowed the creation of an operational Eclipse
extension plug – in.
47
Scalable persistence of EMF models
Chapter 7
Evaluation
In this section a case study will be presented as well as test cases. In addition the requirements
evaluation will be presented as well as some improvements evaluation.
7.1 Case study
At this point the EMF model will be kept simple and small in order for the case study to be easy for
comprehending and due to practical space purposes. The fact that a relatively small EMF model is
presented does not mean that the application can be applied only to simple EMF models. The EMF
model that is going to be presented in the case study is structured in a way so that the functionality of
the application can be examined in full extent.
7.1.1 Constructing an EMF model
The EMF model that our case study will be based is presented below. It is an extended model from
the one that was presented in the Literature Review section (Chapter 2). This model includes also
inheritance relationships and more classes. The extLibrary.ecore file is presented in Figure 7.1.
Figure 7.1: Case study Ecore model
The extended library metamodel consists of six EClasses. Each EClass may contain EAttributes,
EReferences or both. An inheritance relationship in this model is for example between the EClasses
Writer and Person. EClass Writer inherits from the EClass Person.
48
Scalable persistence of EMF models
It is out of scope at this point to present the equivalent Emfatic code that was implemented in order to
generate this model. The thing to remember is that any change to the model with the clicks of some
buttons can automatically be propagated to the Emfatic code and vice versa.
Before going to the database injection part there is the necessity to stress again that it is not the Ecore
model itself that it is injected to the database. The database stores a model with object instances that
conform to the extLibrary.ecore metamodel.
The extLibrary.ecore metamodel can be imagined as the blueprint of the objects that are going to be
instantiated based on it.
Having that in mind the next step is to create objects that conform to the “extlibrary” metamodel of
Figure 7.1.
7.1.2 Instantiating the model
As defined in requirement section (Chapter 4) the models should be instantiated without the help of
the automatic code generation facility EMF provides. The objects should be instantiated reflectively.
In order to achieve that, the built-in EMF reflective editor will be used. The way to achieve that is
really straightforward and does not need additional plug-ins in order to edit the model. After creating
an ECore model that the objects need to conform to it, all is needed is to let know the EMF of the new
ECore created. To update what EMF knows, the newly created model has to be registered by right –
clicking and selecting register EPackages like in Figure 7.2.
Figure 7.2: Register EPackages menu
Now it is easy to create models that conform to the newly created EMF metamodel.
The model that is created using the reflective editor [25] and is presented in the Figure 7.3
Figure 7.3: Instantiated metamodel
This model in Figure 7.3 conforms to the EMF metamodel in Figure 7.1. What is the information that
is included in this model though?
As described in the model in Figure 7.3 there is the root object of type Library that is called
“myExtLib” and has many objects as references:
1. Writer object called Dan
49
Scalable persistence of EMF models
2. Writer object called Jack.
3. Book object called Code Da Vinci
4. Enc object called Britanika
5. Employee object called John
Each of these objects has several fields that need to be instantiated. In the model though as seen in
Figure 7.3 not all the information that is encapsulated in each object is visible. In order to instantiate
and also to view all the information of an object EMF provides a property view. Below in Figure 7.4
the property view of object 1 is presented:
Figure 7.4: Reflective editor property view
Figure 7.4 demonstrates all the information encapsulated in the Writer object with First Name: “Dan”,
Last Name: “Brown” and also a reference to an object Book with Title “Code Da Davinci”.
Similarly all the other objects in Figure 7.3 have been instantiated using the build – in reflective
editor and the EMF property view. There is also the option to instantiate the objects programmatically
but since the instantiated model was small, using the built – in reflective editor is quite convenient.
The next step is to inject the model described in Figure 7.3 in the database.
7.1.3 Objects injected in the database
In this section the case study will be continued. The model that contains instantiated objects in Figure
7.3 is injected into the database. The tables that were created during that process are presented below
in Figure 7.5, 7.6 and 7.7. The information in each table will be analyzed separately.
Figure 7.5: Objects table
50
Scalable persistence of EMF models
Figure 7.6: Attributes table
Figure 7.7: References table
Objects table
Each EClass name is inserted in the Objects table and a unique id is assigned to each object. Despite
the fact that we have two Writer objects, each of them is assigned a different id. The Objects table
contains six objects which is the right number since we had six instantiated objects in Figure
6.3.Continuing this logic it is obvious that it is of no importance the size of the model. The table will
continue growing assigning each new object a unique id number.
Attributes table
In the Attributes table each attribute of an object is mapped with its id and its value. In this way all the
information stored in attributes, it is associated with the suitable object id. In addition there is enough
information in order to determine which attribute belongs to which object. In this case also if the
model was of very large size, it would not affect at all the functionality of this table.
For example if the information of the second row of the table is examined the following conclusions
would be drawn: The object with id = 2 has an attribute which name is firstName and has the value
Dan.
51
Scalable persistence of EMF models
References table
The same logic is followed also in the references table. To analyze the functionality of this table the
first row of the table will be explained. The information illustrated is that in the object with id = 1
there is a reference that is called writers and is of type object with id = 2. This means that the
reference is of type Writer because in the objects table this is the eClassName with id 2. Respectively
in the second row, the object with id 1 has a reference of type object with id = 3 which is of type
Writers again.
7.2 Testing
In this section the testing analysis will take place through black box testing the two building blocks of
the project. Based on the results of the testing the functional requirement analysis will be presented in
the nest section. The case study presented above will be used for the testing process.
7.2.1 Black Box testing
Black box testing is the most suitable process to use at this point since it focuses on the system
functional requirements. The idea of black box testing is to test a system without knowing the internal
structure. An input is given to the system and the output that the system gives is compared with the
expected output as in Figure 7.8.
System input
Black Box
System output
Figure 7.8: Black Box testing schema
Each test case will have a purpose, specific inputs and comparison between the outputs and expected
outputs [26].
7.2.1 Database injection testing
The evaluation of this building block of the project will be based on the case study that was presented
in the 7.1 section of the report.
Purpose of the test: Test the database module functionality
Input values
Expected / Actual Results
Pass/Fail
Input model in Figure 6.3
Expected results:
Pass.
The tables in Figure 6.5
Destination path for the database file New db file in the projects folder.
Actual Results as expected
The database module of the application passed the black box test. All the information of the model
was successfully injected in the database. Also a database file was created in the project’s folder
which contains all the model data.
52
Scalable persistence of EMF models
7.2.2 Database querying testing
The tests will be based again on the case study. The basic methods of the IModel interface that were
implemented will be tested.
Purpose of the test: Test the querying module functionality
Method
Input values
Expected / Actual Results
getAllOfKind(String):
“Book”
Expected results:
List<LiveObject>
The List to have two
Objects, one Object Book
and one Object Enc.
getAllOfType(String):
List<LiveObject>
allContents():Collection<?>
getElementById(String): Object
Actual Results as expected
Expected results:
The List to have two Writer
Objects.
“Writer”
The LiveObject Model
that corresponds to the
Case study.
Actual Results as expected
Expected results:
A Collection with all the
Objects in the Objects table
in Figure 6.5.
Actual Results as expected.
Expected results:
A LiveObject with id=4 and
className=”Book”
“4”
getElementId(Object): String
EClass Library
isOfKind(Object, String):
Boolean
The LiveObject with
className = Enc and
the String “Book”.
knowsAboutProperty(Object,
String): Boolean
The LiveObject with
className = Writer
and String value
firstName
Pass/Fail
Pass.
Pass.
Pass.
Pass.
Actual Results as expected.
Expected results:
A String with value=“1”
Pass.
Actual Results as expected.
Expected results:
A Boolean value true.
Pass.
Actual Results as expected.
Expected results:
A Boolean value true.
Pass.
Actual Results as expected.
The querying module of the project passed the black box test. Many more methods were implemented
in this module but an exhaustive testing of every method would be out of scope for this project.
7.3 Requirements Evaluation
The functional requirements evaluation will take place separately for each building block of the
project. The evaluation will be based on information that was presented so far as well as the black box
testing of the previous section, which is a process that focuses on the functional requirements of a
system.
53
Scalable persistence of EMF models
7.3.1 Database injection functional requirements evaluation
Each of the functional requirements is tagged with a special index (Section 4.3) so that it is easier to
refer to.
Functional requirements:
F.R Injection: It is satisfied since through the case study it is obvious that the EMF model is
successfully injected into the database.
F.R Complete: It is satisfied because through the description and explanation of each table in the
Case Study it is evident that all the information represented in the model is successfully mapped into
the database.
F.R. Generic: It is satisfied since the database design is operational for any EMF model. The
database design is not dependant on the structure of the EMF model.
F.R. Transparency: It is satisfied since the user through the LiveObjectModelBuilder Class has the
option to choose the location of the created database. In addition a separate transparent file of the
database is created to the location specified by the user.
F.R Reflection 1: It is satisfied since the EMF automatic code generation facility is not used to
manipulate the objects.
7.3.2 Database querying functional requirements evaluation
Each of the functional requirements is tagged with a special index (Section 4.3) so that it is easier to
refer to.
Functional requirements:
F.R Querying: It is satisfied since as explained in the implementation part of the project parts of the
model can be loaded into LiveObject objects. Also there is the option to load to memory specific
EAttributes or EReferences. In this way selective parts of the model can be loaded into memory.
F.R Variety: It is satisfied since as explained in the implementation part (Chapter 6) of the project as
well as at the black box testing part a large set of methods have been implemented that allows the user
to query the data in the database in many different ways of choice.
F.R Reflection 2: It is satisfied since the EMF automatic code generation facility was not used in
order to implement the querying methods.
7.3.1 Database injection non functional requirements evaluation
Non – functional requirements:
The non – functional requirements are satisfied too.
N.F.R Embedded: The H2 DBMS system was used in embedded mode. The .jar file needed for the
database connection was transformed into an Eclipse plug – in. This way the user can use it as a
dependency plug – in to the plug – in developed in this project.
54
Scalable persistence of EMF models
N.F.R Efficiency_1: In addition an effort was made to make the Java code efficient in order for the
injection of the model to happen as fast as possible. The time consuming activity of opening a
connection to the database was handled with caution. The connections to the database were
implemented only when needed.
N.F.R Maintainability: The Java code developed is well documented through UML diagrams
in order to be maintainable.
7.3.2 Database querying non functional requirements evaluation
In this section using the tools that Eclipse platform provides the non functional requirements of the
querying building block of the project will be evaluated.
Non – functional requirements:
N.F.R Efficiency_2: An effort was made to make the code as efficient as possible. More
specifically when consecutive querying methods are executed, they all use the same
connection to the database. This technique is applied because opening a connection to a
database is a time consuming activity and performance plays a big part in this project’s
objective. In addition by joining tables when needed instead of visiting the database multiple
times in order to extract data the code is more efficient.
N.F.R Adaptability: The code prevents the alienation of the user by having if clauses as well
as try & catch clauses. This way even if the user inserts invalid parameters when trying to
execute methods of the application the return values are indicative of the inconsistencies.
7.4 General Performance Evaluation
The objective of this section is to test the performance of the application that was developed. Using
Eclipse IDE tools as well as the Epsilon plug – in, the time needed for specific functions to be
executed will be measured. A comparison will take place between
1. Executing these functions through extracting the data from the database.
2. Executing these functions using the data provided from the EMF model itself.
All the test functions that will be timed and measured are implemented in the form of Epsilon Object
Language (EOL) scripts. In addition a very big EMF model will be instantiated using EOL language.
The application’s performance will be measured while handling this big EMF model. The metrics that
are of interest are three at this point:
1. Runtime behaviour: For this metric all the objects of the model will be interrogated. In this case
the database backed solution is expected to run significantly slower since all the data have to be
extracted from the database. With the XMI persistency solution all the model is already loaded into
memory so the interrogation of all the objects of the model is expected to be done much faster.
2. Boot time: For this metric the time for one element of the model to be accessed will be measured.
The database backed solution is expected to perform faster since there is no need for the whole model
to be loaded in memory in order to access the model. The loading time will be considered.
55
Scalable persistence of EMF models
3. Memory footprint: When only a small part of the model is accessed it is obvious that the memory
footprint of the database solution is much more efficient. Only the elements that are on demand
accessed need to be in memory in contrast with the XMI persistence solution that the whole model
needs to be loaded into memory at all times.
The process of realizing these tests will be explained thoroughly in the following sections.
7.4.1 EOL Scripts explained
The EOL program that will be provided in Listing 7.1 is used for the instantiation of a quite big EMF
model. The syntax is similar to that of Java. First a library object is instantiated that will be the root
object. After there is a for loop with 30,000 iterations that instantiates 30,000 Writer objects and
30000 Book objects. The Writer and the Book objects are also added as references to the root object
Library. In addition the attributes of the objects are instantiated too. The model can be considered
quite big since it consists of 60,001 objects. In the hard disc the model occupies almost 2.6 MB
Listing 7.1: EOL instantiating code
var library : Library := new Library;
library.name = 'myLib2';
for (i in Sequence {1..30000}){
var w : Writer := new Writer;
w.firstName := 'w' + i;
w.lastName := 'l' + i;
library.writers.add(w);
var b : Book := new Book;
b.title := 't' + i;
b.pages := 200;
library.books.add(b);
}
In the first line of the EOL script in Listing 7.2 the Profiler object is instantiated. The Profiler object is
used for doing time measurements when EOL scripts are executed. First the name of the library object
is printed. Then the entire list of the Writers and all the titles of the books in the Library are printed.
Listing 7.2: EOL script1
var profiler = new Native('org.eclipse.epsilon.eol.tools.ProfilerTool');
profiler.start("main");
var l = Library.all.first();
l.name.println();
for(writer in Writer.allOfType){
writer.firstName.println();
}
for (book in Book.all) {
book.title.println();
}
profiler.stop();
56
Scalable persistence of EMF models
The EOL script in Listing 7.3 is printing the name on the root object which is of type Library which
contains all the references of the Writer and Book objects.
Listing 7.3: EOL script2
var profiler = new Native('org.eclipse.epsilon.eol.tools.ProfilerTool');
profiler.start("main");
var l = Library.all.first();
l.name.println();
profiler.stop();
7.4.2 Extender plug – in
Below in Figure 7.9 a part of role of the extender plug – in is illustrated. The UI of Eclipse menus
were augmented by the adding an additional functionality. A file that is of type “.model” can be
injected into an H2 database. So the actions that were taken so far were to instantiate a model using
the EOL scripts and then by using the extended UI menu the instantiated EMF model was injected
into an H2 database. The database that was created contains all the information included in the 20,001
objects.
Figure 7.9: UI extension
7.4.3 EMF run time vs. database run time
Below in Figure 7.10, the time to run the EOL script of Listing 7.2. The time presented is in ms units.
The time for the tens of thousands of names and titles to be printed is 2.075 sec. This result is very
fast and also anticipated. The EMF model is already loaded in memory so the amount time needed to
manipulate it, is very little.
Figure 7.10: Time through EMF model
The amount of time to run the EOL script through the database is significantly more. The time needed
is more that 35 minutes. The difference in executions times was of course anticipated since in this
case all the data that are printed are extracted from the database and is not already in memory.
57
Scalable persistence of EMF models
7.4.4 EMF boot time vs. database boot time
The time needed to access one object of the big EMF model from the database is presented in Figure
7.11. It is the time needed for the EOL script in Listing 7.3 to be executed which is 1.172
Figure 7.11: Database Boot time
The time needed to access one object of the big EMF model directly from the model is presented in
Figure 7.12. It is the time needed for the EOL script in Listing 7.3 to be executed which is only 0.015
sec.
Figure 7.12: EMF Boot time
The database solution is slower but its big advantage is that only a very small part of the model to be
loaded in memory. So to make the comparison complete the time for the big EMF model to be loaded
into memory has to be measured. This loading time was measured using Java. The time to load this
big EMF model of approximately 2.6 MB was 1.689 sec. So finally is the EMF boot time is bigger
than the database boot time since 1.689 sec + 0.015 sec > 1.172 sec.
7.4.5 Comparison and outcomes
As analyzed above the run time to execute the EOL script that interrogates all the objects of the huge
EMF model that was instantiated is significantly bigger when it is handled through the database than
when it is handled directly through the EMF model. This was a result of course that was anticipated.
On the other hand though, a model that contains tens of thousands of objects as the above is mapped
to a huge XML file. This XML file takes much time to be loaded into memory. As it was presented in
section 7.4.4 the boot time through the database backed solution that was developed is less than the
boot time through EMF. This is because it is time consuming for the whole EMF model to be loaded
into memory. The model of the example is 2.6 MB. For bigger models this difference would be even
more obvious since a bigger model would take even more time to load and all the other metrics would
remain stable. The objective of the database backed solution was accomplished since it is more
efficient from the boot time perspective
In addition the database backed solution is much more efficient from the memory footprint
perspective. This fact is obvious since the data that is loaded into memory is done on demand. So if a
small part of the model needs to be interrogated only this small part will be loaded into memory. On
the other hand with XMI which is the default persistency solution of EMF the entire model needs to
be loaded in memory even if a small part needs to be interrogated.
58
Scalable persistence of EMF models
Chapter 8
Conclusion
8.1 Project overview
The purpose of the project was to implement a persistency solution using databases for EMF models
to address the scalability issues faced by EMF technology. To achieve the objectives of the project, a
generic database design was implemented that is capable to store any kind of EMF model. In addition
a set of querying methods that complied to the interface IModel were implemented in order to
manipulate the data in the database with the ultimate goal of bypassing the loading problems EMF
faces when persisted in XMI form. Moreover the application developed was transformed into an
Eclipse plug – in. The plug – in is that was developed can be used to extend the functionality of the
EMF platform by injecting an EMF model into a database. During the requirements evaluation all
requirements were satisfied so the aims of the project can be considered that they were achieved.
Chapter 2 of the project, the literature review identified the basic concepts of MDE as well as some
basic features of the EMF platform. In addition two technologies that provided persistency solutions
for EMF models were identified: Teneo and CDO. Epsilon platform was explained briefly too.
In Chapter 3 the methodology of the project was explained. The reasons for the need of iterations
throughout the project were discussed.
In Chapter 4 the project motivation was analyzed in more detail. In addition the requirements analysis
was presented. The analysis was done separately for each building block of the project and was
divided into functional and non – functional requirements analysis.
Chapter 5 of the project provided a detailed design of how the application for this project was
designed. In addition a very important analysis was provided regarding the reasons for which the
technologies of CDO and Teneo were not used. The analysis was cross-referenced with Chapter 4 of
the project, the requirement analysis. After hands - on experiments and implementing examples using
CDO and Teneo the two technologies would not be able to satisfy the requirements that were set for
this project.
In Chapter 6 the implementation details of the project were presented. A focus was given to the
querying methods that were implemented. From the set of the querying methods the most important
were analysed and the algorithms used to implement them were explained thoroughly. Also basic
information was provided about Eclipse plug – ins in order to explain the implementation of the
project in this context.
In Chapter 7 the evaluation of the project took place. A case study was presented that demonstrated
the functionality of the application developed. In addition testing cases, using black box testing, of the
important querying methods were presented. The application passed all the testing cases. Moreover
the requirements evaluation took place. More specifically for the evaluation of the performance part
of the project the Epsilon Object Language (EOL) was used. Through EOL scripts the querying
methods were double tested and their performance was measured. In addition a big EMF model was
instantiated programmatically using EOL in order for the performance of the application to be tested
against it.
59
Scalable persistence of EMF models
8.2 Personal development
The project of Scalable Persistence of EMF models that was developed during the summer term of
this year required studying and contextualizing many different technologies which have a steep
learning curve. Due to that fact though, through the project development I had the amazing
opportunity to acquire valuable knowledge of many different technologies.
First of all through the extensive literature research around the subject I learned a lot about the basic
principles of MDE as well as the benefits and the challenges faced by this technology. In addition I set
firm basis on the fundamentals of EMF which include concepts as: EMF modelling and structure,
automatic code generation, EMF reflective API, EMF persistence API and the XMI serialization. In
addition I learned a lot about subprojects of EMF such as CDO and Teneo that gave me a holistic
view regarding the persistence of EMF.
Through the design phase of the project I had the opportunity to strengthen my UML skills as well as
my knowledge about database design.
While implementing this project I set a solid basis on the reflective API that EMF provides and how
to manipulate EMF objects without the use of the automatic code generation facility. In addition I
learnt a lot through hands on experience about the persistence framework of EMF. In addition I had
the opportunity to exercise my Java skills, SQL development skills and contextualize to a better
degree the powerful possibilities of Eclipse and Eclipse plug - ins.
Moreover while implementing the evaluation part of the project I learnt how to set up configuration
launches on Epsilon platform when using EOL language for model management activities.
Overall it was a remarkable experience that helped me extend to a significant extent my computer
science skills and introduce me to new exciting technologies.
8.3 Future work
The objective to ultimately address the scalability issues that EMF faces is not possible to be
addressed through an MSc project due to timing constraints. As thus there is room for improvements
that will be discussed briefly below.
Write property
As discussed in the requirements section (Chapter 4) the purpose of this application was to query the
models from the database for only read only purposes. A more difficult task and future improvement
would be to expand the functionality of this application by adding methods that will be able to modify
the model. This task will require much additional coding. Most importantly though, a mechanism
should be provided that will propagate the changes made to the data in the database to the model
itself.
Additional case studies
Due to timing constraints the application was not tested exhaustively. A good practice for future
improvements would be to test the application against many big EMF models in order to expose
possible holes from the performance perspective and address the grand challenge of scalable
persistence of EMF models even more effectively.
60
Scalable persistence of EMF models
Improve database design
Another consideration for improvements could be a more efficient database design. More specifically
at the references table another column could be added in order to map the object id with the EClass
name. This information already exists in the Objects table. This means that in the references table
there would be redundant data but this could improve the performance of the application since only
one visit to the References table would be sufficient for extracting the needed data.
61
Scalable persistence of EMF models
Bibliography
[1] Anneke Kleppe, Jos Warmer, and Wim Bast, MDA Explained: The Model Driven Architecture™:
Practice and Promise. Boston, United States of America: Addison-Wesley Professional, 2003.
[2] K. Žagar A. Vodovnik, "MODEL DRIVEN ARCHITECTURE, CONTROL SYSTEMS AND
ECLIPSE," in Accelerator & Large Expt. Physics Control Systems Conference, Geneva, 2005, p.
6.
[3] Elena Tibrea, "Enriching EMF models," Hamburg University of Science and Technology,
Hamburg, Msc 2006.
[4] Ragnhild Van Der Straeten,Tom Mens Stefan Van Baelen, "First International Workshop on
Challenges in Model Driven Software Engineering ," in Model Driven Engineering Languages
and Systems, Toulouse, 2008, p. 61.
[5] Richard F. Paige, and Fiona A.C. Polack Dimitrios S. Kolovos, "The Grand Challenge of
Scalability," in Challenges in Model-Driven Software, 2008, p. 6.
[6] Dave Steinberg Nick Boldt, "Intorduction to Eclipse Modelling Framework," IBM, Toronto, 2006.
[7] Karsten Ehrig, Christian K¨ohler, G¨unter Kuhns, Gabriele Taentzer, Eduard Enrico Biermann,
"EMF Model Refactoring based on Graph Transformation Concepts," in Third Workshop on
Software Evolution Embracing the Change, 2006, p. 16.
[8]
Eclipse (undated): Eclipse
http://www.eclipse.org/emf.
Modelling
Framework
Project
[On-line].
Available
at:
[9] F. Budinsky, M. Paternostro and E. Merks D. Steinberg, EMF Eclipse Modelling Framework, 2nd
ed. Mishigan, United States of America: Addison - Wesley , 2008.
[10] Bacvanski, V. & Graff, P. (2005). Mastering Eclipse Modelling Framework. Available at:
http://www.eclipsecon.org/2005/presentations/EclipseCon2005_Tutorial28.pdf
[11] Eclipse (undated). Emfatic [On-line]. Available at: http://wiki.eclipse.org/Emfatic.
[12] Claudia Ermel, and Stefan Jurack Enrico Biermann, "Modeling the "Ecore to GenModel"
Transformation," , 2010, p. 13.
[13] Eclipse (undated). Eclipse Documentation – Archived release [On - line]. Available at:
org.eclipse.emf.doc/references/javadoc/org/eclipse/emf/ecore/EObject.html
[14] Eclipse Foundation. Teneo, 2008. Available at:
http://www.eclipse.org/modeling/emft/?project=teneo.
[15] Eclipse Foundation. CDO, 2008. Available at: http://www.eclipse.org/modeling/emft/
62
Scalable persistence of EMF models
[16] Eclipse Foundation (undated) CDO [On-line]. Available at: http://wiki.eclipse.org/CDO
[17] Thomas Krafft (2010, June 21) Introducing the Objectivity Eclipse CDO [On-line]. Available at:
http://eclipse.sys-con.com/node/1439578
[18] Dave Minter and Jeff Linwood, Beginning Hibernate From Novice to Professional, Steve
Anglin, Ed. New York, United States of America: Apress Inc, 2006.
[19] P. Charlan, D. Ouellet and M. Salois, “System architecture recovery for open source software
integration.” Defence R&D Canada, 2009.
[20]
Stefan Winkler (2008, Jan. 29). The EclipseCorner [On-line].
http://dev.eclipse.org/newslists/news.eclipse.technology.emft/msg04425.html
Available
at:
[21] Christian Fuchs (2002). Software Engineering and the Production of Surplus Value [On - line].
Available at: http://clogic.eserver.org/2002/fuchs.html
[22] Ian Sommerville, Software Engineering. United States of America: Adison Wesley, 2007.
[23]
H2
(undated).
Performance
comparison
http://www.h2database.com/html/performance.html
[On
-
line].
Available
at:
[24] A. Bolour (2003, July 3). Notes on Eclipse plug- in architecture. Eclipse Corner [On - line]
Available at: http://www.eclipse.org/articles/Article-Plug-in
architecture/plugin_architecture.html
[25]
Eclipse (undated). Reflective EMF tutorial [On http://www.eclipse.org/gmt/epsilon/doc/articles/reflective-emf-tutorial
line].
Available
at:
[26] RedStone Software (undated). Black-box vs. White-box Testing: Choosing the Right Approach
to
Deliver
Quality
Applications
[On
line].
Available
at:
http://www.testplant.com/download_files/BB_vs_WB_Testing.pdf.
[27] Elmasri, R. & Navathe, S. (2007). Fundamentals of Database Systems. Boston,Mass; London.
Pearson/AddisonWesley.
[28] DIS lecture notes. (undated) ER modelling [On - line]. Available at: http://wwwcourse.cs.york.ac.uk/dis/
[29] Kolovos, D. & Paige, R. & Rose, L. & Polack, F. (undated). THE Epsilon BOOK.
[30] B. Selic, "The Pragmatics of Model-Driven Development," IEEE Software, 2003.
[31] L. M. Rose, R. F. Paige, D. S. Kolovos and F. A.C. Polack., "The Epsilon Generation
Language," in European Conference in Model Driven Architecture (ECMDA), 2008, p.16.
[32] D. C Schmidt, “Model-Driven Engineering”. Vanderbilt University, 2006
[33] S. Kent, “Model Driven Engineering”. University of Kent, 2002.
[34] H2 (undated). H2 database engine [On - line]. Available at :
http://www.h2database.com/html/main.html
[35] D. Kolovos, "An Extensible Platform for Specification of Integrated Languages for
Model Management," PhD Thesis, The University of York, 2008.
63
Scalable persistence of EMF models
[36] N. Boldt, D. Steinberg (2006, March 20), Introduction to Eclipse Modelling Framework [On line], Available at: http://www.eclipse.org/modeling/emf/docs/presentations/EclipseCon
64
Download