Microsoft Word

advertisement
TOWARDS AN EVOLUTIONARY MODEL GENERATION
FOR ERP PERFORMANCE SIMULATION
Daniel Tertilt, Stefanie Leimeister
Fortiss – Research Institute at Technische Universitaet Muenchen
Guerickestrasse 25, 80805 Muenchen, Germany
Stephan Gradl, Manuel Mayer, Helmut Krcmar
Chair for Information Systems, Technische Universitaet Muenchen
Boltzmannstraße 3, 85748 Garching, Germany
ABSTRACT
The performance of ERP systems is a critical success factor for the reliable operation of a business. A promising
approach to cope with the complexity of nowadays' ERP systems and to predict their actual behavior is simulation.
Commercial ERP systems, however, only provide limited insight and thus several components have to be handled as
black boxes and require a modeling approach. In this paper we depict an approach to increase the accuracy of ERP
system performance simulation by using an evolutionary algorithm for modeling the black boxes performance behavior.
We can show that evolutionary algorithms are able to generate performance models for ERP components based on
measured performance data that describe the performance behavior of these components accurately. Furthermore we
point out the characteristics of the algorithm, as well as its advantages and disadvantages, and give an outlook about the
future research.
KEYWORDS
Performance modeling, performance simulation, ERP, evolutionary algorithm
1. INTRODUCTION
The performance of an enterprise resource planning (ERP) system is a business critical non-functional
requirement (Schneider, 2006), and strongly dependent on the infrastructure it is hosted on. Bögelsack et al.
(2010) showed that changes in the infrastructure can significantly influence the performance and
consequentially the usability of an ERP system. Performance predictability for ERP systems is thus very
desirable, as it allows the handling of performance problems before they occur (Balsamo et al., 2004),
thereby reducing the risk of infrastructure changes significantly.
At the same time performance prediction is hard to achieve, due to the complexity of modern ERP
systems (Anderson and Mißbach, 2005). Often this complexity is managed by analyzing the internal structure
of the ERP system (white box approach), identifying the correlation between the internal components, and
performing performance prediction by simulation. Simulation though brings some “smallest elements”,
components that cannot be seen inside (e.g. to protect intellectual property), and that have to be handled as
black boxes. The performance behavior of these black boxes is often modeled using simple mathematical
functions like the mean value on measured performance data (see e.g. (Woodside, 2002)), though ignoring
the characteristics of the underlying infrastructure.
As Noblet et al. (2004) state simulation results can only be valid and trusted if the simulation is as close to
reality as possible. Our aim is to optimize the simulation results by increasing the accuracy of the black box
performance models. For this, we develop an approach to model the performance behavior of these black
boxes using an evolutionary algorithm (Zitzler and Thiele, 1999) performing a multi-objective optimization
(Zitzler and Thiele, 1998) on measured performance data. In contrast to exact mathematical or algorithmic
modeling, the evolutionary approach promises usable approximations even of multidimensional models in an
acceptable time span (Gwozdz and Szlachcic, 2009), providing the possibility to consider multiple factors for
the response time prediction of a black box.
2. RELATED WORK
The approach presented in this paper is intended to enhance the interface between the ERP performance
measurement and the ERP performance simulation. The most related work is shown in table 1, where we also
point out which subject the documents deal with. The two documents that deal with modeling as well as
simulation describe a hybrid approach, as it is developed in this paper.
Table 1. Related Work concerning performance measurement, modeling and simulation
Document
Bögelsack et al. (2008)
Pllana et al. (2008)
Jehle (2009)
Gradl et al. (2009)
Rolia et al. (2009)
Kraft et al. (2009)
Bögelsack et al. (2010)
Measurement
No
No
Yes
No
No
No
Yes
Modeling
No
Yes
No
No
Yes
Yes
No
Simulation
Yes
Yes
No
Yes
No
Yes
No
3. ERP COMPONENT PERFORMANCE MODELING
3.1 Structure of the Algorithm
Oriented at natural evolution the evolutionary algorithm sets up a population of concurring threads, each
trying to generate a model that best matches the given set of measured performance data of the ERP black
box component. A selection of the fittest is done by competition, and evolution evolves by constantly passing
better models to threads that lost competitions, and by mutation of these models. Competition in this context
means that the fitness values of two threads (defined by the fitness function described later) are compared,
resulting in the thread with the better fitness value to win the competition.
Central
Component
Create
Fetch
opponent
Report
Population
Compete
Fig. 1. Schematic representation of the functionality of the evolutionary algorithm.
In order to create the performance model, several items have to be introduced. First, a central component is
created at startup. This component manages the population of threads that generate the performance model by
creating them at startup and by selecting an available opponent whenever a thread is in time to compete
against another thread. Furthermore, the central component monitors the results of each competition, to check
if the end criterion is reached.
The population size is restricted by the system resources the algorithm is executed on. A higher
population will result in higher parallelization of the evolution. Figure 1 is a schematic sketch of the
functionality of the algorithm.
3.2 Model Representation
Each thread stores a representation of its actual model in memory. This model is represented by an object tree
that has method objects as nodes and fixed value or variable objects as leafs.
Method objects represent a mathematical operator. Mathematical operators can be binary operators like
addition, subtraction, multiplication, division and power, but also unary operators like sine or cosine. They
are nodes in the tree, as they require parameters, either again method objects, or fixed value or variable
objects. Fixed value objects are leafs that are set to a fixed numeric value. Variable objects are leafs that are
set to the parameter values of the measured performance data used as basis for the modeling on evaluation
time. They represent the variables in the model.
X1
Figure 2 shows the representation of an exemplary model p(X1, X2) = + X2 to illustrate the structure
a
of the model as an object tree.
add
div
X1
X2
a
Fig. 2. Example of the representation of a mathematical model as an object tree.
The nodes add and div are method objects, where X1, X2 are leafs and variable objects, and a is also a leaf
and a fixed value object (a represents a number in this example).
3.3 Fitness Function
During the competition the models of the two opponent threads are evaluated. For every set of available
performance data the performance parameters (like number of parallel users, size or type of request) are set
for the variable objects, and the modeled response time is calculated.
Based on the relative deviation for every given performance data entry, an error index sErr is calculated as
the sum of the deviation between every available measured value and the corresponding value calculated by
the mathematical model, divided by the number of available performance data entries. Formula 1 shows the
implemented directive to achieve sErr .
|rmeasured − rmodeled |
i
i
rmeasured
i
∑n
i=0
sErr =
n
. (1)
In this formula, n is the number of available performance data entries, rmeasuredi the measured response time
for data entry i, and rmodeledi the modeled response time for that data entry. The thread with a lower error
index sErr wins the competition and passes its model to the loser thread.
We chose this fitness function, as it describes the distance between the model and the measured values
accurately, and it is efficiently calculable. The negative side of the function is that it allows big deviations in
some points, when others are modeled very close to the measured values. The evaluation of other fitness
functions will be part of the algorithm optimization process.
3.4 Model Passing and Mutation
Beside the competition, inheritance is the second important factor of an evolution. After a fitter thread has
been identified by competition, it has to pass its model to the loser thread in a way that the chance for
keeping the positive characteristics of the model is high, but at the same time there is a considerable chance
for optimizing the model by mutation.
Passing a model from the winner thread to the loser thread is done by deep cloning of the object tree
representing the model, and replacing the model stored by the loser. After the model is passed, the loser
thread itself mutates its new model. In our approach with a given chance either a fixed value object, a
variable object or a method object is chosen for mutation. If a fixed value object is selected, a random value
is added to its value. The selection of a variable objects results in the allocation of a random performance
parameter to this object. In the case that a method object is chosen, the mathematical operator is set to a
randomly selected one with the same number of parameters. In the latter case the parameters of the method
stays the same.
The optimal probabilities for mutating a fixed value, variable or method object will be analyzed by
experiments and future case studies.
By performing continuous optimization, there is a risk of getting stuck in a local optimum (Rocha and
Neves, 1999). We mitigate this problem of local optima by re-bearing every thread that lost 10.000 times in a
row, resulting in a re-initialization of the model of this thread. The threads themselves keep track of the
number of failed competitions. When the limit is reached, they trigger re-initialization, drop the existing
model, and create a new one from scratch. Even when optimization is advanced, “rebirth” opens a way for
leaving local optima for finding the global one.
3.5 End Criterion
As the end criterion, an error index limit has to be defined. The error index is a significant indicator of a
model’s distance to the measured data, and as it is calculated in every competition, the end criterion is
checked without additional effort.
Choosing the error index as end criterion though, as mentioned before, involves the disadvantage of
allowing big deviations for some data entries, if the majority of the data entries are modeled very exactly.
This might lead to unacceptable prediction errors if the measured performance data is not equally distributed,
including the algorithm to stop on a model that is not usable. This disadvantage though is solved by
improving the fitness function.
4. PRELIMINARY RESULTS
A prototype of the evolutionary algorithm has been implemented, and applied to the measured data of an
SAP benchmark as sample data (Jehle, 2009). A subset of the measured data was used for modeling, while
the rest served for validating the model.
First results show that the approach delivers usable results (average error less than 3%) when the data
used for modeling was equally distributed, while the error becomes big if the data is unbalanced. Future
improvement like weighting the input data will be necessary to remove the requirement of equally distributed
input data.
Another important factor for efficiency of the presented approach is the configuration of the evolutionary
algorithm, especially the mutation. First experiments we conducted on the SAP benchmark data proved that
the selection of a fixed value or variable object in 95% of all mutations, and the selection of a method object
in 5% of all cases, results in a fast evolution with reliable convergence to an optimum.
For the relatively small set of input data with around 200 entries the prototype returned a usable model
after around two to three minutes when hosted on two Intel Core2 Duo machines (1.6 and 3 GHz, both 4GB
RAM). The scalability of the evolutionary algorithm itself will have to be tested in future case studies.
5. CONCLUSION
The prototypical implementation of the evolutionary algorithm showed the feasibility of the depicted
approach. Further, the first case study pointed out the efficiency, but also difficulties of the evolutionary
algorithm.
As next steps, further, more complex case studies will be performed. For this, we will develop an
interface for integrating the generated models into a LQN simulation of a SAP system (Gradl et al., 2009).
Executing the simulation with the traditional black box modeling and afterwards using the generated models
will deliver comparable results.
In parallel, the prototype will be extended and optimized. A literature review about the vehicle routing
problem, a field where evolutionary algorithms are applied since many years, revealed the complexity of
possible configurations and modifications of the algorithm. Using the LQN simulation, we will analyze
different configurations and develop an optimal algorithm for the field of ERP performance simulation.
REFERENCES
Anderson, G. W. and Mißbach, M., 2005. Last-Testing und Performance-Tuning. SAP Press, Bonn, Germany.
Balsamo, S. et al., 2004. Model-based performance prediction in software development: A survey. IEEE Transactions on
Software Engineering, Vol. 30, No. 5, pp. 295-310.
Bögelsack, A. et al. 2008. An Approach to Simulate Enterprise Resource Planning Systems. In: ULTES-NITSCHE, U.,
MOLDT, D. & AUGUSTO, J. C. (eds.) 6th International Workshop on Modelling, Simulation, Verification and
Validation of Enterprise Information Systems, MSVVEIS-2008, In conjunction with ICEIS 2008. Barcelona, Spain:
INSTICC PRESS.
Bögelsack, A. et al. 2010. Performance Overhead of Paravirtualization on an Exemplary ERP System. 12th International
Conference on Enterprise Information Systems. Funchal, Madeira, Portugal.
Gradl, S. et al., 2009. Layered Queuing Networks for Simulating Enterprise Resource Planning Systems. In: MOLDT,
D., AUGUSTO, J. C. & ULTES-NITSCHE, U., eds. 7th International Workshop on Modelling, Simulation,
Verification and Validation of Enterprise Information Systems, MSVVEIS-2009, In conjunction with ICEIS 2009,
May 2009 2009 Milan, Italy. INSTICC PRESS, pp. 85-92.
Gwozdz, P. and Szlachcic, E., 2009. An Adaptive Selection Evolutionary Algorithm for the Capacitated Vehicle Routing
Problem. In: Logistics and Industrial Informatics, 2009. LINDI 2009. 2nd International, 10-12 Sept. 2009 2009. pp.
1-6.
Jehle, H. 2009. Performance-Messung eines Portalsystems in virtualisierter Umgebung am Fallbeispiel SAP. CVLBA
Workshop 2009. 3. Workshop des Centers for Very Large Business Applications (CVLBA). Magdeburg, Deutschland:
Arndt, H.-K.; Krcmar, H.
Kraft, S. et al. 2009. Estimating service resource consumption from response time measurements. Proceedings of the
Fourth International ICST Conference on Performance Evaluation Methodologies and Tools. Pisa, Italy: ICST
(Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering).
Noblet, C. M. H. et al., 2004. Enabling UMTS end-to-end performance analysis. In: 3G Mobile Communication
Technologies, 2004. 3G 2004. Fifth IEE International Conference on, 2004 2004. pp. 29-33.
Pllana, S. et al., 2008. Hybrid Performance Modeling and Prediction of Large-Scale Computing Systems. In: Complex,
Intelligent and Software Intensive Systems, 2008. CISIS 2008. International Conference on, 4-7 March 2008 2008.
pp. 132-138.
Rocha, M. and Neves, J. 1999. Preventing premature convergence to local optima in genetic algorithms via random
offspring generation. Proceedings of the 12th international conference on Industrial and engineering applications of
artificial intelligence and expert systems: multiple approaches to intelligent systems. Cairo, Eygpt: Springer-Verlag
New York, Inc.
Rolia, J. et al. 2009. Predictive modelling of SAP ERP applications: challenges and solutions. Proceedings of the Fourth
International ICST Conference on Performance Evaluation Methodologies and Tools. Pisa, Italy: ICST (Institute for
Computer Sciences, Social-Informatics and Telecommunications Engineering).
Schneider, T., 2006. SAP Performance Optimization Guide. Galileo Press, Bonn, Boston.
Woodside, M. 2002. Tutorial Introduction to Layered Modeling of Software Performance. Available:
http://www.sce.carleton.ca/rads/lqns/lqn-documentation/tutorialg.pdf.
Zitzler, E. and Thiele, L. 1998. Multiobjective Optimization Using Evolutionary Algorithms - A Comparative Case
Study. In: EIBEN, A. E., BÄCK, T., SCHOENAUER, M. & SCHWEFEL, H.-P. (eds.) Parallel Problem Solving
from Nature - PPSN V, 5th International Conference, Amsterdam, The Netherlands, September 27-30, 1998,
Proceedings. Springer.
Zitzler, E. and Thiele, L., 1999. Multiobjective Evolutionary Algorithms: A Comparative Case Study and the Strength
Pareto Approach. IEEE Transactions on Evolutionary Computation, Vol. 3, No. 4, pp. 257 - 271.
Download