Case for Support - University of Warwick

advertisement
Case for Support
Title: Automating Simulation Output Analysis (AutoSimOA):
Selection of Warm-up, Replications and Run-Length
Part 1: Previous Research Track Record
Professor Stewart Robinson (Operational Research and Information Systems Group, Warwick Business
School, Coventry) has spent nearly 20 years working in the field of computer simulation, initially as a consultant.
His research focuses on the practice of simulation [Robinson, 2004] with particular projects investigating:



The verification and validation of simulation models and the quality of simulation projects [Robinson and
Pidd, 1998; Robinson, 1999; Robinson, 2002a].
Modes of simulation practice: how simulation modellers approach simulation studies [Robinson, 2001;
Robinson, 2002b]
Modelling human decision-making with simulation: using simulation to elicit knowledge from decisionmakers and using artificial intelligence to represent their decision-making strategies in a simulation
[Edwards et al, 2004; Robinson et al, 2005]. This includes two projects funded through the EPSRC
Innovative Manufacturing Initiative and Innovative Manufacturing Research Centres.
Professor Robinson is co-chair of the Operational Research Society Simulation Study Group and founder and cochair of the biennial Operational Research Society Simulation Workshop. These have become a focus for the
discrete-event simulation community in the UK and further afield. At Warwick he leads the simulation research
group which currently consists of 4 members of staff and 4 PhD students.
Professor Robinson has made specific contributions to the field of simulation output analysis:
1.
2.
3.
Development of a heuristic technique for determining the run-length of a non-terminating steady-state
simulation [Robinson, 1995].
Applying a heuristic method for detecting shifts in the mean of a time-series of simulation output. This
enables a modeller to identify shifts in the steady-state of a system caused by changes in the input data or
random shifts in the model performance [Robinson et al, 2002].
Development of a method for detecting initialisation bias and selecting a warm-up period for a simulation.
This method is based on the principles of statistical process control [Robinson, 2002c].
In the past year, three masters projects have been supervised which performed some preliminary investigations
into the automation of simulation warm-up, replications and run-length decisions respectively. These projects
demonstrated the possibility of automating simulation output analysis, although it also showed the need for
testing and further development of the output analysis methods used.
Professor Ruth Davies (Operational Research and Information Systems Group, Warwick Business School,
Coventry) has worked with simulation for many years, particularly with applications in the health service. She
has recently moved from the Management School at Southampton University and is head of the Operational
Research and Information Systems group in Warwick Business School.
Discrete-Event Simulation and POST: Professor Davies developed a discrete-event simulation approach
[Davies et al, 1993] providing a powerful and highly flexible simulation shell, enabling modelled patients to
participate in simultaneous activities and queues and to be retrieved efficiently. It was subsequently called POST
(patient-oriented simulation technique) which she used in a number of simulation projects funded by the
Department of Health, to help determine policy and set budgets. The projects included a projection of the
resource requirements of end-stage renal failure [Roderick et al, 2004], the evaluation of potential screening
programmes for diabetic retinopathy [Davies et al, 2002a], and the determination of the cost-effectiveness of
screening for helicobacter pylori [Davies et al, 2002b]. The most recent funded project was a simulation of the
treatment and prevention of heart disease [Cooper et al, 2002]. Currently, together with her PhD student, Suchi
Patel, she is looking at the allocation, matching and survival of patients with liver transplantation.
Modelling Methods: Professor Davies has looked at methods of reducing the simulation run times, including
queue management and the use of common random numbers [Davies and Cooper, 2002]. She has published
several papers which evaluate various modelling methods for the purpose of health services policy making, the
most recent one with Roderick and Raferty [Davies et al, 2003]. In October 2004 she was privileged to be a
participant in a CHEBS (Centre for Health Economics and Baysian Statistics) focus fortnight on "Patient
simulation modelling in health care".
Dr Mark Elder FORS (CEO SIMUL8 Corporation, Glasgow) has been working in the simulation field for
more than 25 years. He was part of the pioneering team in the UK based academic/industry collaboration that
created the first visual interactive simulation methods [Fiddy et al, 1982]. In the 1990's Dr Elder spent some
years teaching and researching simulation at Strathclyde University before returning to industry to found
SIMUL8 Corporation. SIMUL8 now has a base in both the US and UK and is the provider of simulation
technology to the world’s largest companies. Its products are provided in formats from low level development
tools right up to very specific high value simulation based solutions.
One area that is a key distinguishing feature of SIMUL8 Corporation's work is that all its products are aimed at
use by non-statistically-trained users. These people are typically working to improve the processes they work in.
As a result their work is important to the efficiency of UK PLC, but they do not have the necessary statistical
knowledge to ensure the work is valid. For this reason SIMUL8 Corporation is very keen to work with Professor
Robinson and Professor Davies on this project.
References
Cooper, K., Davies, R., Roderick, P. and Chase, D. (2002). The Development of a Simulation for the Treatment
of Coronary Artery Disease. Health Care Management Science, 5, pp. 259-267.
Davies, R. and Cooper, K. (2002). Reducing the Computation Time of Discrete Event Simulation Models of
Population Flows. Proceedings of the OR Society Simulation Workshop, 20-21 March 2002, The Operational
Research Society, pp. 63-68.
Davies, R., Cooper, K., Roderick, P. and Raftery J. (2003). The Evaluation of Disease Prevention and Treatment
using Simulation Models. European Journal of OR., 150 (1), pp. 53-66.
Davies , R., O'Keefe, R. and Davies, H. (1993). Simplifying the Modelling of Multiple Queuing, Multiple
Activities and Interruptions, a Low Level Approach. ACM Transactions on Modelling and Computer
Simulation, 3 (4), pp. 332-346.
Davies, R., Roderick, P., Brailsford, S.C. and Canning, C. (2002a). The Use of Simulation to Evaluate Screening
Policies for Diabetic Retinopathy. Diabetic Medicine, 19 (9), pp. 763-771.
Davies, R., Roderick, P., Crabbe, D. and Raftery J. (2002b). Economic Evaluation of Screening for Helicobacter
Pylori for the Prevention of Peptic Ulcers and Gastric Cancers, using Simulation. Health Care Management
Science, 5, pp. 249-258.
Edwards, J.S., Alifantis, A., Hurrion, R.D., Ladbrook, J., Robinson, S. and Waller, T. (2004). Using a
Simulation Model for Knowledge Elicitation and Knowledge Management. Simulation Modelling Practice
and Theory, 12 (7-8), pp. 527-540.
Fiddy, E., Bright, J.G. and Elder, M.D. (1982). Problem Solving by Pictures. Proceedings of the Institute of
Mechanical Engineers, pp. 125-138.
Robinson, S. (1995). An Heuristic Technique for Selecting the Run-Length of Non-Terminating Steady-State
Simulations. Simulation 65 (3), pp. 170-179.
Robinson, S. (1999). Simulation Verification, Validation and Confidence: A Tutorial. Transactions of the
Society for Computer Simulation International, 16 (2), pp. 63-69.
Robinson, S. (2001). Soft with a Hard Centre: Discrete-Event Simulation in Facilitation. Journal of the
Operational Research Society, 52 (8), pp. 905-915.
Robinson, S. (2002a). General Concepts of Quality for Discrete-Event Simulation. European Journal of
Operational Research, 138 (1), pp. 103-117.
Robinson, S. (2002b). Modes of Simulation Practice: Approaches to Business and Military Simulation.
Simulation Practice and Theory, 10, pp. 513-523.
Robinson, S. (2002c). A Statistical Process Control Approach for Estimating the Warm-up Period. Proceeding
of the 2002 Winter Simulation Conference (Yücesan, E., Chen, C-H., Snowden, S.L. and Charnes, J.M., eds.).
IEEE, Piscataway, NJ, pp. 439-446.
Robinson, S. (2004). Simulation: The Practice of Model Development and Use. Wiley, Chichester, UK.
Robinson, S., Alifantis, T., Edwards, J.S., Ladbrook, J. and Waller, T. (2005). Knowledge Based Improvement:
Simulation and Artificial Intelligence for Identifying and Improving Human Decision-Making in an
Operations System. Journal of the Operational Research Society. Forthcoming.
Robinson, S., Brooks, R.J. and Lewis, C.D. (2002) Detecting Shifts in the Mean of a Simulation Output Process.
Journal of the Operational Research Society, 53 (5), pp.559-573.
Robinson, S. and Pidd, M. (1998). Provider and Customer Expectations of Successful Simulation Projects.
Journal of the Operational Research Society, 49 (3), pp. 200-209.
Roderick P., Davies, R., Jones, C., Feest, T., Smith, S. and Farrington, K. (2004). Simulation Model of Renal
Replacement Therapy: Predicting Future Demand in England. Nephrology, Dialysis and Transplantation, 19,
pp. 692-701.
Part 2: Description of the Research and its Context
Background
Discrete-event simulation has been used for commercial applications since the early 1960s [Tocher, 1963]. In
the early years models were developed using programming languages, but during the 1960s specialist simulation
software started to become available (e.g. GPSS [Schriber, 1974]). The late 1970s saw the introduction of
commercial visual interactive simulation software [Fiddy et al, 1982] making simulation more accessible to the
end customer through a visual and interactive interface. Meanwhile, the visual interactive modelling systems
[Pidd, 2004], first seen in the late 1980s, placed simulation model development into the hands of non-experts by
removing the need for a detailed knowledge of programming code. Today discrete-event simulation is in
widespread use being applied in areas such as manufacturing design and control, service system management
(e.g. call centres), business process design and management, and health applications. Organisations benefit from
improved performance, cost reduction, reduced risk of investment and greater understanding of their operations.
While a welcome development, the prevalence of simulation software and its adoption by non-experts has almost
certainly lead to a significant problem with the use of the simulation models that are being developed. The
appropriate analysis of simulation output requires specific skills in statistics that many non-experts do not
possess. Decisions need to be made about initial transient problems, the length of a simulation run, the number
of independent replications that need to be performed and the selection of scenarios [Law and Kelton, 2000;
Robinson, 2004]. Appropriate methods also need to be adopted for reporting, comparing and ranking results.
The majority of simulation packages only provide guidance over the selection of scenarios through simulation
'optimisers' [Law and McComas, 2002]. Other decisions are left to the user with little or no help from the
software. As a result, it is likely that many simulation models are being used poorly. Indeed, Hollocks [2001] in
a survey of simulation users provides evidence to support the view that simulations are not well used. The
consequences are that incorrect conclusions might be drawn, at best causing organisations to forfeit the benefits
that could be obtained and at worst leading to significant losses with decisions being made based on faulty
information.
Alongside developments in simulation software and simulation practice, theoretical developments in the field of
simulation output analysis have continued. Many of these developments are reported at the annual Winter
Simulation Conference, which has a stream dedicated to the subject [e.g. Chick et al, 2003]. The focus of the
work reported, however, is largely on theoretical developments rather than practical application. For instance, a
survey of research into the initial transient problem and methods for selecting a warm-up period found some 26
methods [Robinson, 2002]. None of the methods, with the possible exception of Welch's method [Welch, 1983]
appear to be in common use.
Three problems seem to inhibit the use of output analysis methods:



Most methods have been subject to only limited testing giving little certainty as to their generality and
effectiveness
Many of the methods require a detailed knowledge of statistics and so are difficult to use, especially for nonexpert simulation users
Simulation software do not generally provide implementations of the methods
One solution to these problems is to implement an automated output analysis procedure in the simulation
software. This would overcome the problem of the need for statistical skills. However, before this is possible
more rigorous testing of the output analysis methods is required so the most effective methods can be selected
for implementation. Automation might involve full automation, giving the user the 'answer', or partial
automation, providing guidance on interpretation to the user. To achieve this, candidate methods may need
some refinement to assure their effectiveness across a wide range of applications. In other cases, the need to
automate a method will require some revisions to that method. For instance, Welch's warm-up period approach
requires a smooth and flat line for a moving average. At present this is left to a user's interpretation. Measures
of smoothness and flatness could be added to the procedure in order to automate the approach.
Programme and Methodology
Objectives
The purpose of this research is to explore the potential to automate the analysis of simulation output. The work
focuses on three specific areas: selecting a warm-up period, determining the number of replications and selecting
an appropriate run-length. The aim is to improve the use of simulation models, particularly by non-experts, by
ensuring that they obtain accurate estimates of model performance. It is envisaged that experts will also benefit
from the work by providing rapid access to the selected output analysis methods.
The specific objectives of the project are:




To determine the most appropriate methods for automating simulation output analysis
To determine the effectiveness of the analysis methods
To revise the methods where necessary in order to improve their effectiveness and capacity for automation
To propose a procedure for automated output analysis of warm-up, replications and run-length
As such the research focuses on automating the analysis of output from a single scenario. Automating the
analysis of the output from multiple scenarios is outside the scope of this project.
Figure 1 Automated Simulation Output Analyser
Simulation
model
Output data
Obtain more output data
Analyser
Warm-up
analysis
Use replications
or long-run?
Replications
analysis
Run-length
analysis
Recommendation
possible?
Recommendation
Methodology
Figure 1 outlines the nature of the automated analysis procedure. The 'analyser' consists of three main
components: warm-up analysis, replications analysis and run-length analysis. The choice of the latter two
depends on the nature of the model and the requirements of the user for a long run or multiple replications. A
simulation model is developed using a commercial simulation software package such as SIMUL8. The analyser
automatically runs and takes output data from the simulation model. It then performs an analysis of the data with
a view to recommending a suitable warm-up period, number of replications and run-length. If insufficient data
are available to make a recommendation (e.g. the model has not reached a steady-state) more model runs are
performed until sufficient data are available.
The research project focuses on the analysis methods required for use within such an analyser. Many methods
have been proposed for determining the warm-up period [Robinson, 2002]. Similarly, a range of methods exist
for selecting the run-length of a simulation model. Most of these aim to provide a confidence interval estimate
of specific precision using, for instance, the batch-means approach (see Alexopoulos and Seila [1998] for a
useful review), although some heuristic methods exist (e.g. Robinson [1995]). The one area where little
investigation is required is in the selection of the number of replications where standard confidence interval
methods can be applied. In this case, there may be some benefits in adopting variance reduction approaches,
particularly antithetic variates [Law and Kelton, 2000].
The research project will consist of three stages. In the first stage various methods that have currently been
proposed for warm-up and run-length selection will be tested. This will involve the use of artificial data sets
such as those proposed by Cash et al [1992] and Goldsman et al [1994]. These have the advantage of being able
to generate quite different output data time-series for which the results are known. Further to this, the methods
will be tested on output data from real models provided by SIMUL8 Corporation and their users. The tests will
determine the effectiveness of the methods with respect to criteria such as their accuracy, simplicity, ease of
implementation, assumptions and generality in their use, and estimation of the parameters. Based on the findings
of these tests, candidate methods for automation will be selected.
The second stage of the research will involve some development and revision of the candidate methods. It is
envisaged that the need for revision will come from two sources:


During testing ideas for improving the methods will be identified. These improvements will be identified in
relation to the criteria used for testing a method’s effectiveness.
Because some of the methods require user intervention, for instance, inspection of a time-series, it is not
possible to fully automate them. Methods for replacing the user intervention, for example, measures of
smoothness for a time-series, will be sought.
As part of this stage users of the SIMUL8 software will test the proposed methods using their own models. This
will help to evaluate the simplicity and generality in use of the methods.
In the final stage of the project an automated procedure will be proposed. It is intended that this will be a
revision of the procedure outlined in figure 1, with specific methods identified for warm-up, replications and
run-length. Prototype software that interfaces with SIMUL8 will be developed and tested in order to validate the
automated procedure. Again, the procedure will be tested with users of the SIMUL8 software.
Justification of the Methodology
The methodology provides a basis for identifying output analysis methods that are suitable for automation and
for improving those methods where necessary. It achieves this by:




Providing a thorough investigation of current output analysis methods for warm-up, replications and runlength. This is something that at present does not exist.
Using a set of criteria for investigating and comparing the methods.
Identifying improvements, particularly in relation to automation, for selected methods.
Proposing, developing and testing (with real simulation users) a prototype automated warm-up, replications
and run-length procedure that works with a commercial simulation software package.
Timeliness and Novelty
In surveys of simulation users both Hlupic [1999] and Hollocks [2001] identify better experimental support as
being much needed. Hollocks notes that ‘simulation is increasingly in the hands of non-specialists, so the quality
of experimentation is at higher risk’. Despite many advances in the simulation field, support for simulation users
during experimentation remains largely underdeveloped. At the same time many methods for warm-up,
replications and run-length have been proposed. These need to be tested more fully to determine which have
greatest potential for helping simulation users. Once appropriate methods have been identified, the capability
and speed of modern hardware and software make automated analysis a realistic possibility, even with a high
demand on computing power.
The novelty of the work lies primarily in it addressing the need to test warm-up, replication and run-length
methods; the adaptation of methods where necessary; and the specification of an automated procedure that
currently does not exist.
Programme of Work
An outline of the programme of work is provided in the appendix. The key milestones in the project are as
follows:
Milestone
Literature review of warm-up, replications and run-length methods *
Development of artificial data sets and collection of simulation models
Testing of warm-up methods
Testing of replications methods
Testing of run-length methods
Development, testing and revision of candidate methods
Develop and test automated procedure (including prototype software)
Dissemination
Total
* Largely complete at the time of writing the proposal
Timescale
(months)
3
2
6
2
6
6
5
6
36
The testing, development and revision stages will be performed iteratively.
Responsibilities for the work:




Project Manager (Dr Mark Elder): management of the project; provision of software (SIMUL8), training
and support; provision of real simulation models; access to SIMUL8 users.
Principal Investigator (Professor Stewart Robinson): management of the academic content of the project
and the day-to-day work of the Research Assistant.
Co-Investigator (Professor Ruth Davies): academic support to the project and the Research Assistant.
Research Assistant: performing the project work.
The Project Manager and Research Assistant will meet on a regular basis (once a month) to discuss progress on
the project and requirements for continued work. A full team meeting will take place four times a year.
Relevance to Beneficiaries
SIMUL8 Corporation:

Generation of ideas to improve the experimental process for SIMUL8 users and consultants

Some ideas will be incorporated into the SIMUL8 software as they are generated

Potential development of commercial software to support the experimental process
Other beneficiaries:

Simulation users: who need better support for experimentation

Simulation academics: who will benefit from the results of the testing of the warm-up, replication and runlength methods, and from the development and revision of the methods

Simulation software vendors: who will obtain ideas on how to implement experimental support within their
software

Business: who will benefit from the improved use of simulation models in their decision-making, realising
more fully the benefits of simulation
Dissemination and Exploitation
Conferences and Journals:
 National conferences e.g. Operational Research Society Simulation Workshop
 International conferences e.g. EURO, Winter Simulation Conference
 Journals e.g. Journal of the Operational Research Society, European Journal of Operational Research, ACM
Transactions on Modeling and Computer Simulation
Web Site: reporting on progress of the research, the automated procedure and conclusions of the work. Software
routines developed as part of this work will be made available via this web site.
SIMUL8 Newsletter: an e-news letter that is circulated to all SIMUL8 users
Software Development: should the project be successful it is envisaged that an analyser will be developed by
SIMUL8 Corporation for their software. This work is not part of the project proposal.
References
Alexopoulos, C. and Seila, A.F. (1998). Output Data Analysis. Handbook of Simulation, (Banks, J., ed.). Wiley,
New York, pp. 225-272.
Cash, C.R., Nelson, B.L., Dippold, D.G., Long, J.M. and Pollard, W.P. (1992). Evaluation of Tests for InitialCondition Bias. Proceedings of the 1992 Winter Simulation Conference (J.J. Swain, J.J., Goldsman, D., Crain,
R.C. and Wilson, J.R., eds.). IEEE, Piscataway, NJ, pp. 577-585.
Chick, S., Sanchez, P.J., Ferrin, D. and Morrice, D.J. (2003). Proceedings of the 2003 Winter Simulation
Conference. IEEE, Picataway, NJ.
Fiddy, E., Bright, J.G. and Elder, M.D. (1982). Problem Solving by Pictures. Proceedings of the Institute of
Mechanical Engineers, pp. 125-138.
Goldsman, D., Schruben, L.W. and Swain, J.J. (1994). Tests for Transient Means in Simulated Time Series.
Naval Research Logistics, 41, pp. 171-187.
Hlupic, V. (1999). Discrete-Event Simulation Software: What the Users Want. Simulation, 73 (6), pp. 362-370.
Hollocks, B.W. (2001). Discrete-Event Simulation: An Inquiry into User Practice. Simulation Practice and
Theory, 8, pp. 451-471.
Law, A.M. and Kelton, W.D. (2000). Simulation Modeling and Analysis, 3rd ed. McGraw-Hill, New York.
Law, A.M. and McComas, M.G. (2002). Simulation-Based Optimization. Proceedings of the 2002 Winter
Simulation Conference (Yücesan, E., Chen, C-H., Snowden, S.L. and Charnes, J.M., eds.). IEEE, Piscataway,
NJ, pp. 41-44.
Pidd, M. (2004). Computer Simulation in Management Science, 5 th ed. Wiley, Chichester, UK.
Robinson, S. (1995). An Heuristic Technique for Selecting the Run-Length of Non-Terminating Steady-State
Simulations. Simulation 65 (3), pp. 170-179.
Robinson, S. (2002). A Statistical Process Control Approach for Estimating the Warm-up Period. Proceeding of
the 2002 Winter Simulation Conference (Yücesan, E., Chen, C-H., Snowden, S.L. and Charnes, J.M., eds.).
IEEE, Piscataway, NJ, pp. 439-446.
Robinson, S. (2004). Simulation: The Practice of Model Development and Use. Wiley, Chichester, UK.
Schriber, T. (1974). Simulation Using GPSS. Wiley, New York.
Tocher, K.D. (1963). The Art of Simulation. The English Universities Press, London.
Welch, P. (1983). The Statistical Analysis of Simulation Results. The Computer Performance Modeling
Handbook (Lavenberg, S., ed.). Academic Press, New York, pp. 268-328.
Download