CLASSIFICATION OF DISCRETE EVENT SIMULATION MODELS AND OUTPUT: CREATING A SUFFICIENT MODEL SET. Kathryn Hoad Warwick Business School The University of Warwick Coventry, CV4 7AL, UK. Stewart Robinson Warwick Business School The University of Warwick Coventry, CV4 7AL, UK. ABSTRACT This paper describes the creation of a representative and sufficient set of models and output data that can be used in discrete event simulation research. The motivation is to provide researchers with a representative set of models or output upon which to test their research ideas. The identification of certain DES model and output characteristics is described, as is the creation of a classification system for each general type of model and output encountered in ‘real life’ DES modelling. The processes and decisions involved in setting up this classification system are explained. The classification tables are outlined, including a collection of real and artificial models as examples of some combinations of the chosen characteristics. 1 Ruth Davies INTRODUCTION There is much published literature on the subject of output analysis of a single scenario, for example research into warm-up methods and construction of confidence intervals. Each paper may use one or more artificial models (or possibly a real model) to test or illustrate their research findings. However, there does not seem to be a general set of models and output types in the public domain that sufficiently covers the many different types of possible models and output. The authors of this paper required such a set for use in their own specific output analysis research. As this was not readily available in the literature, we created a representative and sufficient set of models/data output that could be used in discrete event simulation research by the authors and other researchers. A set of artificial data sets have been developed and a range of ‘real’ simulation models gathered together. Certain attributes of both the models and outputs were considered for their possible importance in the performance of output analysis methods. Appropriate categories were selected and models and outputs classified accordingly. Our aim was to categorise and collect a wide range of real models, artificial models and their associated outputs so Warwick Business School The University of Warwick Coventry, CV4 7AL, UK. the collection would cover each general type of model and output encountered in ‘real life’ modelling. 2 CATEGORISING MODELS / OUTPUT A SUFFICIENT SET OF Model output falls into two main categories or groups: Transient and Steady-State. There is also output with a trend (i.e. produced by out-of-control models where traffic intensity, ρ, is greater or equal to one.); this can be described as Out-of-Control Trend. Cyclic output is characterised as steady-state with a cycle pattern. Apart from identifying these main categories, nine other characteristics of models and output data sets were chosen to be used to categorise the models/output within these two main groups. These were divided into characteristics of the simulation model itself, characteristics of the output data that could be seen by eye (non-statistical) and statistical characteristics of the output that were determined by statistical analysis of the data sets, as follows: Model Characteristics 1. Deterministic or stochastic (random) model 2. Significant pre-determined model changes (by time), e.g. arrival patterns 3. Dynamic internal changes (i.e. ‘feed-back’), e.g. activating additional resources in response to demand Output Data Characteristics Non-statistical: 4. Empty-to-empty pattern 5. Initial transient (warm-up) 6. Out of control trend ρ≥1 7. Cycle Statistical: 8. Auto-correlation 9. Statistical distribution Hoad, Robinson and Davies 3 CREATION AND COLLECTION MODELS/OUTPUT DATA SETS OF In order to find out whether the chosen classification scheme was adequate a collection of both artificial and ‘real’ models and their associated outputs were collected together and categorised by their model and/or output characteristics (as set out in section 2). 3.1 COLLECTION OF ARTIFICAL MODELS / OUTPUT A search of existing and cited literature produced 24 artificial models. It was found that authors borrow models from each other either with or without amendments. All the models produce a steady state output with or without a warm-up period. These models, which have all been recreated for this research, are as follows: Cash et al (1992): AR(1); M/M/1; Markov Chain. Robinson (2007): AR(1); M/M/1. Goldsman et al. (1994): AR(1); M/M/1. White, Cobb & Spratt (2000): AR(2). Ockerman & Goldsman (1997): Random Walk; AR(1); MA(1). Kelton & Law (1983): M/M/1 (FIFO); M/M/1 (LIFO); M/M/1 (SIRO); M/M/1 (initialised with 10 customers); E4/M/1; M/H2/1; M/M/2; M/M/4; M/M/1/M/1/M/1. Hsieh et al (2004): M/M/1/199; M/G/1/199; M/M/1/19; Number-in-stock process single item inventory management system. There are three main methods for creating artificial models and output data sets: Create simple simulation models where theoretical value of some attribute is known. e.g. Model: M/M/1. Attribute: mean waiting time. Create simple simulation models where the value of some attribute is estimated but model characteristics can be controlled. e.g. Model: Single item inventory management system. Attribute: Number-in-stock. Create data sets from known equations, which closely resemble real model output, with a known value for some specific attribute. e.g. AR(1) with Normal(0,1) errors 3.2 COLLECTION OF REAL MODELS Real models are defined as discrete event simulation models of real existing systems, created in “real circumstances” (i.e. in business, academia, etc…). For example: Model Call Centre Production Line Manufacturing Plant Fast Food Store Hospital Output Result Percentage of calls answered within 30 seconds Throughput Average queuing time Average number in system For each model the output result chosen to be analysed was deemed to be the most likely output to be of interest to a practitioner for that type of model. When the model came with already programmed results collection then these were utilised if feasible. 3.3 CLASSIFICATION OF COLLECTED MODELS / OUTPUT After collection or creation of model/output the data output sets were identified as one of 4 sub-types: Steady–State, Steady-State Cycle, Transient, or Out of Control Trend. Each type was statistically analysed as follows: Steady-State i. Subtract the mean of each replication from the data output for that replication to create time series residuals. (Do this for 3 or more replications) ii. Test the residuals for autocorrelation and partial autocorrelation functions (ACF and PACF) iii. Test the residuals for normality Steady-State Cycle i. If output is collected per customer/item then use time customer/item leaves system as x axis instead of customer/item index number. ii. Run model for many cycles for each replication carried out (3 or more cycles) iii. Take mean of each cycle to create a new time series (for each replication) iv. Subtract mean from this new output data of each replication carried out (3 or more replications) v. Test residuals for ACF/PACF and Normality (Normal or not Normal) Hoad, Robinson and Davies Transient i. Test for ACF/PACF on raw data from each replication carried out (3 or more replications) ii. Run many replications (1000) iii. Take mean of each replication to create new (non auto-correlated) data series. iv. Test for what type of statistical distribution this data series is – is it normal or highly skewed etc? Find the ‘best’ fitting distributions to the data using maximum likelihood estimates of parameters and goodness of fit Anderson-Darling and KolmogorovSmirnov tests. subject of consistency of warm-up periods in cyclic models. Table 1: Transient Output Data Characteristics TRANSIENT OUTPUT DATA CHARACTERISTICS NON-STATISTICAL Empty to emp- Warmρ≥1 ty up? STATISTICAL Distribution of replication means Model Exists in collection No Left Skewed Normal Other Right Skewed Left Skewed Normal Other Right Skewed Left Skewed Normal Other Right Skewed N/A No Real No Real Real Real No Real No No No No Real Yes None No None Out of Control Trend i. Plot data 4 CREATING CLASSIFICATION TABLES Two separate classification tables were drawn up, one showing the model characteristics only and one displaying the output characteristics. Each table is then split into two again, producing a table for transient models and a table for steady state models. This was due to the fact that these two model types had been analysed slightly differently as thought appropriate. Each of these four tables contains all combinations of important model and output characteristics that were logically possible. Each table also indicates which combinations have examples of real or artificial models existing in our collection. Table 1 shows the complete table for the transient model output characteristics. Tables 2-4 show a sample from the other three categories. Full tables can be viewed at the project website <http://www.wbs.ac.uk/go/autosimoa> 5 CONCLUSION This research has produced a classification of model and output types for the purpose of aiding research into simulation output analysis. A series of artificial and real models have been identified and classified. Three specific issues warrant further attention. First, are the criteria used to categorise the models/output sufficient? Are there other criteria that should be incorporated that have thus far been missed out? Second, there are not yet examples in our model collection for every category. In particular, there are no transient model outputs with a warm-up period, deterministic transient models or cycle output with a warm-up period. Is this because these type of models/outputs do not exist, or simply that we have failed to find such models? It is our belief that these model types do exist e.g. Beck (2004) tackles the Yes Yes No None Table 2: Steady State Output Data Characteristics (Sample) Steady State Output Data Characteristics Non-Statistical Statistical Model exists in Warm Cycle Distribution Auto-up correlation collection None Yes Normal AR(1) Real Table 3: Steady State Model Characteristics (Sample) Steady State Model Characteristics Deterministic Pre-determined Dynamic Model ex/ Stochastic model changes model ists in colchanges lection Stochastic None None Artificial and Real Table 4: Transient Model Characteristics (Sample) Transient Model Characteristics Deterministic Pre-determined Dynamic / Stochastic model changes model changes Deterministic Yes None Model exists in collection No models Hoad, Robinson and Davies Finally, we note that the extant artificial models fall within a very limited set of categories. This suggest the need to devise a wider set of artificial models/outputs. It also raises the question of the generality of the tests previously performed on proposed output analysis methods using these models. White, K. P., M. J. Cobb, and S. C. Spratt. 2000. A comparison of five steady-state truncation heuristics for simulation. Proceedings of the Winter Simulation Conference 2000. In terms of current and future work on this project: a group of artificially created data sets that mimic the types of transient output that we observed in our collection of models are currently being used, along with some other real models from our collection, to test and develop our output analysis algorithms. It is our intention to continue using the classification set in our research and to create artificial data sets for each category combination that is missing an example. KATHRYN A. HOAD is a research fellow in the Operational Research and Information Systems Group at Warwick Business School. She holds a BSc in Mathematics and its Applications from Portsmouth University, an MSc in Statistics and a PhD in Operational Research from Southampton University. Her e-mail address is <kathryn.hoad@wbs.ac.uk> ACKNOWLEDGEMENTS This work is part of the Automating Simulation Output Analysis (AutoSimOA) project that is funded by the UK Engineering and Physical Sciences Research Council (EP/D033640/1). The work is being carried out in collaboration with SIMUL8 Corporation, who are also providing sponsorship for the project. REFERENCES Cash, C. R., B. L. Nelson, D. G. Dippold, J. M. Long, and W. P. Pollard. 1992. Evaluation of tests for initialcondition bias. Proceedings of the Winter Simulation Conference 1992. Beck, A. D. 2004. Consistency of warm up periods for a simulation model that is cyclic in nature. Proceedings of the Simulation Study Group(The OR Society) 2004.. Goldsman, D., L. W. Schruben, and J. J. Swain. 1994. Tests for transient means in simulated time series. Naval Research Logistics. Vol. 41. pp. 171-187. Hsieh, M-H., D. L. Iglehart, and P. W. Glynn. 2004. Empirical performance of bias-reducing estimators for regenerative steady-state simulations. ACM Transactions on Modeling and Computer Simulation. Vol. 14. pp. 325-343. Kelton, W. D., and A. M. Law. 1983. A new approach for dealing with the startup problem in discrete event simulation. Naval Research Logistics. Vol. 30. pp. 641-658. Ockerman, D. H., and D. Goldsman. 1997. The impact of transients on simulation variance estimators. Proceedings of the Winter Simulation Conference 1997. Robinson, S. 2007. A statistical process control approach to selecting a warm-up period for a discrete-event simulation. Science Direct, European Journal Of Operational Research. Vol. 176. pp. 332-346. AUTHOR BIOGRAPHIES STEWART ROBINSON is a Professor of Operational Research at Warwick Business School. He holds a BSc and PhD in Management Science from Lancaster University. Previously employed in simulation consultancy, he supported the use of simulation in companies throughout Europe and the rest of the world. He is author/co-author of three books on simulation. His research focuses on the practice of simulation model development and use. Key areas of interest are conceptual modelling, model validation, output analysis and modelling human factors in simulation models. His email address is <stewart.robinson@warwick.ac.uk> and his Web address is <www.btinternet.com/~stewart.robinson1/sr.htm>. RUTH DAVIES is a Professor of Operational Research in Warwick Business School, University of Warwick and is head of the Operational Research and Information Systems Group. She was previously at the University of Southampton. Her expertise is in modeling health systems, using simulation to describe the interaction between the parts in order to evaluate current and potential future policies. Over the past few years she has run several substantial projects funded by the Department of Health, in order to advise on policy on: the prevention, treatment and need for resources for coronary heart disease, gastric cancer, end-stage renal failure and diabetes. Her email address is <ruth.davies@wbs.ac.uk> .