SMARTSTAT SIMULATION BASED COST / BENEFIT ESTIMATING Ozan Talu Knowledge Based Systems, Inc. One KBSI Place 1408 University Drive East College Station, TX 77840 409.260.5274 http://www.kbsi.com http://www.idef.com products@kbsi.com Using SmartStat White Paper ______________________________________________________________________________________ SMARTSTAT WHITE PAPER .............. 2 I) BACKGROUND: ........................................................................................................................................ 3 1.1 What is simulation? .......................................................................................................................... 3 1.2 What is simulation good for? ............................................................................................................ 3 1.3 How can the results be used for decision-making? .......................................................................... 3 II) USING SMARTSTAT: ............................................................................................................................... 3 2.1 Opening SmartStat............................................................................................................................ 4 2.2 Opening the Workbook to apply SmartStat....................................................................................... 5 2.3 Defining Statistical Characteristics (Activating Cells) ..................................................................... 5 2.4 Identifying Target Cells .................................................................................................................... 6 2.5 Deleting Statistical Characteristic ................................................................................................... 7 2.6 Running SmartStat Simulation .......................................................................................................... 7 2.7 Statistical Output .............................................................................................................................. 8 2.8 Exiting SmartStat .............................................................................................................................. 9 2.9 Limitations .......................................................................................................................................10 © 1999 KBSI. All rights reserved. 2 Using SmartStat White Paper ______________________________________________________________________________________ SmartStat White Paper SmartStat is a Monte Carlo Simulation tool. The tool allows users to make informed quantitative assessments based on risk factors that may affect decision-making. Both assessments of risk factors and output that aid in the decision-making process are supported by sound statistical methodologies. The tool operates in Microsoft Excel and boosts the application's power by allowing simulation capabilities. I) Background: 1.1 What is simulation? Simulation begins by defining the behavior of a parameter - defining possible patterns (scenarios) the parameter may follow as well as setting boundaries. Once the behavior is identified, the parameter is set free within the defined environment, and the scenario is repeated several times. "Iteration" is used to describe each instance the scenario is repeated. The number of iterations should be increased in order to encounter all possible scenarios. 1.2 What is simulation good for? As a result of repeating a scenario several times, important information is gathered, revealing the number of times certain patterns occur and the number of times maximums and minimums are reached. This information is used to create histograms, which are graphical representations of pattern frequencies. The power of simulations becomes more apparent when the parameter has other dependent parameters. For example, in an Excel spreadsheet, if the first cell contains the value 5, the second cell contains the value 6 and the third cell contains a formula for "Cell1+Cell2", cell 3 is dependent upon both cell 1 and cell 2. Therefore changing cell 1 or cell 2 means changing the value of cell 3. Hence, allowing cell 1 to take values over several iterations will cause cell 3 to take different values for each iteration. The value of the dependent cells can be tracked to determine possible scenarios it may take, given the behavior of the precedent parameters. 1.3 How can the results be used for decision-making? The results of simulations reveal possible values a parameter (dependent and independent) can take given the behavior of the independent variables. In addition, the results yield the probability distribution of possible values. In this way the decision-maker can have a better idea about the most likely occurrence and variability from the mean, as well as limits of quantitative measures. Suppose the MIS manager of XYZ Company is trying to determine the number of e-mails they may expect when the company launches its web site. After evaluating several factors including number of site hits to number of total customers, the manager can then define the statistical behavior of his/her assumptions, and run a simulation on the model. The results may yield that on average there will be 534 incoming e-mails per day. There is 100% possibility that the incoming e-mails will not exceed 745, and there is 0% probability that the number of incoming e-mails will be less then 356. The MIS manager is now able to allocate necessary resources depending on these values. He will make sure that there are necessary resources in place to accommodate maximum levels of incoming mail traffic. II) Using SmartStat: SmartStat is a very easy application to use. To run the application: The first thing that the user needs to do after creating his spreadsheet is to assign statistical characteristics to independent cells. Independent cells are those that contain a numerical value rather © 1999 KBSI. All rights reserved. 3 Using SmartStat White Paper ______________________________________________________________________________________ than a formula. The user can define the statistical character by assigning a distribution type (i.e. normal, uniform, exponential, triangular, or discrete distribution) to the independent cell. Each distribution type requires certain inputs from the user, which establishes the basis for defining possible patterns each cell can take. As a second step, the user needs to identify target cells. Target cells are the dependent cells that the user wants to track. Target cell values change as a result of changes in the independent cells. Once the target cells are identified, the user can run the simulation for a determined number of times (iterations). SmartStat will generate numbers simulated with the user-defined statistical characteristics. In the meantime, the application will track the value changes of the selected target cells. Finally, the application generates a report that depicts the statistical information relating to the target cells and the independent cells. The statistical information in the report includes: histogram chart, confidence intervals, average, mean, median, standard deviation, minimum value, maximum value, range, skewness, and kurtosis for all the parameters selected. 2.1 Opening SmartStat In order to open SmartStat, double-click on the icon, which resides in the folder the application has been installed. This action will automatically open Microsoft Excel. SmartStat is written in Visual Basic within Microsoft Excel. When the application starts in Excel, it will ask the user if it should run the Visual Basic Code by prompting the user to click on either "Disable Macros" or "Enable Macros". For the application to run properly, the macros should be enabled. The introduction window will appear when the application starts. Clicking on the ( ? ) button located at the upper right corner of the window displays operating details of the tool. Clicking anywhere else in the window will let the user proceed using SmartStat. The simulation toolbar will appear on the toolbar as follows: Activate Cell Delete Statistical Character © 1999 KBSI. All rights reserved. Help Set Target Cell Exit out of SmartStat Open Workbook Run Simulation 4 Using SmartStat White Paper ______________________________________________________________________________________ 2.2 Opening the Workbook to apply SmartStat SmartStat applications contain the engine to define statistical characteristics, run the simulation and create a statistical output. This engine can be applied to any Microsoft Excel document. Excel Workbook SmartStat SmartStat and the spreadsheet can be opened in any order. When SmartStat is opened, first, the application automatically guides the user to open the workbook whether a workbook is already open or not. The user can click on the “Cancel” button if the desired workbook is already open. 2.3 Defining Statistical Characteristics (Activating Cells) Once the desired workbook and SmartStat is open, the user can begin assigning statistical characteristics to necessary parameters. Example: The Shot Peening Cost Analysis Model depicted below helps to calculate the annual cost of shot peening process by manipulating the variables that have a direct or indirect impact. In a standard spreadsheet environment, one cell can only contain one value. This becomes a constraint if one needs to capture all possible values a parameter will allow, and reach conclusions © 1999 KBSI. All rights reserved. 5 Using SmartStat White Paper ______________________________________________________________________________________ depending on the result. In the above Shot Peening Cost Analysis spreadsheet, "Anticipated Flow Rate" is set as 0.3. In actuality, this number can be any value between 0.3 to 1.5. Therefore, Anticipated Flow Rate shows a uniform distribution of characteristics between 0.3 and 1.5. With the support of SmartStat Engine, the user can assign a characteristic to the cell by pressing button. This will bring up the Statistical Analyzer form (see below). The form allows the user to select the appropriate distribution type using the tabs located at the top. Once uniform distribution is selected, the user can then define the desired boundaries. In this case, the upper boundary is 1.5 while the lower boundary is 0.3. The Cell Name box can be used to assign a name to the selected cell for future reference. If the user chooses to leave the Cell Name box blank, the application assigns the cell address reference as the cell name (i.e. $J$4 for this case). Once the information is complete, it can be submitted by clicking on the "Submit" button. When the information is submitted, the related cell is automatically highlighted and the statistical characteristic is attached as a note to the cell. The user builds up the cost model assigning statistical characteristics to as many cells as necessary. 2.4 Identifying Target Cells After identifying the independent variables and their statistical characteristics, the user can then select target cells. These cell values change in accordance with their precedent cells. In the above mentioned example, "Anticipated Peening Flow Rate" is not a target cell since there is no other cell that may affect its value. In all cases, Target Cells contain a formula which indicates that its value is dependent upon at least one another parameter. The purpose in identifying the target cells is to track the changes in their values as the independent cell values change. © 1999 KBSI. All rights reserved. 6 Using SmartStat White Paper ______________________________________________________________________________________ Example: In the Shot Peening Cost Analysis spreadsheet, one of the most important outputs is the annual cost of batch peening. The possible values that this parameter can take will vary by the changes in the precedent parameters - one of which is the Anticipated Flow Rate. Tracking the changes in the Annual Batch Peening Cost is made possible by setting range M2 as the target cell by pressing button. This will bring the "Set Target Cell" form, which allows the user to assign a label for the target cell. In this example, the target cell is named “Annual Batch Processing Cost.” When this information is submitted, the referenced cell is highlighted with red (see below). The user can select the desired number of target cells as long as the cells contain formulas. 2.5 Deleting Statistical Characteristic This application allows the user to delete the activated cells as well as the target cells by means of pressing the button. This functionality will delete one cell at a time. Hence, when a range of cells is selected, the first cell in the range will be deleted by clicking on the button. 2.6 Running SmartStat Simulation Once the independent parameters are activated and the target cells are identified, the simulation can begin. In order to initiate the simulation, the button is used. The number of times (iterations) the simulation is repeated is in the user's discretion and can be changed through "Run Simulation" form. “Report Name” is also the name of the worksheet on which the output graph is presented. In order to compare several simulations on a given workbook, different Report Names can be given in each run, which will result in the generation of multiple results sheets. In addition to generating graphical reports, the user may also choose to generate a summary table report at a selected location. The Summary Table Report contains important parameters of each active and target cell. Using the Summary Table Report enables the user to reference the © 1999 KBSI. All rights reserved. 7 Using SmartStat White Paper ______________________________________________________________________________________ results of the simulation in further calculations. For example, the user can build a formula that contains the calculated average of an activated cell. Example: In the Shot Peening Cost Analysis example, the simulation will be run 5000 times. The “Run Simulation” form also contains the “Expanded Report” function, which enables the independent parameter behavior to be reported in the output. Upon the application of the “Run” button, the SmartStat engine starts to assign random numbers to the activated independent variables in-line with the assigned statistical characteristics. In the meantime, the application tracks the behavior of the target cells. The simulation may take several minutes, depending on the speed of the number of iterations; selected number of independent variables and targets; and CPU. 2.7 Statistical Output After the simulation has run, the application creates a "Statistical Results" sheet. The Statistical Results sheet includes the histogram charts for the target cells. In the cases where the expanded report is enabled, the Statistical Results sheet also contains histogram charts for the independent variables. Besides the charts, the output includes confidence intervals, and statistical detail for each independent cell and target parameter. The statistical detail consists of Average, Median, Standard Deviation, Sample Variance, Kurtosis, Skewness, Range, Minimum Value, and Maximum Value. Statistical Detail: Average: This is the arithmetical mean of all the numbers the parameter recorded during the specified number of iterations. Median: This is the number in the middle of a set of numbers the parameter found during the specified number of iterations; half the numbers have values that are greater than the median, and half have values that are less. Standard Deviation: This is the indicator of how widely the numbers are distributed from the arithmetical mean. Sample Variance: This is the Standard Deviation squared. Kurtosis: This is the factor of deviation Initial Value of the Parameter Date and Time Created from a normal distribution. It indicates the flatness or peak level of a distribution compared to a normal distribution curve. Number of Iterations Applied Positive kurtosis indicates peak distribution, while negative numbers indicate a flatter distribution. Skewness: This parameter indicates the asymmetry of a distribution around its mean. Positive skewness indicates a distribution with an asymmetric tail extending toward more positive values. Negative skewness indicates a distribution with an asymmetric tail extending toward more negative values. Example: As a result of running the Shot Peening Cost Analysis simulation 5,000 times, below statistical information is generated by the application. © 1999 KBSI. All rights reserved. 8 Using SmartStat White Paper ______________________________________________________________________________________ The report starts with the date and time it was created. After this information, the report proceeds with the statistical information dealing with the target parameters. The histogram of the selected target parameters contains the label of the target cell. In the example, Annual Batch Peening Cost is the first (and only) target cell that was selected. The initial value of the target is indicated at the top left corner of the chart. The number of iterations performed is depicted to the right of the initial cell value. As apparent in the histogram chart, the annual batch peening cost will allow values between 26 thousand and 69 thousand. This information can also be confirmed from the minimum and maximum values at the "Statistical Detail," which are 26,506.70 and 69,356.39 respectively. The "Range" parameter indicates the difference between the minimum and maximum. It is also apparent from the histogram that the peening cost is leaning towards the left half of the curve. This is confirmed by both the average and median values, which are 43,775.73 and 42,722.10 respectively. Both of these values are located at the left portion of the horizontal axis. The negative Kurtosis of -0.54 is an indication of a relatively flat peak when compared to a normal distribution. The skewness of 0.41 indicates that the asymmetric tail is extending towards the positive side, which can also be observed by a longer tail to the right of the average. The Confidence Interval is a numeric depiction of the histogram. According to the values in the above chart, there is a 10% possibility for the peening cost to fall between 30,402 and the minimum value (26,506.70). Similarly, there is a 50% chance for the cost to fall between 47,715.13 and the minimum value, and a 100% possibility for it to fall between 69,356.39 and the minimum value. Besides the statistical information for the selected target cells, the report also includes the statistical information for the selected independent cells. The target cell value and number of iterations information is replaced by the distribution details information. In the New Part Lot Arrivals Independent variable information, the chosen distribution method was normal distribution with a mean value if 8 and standard deviation of 2.5. Because the distribution type was selected as normal distribution, the skewness and Kurtosis parameters show minimal deviation from the normal. Distribution Information 2.8 Exiting SmartStat In order to exit SmartStat, the button can be used. This button deletes all the statistical content that was created by SmartStat including the statistical report. © 1999 KBSI. All rights reserved. 9 Using SmartStat White Paper ______________________________________________________________________________________ 2.9 Limitations One of the most important things to remember when utilizing SmartStat is that it is the engine and the Excel Spreadsheet is the static environment in which the engine runs. It is a one-way information flow, and that is from the SmartStat to the Excel Spreadsheet. Consequently when the user selects a cell that has a value of 3 and activates it using the button, the SmartStat engine captures that information and stores it. After this operation, if the user changes the cell's value to 5 from 3, because the information flow is from the SmartStat engine to the spreadsheet environment, the SmartStat engine will not recognize this change. Consequently, SmartStat will still think that the cell's value is 3 regardless of its current value and generate an error indicating that the initially entered value (which is 3) has changed. Hence, once changes are made in the spreadsheet, they have to be recaptured by SmartStat via the appropriate methods. In this case, the user has to select the cell and use once again to upload the new information. In a similar fashion, below are instances which will effect the operation of SmartStat: Changing the location of a previously activated (or selected as target) cell by means of inserting cells. Changing the name of the Worksheet that has previously activated cells (or target cells). Changing the value of a previously activated independent cell or the formula of a previously selected target cell. Suggestion: In order to cope with this limitation, the best method is to delete the statistical content by using the button, before making any of the above changes. Then, use the again to reassign statistical information (or to reselect target). In the case of changing the worksheet name, all the cells in the sheet containing statistical information should be deleted using the button before the renaming operation. © 1999 KBSI. All rights reserved. 10