Simulation Chapter 9 Discrete Variables Course Overview Modeling Techniques Deterministic Models Basic Profit Models Breakeven Pricing for Max Profit Crossover Time Series Forecasting Naive / Exp. Smoothing Regression (trend) Classical Decomposition (trend + seasonality) Probabilistic Models Decision Analysis Alternatives, States, Payoffs Decision Criteria Decision Trees Bayes Theorem Simulation Random numbers Distributions Discrete Variables Continuous Variables What is Simulation? A computer based model used to run experiments on a real system The basic idea is to build an experimental device, or simulator, that will “act like” the system of interest … in a quick, cost-effective manner. Representation of the operation or features of one process or system through the use of another: computer simulation of an in-flight emergency. Simulation in Business Analysis Uses mathematical models Probabilistic (as opposed to deterministic) Uses entire range of possible values of a variable in the model Imitates a system of situation (like a coin flip, or how long a person may have to wait in line at a restaurant) Why Simulate? Safety – flight simulator (emergency) Cost – easier to simulate adding a new runway and find out effects than to implement in reality and then find out Time – Boeing uses simulated manufacturing before the real thing, with tremendous savings in time and money – can discover parts that do not fit and fix them before actual production Types of Simulation Discrete Used for simulating specific values or specific points Example: Number of people waiting in line (queue) Continuous Based on mathematical equations Used for simulating continuous values for all points in time Example: The amount of time a person spends in a queue Simulation Methodology Estimate probabilities of future events Assign random number ranges to percentages (probabilities) Obtain random numbers Use random numbers to “simulate” events Data Collection and Random Number Interval Example Suppose you timed 20 athletes running the 100-yard dash and tallied the information into the four time intervals below. You then count the tallies and make a frequency distribution. Then convert the frequencies into percentages. Finally, use the percentages to develop the random number intervals. Seconds Tallies 0-5.99 6-6.99 7-7.99 8 or more Frequency 4 10 4 2 % 20 50 20 10 RN Intervals 01-20 21-70 71-90 91-100 Sources of Event Probabilities and Random Numbers Event Probabilities From historical data (assuming the future will be like the past) From expert opinion (if future unlike the past or no data available) Random Numbers From probability distributions that “fit” the historical data or can be assumed (EXCEL functions) From manual random number tables From Instructor (for homework or tests, so that we will get the same number) Discrete Example – Coin Toss Variable to be simulated is “Outcome of a coin toss”. It takes on values “Heads” and “Tails”, each with 0.5 probability. Generate 100 random numbers (100 tosses of coin). Make a rule like – if random number > 0.5, then “Heads”, else “Tails”. This will create the right distribution of outcomes. Coin Toss Random Number Mapping The random numbers are now mapped to number of failures as follows. “If random number > 0.5, then “Heads”, else “Tails”. Random # 0.345 0.008 0.985 0.878 Coin Toss Result Heads Heads Tails Tails Discrete Example: Machine Failures Simulate machine failures based on this historical data Number of Frequency Failures per (# of month months this occurred) 0 1 2 3 36 20 3 1 Total 60 Simulating Machine Failures, contd. Create the following cumulative probability table. Number of Frequency Probability Failures per (# of month months this occurred) 0 1 2 3 Total 36 20 3 1 60 0.600 0.333 0.050 0.016 1.00 Cumulative Probability 0.600 0.933 0.983 1.000 Simulating Machine Failures, contd. Now map the random numbers between 0 and 1 using the cumulative prob. Column as the cutoffs. 0 failures 0 1 failure 0.60 0.93 2 3 failures 0.98 Random numbers between 0 and 0.6 represent 0 failures, between 0.6 and 0.933 represent 1 failure, and so on. Solution – Random Number Mapping The random numbers are now mapped to number of failures as follows. Random Number # of Failures 0.345 0 0.008 0 0.985 3 0.878 1 Continuous Example – Arrival Time Variable to be simulated is arrival time at a restaurant which can literally take on infinite individual values For example someone could arrive at: 12:09:37 12:09:37:52 12:09:37:52:14, etc. Continuous Example – Arrival Time To simulate this situation, we must specify intervals At the restaurant the intervals could be all people arriving between 11am and 12pm, 12pm and 1pm, or 1pm and 2pm. As with the coin toss, generate random numbers (in Excel =RAND()) Make a rule –if random number: <=.333, then =11am-12pm >.333 up to =.666, then 12pm-1pm >.666 up to 1, then 1pm to 2pm each category is equally likely Continuous Example – Arrival Time If the random number is.47, then this would fall in the 12pm to 1pm category, If the random number is .88, then this would fall in the 1pm to 2pm category, etc. Because each category is equally likely, if we run enough trials, each category will contain about the same number of random numbers, which will tell the restaurant owner that it is equally likely that a person will arrive at any of the three times. Continuous Example – Arrival Time If random number: Less than .333, then =11am-12pm Between .333 - .666, then 12pm-1pm Greater than >.666, then 1pm to 2pm each category is equally likely Random Number 0.47 0.88 0.36 0.27 0.21 0.25 0.36 0.41 0.85 Result 1 12p - 1p 1p - 2p 12p - 1p 11a - 12p 11a - 12p 11a - 12p 12p - 1p 12p - 1p 1p - 2p Continuous Example – Arrival Time The owner looks at historical information and says that on an average day, 225 people eat lunch at his restaurant , and that typically 47 people arrive between 11am and 12pm 112 people arrive between 12pm and 1pm 66 people arrive between 1pm and 2pm How do we map these numbers? Continuous Example – Arrival Time count 11am to 12pm 12pm to 1pm 1pm to 2pm total 47 112 66 225 percent 0.21 0.50 0.29 1.00 To complete the mapping, we need to make a cumulative distribution function (CDF) Continuous Example – Arrival Time count 11am to 12pm 12pm to 1pm 1pm to 2pm total 47 112 66 225 percent CDF 0.21 0.21 0.50 0.71 0.29 1.00 1.00 Make a new rule – if random number: <=.21, then =11am-12pm >.21 up to =.71, then 12pm-1pm >.71 up to 1, then 1pm to 2pm Continuous Example – Arrival Time If random number: <=.21, then =11am-12pm >.21 up to =.71, then 12pm-1pm >.71 up to 1, then 1pm to 2pm Random Number 0.47 0.88 0.36 0.27 0.21 0.25 0.36 0.41 0.85 Result 1 12p - 1p 1p - 2p 12p - 1p 11a - 12p 11a - 12p 11a - 12p 12p - 1p 12p - 1p 1p - 2p Result 2 12p - 1p 1p - 2p 12p - 1p 12p - 1p 11a - 12p 12p - 1p 12p - 1p 12p - 1p 1p - 2p Note on Random Numbers in Excel Spreadsheets Once entered in a spreadsheet, a random number function remains “live.” A new random number is created whenever the spreadsheet is re-calculated. To re-calculate the spreadsheet, use the F9 key. Note, almost any change in the spreadsheet causes the spreadsheet to be recalculated! If you do not want the random number to change, you can freeze it by selecting: tools, options, calculations, and checking “manual.” Evaluating Results Simulation measures the quality of a solution because it gives the probability of a certain event occurring Simulation also shows the variability Simulation does not necessarily give the best possible answer. It gives the most likely answer. Optimization gives the best possible answer Evaluating Results Conclusions depend on the degree to which the model reflects the real system The only true test of a simulation is how well the real system performs after the results of the study have been implemented Advantages of Simulation Simulation often leads to a better understanding of the real system. Years of experience in the real system can be compressed into seconds or minutes. Simulation does not disrupt ongoing activities of the real system. Simulation is far more general than mathematical models. Simulation can be used as a game for training experience (safety!). Advantages of Simulation (cont) Simulation can be used when data is hard to come by. Simulation can provide a more realistic replication of a system than mathematical analysis. Many standard simulation software packages are available commercially (and Excel works fine too!). Disadvantages of Simulation There is no guarantee that the model will, in fact, provide good answers. There is no way to prove reliability. Simulation may be less accurate than mathematical analysis because it is randomly based. Building a simulation model can take a great deal of time (but if the payoff is great, then it is worth it!). A significant amount of computer time may be needed to run complex models (old concern - no longer an issue!). The technique of simulation still lacks a standardized approach. Appendix Useful Information on Probability Distributions Probability Distributions A probability distribution defines the behavior of a variable by defining its limits, central tendency and nature Mean Standard Deviation Upper and Lower Limits Continuous or Discrete Examples are: Normal Distribution (continuous) Binomial (discrete) Poisson (discrete) Uniform (continuous or discrete) Custom (created to suit a specific purpose) Uniform Distribution All values between minimum and maximum occur with equal likelihood Conditions Minimum Value is Fixed Maximum Value is Fixed All values occur with equal likelihood Excel function: RAND() – returns a uniformly distributed random number in the range (0,1) Normal Distribution Conditions: Use when: Uncertain variable is symmetric about the mean Uncertain variable is more likely to be in vicinity of the mean than far away Distribution of x is normal (for any sample size) Distribution of x is not normal, but the distribution of sample means (x-bar) will be normally distributed with samples of size 30 or more (Central Limit Theorem) Excel function: NORMSDIST() – returns a random number from the cumulative standard normal distribution with a mean of zero and a standard deviation of one [e.g., NORMSDIST(1) = .84] Simulation Continuous Variables Simulation Continuous Vars Dr. Satish Nargundkar 34 Distributions Variables to be simulated may be normal (e.g. height) or exponential (e.g. service time) or various other distributions. Task is to convert uniform distribution to the required distribution. Freq Freq 0 1 Simulation Continuous Vars 0 Dr. Satish Nargundkar infinity 35 Application - Queuing Systems A queuing system is any system where entities (people, trucks, jobs, etc.) wait in line for service (processing of some sort) retail checkout lines, jobs on a network server, phone switchboard, airport runways, etc. Simulation Continuous Vars Dr. Satish Nargundkar 36 Queuing System Inputs Queuing (waiting line) systems are characterized by: Number of servers / number of queues SSSQ – Single Server Single Queue SSMQ – Single Server Multiple Queue MSSQ – Multiple Server Single Queue MSMQ - Multiple Server Multiple Queue Arrival Rate (Arrival Intervals) Service Rate (Service Times) Simulation Continuous Vars Dr. Satish Nargundkar 37 Performance Variables (outcome) Performance of a queuing system is measured by Average number of entities in queue/system Average time waiting in queue/system Time in Queue Arrival time Service Time Service Begins Service Ends Time in System Simulation Continuous Vars Dr. Satish Nargundkar 38 Distributions in Queuing Arrival Intervals (time between two consecutive arrivals) and Service Time (time to serve one customer) are exponentially distributed. Confirm it yourself by watching cars on a street! Simulation Continuous Vars Dr. Satish Nargundkar 39 Sample Problem A loading dock (SSSQ) has trucks arriving every 36 minutes (0.6 hrs) on average, and the average service (loading / unloading) time is 30 minutes (0.5 hrs). A new conveyer belt system can reduce that to 15 minutes (0.25 hours). Simulate the arrival of 200 trucks to see how performance would be affected by the new system. Simulation Continuous Vars Dr. Satish Nargundkar 40 Simulating Exponential Distributions To convert the uniform distribution of the random numbers to an exponential distribution, take the negative natural log of the random numbers. This creates an exponential distribution with an average of 1.00. To get an average of 0.6 (to represent average arrival interval in hours), simply multiply result by 0.6. Thus, the conversion formula is: –ln(rand())*µ where µ is the mean of the exponential distribution desired. Simulation Continuous Vars Dr. Satish Nargundkar 41 Sample Conversion Average: 0 1 Simulation Continuous Vars 0.500645 1.040982 Random -ln(rand()) 0.449796 0.798962 0.858464 0.15261 0.828061 0.188668 0.938751 0.063206 0.84637 0.166798 0.428408 0.847678 0.357574 1.028412 0.63932 0.447351 ….. ….. Dr. Satish Nargundkar 0 infinity 42 Problem - Discrete • • • Jimmy prints a neighborhood newspaper. He has 10 subscribers. He also sells it to whoever comes by, from his front lawn on Friday afternoons. His mother has kept track of his demand (including requests made after he had sold out) for the past 100 weeks, and observed the f pattern shown: The papers cost 30 cents to print and Jimmy sells them for 50 cents. Assume that he prints 20 copies a week. Mom makes him throw away unsold copies. Simulate his sales for a year and determine his earnings. What if the # printed were adjusted? Papers Demanded Number of weeks 13 1 14 2 15 4 16 9 17 10 18 15 19 16 20 15 21 12 22 9 23 4 24 2 25 1 26 0 Total 100 Problem - Continuous • Trucks arrive at a loading dock on average every 0.6 hours. It takes on average 0.5 hours to unload a truck. Assuming that arrival intervals and service times are exponentially distributed, simulate the queuing system to find the average waiting times and the average number of trucks in the queue/system. • Suppose you could install an new conveyor belt system to unload trucks faster, so that the average unloading time is cut to 0.25 hours. How much improvement will there be in the average waiting time and average number in queue/system? • Queuing systems consist of multiple inputs: Number of servers, Number of Queues Arrival rate/Arrival interval Service rate/Service time • And multiple outputs: Average wait time Average number in queue / system