EMGT 269 - Elements of Problem Solving and Decision Making 10. USING DATA 1. Constructing Probability Distributions with Data A. Discrete Case The Empirical Probability Mass function is constructed using relative frequencies of events Relative Frequency # of occurences of particular outcome Total number of occurrence s Estimation of Discrete Empirical Probability Mass Function # Accidents 0 1 2 3 4 # Occurrences N0 N1 N2 N3 N4 Pr(#Accidents) N0/M N1/M N2/M N3/M N4/M 4 Total M= N i 0 i 1.0 Maintenance Example You own a manufacturing plant and are trying to design a maintenance policy. You would like to design your maintenance interval such that you balance the failure time of a machine to the maintenance interval. Maintenance interval is short costly due to frequent maintenance and machines never fail. Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 126 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making Maintenance interval is too long the machines fail, interrupting production resulting into high cost. Suppose you suggest an interval of 260 days. You want to calculate the probability of #machine failures per day in this period and use this in your interval selection. No Failures One Failure Two Failures Total # Days Pr(# Failures in a Day) 217/260 = 0.835 32/260 = 0.123 11/260 = 0.042 1 Pr(# Failures in a Day) # Days 217 32 11 260 Pr(# Failures in a Day) Also important in selecting a maintenance1.000 interval is whether these "two failure days" happen towards the end of the 260 day period. Graph of Probability Mass Function 1.000 0.800 0.600 0.400 0.200 0.000 Pr(# Failures in a Day) 0.500 0.000 No Failures 0.835 Pr(# Failures in a No Failures Day)One Failure Two Failures 0.835 0.123 0.042 No Failure (0.835) One Failure (0.123) Two Failures (0.042) Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 127 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen One EMGT 269 - Elements of Problem Solving and Decision Making Notes: Make sure you have enough data for accuracy (at least 5 observations in each category) Always ask: Does past data represent future uncertainty? B. Continuous Case Estimation of Emperical Continuous Distribution Function Y = Failure Time of Machine 1. Given data: yi , i = 1,…,n. 2. Order data such that y(1) y( 2 ) y( n 1) yn 3. Estimate y Min . In this case, we may set: yMin 0 4. Set: F ( yMin ) Pr(Y yMin ) 0 1 F ( y(1) ) Pr(Y y(1) ) n 2 F ( y( 2 ) ) Pr(Y y( 2 ) ) n n 1 n 1 F ( y( n ) ) Pr(Y y( n ) ) n 5. Plot the points ( y min , F ( y min )), ( y(1) , F ( y(1) )), , ( y( n ) , F ( y( n ) )) in a graph. 6. Connect these points by a straight line. F ( y( n 1) ) Pr(Y y( n1) ) Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 128 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making Above procedure may be referred to as STRAIGHT LINE APPROXIMATION EXAMPLE: HALFWAY HOUSE Y(i) 0 52 76 100 136 137 186 196 205 250 257 264 280 282 283 303 313 317 325 345 373 384 400 402 408 417 422 472 480 643 693 732 749 750 791 891 # Observations <= Y(i) Estimated Minimum 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 Pr(Yearly Bed Cost <= y(i)) 0.00 0.03 0.06 0.09 0.11 0.14 0.17 0.20 0.23 0.26 0.29 0.31 0.34 0.37 0.40 0.43 0.46 0.49 0.51 0.54 0.57 0.60 0.63 0.66 0.69 0.71 0.74 0.77 0.80 0.83 0.86 0.89 0.91 0.94 0.97 1.00 Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 129 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen Pr(Yearly Bed Cost <= X) EMGT 269 - Elements of Problem Solving and Decision Making 1.00 0.80 0.60 0.40 0.20 0.00 0 100 200 300 400 500 600 700 800 900 X What if we observe ties in the data? Observation Y(i) # Observations <= Y(i) Pr(Y<=Y(i)) 0 0 Estimated Minimum 0 1 1 1 1/7 2 3 2 2/7 3 7 skip skip 4 7 skip skip 5 7 5 5/7 6 9 6 6/7 7 11 7 7/7 y 0 1 3 7 9 11 Pr(Y<=y) 0.00 0.14 0.29 0.71 0.86 1.00 Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 130 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making 1.00 Pr(Y<=y) 0.80 0.60 0.40 0.20 0.00 0 1 2 3 4 5 6 7 8 9 10 11 12 y How do we use Empirical CDF in Decision Trees? Pr(Yearly Bed Cost <= X) As before, use discrete approximation e.g. Extended Pearson Tukey Method The Bracket Median Method 95% 1.00 0.80 0.60 0.40 0.20 0.00 50% 5% 0 100 200 300 400 500 600 700 800 900 75 320 X 750 Cost= 75 (0.185) Cost = 320 (0.630) Cost = 70 (0.185) Discrete Approximation - Extended Pearson Tukey Method Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 131 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making C. Using Data to fit Theoretical Probability Models Method of Moments Let Y be a random variable e.g. failure time of a machine 1. Given data: yi , I=1,…,n. 2. Calculate the Sample Mean (=First Moment): 1 n y yi n i 1 3. Calculate the Sample Variance (=Second Moment) 1 n 2 s ( y y ) i n 1 i 1 2 4. Select a Theoretical Probability Model with CDF F ( y | 1 , 2 ) , where (1 , 2 ) are the parameters 5. Calculate the theoretical expressions for E[Y ] g(1 , 2 ),Var(Y ) h(1 , 2 ) 6. Solve for the parameters (1 , 2 ) by setting g ( 1 , 2 ) y E [Y ] y 2 2 h ( , ) s Var ( Y ) s 1 2 Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 132 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making HALFWAY HOUSE EXAMPLE CONTINUED Observation 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 Total Sample Mean Divide by n=35 X 52 76 100 136 137 186 196 205 250 257 264 280 282 283 303 313 317 325 345 373 384 400 402 408 417 422 472 480 643 693 732 749 750 791 891 13314 X - Sample Mean -328.4 -304.4 -280.4 -244.4 -243.4 -194.4 -184.4 -175.4 -130.4 -123.4 -116.4 -100.4 -98.4 -97.4 -77.4 -67.4 -63.4 -55.4 -35.4 -7.4 3.6 19.6 21.6 27.6 36.6 41.6 91.6 99.6 262.6 312.6 351.6 368.6 369.6 410.6 510.6 Total (X - Sample Mean)^2 107846.56 92659.36 78624.16 59731.36 59243.56 37791.36 34003.36 30765.16 17004.16 15227.56 13548.96 10080.16 9682.56 9486.76 5990.76 4542.76 4019.56 3069.16 1253.16 54.76 12.96 384.16 466.56 761.76 1339.56 1730.56 8390.56 9920.16 68958.76 97718.76 123622.56 135865.96 136604.16 168592.36 260712.36 1609706.40 380.4 Sample Variance 47344.31 Sample St. Dev. 217.59 Divide by (n-1)=34 Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 133 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making Random Variable Y = Yearly Bed Cost in Half Way House 4. Propose Normal Probability Model: i.e Y N ( , ) 2 5. E[Y] = g ( , ) , Var(Y) = h( , ) 6. E[Y ] 380.4 380.4 2 Var(Y ) 47344.31 47344.31 380.4 217.59 1.00 Pr(Yearly Bed Cost <= x) 0.80 0.60 0.40 0.20 0.00 0 100 200 300 400 500 600 700 800 900 x Empirical Pr(Yearly Bed Cost <= X) Theoretical Normal Approximation Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 134 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making EXAMPLES "METHOD OF MOMENTS" FOR CONTINUOUS DISTRIBUTIONS Theoretical Distribution Normal: N ( , ) Gamma: G ( , ) Exponential: Exp( ) Beta: Beta ( r, n ) Theoretical Expressions Parameter Solutions E[Y ] 2 Var(Y ) y 2 s E[Y ] Var(Y ) 2 y 2 / s 2 y / s 2 1 E[Y ] 1 Var(Y ) ( ) 2 r E[Y ] n r (n r ) Var(Y ) 2 n ( n 1) 1 y n r y 3 ( y 1) s 2 ( y 1) 2 / y 3 y * (n r ) r y 1 EXAMPLES "METHOD OF MOMENTS" FOR DISCRETE DISTRIBUTIONS Theoretical Distribution Theoretical Expressions Binomial: Bin ( n, p ) E[Y ] n p Var(Y ) n p (1 p ) Poisson: Poisson (m) E[Y ] m Var (Y ) m Geometric: Geo( p ) 1 p E[Y ] p 1 p Var(Y ) ( p) 2 Parameter Solutions p y n m y p 1 y 1 Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 135 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making Fitting Theoretical Distributions using quantile estimates Y= Yearly bed cost in Half Way House 1. Given data: yi , I=1,…,n. 2. Order the data such that y(1) y( 2 ) y( n 1) yn 3. Set: 1 n 2 p2 Pr(Y y ( 2 ) ) n p1 Pr(Y y (1) ) n 1 n 1 pn Pr(Y y ( n ) ) n pn 1 Pr(Y y ( n 1) ) 7. Fit a Theoretical Probability Model with CDF F ( y | 1 , 2 ) by selecting parameters (1 , 2 ) such that F ( y n i 1 | 1 , 2 ) pi 2 (i ) is minimized. Note: Above procedure requires the use of numerical algorithms to calculate the parameters (1 , 2 ) . Software BESTFIT not only determines optimal parameters but also test multiple theoretical distributions. Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 136 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making HALFWAY HOUSE EXAMPLE USING BESTFIT Comparison of Input Distribution and Normal(3.80e+2,2.18e+2) 1.2 1 0.8 0.6 0.4 0.2 0 1.0444 2.0931 3.1419 4.1906 5.2394 6.2881 7.3369 8.3856 User Input Normal(3.80e+2,2.18e+2) Comparison of Input Distribution and Gamma(2.89,1.31e+2) 1.2 1 0.8 0.6 0.4 0.2 0 1.0444 2.0931 3.1419 4.1906 5.2394 6.2881 7.3369 8.3856 User Input Gamma(2.89,1.31e+2) Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 137 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making Uncertainty about Parameters and Bayesian Updating A. Discrete Case B = {Killer in a Murder Case} B {B1, B2, B3}, where; B1 = Hunter, B2 = Near Sighted Man, B3 = Sharp Shooter After interrogations, interviews with witnesses, we are able to establish the following prior distribution. Pr(B= B1)=0.2, Pr(B= B2)=0.7, Pr(B= B3)=0.1. Evidence A becomes available, being that the victim was shot from 2000 ft. We establish the following probability model. Pr(A|B1)=0.7, Pr(A|B2)=0.1, Pr(A|B3)=0.9. We update our prior distribution using the evidence into a posterior distribution using Bayes Theorem. Pr(A) = Pr(A|B1)Pr(B1)+ Pr(A|B2)Pr(B2)+ Pr(A|B3)Pr(B3) = 0.70.2+0.10.7+0.90.1=0.30 Pr( A | B1 ) Pr( B1 ) 0.7 0.2 0.47 Pr( A) 0.3 Pr( A | B2 ) Pr( B2 ) 0.1 0.7 Pr( B2 | A) 0.23 Pr( A) 0.3 Pr( A | B3 ) Pr( B3 ) 0.9 0.1 Pr( B3 | A) 0.30 Pr( A) 0.3 Pr( B1 | A) Conclusion: Refocus investigation on Hunter and Sharp shooter. Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 138 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making Choose a theoretical probability model, P( X k | ) for the physical process interest Assess uncertainty about parameter by specifying a prior distribution f ( ) Uncertainty about X has two parts: •Uncertainty due to the process itself P ( X k | ) •Uncertainty about the parameter , though f ( ) Uncertainty about X can be collapsed into one source by applying Law of Total Probability to calculate the prior predictive distribution P ( X k ) MODELING + PAST DATA + EXPERT JUDGEMENT B. Continuous Case Observe Data D1 Reassess uncertainty parameter by using Bayes Theorem to calculate posterior distribution f ( | D1 ) •Uncertainty due to the process itself P ( X k | ) •Uncertainty about the parameter , though f ( | D1 ) Uncertainty about X can be collapsed into one source by applying Law of Total Probability to calculate the posterior predictive distribution P ( X k | D1 ) FUTURE DATA + ANALYSIS Uncertainty about X has two parts: Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 139 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making Two calculations in above diagram have not been specified: 1. Calculating the Predictive Distribution Probability Model: Pr( X x | ) , e.g. X Bin(N,p). Prior distribution on : f ( ) e.g. f(p) = Beta(n0,r0) To calculate predictive distribution apply Law of Total Probability for the continuous case: Pr( X x ) Pr( X x | ) f ( )d SOFTE PRETZLE EXAMPLE CONTINUED Y = # Customers out of N that buy your pretzle, Y Bin(N,p), Where p is your market percentage. You are uncertain about p and you decide to model your uncertainty using a Beta distribution. p Beta(n0,r0). N ( n0 ) p r0 1 (1 p ) n0 r0 1 dp Pr( Y k | N ) p k (1 p ) N k ( r0 ) ( n 0 r0 ) k 0 1 N (n0 ) p r0 k 1 (1 p) n0 N r0 k 1 dp k ( r0 ) (n0 r0 ) 0 1 N (n0 ) p r0 k 1 (1 p) n0 N r0 k 1 dp k ( r0 ) (n0 r0 ) 0 1 Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 140 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making But: p r0 k 1 (1 p)n0 N r0 k 1 looks like a Beta(n0+N ,r0+k) distribution without the term ( n0 N ) ( r0 k ) ( n0 N r0 k ) Thus: 1 r k 1 n N r k 1 p 0 (1 p) 0 0 dp 0 ( r0 k ) (n0 N r0 k ) (n0 N ) Finally: N (n0 ) ( r0 k ) (n0 N r0 k ) Pr(Y k | N ) (n0 N ) k ( r0 ) (n0 r0 ) Note: ( n ) ( n 1)! , n 1,2,3,... In Soft Pretzel Example you decide to set f(p) = Beta(4,1) or in other words n0,=4, r0=1. Thus, N (4) (1 k ) (4 N 1 k ) Pr(Y k | N ) ( 4 N ) k (1) (3) N 3! k!( N k 2)! N k!( N k 2)! Pr(Y k | N ) 3 ( N 3)! k 1 2! ( N 3)! k Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 141 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making Use Excel Spreadsheet to perform the Calculations n0 r0 4 1 N 20 k 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Pr(Y=k|N) 0.1304 0.1186 0.1073 0.0966 0.0864 0.0768 0.0678 0.0593 0.0514 0.0440 0.0373 0.0311 0.0254 0.0203 0.0158 0.0119 0.0085 0.0056 0.0034 0.0017 0.0006 Pr(Y<=k|N) 0.1304 0.2490 0.3563 0.4529 0.5392 0.6160 0.6838 0.7431 0.7945 0.8385 0.8758 0.9068 0.9322 0.9526 0.9684 0.9802 0.9887 0.9944 0.9977 0.9994 1.0000 Conclusion: E.g. Pr(Y>10|N) = 1- 0.8758=0.1242, you believe there is approximately a 12.5% chance that you will sell more than 10 pretzels. Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 142 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making 2. Calculating the Posterior Distribution Probability Model: Pr( X x | ) , e.g. X Bin(N,p). Prior distribution on : f ( ) e.g. f(p) = Beta(n0,r0) Observed data: D To calculate the posterior distribution apply Bayes Theorem for the continuous case: f ( | D ) Pr( D | ) f ( ) Pr( D | ) f ( )d Pr( D | ) f ( ) Pr( D ) SOFTE PRETZLE EXAMPLE CONTINUED Y = # Customers out of N that buy your pretzle, Y Bin(N,p), Where p is your market percentage. You are uncertain about p and you decide to model your uncertainty using a Beta distribution. p Beta(n0,r0). f ( p) ( n0 ) p r0 1 (1 p ) n0 r0 1 ( r0 ) ( n0 r0 ) Suppose you observe the following data: D = (N,k), i.e. k out of N customers bought your pretzle. Then f ( p | D) Pr( D | p ) f ( p ) 1 Pr( D | p) f ( p)dp Pr( D | p ) f ( p ) Pr( D) 0 Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 143 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making f ( p) ( n0 ) p r0 1 (1 p ) n0 r0 1 ( r0 ) ( n0 r0 ) N k Pr( D | p ) Pr( Y k | N , p ) p (1 p ) N k k Pr( D ) Pr(Y k | N ) , the predictive distribution that we just calculated. N k ( n0 ) p r0 1 (1 p ) n0 r0 1 p (1 p ) N k k ( r0 ) ( n0 r0 ) f ( p | D) N ( n0 ) ( r0 k ) ( n0 N r0 k ) ( n0 N ) k ( r0 ) ( n0 r0 ) N k ( n0 ) p r0 1 (1 p ) n0 r0 1 p (1 p ) N k k ( r0 ) ( n0 r0 ) f ( p | D) N ( n0 ) ( r0 k ) (n0 N r0 k ) ( n0 N ) k ( r0 ) (n0 r0 ) f ( p | D) ( n 0 N ) p r0 k 1 (1 p ) n0 N r0 k 1 ( r0 k ) (n0 N r0 k ) Conclusion: The posterior distribution is ALSO a beta distribution but with parameters (n0+N,r0+k). Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 144 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making Definition: The prior distribution and the theoretical probability model are such that the prior distribution and the posterior distribution belong to the same family of distributions Prior Distribution and Theoretical Probability Model are Conjugate distributions. SOFT PRETZLE EXAMPLE CONTINUED: In Soft Pretzel Example you decide to set f(p) = Beta(4,1) as your prior distribution on the market percentage p or in other words n0,=4, r0=1. You observed that out of 20 potential customers 7 bought your pretzle. Thus D=(20,7). Thus the posterior distribution of the market percentage p is f(p|D)= Beta(4+20,1+7)= Beta(24,8) 4.50 4.00 3.50 3.00 2.50 2.00 1.50 1.00 0.50 0.00 0.00 0.20 0.40 Prior: Beta(4,1) 0.60 0.80 1.00 Posterior: Beta(24,8) Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 145 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making Conclusion: The good news is that with the observed data D=(20,7) I am becoming more certain about my market percentage p. However, what does this mean with respect to my investment in a soft pretzle stand. Is there some bad news too? What about the posterior predictive distribution? Recall: In Soft Pretzel Example you decide to set f(p) = Beta(4,1) or in other words n0,=4, r0=1. Thus, N ( 4 ) (1 k ) (4 N 1 k ) Pr(Y k | N ) ( 4 N ) k (1) (4 1) After observing data D: f(p|D) = Beta(24,8) or in other words N (24) (8 k ) (24 N 8 k ) Pr(Y k | N , D ) (24 N ) k (8) (24 8) N 23! (k 7)!( N k 15)! Pr(Y k | N ) ( N 23)! k 7!15! Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 146 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making n0+20 r0+7 24 8 N 20 k 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Pr(Y=k|N) 0.0034 0.0155 0.0389 0.0707 0.1032 0.1279 0.1386 0.1338 0.1165 0.0920 0.0662 0.0433 0.0257 0.0138 0.0066 0.0028 0.0010 0.0003 0.0001 0.0000 0.0000 Pr(Y<=k|N) 0.0034 0.0188 0.0577 0.1284 0.2316 0.3595 0.4981 0.6319 0.7483 0.8403 0.9065 0.9498 0.9756 0.9893 0.9959 0.9986 0.9996 0.9999 1.0000 1.0000 1.0000 Conclusion: After observing the data Pr(Y>10|N, D) = 1- 0.9065=0.0935. Thus your updated belief says there is approximately a 9.3% chance that you will sell more than 10 pretzels. In addition, you observe that your posterior has less uncertainty than the prior. Hence you are becoming more certain that selling soft pretzles may not be a good investment. Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 147 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making Conjugate Analysis for Normal Distributions Predictive Analysis Table Probability Model (Y | , ) Prior Predictive (Y | ) Normal ( , ) Normal(m0 , 0 ) Normal(m0 , 2 0 ) 2 Posterior Analysis Table Probability Model Prior (Y | , ) Posterior given D = (n1 , y ) ( | D, ) Normal ( , ) Normal(m0 , 0 ) Normal(m* , * ) Where: 2 / n1 02 m 2 m0 2 y 2 2 / n1 0 / n1 0 * 0 2 2 / n1 2 / n1 0 2 * Assignment: Study conjugate analysis for Halfway House Example on pages 392-396. Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 148 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making Question 10.19 A comptroller was preparing to analyze the distribution of balances in the various accounts receivable for her firm. She knew from studies in previous years that the distribution would be normal with a standard deviation of $1500, but she was unsure of the mean value She thought carefully about her uncertainty about this parameter and assessed a normal distribution for with mean m0 = 10,000, 0 = 800. Over lunch, she discussed this problem with her friend, who also worked in the accounting division. Her friend commented that she also was unsure of but would have place it somewhat higher. The friend said that “better” estimates for m0 and 0 would have been $12000 and $750, respectively. Define: Y:= Balance in Accounts Receivable Then: (Y | , ) Normal ( , ) , = $1500 a. Find P( $11000) for both prior distributions. For the Comptroller: PN( > 11,000 | m0 = 10,000, 0 = 800) = P(Z > (11,000-10,000)/800) = P(Z > 1.25) = 0.1056. For her friend, PN( > 11,000 | m0 = 12,000, 0 = 750) = P(Z > (11,000-12,000)/750) = P(Z > -1.33) = 0.9082. Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 149 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making b. That afternoon, the comptroller randomly chose nine accounts and calculated x =$11,003. Find the posterior distributions of Find P( $11000) for both posterior distributions. For the Comptroller: 15002 10,000 11,003 8002 9 m* 10,721 15002 8002 9 15002 (8002 ) 9 * 424 15002 2 800 9 Thus, PN( > 11,000 | m* = 10,721, * = 424) = P(Z > (11,000-10,721)/424) = P(Z > 0.66) = 0.2546. Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 150 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making For the friend: 15002 12,000 11,003 7502 9 m* 11,310 15002 7502 9 15002 (7502 ) 9 * 416 15002 7502 9 Thus, PN( > 11,000 | m* = 11,310, * = 416) = P(Z > (11,000-11,310)/416) = P(Z > -0.75) = 0.7734. c. A week later the analysis had been completed. Of a total of 144 accounts (including the nine reported in part b), the average was x =$11,254. Find the posterior distributions of Find P( $11000) for both posterior distributions. Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 151 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making For the Comptroller: 15002 10,000 11, 254 8002 144 m** 11, 224 15002 8002 144 15002 (8002 ) * 1442 123.5 1500 8002 144 Thus, PN( > 11,000 | m** = 11,224, ** = 123.5) = P(Z > (11,000-11,224)/123.5) = P(Z > -1.81) = 0.9649. For the friend: 15002 12,000 11, 254 7502 144 m** 11, 274 15002 7502 144 15002 (7502 ) ** 1442 123.3 1500 7502 144 Thus, PN( > 11,000 | m** = 11,274, ** = 123.3) = P(Z > (11,000-11,274)/123.3) = P(Z > -2.22) = 0.9868 Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 152 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen EMGT 269 - Elements of Problem Solving and Decision Making d. Discuss your answers to parts a,b, and c. What can you conclude? Eventually the data overwhelm any prior information. In the limit, as more data are collected, the comptroller and her friend will end up with the same posterior distribution. Lecture notes by: Dr. J. Rene van Dorp Session 9 - Page 153 Source: Making Hard Decisions, An Introduction to Decision Analysis by R.T. Clemen