OPRE504 Chapter 8 I Chapter Study Guide Probability Models Properties of Random Variables Examples of Probability Models: 1. The probability model of the number showing on the face when tossing a fair die: 1⁄6 ππ π₯ = 1, 2, 3, 4, 5, ππ 6 P (X = x) = { 0 ππ‘βπππ€ππ π where X denotes the random variable and x denotes a particular value of X) 2. Probability Model for an Insurance Policy 1 ππ πππ¦ππ’π‘ πππ π‘ ππ $100,000) 1000 P (X = x) = 2 1000 997 ππ πππ¦ππ’π‘ πππ π‘ ππ $50,000) { 1000 ππ πππ¦ππ’π‘ πππ π‘ ππ $0 Expected Value of a Discrete Random Variable with n Outcomes: μ = E(X) = x1 P (x1) + x2 P (x2) + … + xn P (xn) 1 2 997 E(payout cost) =$100,000 x (1000) + $50,000 x ( 1000 )+ 0 x (1000 ) = $200 Variance of Expected Value: σ2 = Var(X) = ∑ (x – μ )2 P(x) For example, Variance of Expected Insurance Payout Cost: 1 2 997 Var (Payout Cost) = (100,000-200)2 x (1000) + (50,000-200)2 x (1000) + (0-200)2 x (1000) = 14,960,000 Standard Deviation of Expected Value: σ = √πππ(π) = √14,960,000 = $3,867.82 Properties of Random Variables (e.g., X and Y) E (X ±c) = E (X) ± c (c is a constant) E (aX) = aE(X) (a is a constant) E (X + Y) = E (X) + E (Y) E (X – Y) = E (X) – E (Y) Var (X ±c) = Var (X) (c is a constant, which does not change) Var (aX) = a2Var (X) (a is a constant, squared after taking out) Var ( X + Y ) = Var (X) + Var (Y) (assuming X and Y are independent) Chaodong Han OPRE504 Data Analysis and Decisions Class Handout Page 1 of 8 Var ( X – Y ) = Var (X) + Var (Y) (Note: + is not a typo! Assuming X and Y are independent) SD (X ±c) = SD (X) SD (aX) = |a| SD (X) (standard deviation is always positive) SD (X +Y) = √πππ(π) + πππ(π) (X and Y are assumed to be independent) SD (X – Y) = √πππ(π) + πππ(π) (+ is not a typo! X and Y are assumed to be independent) II 1. Probability Models for Discrete Random Variables Uniform Distribution When each outcome has an equal probability of occurring P (X= i) = 1/n when i = 1, 2, 3, …, n. probability 1/6 1 2 3 4 5 6 number showing on a face of a die 2. Geometric Distribution for Bernoulli Trials When there are only two possible outcomes (customarily named success and failure) for each trial (for example, tossing a coin has only two outcomes: head and tail); the probability of success, denoted as p, is the same for each trial (probability of failure, q = 1-p); all trials are independent (outcome of a trial does not affect the outcomes of any other trials) The expected number of trials needed for the first success to occur (e.g., for a head to occur when tossing a coin): 1 μ =E(X) = π where p = probability of success π σ = √π2 (standard deviation of expected number of trials) P (X = x) = qx-1p (the probability of x trials needed to see a first success) Example: for tossing a fair coin, the probability of seeing a head (success) can be 0.5 while seeing a tail (failure) is 1-0.5= 0.5. So the number of tosses needed for the first head to occur can be modelled as follows: x 1 P(X=x) x-1 0.5 0.5 p 0.5 0.5 Chaodong Han 2 qp 0.5x0.5 0.25 3 4 2 5 3 … 4 qp qp qp 2 3 4 0.5 (.5) 0.125 … 0.5 (0.5) 0.5 (0.5) 0.0625 0.03125 … OPRE504 Data Analysis and Decisions Class Handout Page 2 of 8 More exercises: Sharpe 2011, Chapter 8, Exercises 29, 30 (p.244) Question 8.1 [Sharpe 2011, Exercise 31] A salesman normally makes a sale (closes) on 80% of his presentations. Assuming the presentations are independent, find the probability of each of the following: a) He fails to close the sale for the first time on his fifth attempt (presentation) Four closes followed by a failure: P (X = x) = 0.85-1x0.2 (the probability of x trials needed to see a first failure) = 0.0819 b) He closes his first presentation on his fourth attempt Three failures followed by success: P (X=x ) = 0.24-1x0.8 = 0.0064 c) The first presentation he closes will be on his second attempt One failure followed by a success: P (X=x) = 0.22-1x0.8 = 0.16 d) The first presentation he closes will be on one of his first three attempts First presentation on the first attempt: 0.8 First presentation on the second attempt: 0.2x0.8 = 0.16 First presentation on the third attempt: 0.2x0.2x0.8 = 0.032 Due to mutually exclusiveness, P = 0.8+0.16+0.032 = 0.992 More exercises: Sharpe 2011, Chapter 8: Exercises 32, 33, and 34 (p.244) 3. Binomial Distribution When the random variable of interest is the number of successes in a series of Bernoulli trials (e.g., the number of heads to occur when tossing a fair coin five times in a row), the distribution of the number of successes can be modeled by Binomial Distribution. n= number of trials, p= probability of successes (q=1-p = probability of failure) X = number of successes in n trials π! P (X= x) = (ππ₯) pxqn-x, where (ππ₯) = π₯!(π−π₯)! is the number of combinations of forming x success out of n trials. Expected number of successes in n trials = μ =E(X) = np Standard Deviation of expected number of successes in n trials σ= SD(X) = √πππ More exercises: Sharpe 2011, Chapter 8, Guided Example – ARS pp.218-219; Exercises 39 and 40 (p.245) Chaodong Han OPRE504 Data Analysis and Decisions Class Handout Page 3 of 8 4. Poisson Distribution Poisson Distribution is used to predict the number of occurrences of an event over a given time interval. π −π ππ₯ P (X= x) = π₯! (x is a particular number of occurrences over a future time interval; π is average number of occurrences for the same time interval, which can be obtained from historical data; natural number e = 2.71828. E (X) = π and SD (X) = √π For example, data show that there are 4 visits per minute on average to a small business website over the period 1 pm to 5 pm. We can estimate the probability of no visits to the website for the next minute: π = 4 hits per minute, x=0; P (X=0) = π −4 40 0! = e-4 = 2.71828-4 = 0.0183 [Excel: exp(-4)] We can estimate the probability of 1 visits to the website for the next minute: π = 4 hits per minute, x=1; P (X=1) = π −4 41 1! = e-4 4= 2.71828-4 * 4= 0.0732 [Excel: exp(-4)] We can also estimate the probability of no visits to the website for next 30 seconds: given 4 hit per minute, π = 2 hits per 30 seconds, x=0, P (X=0) = π −2 20 0! = e-2 = 0.1353 Question 8.2 [Sharpe 2011, Exercise 13] A sporting goods manufacture was asked to sponsor a local boy in two fishing tournaments. They claim the probability that he will win the first tournament is 0.4. If he wins the first tournament, they estimate the probability that he will also win the second is 0.2. They guess that if he loses the first tournament, the probability that he will win the second is 0.3. a) Are the two tournaments independent? Explain. b) What is the probability that he loses both tournaments? c) What is the probability that he wins both tournaments? d) Find the probability model for the number of tournaments the boy wins: Chaodong Han OPRE504 Data Analysis and Decisions Class Handout Page 4 of 8 e) What are expected value and standard deviation of the number of tournaments he wins? More Exercises: Sharpe 2011, Chapter 8, Exercises 35, 36, 37, and 38 (pp.244-245) III 1. Probability Models for Continuous Random Variables Uniform Distribution for Continuous Variables Unlike in the case of discrete variables where the probability for the random variable to be a particular value is given (for example, the probability is 1/6 for a face of a fair die to show a number of 2 after a tossing), the probability for a continuous variable to be a specific value is always zero. Instead, we can only estimate the probability for the random variable to fall within an interval (e.g., either greater than a value, less than a value, or between two values). The variable is assumed to be equally likely to fall anywhere in the interval. The density function of a continuous uniform random variable looks like: f(x) 1 b-a 0 1 f (x) = b x (if a ≤ x ≤ b) { π−π 0 μ= E (X) = a ππ ππ‘βπππ€π ππ π+π 2 Chaodong Han OPRE504 Data Analysis and Decisions Class Handout Page 5 of 8 Variance: Var (X) = (π−π)2 12 (π−π)2 ; Standard Deviation: SD (X) = √ 12 Given the distribution, if a ≤ c ≤ d ≤ b, the probability of c ≤ x ≤ d is: (π−π) P (c ≤ x ≤ d) = (π−π) Example The sign at bus stop indicates that busses arrive about every 20 minutes. If a passenger just arrives at the bus station and wants to model the wait time before the bus arrives without any other information. For example, what’s the expected arrival time? What’s the probability for the bus to arrive within next 5 minutes? f(x) 1 20 0 20 x Expected arrival time: E (X) = (0+20)/2 = 10 minutes (20−0)2 SD (X) = √ = 5.77 minutes 12 (5−0) P (0≤ X ≤ 5) = (20−0) = 0.25 More exercises: Sharpe 2011, Chapter 8, Exercise 75 and 76 (p.248) Chaodong Han OPRE504 Data Analysis and Decisions Class Handout Page 6 of 8 2. Normal Distribution Again, the probability for a continuous random variable to be exactly a particular value is defined as Zero (read Sharpe 2011, pp.223-224). When the distribution of a random variable is bell-shaped curve, it is called a normal distribution, centered at mean of μ and with a standard deviation of σ. This distribution is notated as N (μ, σ). -3σ -2σ –σ μ σ +2σ +3σ Roughly, the area under the curve and between – σ and σ is 68%; the area under the curve and between – 2σ and 2σ is 95%; the area under the curve and between – 3σ and 3σ is 99.7%. Standard Normal Distribution We can transform any normal distribution into a standard normal distribution by find z = -3 -2 Chaodong Han –1 0 1 OPRE504 Data Analysis and Decisions +2 Class Handout +3 Page 7 of 8 π¦−π π Question 8.3 [Sharpe 2011, Exercise 46] For the 900 trading days from January 2003 to July 2006, the daily closing price of IBM stock (in $) is well modeled by a Normal model with mean of $85.60 and standard deviation of $6.20. According to this model, what is the probability that on a randomly selected day in this period the stock price closed a) Above $91.80? b) Below $98? c) Between $73.20 and $98? d) Which price is more unusual: $93 or $70? Question 8.4 [Sharpe 2011, Exercise 50, p.246] Based on the Normal Model N(100, 16) describing IQ scores, what percent of applicants would you expect to have scores a) Over 80: b) Under 90: c) Between 112 and 132: d) Above 125: Question 8.5 For MBA admissions, a business school only considers applicants with GMAT scores among the top 5%. Assuming GMAT scores are approximately distributed as N (600, 100), how high a GMAT score does it take to be eligible for admission? More exercises: Sharpe 2011, Chapter 8: Guided Example – Cereal Company, pp.228-229; Packaging Stereos – pp.231-234. Chapter 8 Exercises: 41-45, 47, 48, and 51-74. Chaodong Han OPRE504 Data Analysis and Decisions Class Handout Page 8 of 8