HERE

advertisement
OPRE504
Chapter 8
I
Chapter Study Guide
Probability Models
Properties of Random Variables
Examples of Probability Models:
1.
The probability model of the number showing on the face when tossing a fair die:
1⁄6 𝑖𝑓 π‘₯ = 1, 2, 3, 4, 5, π‘œπ‘Ÿ 6
P (X = x) = {
0
π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’
where X denotes the random variable and x denotes a particular value of X)
2.
Probability Model for an Insurance Policy
1
𝑖𝑓 π‘π‘Žπ‘¦π‘œπ‘’π‘‘ π‘π‘œπ‘ π‘‘ 𝑖𝑠 $100,000)
1000
P (X = x) =
2
1000
997
𝑖𝑓 π‘π‘Žπ‘¦π‘œπ‘’π‘‘ π‘π‘œπ‘ π‘‘ 𝑖𝑠 $50,000)
{ 1000 𝑖𝑓 π‘π‘Žπ‘¦π‘œπ‘’π‘‘ π‘π‘œπ‘ π‘‘ 𝑖𝑠 $0
Expected Value of a Discrete Random Variable with n Outcomes:
μ = E(X) = x1 P (x1) + x2 P (x2) + … + xn P (xn)
1
2
997
E(payout cost) =$100,000 x (1000) + $50,000 x ( 1000 )+ 0 x (1000 ) = $200
Variance of Expected Value:
σ2 = Var(X) = ∑ (x – μ )2 P(x)
For example, Variance of Expected Insurance Payout Cost:
1
2
997
Var (Payout Cost) = (100,000-200)2 x (1000) + (50,000-200)2 x (1000) + (0-200)2 x (1000)
= 14,960,000
Standard Deviation of Expected Value:
σ = √π‘‰π‘Žπ‘Ÿ(𝑋) = √14,960,000 = $3,867.82
Properties of Random Variables (e.g., X and Y)
E (X ±c) = E (X) ± c (c is a constant)
E (aX) = aE(X) (a is a constant)
E (X + Y) = E (X) + E (Y)
E (X – Y) = E (X) – E (Y)
Var (X ±c) = Var (X) (c is a constant, which does not change)
Var (aX) = a2Var (X) (a is a constant, squared after taking out)
Var ( X + Y ) = Var (X) + Var (Y) (assuming X and Y are independent)
Chaodong Han
OPRE504 Data Analysis and Decisions
Class Handout
Page 1 of 8
Var ( X – Y ) = Var (X) + Var (Y) (Note: + is not a typo! Assuming X and Y are independent)
SD (X ±c) = SD (X)
SD (aX) = |a| SD (X) (standard deviation is always positive)
SD (X +Y) = √π‘‰π‘Žπ‘Ÿ(𝑋) + π‘‰π‘Žπ‘Ÿ(π‘Œ) (X and Y are assumed to be independent)
SD (X – Y) = √π‘‰π‘Žπ‘Ÿ(𝑋) + π‘‰π‘Žπ‘Ÿ(π‘Œ) (+ is not a typo! X and Y are assumed to be independent)
II
1.
Probability Models for Discrete Random Variables
Uniform Distribution
When each outcome has an equal probability of occurring
P (X= i) = 1/n when i = 1, 2, 3, …, n.
probability
1/6
1
2
3
4
5
6
number showing on a face of a die
2.
Geometric Distribution for Bernoulli Trials
When there are only two possible outcomes (customarily named success and failure) for
each trial (for example, tossing a coin has only two outcomes: head and tail); the probability of
success, denoted as p, is the same for each trial (probability of failure, q = 1-p); all trials are
independent (outcome of a trial does not affect the outcomes of any other trials)
The expected number of trials needed for the first success to occur (e.g., for a head to occur
when tossing a coin):
1
μ =E(X) = 𝑝 where p = probability of success
π‘ž
σ = √𝑝2 (standard deviation of expected number of trials)
P (X = x) = qx-1p (the probability of x trials needed to see a first success)
Example: for tossing a fair coin, the probability of seeing a head (success) can be 0.5 while
seeing a tail (failure) is 1-0.5= 0.5. So the number of tosses needed for the first head to occur can
be modelled as follows:
x
1
P(X=x)
x-1
0.5 0.5
p
0.5
0.5
Chaodong Han
2
qp
0.5x0.5
0.25
3
4
2
5
3
…
4
qp
qp
qp
2
3
4
0.5 (.5)
0.125
…
0.5 (0.5) 0.5 (0.5)
0.0625
0.03125 …
OPRE504 Data Analysis and Decisions
Class Handout
Page 2 of 8
More exercises:
Sharpe 2011, Chapter 8, Exercises 29, 30 (p.244)
Question 8.1 [Sharpe 2011, Exercise 31] A salesman normally makes a sale (closes) on 80% of
his presentations. Assuming the presentations are independent, find the probability of each of the
following:
a)
He fails to close the sale for the first time on his fifth attempt (presentation)
Four closes followed by a failure:
P (X = x) = 0.85-1x0.2 (the probability of x trials needed to see a first failure) = 0.0819
b)
He closes his first presentation on his fourth attempt
Three failures followed by success:
P (X=x ) = 0.24-1x0.8 = 0.0064
c)
The first presentation he closes will be on his second attempt
One failure followed by a success:
P (X=x) = 0.22-1x0.8 = 0.16
d)
The first presentation he closes will be on one of his first three attempts
First presentation on the first attempt: 0.8
First presentation on the second attempt: 0.2x0.8 = 0.16
First presentation on the third attempt: 0.2x0.2x0.8 = 0.032
Due to mutually exclusiveness, P = 0.8+0.16+0.032 = 0.992
More exercises:
Sharpe 2011, Chapter 8: Exercises 32, 33, and 34 (p.244)
3.
Binomial Distribution
When the random variable of interest is the number of successes in a series of Bernoulli
trials (e.g., the number of heads to occur when tossing a fair coin five times in a row), the
distribution of the number of successes can be modeled by Binomial Distribution.
n= number of trials, p= probability of successes (q=1-p = probability of failure)
X = number of successes in n trials
𝑛!
P (X= x) = (𝑛π‘₯) pxqn-x, where (𝑛π‘₯) = π‘₯!(𝑛−π‘₯)! is the number of combinations of forming x success
out of n trials.
Expected number of successes in n trials = μ =E(X) = np
Standard Deviation of expected number of successes in n trials σ= SD(X) = √π‘›π‘π‘ž
More exercises:
Sharpe 2011, Chapter 8, Guided Example – ARS pp.218-219; Exercises 39 and 40 (p.245)
Chaodong Han
OPRE504 Data Analysis and Decisions
Class Handout
Page 3 of 8
4.
Poisson Distribution
Poisson Distribution is used to predict the number of occurrences of an event over a given time
interval.
𝑒 −πœ† πœ†π‘₯
P (X= x) = π‘₯! (x is a particular number of occurrences over a future time interval; πœ† is average
number of occurrences for the same time interval, which can be obtained from historical data;
natural number e = 2.71828. E (X) = πœ† and SD (X) = √πœ†
For example, data show that there are 4 visits per minute on average to a small business website
over the period 1 pm to 5 pm.
We can estimate the probability of no visits to the website for the next minute:
πœ† = 4 hits per minute, x=0; P (X=0) =
𝑒 −4 40
0!
= e-4 = 2.71828-4 = 0.0183 [Excel: exp(-4)]
We can estimate the probability of 1 visits to the website for the next minute:
πœ† = 4 hits per minute, x=1; P (X=1) =
𝑒 −4 41
1!
= e-4 4= 2.71828-4 * 4= 0.0732 [Excel: exp(-4)]
We can also estimate the probability of no visits to the website for next 30 seconds:
given 4 hit per minute, πœ† = 2 hits per 30 seconds, x=0,
P (X=0) =
𝑒 −2 20
0!
= e-2 = 0.1353
Question 8.2 [Sharpe 2011, Exercise 13] A sporting goods manufacture was asked to sponsor a
local boy in two fishing tournaments. They claim the probability that he will win the first
tournament is 0.4. If he wins the first tournament, they estimate the probability that he will also
win the second is 0.2. They guess that if he loses the first tournament, the probability that he will
win the second is 0.3.
a)
Are the two tournaments independent? Explain.
b)
What is the probability that he loses both tournaments?
c)
What is the probability that he wins both tournaments?
d)
Find the probability model for the number of tournaments the boy wins:
Chaodong Han
OPRE504 Data Analysis and Decisions
Class Handout
Page 4 of 8
e)
What are expected value and standard deviation of the number of tournaments he wins?
More Exercises:
Sharpe 2011, Chapter 8, Exercises 35, 36, 37, and 38 (pp.244-245)
III
1.
Probability Models for Continuous Random Variables
Uniform Distribution for Continuous Variables
Unlike in the case of discrete variables where the probability for the random variable to
be a particular value is given (for example, the probability is 1/6 for a face of a fair die to show a
number of 2 after a tossing), the probability for a continuous variable to be a specific value is
always zero. Instead, we can only estimate the probability for the random variable to fall within
an interval (e.g., either greater than a value, less than a value, or between two values).
The variable is assumed to be equally likely to fall anywhere in the interval. The density function
of a continuous uniform random variable looks like:
f(x)
1
b-a
0
1
f (x) =
b
x
(if a ≤ x ≤ b)
{ 𝑏−π‘Ž
0
μ= E (X) =
a
𝑖𝑓 π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘ π‘–π‘’
π‘Ž+𝑏
2
Chaodong Han
OPRE504 Data Analysis and Decisions
Class Handout
Page 5 of 8
Variance: Var (X) =
(𝑏−π‘Ž)2
12
(𝑏−π‘Ž)2
; Standard Deviation: SD (X) = √
12
Given the distribution, if a ≤ c ≤ d ≤ b, the probability of c ≤ x ≤ d is:
(𝑑−𝑐)
P (c ≤ x ≤ d) = (𝑏−π‘Ž)
Example
The sign at bus stop indicates that busses arrive about every 20 minutes. If a
passenger just arrives at the bus station and wants to model the wait time before the bus arrives
without any other information. For example, what’s the expected arrival time? What’s the
probability for the bus to arrive within next 5 minutes?
f(x)
1
20
0
20
x
Expected arrival time: E (X) = (0+20)/2 = 10 minutes
(20−0)2
SD (X) = √
= 5.77 minutes
12
(5−0)
P (0≤ X ≤ 5) = (20−0) = 0.25
More exercises:
Sharpe 2011, Chapter 8, Exercise 75 and 76 (p.248)
Chaodong Han
OPRE504 Data Analysis and Decisions
Class Handout
Page 6 of 8
2.
Normal Distribution
Again, the probability for a continuous random variable to be exactly a particular value is
defined as Zero (read Sharpe 2011, pp.223-224).
When the distribution of a random variable is bell-shaped curve, it is called a normal distribution,
centered at mean of μ and with a standard deviation of σ. This distribution is notated as N (μ, σ).
-3σ
-2σ
–σ
μ
σ
+2σ
+3σ
Roughly, the area under the curve and between – σ and σ is 68%; the area under the curve and
between – 2σ and 2σ is 95%; the area under the curve and between – 3σ and 3σ is 99.7%.
Standard Normal Distribution
We can transform any normal distribution into a standard normal distribution by find z =
-3
-2
Chaodong Han
–1
0
1
OPRE504 Data Analysis and Decisions
+2
Class Handout
+3
Page 7 of 8
𝑦−πœ‡
𝜎
Question 8.3 [Sharpe 2011, Exercise 46] For the 900 trading days from January 2003 to July
2006, the daily closing price of IBM stock (in $) is well modeled by a Normal model with mean
of $85.60 and standard deviation of $6.20. According to this model, what is the probability that
on a randomly selected day in this period the stock price closed
a)
Above $91.80?
b)
Below $98?
c)
Between $73.20 and $98?
d)
Which price is more unusual: $93 or $70?
Question 8.4 [Sharpe 2011, Exercise 50, p.246] Based on the Normal Model N(100, 16)
describing IQ scores, what percent of applicants would you expect to have scores
a) Over 80:
b) Under 90:
c) Between 112 and 132:
d) Above 125:
Question 8.5 For MBA admissions, a business school only considers applicants with GMAT
scores among the top 5%. Assuming GMAT scores are approximately distributed as N (600,
100), how high a GMAT score does it take to be eligible for admission?
More exercises:
Sharpe 2011, Chapter 8: Guided Example – Cereal Company, pp.228-229; Packaging Stereos –
pp.231-234.
Chapter 8 Exercises: 41-45, 47, 48, and 51-74.
Chaodong Han
OPRE504 Data Analysis and Decisions
Class Handout
Page 8 of 8
Download