STAT 401 Week 3 Lab 1 1 Example Problems Geoffrey Thompson

advertisement
STAT 401 Week 3 Lab 1
Geoffrey Thompson
6/3/2013
1
Example Problems
The first problem is about binomial and hypergeometric distributions, the second problem is about calculating the expectation and variance of random variables, and the third problem is about continuous uniform and normal distributions.
1.1
Zombie Outbreak
Unfortunately, there has been an outbreak of zombie-ism or something like it in
the country. Everybody in the city has an independent probability p of being
infected. Everybody who is infected becomes a zombie very quickly, so we will
simply refer to infected people as zombies even if they are not yet presenting
with symptoms.
This is horribly inconvenient, but we will have lab today anyway.
1. There are 50000 people in Ames and p = 0.04. If we are interested in the
number of zombies in town, what probability distribution is this? If you
are only interested in the probability that one specific person is a zombie,
what probability distribution is this?
This is a binomial distribution. If you are only interested in one student,
this is a bernoulli.
2. Given the numbers in (1), what is the expected number of zombies in
Ames?
The expectation for a binomial is np, so this is 50000 · 0.04 = 2000.
3. What is the variance of the number of zombies in Ames?
The variance of a binomial is np(1 − p), so this is 50000 · 0.04 · 0.96 = 1920
4. There are 40 students in this class. What is the expected number of
zombies in this class? What is the variance of the number of zombies in
this class?
1
This is still a binomial problem because we have a number of students,
n = 40, a probability that a given student is a zombie, p = 0.04, and we
are interested in counting zombies.
The expectation of a binomial is np, so here it is np = 40 · 0.04 = 1.6.
The variance of a binomial is np(1 − p), so here it is np(1 − p) = 40 · 0.04 ·
.96 = 1.536
5. Suppose the class breaks into small groups to survive the zombie apocalypse. Suppose, for these questions, p = 0.1.
(a) In a group of two people, what is the probability that none are zombies? What is the probability that 1 is a zombie?
This is still a binomial. Since each student is independently considered, the size of the class is irrelevant. p = 0.1 no matter what.
2 0
P (X = 0) =
p (1 − p)2 = (.9)2 = .81
0
2
P (X = 1) =
p(1 − p) = 2(.1)(.9) = .18
1
(b) What about in a group of 5 people?
5 0
P (X = 0) =
p (1 − p)5 = (.9)5 = 0.5905
0
5
P (X = 1) =
p(1 − p)4 = 2(.1)(.9)4 = 0.1312
1
(c) What about in a group of 10 people?
10 0
P (X = 0) =
p (1 − p)10 = (.9)10 = 0.3487
0
10
P (X = 1) =
p(1 − p)9 = 2(.1)(.9)9 = 0.0775
1
6. Suppose we know that 5 people in class are zombies, but we do not know
which.
(a) In a group of two people, what is the probability that 1 is a zombie?
HINT: hypergeometric.
This is hypergeometric because we have a population, the class, we
know exactly how many zombies are in the class, and we’re taking
groups ut of the class and trying to figure out how many zombies are
in those groups.
From the definition of a hypergeometric in our notes, we have the
following:
N = 40; M = 5; n = 2; x = 1
2
From the definition, then:
M
x
P (X = 1) =
N −M
n−x
N
n
5
1
35
1
40
2
=
= 0.2244
(b) In a group of 5 people, what is the probability that all 5 are zombies?
P (X = 5) =
M
x
N −M
n−x
N
n
5
5
35
0
40
5
=
= 1.5197 × 10−6
(c) In a group of 5 people, what is the probability that exactly 2 people
are zombies?
P (X = 2) =
M
x
N −M
n−x
N
n
5
2
35
3
40
5
=
= 0.0995
(d) In a group of 5 people, what is the probability that none are zombies?
P (X = 0) =
M
x
N −M
n−x
N
n
5
0
35
5
40
5
=
= 0.4934
(e) In a group of 10 people, what is the probability that all 10 are zombies?
Since there are only 5 zombies, there cannot be 10 zombies in a group.
The probability of any impossible event is 0.
Of course, zombies themselves are impossible, but let’s ignore that
for the sake of argument.
(f) In a group of 10 people, what is the probability that exactly 2 are
zombies?
P (X = 2) =
1.2
M
x
N −M
n−x
N
n
=
5
2
30
8
40
10
= 0.069
Rental Property
Mr. Brocklehurst owns three rental properties. All three have leases expiring in
July and he still has not found new tenants. However, he knows that, for each
property, he has a probability p = 0.8 of finding a new tenant for July.
For each property, he has a fixed cost of $475.
For each property that he rents out, he receives a rent of $750.
He receives $0 for each property that he does not rent out.
For each property he does not rent out, he has an additional maintenance
expense of $50.
3
P (X = x)
Profit(x)
P (x) · P rof it(x)
P (x) · P rof it(x)2
x=0
0.008
-1575
-12.6
1.9845 × 104
1
0.096
-775
-74.4
5.766 × 104
2
0.384
25
9.6
240
3
0.512
825
422.4
3.4848 × 105
Total
1
—
345
4.2622 × 105
1. What is the expected number of rented properties?
This is a binomial distribution, so the expected number of rented properties is np = 3 · 0.8 = 2.4.
2. In the table above, X is the number of rented properties. Fill out P (x).
The pmf is:
P (X = x)
3 x
p (1 − p)3−x
x
3. Profit is a random variable that is a function of X. Profit(x) denotes
the profit at a particular value of x of X. Calculate the profit for each
scenario.
P rof it(x) = −475 · 3 + 750x − 50(3 − x) = −1575 + 800x
4. After calculating the profit for each scenario, multiply the profits by the
probability of that outcome. Sum them up to get the expected profit in
the last column.
You can either solve this by that method or note that E(P rof it) =
−1575 + 800E(X) = 345
5. Fill in the bottom row and calculate V ar(P rof it).
V ar(P rof it) = E(P rof it2 ) − E(P rof it)2
= 4.2622 × 105 − 3452 = 3.072 × 105
Alternatively, note that profit is a linear function of X and therefore
V ar(P rof it) = 8002 V ar(X) = 3.072 × 105
1.3
Continuous Zombie Problems
More bad news: there has been another zombie outbreak and it somehow involves continuous probability distributions.
1. The number of zombies in Ames is uniformly distributed between 1000
and 9000.
(a) What is the expected number of zombies in Ames?
The endpoints of the distribution are 1000 and 9000. By definition,
the expectation is
9000 + 1000
= 5000
2
4
(b) What is the variance in the number of zombies in Ames?
V ar(X) =
1
80002
(B − A)2 =
= 5.3333 × 106
12
12
From the formula for uniform distributions.
(c) What is the probability that between 3000 and 4000 zombies are in
Ames?
There are two ways of doing this: either an integral of the pdf or
using what we know about the uniform distribution.
Integral:
1
for x ∈ (1000, 9000) and 0 otherwise (from
The pdf is fX (x) = 8000
the definitions).
Z
4000
Z
4000
fX (x)dx =
3000
3000
dx
4000 − 3000
=
= 0.125
8000
8000
Quick way:
If you are trying to find P (a ≤ X ≤ b) for a uniform r.v. on the
interval (A, B) with A ≤ a < b ≤ B, then:
P (a ≤ X ≤ b) =
b−a
B−A
(d) What is the probability less than 6000 zombies are in Ames? Using
the shortcut above, we have:
b−a
6000 − 1000
5
=
= = 0.625
B−A
9000 − 1000
8
Note that we have 6000 − 1000 instead of 6000.
(e) The zombie outbreak will cost the city $1,000,000 plus an additional
$17,000 per zombie. What is the expected cost of the zombie outbreak? What is the standard deviation of the cost of the zombie
outbreak?
This is a linear function of X. Therefore, we can use the tools we
already know.
Y = 1000000 + 17000X
E(Y ) = E(1000000 + 17000X) = 1000000 + 17000E(X) = 8.6 × 107
V ar(Y ) = V ar(1000000 + 17000X) = 170002 V ar(X) = σY2
q
σY = σY2 = 3.926 × 107
(f) In the file http://gzt.public.iastate.edu/stat401/data/unifzombie.
txt, I have simulated the draws from this distribution. In JMP, load
5
this data set and calculate the mean and variance. Plot a histogram.
Sorry, this isn’t JMP, but it’s the easiest way to show a histogram.
2. (only if we’ve gotten to the normal distribution)
The number of zombies in Ames is normally distributed with mean µ =
5000 and standard deviation σ = 2000.
(a) Write out the formula for the pdf for the number of zombies in Ames.
1
(x − µ)2
1
(x − 5000)2
√
exp −
exp
−
fX (x) = √
=
2σ 2
2 · 20002
2πσ
2000 2π
(b) What is the probability fewer than 5000 zombies are in Ames?
Here is a standard method for calculating probabilities in the normal
distribution. The idea is that you transform it to a standard normal.
P (X < 5000) = P (X − µ < 0)
X − µ)
=P
< 0 = P (Z < 0) = Φ(0) = .5
σ
It is helpful when doing this to keep the “X” terms as symbols (e.g.,
µ) while using their numeric equivalents with the numbers you do
have. This lets you know when you have gotten to Z.
6
(c) What is the probability fewer than 7000 zombies are in Ames?
We use the standard method above:
P (X < 7000) = P (X − µ < 7000 − 5000 = 2000)
X − µ)
2000
=P
<
= P (Z < 1)
σ
2000
= Φ(1) = 0.8413
The idea is to subtract off the mean and then divide by the standard
deviation to get a standard normal. Then look up the answer from
a table.
(d) What is the probability fewer than 3000 zombies are in Ames?
P (X < 3000) = P (X − µ < 3000 − 5000 = −2000)
−2000
X − µ)
<
= P (Z < −1)
=P
σ
2000
= Φ(−1) = 0.1587
(e) How would we calculate the probability between 3256 and 8821 zombies are in Ames? Set up the equations, we do not need to evaluate
them.
I would have demonstrated this in lab if lecture had covered normal
distributions.
This one is tricky! I don’t know if you’ve done this in class. There’s
a hard way: doing an integral. We do not want to do that.
There are a couple easier ways. One is to calculate FX (8821) and
FX (2356) and then find FX (8821) − FX (3256), where FX is the cdf
of X.
A better way is to translate the first easier way into a problem involving standard normals. Here is how that works:
P (3256 < X < 8821) = P (3256 − 5000 < X − µ < 8821 − 5000)
= P (−1744 < X − µ < 3821)
−1721
X −µ
3821
<
<
=P
2000
σ
2000
−1721
3821
=P
<Z<
2000
2000
3821
−1721
=Φ
−Φ
= 0.7804
2000
2000
(f) The zombie outbreak will cost the city $1,000,000 plus an additional
$17,000 per zombie. What is the expected cost of the zombie outbreak? What is the standard deviation of the cost of the zombie
outbreak?
7
The idea here is that this is a linear function of X, so the usual tricks
apply.
E(Y ) = E(1000000 + 17000X) = 1000000 + 17000E(X) = 8.6 × 107
V ar(Y ) = V ar(1000000 + 17000X) = 170002 V ar(X) = 1.156 × 1015
q
V ar(Y ) = σY2 ; σY = σY2 = 3.4 × 107
(g) In the file http://gzt.public.iastate.edu/stat401/data/normalzombie.
txt, I have simulated the draws from this distribution. In JMP, load
this data set and calculate the mean and variance. Plot a histogram.
Make a normal quantile plot. There is something wrong with this
data, what is it? Look at the histogram or a scatter plot to see.
8
Sorry once again for using something besides JMP. To do the same
from JMP, look under Analyze > Distribution.
The normal quantile plot looks fine - it’s mostly along a straight line.
The JMP output is more helpful, actually.
However, looking at the histogram, there is an obvious problem: there
are data points less than 0. This is bad! You can have 0 zombies,
but you can’t have less than 0 zombies. So this is a bad simulation.
2
References
• Mathematical Modeling of an Outbreak of Zombie Infection
• STAT 401 Page
9
Download