M442, Fall 2013 Practice Problems for the Final

advertisement
M442, Fall 2013 Practice Problems for the Final
My office hours during exam week will be 1:30-2:30 on the following days: Tuesday Dec. 3,
Wednesday Dec. 4, Thursday Dec. 5, Tuesday Dec. 10.
The final exam for M442 will be Wednesday, Dec. 11, 10:30 a.m. – 12:30 p.m., in Blocker 123
(the usual classroom). The exam will consist of two parts: Part 1 will not require MATLAB,
while Part 2 will require MATLAB. Students will turn in Part 1 before beginning Part 2,
but for Part 2 students will have access to all M-files we’ve used this semester, from both
lecture and homework.
The final exam will cover all course material following the midterm, including: probability
definitions and axioms; permutations and combinations (with and without repeats); conditional probability, including Bayes’ Lemma, and computing probabilities by conditioning;
independent events; random variables and expected value, including properties of expected
value; conditional expected value and computing expected value by conditioning; simulating
uniform random variables, Gaussian random variables, and discrete random variables; variance and standard deviation; probability density functions; Markov’s Lemma; Chebyshev’s
Lemma; the Weak Law of Large Numbers, including inequality *; the Strong Law of Large
Numbers; the Central Limit Theorem.
This set of practice problems is in the general format of the exam, but is at least twice the
length of the actual exam.
Part 1 Problems
1.
of
2.
a.
Suppose A and B are independent events. Use the axioms of probability and the definition
independence to show that A and B c are independent events.
Use the axioms of probability theory to show the following.
For any two events A and B
P (A ∩ B) ≤ P (A).
b. For any two events A and B
P (A ∪ B) ≤ P (A) + P (B).
3. Answer the following.
a. Compute the probability of obtaining a quad (four of a kind) if you are dealt five cards
from a standard deck of 52 cards.
b. Compute the probability of obtaining a quad if, as in Texas Hold’em, you are dealt seven
cards (two down plus five community).
4. Suppose there are 15 students in a certain class, and they are to be arranged into 5 groups
of 3 students each. In how many ways can this arrangement be made?
5. Suppose a certain M442 student can solve a regression problem with probability .8, a
dimensional analysis problem with probability .7, and a phase plane problem with probability
.25. If a pop quiz is given and there is a 10% chance the problem will be on regression, a
1
50% chance the problem will be on dimensional analysis, and a 40% chance the problem will
be on phase planes, what is the probability the student will be able to solve the problem?
6. Suppose we have three cards identical in form except that both sides of the first card
are colored red, both sides of the second card are colored black, and one side of the third
card is colored red, the other black. The three cards are mixed up in a hat, and one card is
randomly selected and put down on the ground. If the up side of the chosen card is colored
red, what is the probability that the other side is colored black?
7. Suppose machines M1 and M2 turn out, respectively, 10 and 90 percent of the total
production of a certain type of article. Suppose the probability that machine M1 turns out
a defective article is .01, while the probability that machine M2 turns out a defective article
is .05. What is the probability that an article taken at random from a day’s production was
made by machine M1 , given that it is found to be defective?
8. Suppose sixty percent of cars in a certain town are made by Company A. Thirty percent of
all cars in this town are SUVs, and forty percent of SUVs in the town are made by Company
A. Given that a particular car in the town is not an SUV, what is the probability that it
was made by Company A?
9. In the game of American roulette, there are 38 equally probable slots: 18 red, 18 black,
and two house. When one dollar is bet on red, the player wins one dollar if the ball lands
red, and he loses his bet otherwise.
a. Compute the expected amount of gain or loss from betting one dollar on red.
b. One famous roulette strategy is the “martingale” or “doubling up” strategy. Each time
the player loses, he doubles his bet. For example, he might bet 1 dollar, lose, bet 2 dollars,
lose, bet 4 dollars, win. In total, he has lost 3 dollars and won 4, so he comes out ahead
by one dollar. Suppose there is a betting maximum of four dollars at a certain table, and
determine the expected value of this strategy.
c. Determine the expected amount bet during the martingale strategy with a maximum of
four dollars, and divide your expected value from (b) with this amount to find the average
loss per dollar bet.
10. A bin of 5 electrical components is known to contain exactly 2 that are defective. If the
components are tested one at a time in random order, until both defectives are discovered,
find the expected number of tests that are made. Keep in mind that you don’t necessarily
have to test the defectives. For example, if you draw three working components in a row,
you can conclude that the remaining two are defective. Also, compute the variance of the
number of tests.
11. The covariance of two random variables X and Y is defined to be
Cov(X, Y ) := E[(X − E[X])(Y − E[Y ])].
Show that
Cov(X, Y ) = E[XY ] − E[X]E[Y ].
12. Show that for any two discrete random variables X and Y for which E[|X|] and E[|Y |]
are finite
E[XY ] = E[Y E[X|Y ]].
2
Explain why you need to assume E[|X|] and E[|Y |] are finite.
13. Suppose a fair coin is flipped until it lands heads on two consecutive flips. What is the
expected number of flips required?
14. Answer the following.
a. A miner is trapped in a mine containing three doors. The first door leads to a tunnel
that will take him to safety after three hours of travel. The second door leads to a tunnel
that will return him to the mine after five hours of travel. The third door leads to a tunnel
that will return him to the mine after seven hours of travel. If we assume that the miner
is at all times equally likely to choose any one of the doors, what is the expected length of
time until he reaches safety?
b. Compute the variance of the length of time until the miner reaches safety.
15. Show that for any random variable X and any value a ∈ R
P (X ≥ a) ≤ e−a E[eX ].
16. Let A denote an event on a probability space S, and suppose we would like to use the
method of simulation to approximate a value for p = P (A). Let X denote a random variable
(
1 if event A occurs
X=
.
0 otherwise
a. Explain how the random variable X can be used to approximate a value for p = P (A).
b. Find a number of simulations n required to ensure that the probability is less than 1%
that your error on P (A) is greater than .02.
17. Suppose U1 and U2 are both uniform random variables on [0, 1]. Determine whether or
not the random variable X = U1 + U2 is a uniform random variable on [0, 2].
3
Part 2 Problems
1. Consider the polygon described by the following inequalities:
x + y ≥1
y ≤4
y ≥1
x − y ≤0
2
y + x ≤5
3
(See Figure 1.)
Polygon for Problem 1
5
4.5
4
3.5
y
3
2.5
2
1.5
1
0.5
0
−4
−3
−2
−1
0
x
1
2
3
4
Figure 1: Polygon for Problem 1.
a. Write a MATLAB M-file that uses simulation to determine the area of this polygon.
b. Determine the number of times n you should run your simulation to ensure there is a
90% chance that the error on your probability from (a) will be smaller than .005.
c. Run your simulation the number of times determined in (b) and turn in your result.
2. The modern study of probability theory grew out of a correspondence between Blaise
Pascal and Pierre de Fermat that was initiated by the question of whether or not it’s advantageous to bet even money that double sixes will turn up at least once in 24 throws of a
pair of fair dice.
a. Write a MATLAB M-file that simulates this game n times, where n is to be determined
below.
b. Determine a number of simulations n that will ensure that the error on your probability
is smaller than .001 at least 95% of the time.
c. Run your simulation the number of times determined in (b) and turn in your result.
4
3. Suppose a six-sided die is rolled five times, and the numbers obtained are added together
to give a value between 5 and 30.
a. Write a MATLAB M-file that simulates this game n times, where n will be determined
below, and determines the probability of getting the values 13, 17, and 23.
b. Determine the number of times n you should run your simulation to ensure there is a
95% chance that the error on all three of your probabilities will be smaller than .01. You
may assume independence of the three observations.
c. Run your simulation the number of times determined in (b) and turn in your result.
4. Consider a process called a random walk in which motion proceeds as follows: Starting
at position 0, an object moves either one space to the right with probability p or one space
to the left with probability 1 − p.
a. Take p = .5 and write a MATLAB M-file that simulates the process for 10 steps, and
determines the probability that the object finishes on each of the following positions: 0, 1,
2.
b. Determine the number of times n you should run your simulation to ensure there is a
95% chance that the error on all three of your probabilities will be smaller than .01. You
may assume independence of the three observations.
c. Run your simulation the number of times determined in (b) and turn in your result.
5
Solutions to Part 1
1. We start by writing
A = (A ∩ B) ∪ (A ∩ B c ),
which is a union of disjoint sets. Then by Axiom 3 and the independence of A and B
P (A) = P (A ∩ B) + P (A ∩ B c ) = P (A)P (B) + P (A ∩ B c ).
We see that
P (A ∩ B c ) = P (A)(1 − P (B)) = P (A)P (B c ).
2. For (a), we write
A = (A ∩ B) ∪ (A ∩ B c ),
and use Axiom 3 to see
P (A) = P (A ∩ B) + P (A ∩ B c ).
By Axiom 1 we get the conclusion.
For (b), we observe that
A ∪ B = B ∪ (B c ∩ A),
so
P (A ∪ B) = P (B) + P (B c ∩ A) ≤ P (B) + P (A),
where in getting this last inequality we used (a) with B c replacing B.
3. Since there are 13 ways to rank the quad, and 48 ways to choose the fifth card,
P (quad) =
13 · 48
= .000240.
52
5
For (b), there are still 13 ways to rank the quad, and then there are
other three cards. We have
13 · 48
3 = .001681.
P (quad) =
52
48
3
ways to select the
7
4. There are 15
ways to choose the first group,
3
then 5! ways to arrange the five groups. In total,
15 12 9 6 3
3
3
3
3
3
5!
12
3
ways to choose the second group etc.,
= 1401400.
5. We define the partition
A1 = event a regression problem is given
A2 = event a dimensional analysis problem is given
A3 = event a phase plane problem is given,
6
and let B denote the event the given problem is solved. Then
P (B) =
3
X
P (B|Aj )P (Aj )
j=1
= .8 · .1 + .7 · .5 + .25 · .4 = .53.
6. One approach to this problem is to proceed almost precisely as for the Monty Hall
problem. Begin with the partition
A1 = event mixed card is drawn
A2 = event red-red card is drawn
A3 = event black-black card is drawn.
The conditioning event is
B = event up side of chosen card is red.
According to Bayes’ Lemma, we have
P (B|A1 )P (A1 )
P (B|A1 )P (A1 ) + P (B|A2 )P (A2 ) + P (B|A3 )P (A3 )
( 21 ) 13
1
= 1 1
1
1 = .
3
( 2 ) 3 + (1) 3 + (0) 3
P (A1 |B) =
Alternatively, you can get by with only defining two events,
A = event mixed card is drawn
B = event up side of chosen card is red.
Here, by the definition of conditional probability
P (A|B) =
P (A ∩ B)
=
P (B)
1
6
1
2
1
= .
3
In this case, the probabilities are slightly more subtle. For P (A ∩ B), keep in mind that
there is 1/3 chance of getting the mixed card, and then once you have it, a 1/2 chance that
the red side will be up. For P (B), consider the three cards as six sides, each equally likely
to turn up.
7. We set
D = event the article is defective
Mj = event the article was made by machine j.
According to Bayes’ Lemma,
P (D|M1 )P (M1 )
P (D|M1 )P (M1 ) + P (D|M2 )P (M2 )
.01 · .1
=
= .021739.
.01 · .1 + .05 · .9
P (M1 |D) =
7
8. Set
A = event car was made by company A
S = event car is an SUV,
and note that our goal is to compute
P (A|S c ).
We have
P (A) = P (A|S)P (S) + P (A|S c )P (S c ),
and we see
.6 = .4 · .3 + P (A|S c ) · .7.
Solving for P (A|S c ) we conclude
P (A|S c ) = .685714.
9. For (a), We have
E[One dollar on red] = 1 ·
18
20
−1·
= −.0526 cents.
38
38
b. If we let X be the player’s final payout from one series with the strategy, we have
18 20 18
20 18 20
E[X] = 1 ·
+
·
+ ( )2
− 7( )3 = −.1664.
38 38 38
38 38
38
c. The player’s expected bet under the assumption of a $4 cap is given by,
E[B] = 1 ·
18
20 18
18 20 18
+3·
·
+ 7 · (1 −
−
· ) = 3.1607.
38
38 38
38 38 38
Notice that
−.1664
= −.0526.
3.1607
If you play roulette you should get used to this number.
10. Let the random variable N denote the number of tests that have
P4to be made. Then the
expected value is computed from the relation from class E[N ] = n=2 nP (N = n), where
a single test cannot possibly be conclusive, and we would never require 5. The probability
P (N = 2), for example, is the probability that one of two defectives is drawn out of five
components (2/5) times the probability that a single defective is drawn out of the four
remaining components (1/4). I.e.,
P (N = 2) =
8
2 1
1
· =
5 4
10
For P (N = 3), we note that there are three types of draws that will be definitive. If we let G
denote good and D defective these draws are DGD, GDD, GGG, with respective probabilities
(as summands)
3
2 3 1 3 2 1 3 2 1
· · + · · + · · = .
5 4 3 5 4 3 5 4 3
10
Finally,
6
P (N = 4) = 1 − P (N = 2) − P (N = 3) = .
10
We have
1
3
6
E[N ] = 2 ·
+ 3 + 4 = 3.5.
10
10
10
The variance is now straightforward:
Var[N ] = 1.52 ·
3
3
9
1
+ .52 ·
+ .52 · =
= .45.
10
10
5
20
11. We compute
Cov(X, Y ) = E[(X − E[X])(Y − E[Y ])]
= E[XY − E[X]Y − XE[Y ] + E[X]E[Y ]]
= E[XY ] − E[X]E[Y ].
12. My recommendation on problems like this is to start with the more complicated expression, which in this case is E[Y E[X|Y ]]. We compute
X
E[Y E[X|Y ]] =
yE[X|Y = y]P (Y = y)
y
X X
xP ({X = x}|{Y = y})P (Y = y)
y
=
y
x
X X
xP ({X = x} ∩ {Y = y})
y
=
y
=
X
x
xyP ({X = x} ∩ {Y = y})
x,y
= E[XY ]
It is in the technical step in which y is distributed through the sum over x that we use our
assumption that E[|X|] and E[|Y |] are finite.
13. Let N denote the number of flips required until two consecutive heads occur, and let Tk
denote the event that the first tail occurs on flip k, k = 1, 2, and let T3 denote the event that
the first two flips are heads. (Note that the {Tk }3k=1 partition our sample space.) Proceed
by conditioning,
E[N ] =
3
X
1
1
1
E[N |Tk ]P (Tk ) = (1 + E[N ]) + (2 + E[N ]) + 2 .
2
4
4
k=1
9
Solving for E[N ], we find
E[N ] = 6.
14. Letting
Dk = event miner goes through door k,
and letting T denote the time until the miner reaches safety, we have
E[T ] =
3
X
1
1
1
E[T |Dk ]Pr{Dk } = 3 + (5 + E[T ]) + (7 + E[T ]) .
3
3
3
k=1
Solving this algebraic equation for E[T ], we conclude
1
E[T ] = 5 ⇒ E[T ] = 15.
3
b. We have
Var(T ) = E[T 2 ] − E[T ]2 ,
where we already know that
E[T ] = 15.
For E[T 2 ], we let Dk denote the event that the miner goes through door k and compute
E[T 2 ] =
1
1
1
E[T 2 |Dk ]P (Dk ) = 9 + E[(5 + T )2 ] + E[(7 + T )2 ] ,
3
3
3
k=1
X
which we can solve for
1
9 + 25 + 150 + 49 + 210
E[T 2 ] =
⇒ E[T 2 ] = 443.
3
3
Finally,
Var(T ) = E[T 2 ] − E[T ]2 = 443 − 225 = 218.
15. Set Y = eX and observe that Y ≥ 0. By Markov’s inequality
P (Y ≥ ea ) ≤
E[Y ]
= e−a E[eX ].
ea
Finally, note that eX ≥ ea if and only if X ≥ a.
16. For (a) we use
n
1X
p̄ =
Xj ,
n j=1
where the {Xj }nj=1 and instances of X, since E[X] = p.
For (b), we have
n
1X
σ2
1
P (|
Xj − p| ≥ ) ≤ 2 ≤
,
n j=1
n
4n2
10
so if we want = .02 we find n so that
1
1
≤ .01 ⇒ n ≥
= 62500.
2
4n(.02)
4 · .01 · (.02)2
17. Since U1 and U2 are both uniform on [0, 1], we know that for all a, b ∈ [0, 1], a ≤ b, there
holds
P (a ≤ Uk ≤ b) = b − a, k = 1, 2.
Now let c, d ∈ [0, 2], c ≤ d. The question becomes, is it true that
P (c ≤ X ≤ d) =
d−c
.
2
(1)
(Keep in mind that if c = 0 and d = 2 the probability must be 1.) Intuitively, we think that
X should not be uniform, because there are many more ways to get a sum such as 1 than a
sum such as 0. In order to find a counterexample to (1), consider a particular pair of values,
say c = 12 and d = 32 . We have
3
1
3
1
P ( ≤ X ≤ ) = P ( ≤ U1 + U2 ≤ )
2
2
2
2
1
1
1
1
≥ P ( ≤ U1 ≤ 1)P (U2 ≤ ) + P (U1 ≤ )P ( ≤ U2 ≤ 1)
2
2
2
2
1
1
1
1
+ P ( ≤ U1 ≤ )P ( ≤ U2 ≤ )
4
2
4
2
1
1
1
> .
= +
2 16
2
11
Solutions to Part 2
1. For (a) the main idea is to generate two random variables U1 ∈ [−3, 3] and U2 ∈ [1, 4],
and to set
(
1 (U1 , U2 ) ∈ P
X=
0 otherwise,
where P denotes the polygon. If we let A denote the area of the polygon, then
n
1X
A
≈
Xj .
18
n j=1
(Note that 18 is the area of the rectangle R containing the polygon with (U1 , U2 ) ∈ R.) The
approximation is carried out with polyarea1.m.
function polyarea1(n)
%POLYAREA1: MATLAB function M-file that takes a
%number of simulations as input and uses simulation
%to compute the area of the polygon defined by
%the following inequalities:
%x+y>1
%y<4
%y>1
%x-y<0
%y+(2/3)x<5
%
np=0;
rng(’shuffle’)
for k=1:n
x = 6*rand-3;
y = 3*rand+1;
if x+y>1 && y<4 && y>1 && x-y<0 && y+(2/3)*x<5
np = np+1;
end
end
fprintf(’The probability is approximately %5.3f\n’,(np/n)*18)
b. Using the inequality from the Weak Law of Large Numbers, we have
n
A
σ2
1X
Xj − | ≥ k) ≤
,
P (|
n j=1
18
nk 2
which we re-express as
n
P (|
18 X
σ2
Xj − A| ≥ 18k) ≤
.
n j=1
nk 2
12
(Notice particularly that the appearance of k on the right hand side does not change, because
,
all we’ve done is re-written the inequality.) In this case, we take 18k = .005 so that k = .005
18
1
2
and we must find n sufficiently large so that (using σ ≤ 4 )
1
1
= 32400000.
≤ .1 ⇒ n ≥
2
4n(.005/18)
4(.005/18)2 (.1)
We implement this below.
>>polyarea1(32400000)
The probability is approximately 10.750
It’s easy to check that the exact area is 10.75.
2. For (a) the simulation is given in fermat24.
function fermat24(n)
%FERMAT24: simulates a game in which a pair of
%dice are rolled twenty-four times, and determines
%through Monte Carlo simulation the probability
%that a pair of sixes will turn up.
sixes = 0; %Initialize number of sixes
for k = 1:n
for j=1:24
if rand <= 1/36 %Two sixes
sixes = sixes + 1;
break %Terminate experiment so multiple pairs not counted
end
end
end
disp([’Probability for pair of sixes: ’ num2str(sixes/n)])
For (b), we consider a sequence of random variables {Xj }nj=1 so that Xj is 1 if double sixed
occurs at least once in the trial and 0 otherwise. Then our inequality from the Weak Law of
Large Numbers gives
n
σ2
1X
Xj − p| ≥ ) ≤ 2 ,
P (|
n j=1
n
where we take = .001. Using σ 2 ≤ 14 , we find n so that
1
1
≤
.05
⇒
n
≥
= 5000000.
4n(.001)2
4(.05).0012
A MATLAB implementation is given below:
>>fermat24(5000000)
Probability for pair of sixes: 0.49188
13
We showed in class that the exact probability is .4914.
3. For (a) we use rolls5.m.
function rolls5(n)
%ROLLS5: MATLAB function M-file that takes a number of
%simulations n as input and simulates the following
%experiment: a fair die is rolled five times and
%the numbers are added together. The file approximates
%the probabilities of getting values 13, 17, and 23.
%
n13=0; n17=0; n23=0;
for k=1:n
rolls = sum(randi(6,[1 5]));
if rolls == 13
n13=n13+1;
elseif rolls == 17
n17=n17+1;
elseif rolls == 23
n23=n23+1;
end
end
fprintf(’The probability of getting 13 is %5.4f\n’,n13/n)
fprintf(’The probability of getting 17 is %5.4f\n’,n17/n)
fprintf(’The probability of getting 23 is %5.4f\n’,n23/n)
For (b) we consider three random variables X13 , X17 , and X23 with
(
1 if the total is 13
X13 =
0 otherwise,
and similarly for the other two. In each case, if {Xj }nj=1 is a sequence of observations then
from the Weak Law of Large Numbers we have the inequality
n
σ2
1
1X
Xj − p13 | ≥ ) ≤ 2 ≤
,
P (|
n j=1
n
4n2
using our usual estimate σ 2 ≤ 14 . Clearly then
n
n
1X
1X
1
P (|
Xj − p13 | < ) = 1 − P (|
Xj − p13 | ≥ ) > 1 −
.
n j=1
n j=1
4n2
The probability that all three errors are small at once is then (assuming independence)
(1 −
1 3
),
4n2
14
so we will determine n from the inequality
(1 −
1
)3 > .95.
4n(.01)2
We find
n>
4(.01)2 (1
1
= 147471.51,
− (.95)1/3 )
so that we use n = 147472.
A MATLAB implementation is given below.
>>rolls5(147472)
The probability of getting 13 is 0.0537
The probability of getting 17 is 0.1001
The probability of getting 23 is 0.0388
4. For (a) we use the M-file walk1.m.
function walk1(p,n)
%WALK1: MATLAB function M-file that takes as input
%a probability p and a number of simulations n and
%simulates a random walk on the integers, starting
%at position 0 and moving to the right 1 unit with
%probability p and to the left one unit with
%probability 1-p.
%
%The walk takes 10 steps, and probabilities are
%computed for ending on 0, 1, or 2.
%
n0=0;n1=0;n2=0;
for k=1:n
X=0;
for j=1:10
if rand < p
X=X+1;
else
X=X-1;
end
end
if X==0
n0=n0+1;
elseif X==1
n1=n1+1;
elseif X==2
n2=n2+1;
end
15
end
fprintf(’The object ended on 0 with probability %5.4f.\n’,n0/n)
fprintf(’The object ended on 1 with probability %5.4f.\n’,n1/n)
fprintf(’The object ended on 2 with probability %5.4f.\n’,n2/n)
For (b), we begin by observing that the object will always be on an even number after 10
steps, so the probability of getting 1 is 0. In this way, we only have two probabilities to
determine, and the probability that the errors on these are both small at once is
(1 −
1 2
),
4n2
so we will determine n from the inequality
(1 −
1
)2 > .95.
4n(.01)2
We find
n>
4(.01)2 (1
1
= 98733.97,
− (.95)1/2 )
so that we use n = 98734.
A diary session of the implementation is given below.
>>walk1(.5,98734)
The object ended on 0 with probability 0.2460.
The object ended on 1 with probability 0.0000.
The object ended on 2 with probability 0.2056.
16
Download