M442, Fall 2013 Practice Problems for the Final My office hours during exam week will be 1:30-2:30 on the following days: Tuesday Dec. 3, Wednesday Dec. 4, Thursday Dec. 5, Tuesday Dec. 10. The final exam for M442 will be Wednesday, Dec. 11, 10:30 a.m. – 12:30 p.m., in Blocker 123 (the usual classroom). The exam will consist of two parts: Part 1 will not require MATLAB, while Part 2 will require MATLAB. Students will turn in Part 1 before beginning Part 2, but for Part 2 students will have access to all M-files we’ve used this semester, from both lecture and homework. The final exam will cover all course material following the midterm, including: probability definitions and axioms; permutations and combinations (with and without repeats); conditional probability, including Bayes’ Lemma, and computing probabilities by conditioning; independent events; random variables and expected value, including properties of expected value; conditional expected value and computing expected value by conditioning; simulating uniform random variables, Gaussian random variables, and discrete random variables; variance and standard deviation; probability density functions; Markov’s Lemma; Chebyshev’s Lemma; the Weak Law of Large Numbers, including inequality *; the Strong Law of Large Numbers; the Central Limit Theorem. This set of practice problems is in the general format of the exam, but is at least twice the length of the actual exam. Part 1 Problems 1. of 2. a. Suppose A and B are independent events. Use the axioms of probability and the definition independence to show that A and B c are independent events. Use the axioms of probability theory to show the following. For any two events A and B P (A ∩ B) ≤ P (A). b. For any two events A and B P (A ∪ B) ≤ P (A) + P (B). 3. Answer the following. a. Compute the probability of obtaining a quad (four of a kind) if you are dealt five cards from a standard deck of 52 cards. b. Compute the probability of obtaining a quad if, as in Texas Hold’em, you are dealt seven cards (two down plus five community). 4. Suppose there are 15 students in a certain class, and they are to be arranged into 5 groups of 3 students each. In how many ways can this arrangement be made? 5. Suppose a certain M442 student can solve a regression problem with probability .8, a dimensional analysis problem with probability .7, and a phase plane problem with probability .25. If a pop quiz is given and there is a 10% chance the problem will be on regression, a 1 50% chance the problem will be on dimensional analysis, and a 40% chance the problem will be on phase planes, what is the probability the student will be able to solve the problem? 6. Suppose we have three cards identical in form except that both sides of the first card are colored red, both sides of the second card are colored black, and one side of the third card is colored red, the other black. The three cards are mixed up in a hat, and one card is randomly selected and put down on the ground. If the up side of the chosen card is colored red, what is the probability that the other side is colored black? 7. Suppose machines M1 and M2 turn out, respectively, 10 and 90 percent of the total production of a certain type of article. Suppose the probability that machine M1 turns out a defective article is .01, while the probability that machine M2 turns out a defective article is .05. What is the probability that an article taken at random from a day’s production was made by machine M1 , given that it is found to be defective? 8. Suppose sixty percent of cars in a certain town are made by Company A. Thirty percent of all cars in this town are SUVs, and forty percent of SUVs in the town are made by Company A. Given that a particular car in the town is not an SUV, what is the probability that it was made by Company A? 9. In the game of American roulette, there are 38 equally probable slots: 18 red, 18 black, and two house. When one dollar is bet on red, the player wins one dollar if the ball lands red, and he loses his bet otherwise. a. Compute the expected amount of gain or loss from betting one dollar on red. b. One famous roulette strategy is the “martingale” or “doubling up” strategy. Each time the player loses, he doubles his bet. For example, he might bet 1 dollar, lose, bet 2 dollars, lose, bet 4 dollars, win. In total, he has lost 3 dollars and won 4, so he comes out ahead by one dollar. Suppose there is a betting maximum of four dollars at a certain table, and determine the expected value of this strategy. c. Determine the expected amount bet during the martingale strategy with a maximum of four dollars, and divide your expected value from (b) with this amount to find the average loss per dollar bet. 10. A bin of 5 electrical components is known to contain exactly 2 that are defective. If the components are tested one at a time in random order, until both defectives are discovered, find the expected number of tests that are made. Keep in mind that you don’t necessarily have to test the defectives. For example, if you draw three working components in a row, you can conclude that the remaining two are defective. Also, compute the variance of the number of tests. 11. The covariance of two random variables X and Y is defined to be Cov(X, Y ) := E[(X − E[X])(Y − E[Y ])]. Show that Cov(X, Y ) = E[XY ] − E[X]E[Y ]. 12. Show that for any two discrete random variables X and Y for which E[|X|] and E[|Y |] are finite E[XY ] = E[Y E[X|Y ]]. 2 Explain why you need to assume E[|X|] and E[|Y |] are finite. 13. Suppose a fair coin is flipped until it lands heads on two consecutive flips. What is the expected number of flips required? 14. Answer the following. a. A miner is trapped in a mine containing three doors. The first door leads to a tunnel that will take him to safety after three hours of travel. The second door leads to a tunnel that will return him to the mine after five hours of travel. The third door leads to a tunnel that will return him to the mine after seven hours of travel. If we assume that the miner is at all times equally likely to choose any one of the doors, what is the expected length of time until he reaches safety? b. Compute the variance of the length of time until the miner reaches safety. 15. Show that for any random variable X and any value a ∈ R P (X ≥ a) ≤ e−a E[eX ]. 16. Let A denote an event on a probability space S, and suppose we would like to use the method of simulation to approximate a value for p = P (A). Let X denote a random variable ( 1 if event A occurs X= . 0 otherwise a. Explain how the random variable X can be used to approximate a value for p = P (A). b. Find a number of simulations n required to ensure that the probability is less than 1% that your error on P (A) is greater than .02. 17. Suppose U1 and U2 are both uniform random variables on [0, 1]. Determine whether or not the random variable X = U1 + U2 is a uniform random variable on [0, 2]. 3 Part 2 Problems 1. Consider the polygon described by the following inequalities: x + y ≥1 y ≤4 y ≥1 x − y ≤0 2 y + x ≤5 3 (See Figure 1.) Polygon for Problem 1 5 4.5 4 3.5 y 3 2.5 2 1.5 1 0.5 0 −4 −3 −2 −1 0 x 1 2 3 4 Figure 1: Polygon for Problem 1. a. Write a MATLAB M-file that uses simulation to determine the area of this polygon. b. Determine the number of times n you should run your simulation to ensure there is a 90% chance that the error on your probability from (a) will be smaller than .005. c. Run your simulation the number of times determined in (b) and turn in your result. 2. The modern study of probability theory grew out of a correspondence between Blaise Pascal and Pierre de Fermat that was initiated by the question of whether or not it’s advantageous to bet even money that double sixes will turn up at least once in 24 throws of a pair of fair dice. a. Write a MATLAB M-file that simulates this game n times, where n is to be determined below. b. Determine a number of simulations n that will ensure that the error on your probability is smaller than .001 at least 95% of the time. c. Run your simulation the number of times determined in (b) and turn in your result. 4 3. Suppose a six-sided die is rolled five times, and the numbers obtained are added together to give a value between 5 and 30. a. Write a MATLAB M-file that simulates this game n times, where n will be determined below, and determines the probability of getting the values 13, 17, and 23. b. Determine the number of times n you should run your simulation to ensure there is a 95% chance that the error on all three of your probabilities will be smaller than .01. You may assume independence of the three observations. c. Run your simulation the number of times determined in (b) and turn in your result. 4. Consider a process called a random walk in which motion proceeds as follows: Starting at position 0, an object moves either one space to the right with probability p or one space to the left with probability 1 − p. a. Take p = .5 and write a MATLAB M-file that simulates the process for 10 steps, and determines the probability that the object finishes on each of the following positions: 0, 1, 2. b. Determine the number of times n you should run your simulation to ensure there is a 95% chance that the error on all three of your probabilities will be smaller than .01. You may assume independence of the three observations. c. Run your simulation the number of times determined in (b) and turn in your result. 5 Solutions to Part 1 1. We start by writing A = (A ∩ B) ∪ (A ∩ B c ), which is a union of disjoint sets. Then by Axiom 3 and the independence of A and B P (A) = P (A ∩ B) + P (A ∩ B c ) = P (A)P (B) + P (A ∩ B c ). We see that P (A ∩ B c ) = P (A)(1 − P (B)) = P (A)P (B c ). 2. For (a), we write A = (A ∩ B) ∪ (A ∩ B c ), and use Axiom 3 to see P (A) = P (A ∩ B) + P (A ∩ B c ). By Axiom 1 we get the conclusion. For (b), we observe that A ∪ B = B ∪ (B c ∩ A), so P (A ∪ B) = P (B) + P (B c ∩ A) ≤ P (B) + P (A), where in getting this last inequality we used (a) with B c replacing B. 3. Since there are 13 ways to rank the quad, and 48 ways to choose the fifth card, P (quad) = 13 · 48 = .000240. 52 5 For (b), there are still 13 ways to rank the quad, and then there are other three cards. We have 13 · 48 3 = .001681. P (quad) = 52 48 3 ways to select the 7 4. There are 15 ways to choose the first group, 3 then 5! ways to arrange the five groups. In total, 15 12 9 6 3 3 3 3 3 3 5! 12 3 ways to choose the second group etc., = 1401400. 5. We define the partition A1 = event a regression problem is given A2 = event a dimensional analysis problem is given A3 = event a phase plane problem is given, 6 and let B denote the event the given problem is solved. Then P (B) = 3 X P (B|Aj )P (Aj ) j=1 = .8 · .1 + .7 · .5 + .25 · .4 = .53. 6. One approach to this problem is to proceed almost precisely as for the Monty Hall problem. Begin with the partition A1 = event mixed card is drawn A2 = event red-red card is drawn A3 = event black-black card is drawn. The conditioning event is B = event up side of chosen card is red. According to Bayes’ Lemma, we have P (B|A1 )P (A1 ) P (B|A1 )P (A1 ) + P (B|A2 )P (A2 ) + P (B|A3 )P (A3 ) ( 21 ) 13 1 = 1 1 1 1 = . 3 ( 2 ) 3 + (1) 3 + (0) 3 P (A1 |B) = Alternatively, you can get by with only defining two events, A = event mixed card is drawn B = event up side of chosen card is red. Here, by the definition of conditional probability P (A|B) = P (A ∩ B) = P (B) 1 6 1 2 1 = . 3 In this case, the probabilities are slightly more subtle. For P (A ∩ B), keep in mind that there is 1/3 chance of getting the mixed card, and then once you have it, a 1/2 chance that the red side will be up. For P (B), consider the three cards as six sides, each equally likely to turn up. 7. We set D = event the article is defective Mj = event the article was made by machine j. According to Bayes’ Lemma, P (D|M1 )P (M1 ) P (D|M1 )P (M1 ) + P (D|M2 )P (M2 ) .01 · .1 = = .021739. .01 · .1 + .05 · .9 P (M1 |D) = 7 8. Set A = event car was made by company A S = event car is an SUV, and note that our goal is to compute P (A|S c ). We have P (A) = P (A|S)P (S) + P (A|S c )P (S c ), and we see .6 = .4 · .3 + P (A|S c ) · .7. Solving for P (A|S c ) we conclude P (A|S c ) = .685714. 9. For (a), We have E[One dollar on red] = 1 · 18 20 −1· = −.0526 cents. 38 38 b. If we let X be the player’s final payout from one series with the strategy, we have 18 20 18 20 18 20 E[X] = 1 · + · + ( )2 − 7( )3 = −.1664. 38 38 38 38 38 38 c. The player’s expected bet under the assumption of a $4 cap is given by, E[B] = 1 · 18 20 18 18 20 18 +3· · + 7 · (1 − − · ) = 3.1607. 38 38 38 38 38 38 Notice that −.1664 = −.0526. 3.1607 If you play roulette you should get used to this number. 10. Let the random variable N denote the number of tests that have P4to be made. Then the expected value is computed from the relation from class E[N ] = n=2 nP (N = n), where a single test cannot possibly be conclusive, and we would never require 5. The probability P (N = 2), for example, is the probability that one of two defectives is drawn out of five components (2/5) times the probability that a single defective is drawn out of the four remaining components (1/4). I.e., P (N = 2) = 8 2 1 1 · = 5 4 10 For P (N = 3), we note that there are three types of draws that will be definitive. If we let G denote good and D defective these draws are DGD, GDD, GGG, with respective probabilities (as summands) 3 2 3 1 3 2 1 3 2 1 · · + · · + · · = . 5 4 3 5 4 3 5 4 3 10 Finally, 6 P (N = 4) = 1 − P (N = 2) − P (N = 3) = . 10 We have 1 3 6 E[N ] = 2 · + 3 + 4 = 3.5. 10 10 10 The variance is now straightforward: Var[N ] = 1.52 · 3 3 9 1 + .52 · + .52 · = = .45. 10 10 5 20 11. We compute Cov(X, Y ) = E[(X − E[X])(Y − E[Y ])] = E[XY − E[X]Y − XE[Y ] + E[X]E[Y ]] = E[XY ] − E[X]E[Y ]. 12. My recommendation on problems like this is to start with the more complicated expression, which in this case is E[Y E[X|Y ]]. We compute X E[Y E[X|Y ]] = yE[X|Y = y]P (Y = y) y X X xP ({X = x}|{Y = y})P (Y = y) y = y x X X xP ({X = x} ∩ {Y = y}) y = y = X x xyP ({X = x} ∩ {Y = y}) x,y = E[XY ] It is in the technical step in which y is distributed through the sum over x that we use our assumption that E[|X|] and E[|Y |] are finite. 13. Let N denote the number of flips required until two consecutive heads occur, and let Tk denote the event that the first tail occurs on flip k, k = 1, 2, and let T3 denote the event that the first two flips are heads. (Note that the {Tk }3k=1 partition our sample space.) Proceed by conditioning, E[N ] = 3 X 1 1 1 E[N |Tk ]P (Tk ) = (1 + E[N ]) + (2 + E[N ]) + 2 . 2 4 4 k=1 9 Solving for E[N ], we find E[N ] = 6. 14. Letting Dk = event miner goes through door k, and letting T denote the time until the miner reaches safety, we have E[T ] = 3 X 1 1 1 E[T |Dk ]Pr{Dk } = 3 + (5 + E[T ]) + (7 + E[T ]) . 3 3 3 k=1 Solving this algebraic equation for E[T ], we conclude 1 E[T ] = 5 ⇒ E[T ] = 15. 3 b. We have Var(T ) = E[T 2 ] − E[T ]2 , where we already know that E[T ] = 15. For E[T 2 ], we let Dk denote the event that the miner goes through door k and compute E[T 2 ] = 1 1 1 E[T 2 |Dk ]P (Dk ) = 9 + E[(5 + T )2 ] + E[(7 + T )2 ] , 3 3 3 k=1 X which we can solve for 1 9 + 25 + 150 + 49 + 210 E[T 2 ] = ⇒ E[T 2 ] = 443. 3 3 Finally, Var(T ) = E[T 2 ] − E[T ]2 = 443 − 225 = 218. 15. Set Y = eX and observe that Y ≥ 0. By Markov’s inequality P (Y ≥ ea ) ≤ E[Y ] = e−a E[eX ]. ea Finally, note that eX ≥ ea if and only if X ≥ a. 16. For (a) we use n 1X pĚ„ = Xj , n j=1 where the {Xj }nj=1 and instances of X, since E[X] = p. For (b), we have n 1X σ2 1 P (| Xj − p| ≥ ) ≤ 2 ≤ , n j=1 n 4n2 10 so if we want = .02 we find n so that 1 1 ≤ .01 ⇒ n ≥ = 62500. 2 4n(.02) 4 · .01 · (.02)2 17. Since U1 and U2 are both uniform on [0, 1], we know that for all a, b ∈ [0, 1], a ≤ b, there holds P (a ≤ Uk ≤ b) = b − a, k = 1, 2. Now let c, d ∈ [0, 2], c ≤ d. The question becomes, is it true that P (c ≤ X ≤ d) = d−c . 2 (1) (Keep in mind that if c = 0 and d = 2 the probability must be 1.) Intuitively, we think that X should not be uniform, because there are many more ways to get a sum such as 1 than a sum such as 0. In order to find a counterexample to (1), consider a particular pair of values, say c = 12 and d = 32 . We have 3 1 3 1 P ( ≤ X ≤ ) = P ( ≤ U1 + U2 ≤ ) 2 2 2 2 1 1 1 1 ≥ P ( ≤ U1 ≤ 1)P (U2 ≤ ) + P (U1 ≤ )P ( ≤ U2 ≤ 1) 2 2 2 2 1 1 1 1 + P ( ≤ U1 ≤ )P ( ≤ U2 ≤ ) 4 2 4 2 1 1 1 > . = + 2 16 2 11 Solutions to Part 2 1. For (a) the main idea is to generate two random variables U1 ∈ [−3, 3] and U2 ∈ [1, 4], and to set ( 1 (U1 , U2 ) ∈ P X= 0 otherwise, where P denotes the polygon. If we let A denote the area of the polygon, then n 1X A ≈ Xj . 18 n j=1 (Note that 18 is the area of the rectangle R containing the polygon with (U1 , U2 ) ∈ R.) The approximation is carried out with polyarea1.m. function polyarea1(n) %POLYAREA1: MATLAB function M-file that takes a %number of simulations as input and uses simulation %to compute the area of the polygon defined by %the following inequalities: %x+y>1 %y<4 %y>1 %x-y<0 %y+(2/3)x<5 % np=0; rng(’shuffle’) for k=1:n x = 6*rand-3; y = 3*rand+1; if x+y>1 && y<4 && y>1 && x-y<0 && y+(2/3)*x<5 np = np+1; end end fprintf(’The probability is approximately %5.3f\n’,(np/n)*18) b. Using the inequality from the Weak Law of Large Numbers, we have n A σ2 1X Xj − | ≥ k) ≤ , P (| n j=1 18 nk 2 which we re-express as n P (| 18 X σ2 Xj − A| ≥ 18k) ≤ . n j=1 nk 2 12 (Notice particularly that the appearance of k on the right hand side does not change, because , all we’ve done is re-written the inequality.) In this case, we take 18k = .005 so that k = .005 18 1 2 and we must find n sufficiently large so that (using σ ≤ 4 ) 1 1 = 32400000. ≤ .1 ⇒ n ≥ 2 4n(.005/18) 4(.005/18)2 (.1) We implement this below. >>polyarea1(32400000) The probability is approximately 10.750 It’s easy to check that the exact area is 10.75. 2. For (a) the simulation is given in fermat24. function fermat24(n) %FERMAT24: simulates a game in which a pair of %dice are rolled twenty-four times, and determines %through Monte Carlo simulation the probability %that a pair of sixes will turn up. sixes = 0; %Initialize number of sixes for k = 1:n for j=1:24 if rand <= 1/36 %Two sixes sixes = sixes + 1; break %Terminate experiment so multiple pairs not counted end end end disp([’Probability for pair of sixes: ’ num2str(sixes/n)]) For (b), we consider a sequence of random variables {Xj }nj=1 so that Xj is 1 if double sixed occurs at least once in the trial and 0 otherwise. Then our inequality from the Weak Law of Large Numbers gives n σ2 1X Xj − p| ≥ ) ≤ 2 , P (| n j=1 n where we take = .001. Using σ 2 ≤ 14 , we find n so that 1 1 ≤ .05 ⇒ n ≥ = 5000000. 4n(.001)2 4(.05).0012 A MATLAB implementation is given below: >>fermat24(5000000) Probability for pair of sixes: 0.49188 13 We showed in class that the exact probability is .4914. 3. For (a) we use rolls5.m. function rolls5(n) %ROLLS5: MATLAB function M-file that takes a number of %simulations n as input and simulates the following %experiment: a fair die is rolled five times and %the numbers are added together. The file approximates %the probabilities of getting values 13, 17, and 23. % n13=0; n17=0; n23=0; for k=1:n rolls = sum(randi(6,[1 5])); if rolls == 13 n13=n13+1; elseif rolls == 17 n17=n17+1; elseif rolls == 23 n23=n23+1; end end fprintf(’The probability of getting 13 is %5.4f\n’,n13/n) fprintf(’The probability of getting 17 is %5.4f\n’,n17/n) fprintf(’The probability of getting 23 is %5.4f\n’,n23/n) For (b) we consider three random variables X13 , X17 , and X23 with ( 1 if the total is 13 X13 = 0 otherwise, and similarly for the other two. In each case, if {Xj }nj=1 is a sequence of observations then from the Weak Law of Large Numbers we have the inequality n σ2 1 1X Xj − p13 | ≥ ) ≤ 2 ≤ , P (| n j=1 n 4n2 using our usual estimate σ 2 ≤ 14 . Clearly then n n 1X 1X 1 P (| Xj − p13 | < ) = 1 − P (| Xj − p13 | ≥ ) > 1 − . n j=1 n j=1 4n2 The probability that all three errors are small at once is then (assuming independence) (1 − 1 3 ), 4n2 14 so we will determine n from the inequality (1 − 1 )3 > .95. 4n(.01)2 We find n> 4(.01)2 (1 1 = 147471.51, − (.95)1/3 ) so that we use n = 147472. A MATLAB implementation is given below. >>rolls5(147472) The probability of getting 13 is 0.0537 The probability of getting 17 is 0.1001 The probability of getting 23 is 0.0388 4. For (a) we use the M-file walk1.m. function walk1(p,n) %WALK1: MATLAB function M-file that takes as input %a probability p and a number of simulations n and %simulates a random walk on the integers, starting %at position 0 and moving to the right 1 unit with %probability p and to the left one unit with %probability 1-p. % %The walk takes 10 steps, and probabilities are %computed for ending on 0, 1, or 2. % n0=0;n1=0;n2=0; for k=1:n X=0; for j=1:10 if rand < p X=X+1; else X=X-1; end end if X==0 n0=n0+1; elseif X==1 n1=n1+1; elseif X==2 n2=n2+1; end 15 end fprintf(’The object ended on 0 with probability %5.4f.\n’,n0/n) fprintf(’The object ended on 1 with probability %5.4f.\n’,n1/n) fprintf(’The object ended on 2 with probability %5.4f.\n’,n2/n) For (b), we begin by observing that the object will always be on an even number after 10 steps, so the probability of getting 1 is 0. In this way, we only have two probabilities to determine, and the probability that the errors on these are both small at once is (1 − 1 2 ), 4n2 so we will determine n from the inequality (1 − 1 )2 > .95. 4n(.01)2 We find n> 4(.01)2 (1 1 = 98733.97, − (.95)1/2 ) so that we use n = 98734. A diary session of the implementation is given below. >>walk1(.5,98734) The object ended on 0 with probability 0.2460. The object ended on 1 with probability 0.0000. The object ended on 2 with probability 0.2056. 16