Instructor’s Solutions Manual Version October 14, 20171 Probability, A Lively Introduction Henk Tijms Cambridge University Press, 2017 1 Despite careful checking, there is non-negligible probability that there are errors in answers. If anything should be corrected, please let me know be sending an email to h.c.tijms@gmail.com. 1 Chapter 1 1.1 Imagine a blue and a red die. Take as sample space the set of the ordered pairs (b, r), where b is the number shown on the blue and r is the number shown on the red die. Each of the 36 elements (b, r) is equally likely. There are 2 × 3 × 3 = 18 elements for which exactly one component is odd. Thus the probability that the sum of the two 1 dice is odd equals 18 36 = 2 . There are 3 × 3 = 9 elements for which both components are odd. Thus the probability that the product of 9 the two numbers rolled is odd equals 36 = 14 . Alternatively, you can obtain the probabilities by using as sample space the set consisting of the four equiprobable elements (odd, odd), (odd, even), (even, even), and (even, odd). 1.2 Label the plumbers as 1 and 2. Take as sample space the set of all possible sequences of ones and twos to the length 3, where a one stands for plumber 1 and a two for plumber 2. The sample space has 23 = 8 equiprobable outcomes. There are 2 outcomes with three ones or three twos. The sought probability is 82 = 14 . 1.3 Label the 10 letters of “randomness” as 1 to 10. Take as sample space the set of all permutations of the numbers 1 to 10. All 10! outcomes are equally likely. There are 3 × 2 × 8! outcomes that begin and end with a vowel and there are 8 × 3! × 7! outcomes in which the three vowels are adjacent to each other. The probabilities are 3 × 2 × 8!/10! = 1/15 and 8 × 3! × 7!/10! = 1/15. 1.4 Take as sample space the set of all possible sequences of zeros and ones to the length 4, where a zero stands for male gender and an one for female gender. The sample space has 24 = 16 equiprobable outcomes. There are 8 outcomes with exactly three zeros or exactly three ones and 6 outcomes with exactly two zeros. Hence the probability of three 8 puppies of one gender and one of the other is 16 . The probability of 6 two puppies of each gender is 16 . 1.5 Take as sample space the set of all unordered samples of m different n numbers. The sample space has m equiprobable elements. There n−1 are m−1 samples that contain the largest number. The probability n n−1 of getting the largest number is m−1 / m = m n . Alternatively, you can take as sample space the set of all n! permutations of the integers 1 to n. There are m × (n − 1)! permutations for which the number n 2 is in one of the first m positions. Note: More generally, the probability that the largest r numbers n−r n are among the m numbers picked is given by both m−r / m and m r r!(n − r)!/n!. 1.6 Take as sample space the set of all possible combinations of two persons 6 who do the dishes. The sample space has 2 = 15 equally likely outcomes. The number of outcomes consisting of two boys is 32 = 3 = 51 .Alternatively, using an ordered 3. The sought probability is 15 sample space consisting of all 6! possible ordering of the six people and imagining that the first two people in the ordering have to do the = 15 dishes, the sought probability can be calculated as 3×2×4! 6! 1.7 Imagine that the balls are labeled as 1, . . . , n. It is no restriction to assume that the two winning balls have the labels 1 and 2. Take as sample space the set of all n! permutations of 1, . . . , n. For any k, the number of permutations having either 1 or 2 on the kth place is (n − 1)! + (n − 1)!. Thus, the probability that the kth person picks a winning ball is (n−1)!+(n−1)! = n2 for each k. n! 1.8 Take as sample space the set of all ordered pairs (i, j) : i, j = 1, . . . , 6, where i is the number rolled by player A and j is the number rolled by player B. The sample space has 36 equally likely outcomes. The number of winning outcomes for player B is 9 + 11 = 20. The probability 4 of player A winning is 16 36 = 9 . 1.9 Take as sample space the set of all unordered samples of six differ- ent numbers from the numbers 1 to 42. The sample space has 42 6 equiprobable outcomes. There are 41 outcomes with the number 10. 5 42 6 Thus the probability of getting the number 10 is 41 5 / 6 = 42 . The probability 42that each of the six numbers picked is 20 or more is equal 23 to 6 / 6 = 0.0192. Alternatively, the probabilities can be calculated by using the sample space consisting of all ordered arrangement of the numbers 1 to 42, where the numbers in the first six positions 6 are the lotto numbers. This leads the calculations (6 × 41!)/42! = 42 23 and ( 6 × 6! × 36!)/42! = 0.0192 for the sought probabilities. 1.10 Take as (unordered) sample space all possible combinations of two candidates to receive a cup of tea from the waiter. The sample space 5 has 2 = 10 equally likely outcomes. The number of combinations of two people each getting the cup of tea they ordered is 1. The 3 1 . Alternatively, using an ordered sample space sought probability is 10 consisting of all possible orderings of the five people and imagining that the first two people in the ordering get a cup of tea from the 1 waiter, the probability can be calculated as 2×1×3! = 10 . 5! 1.11 Label the nine socks as s1 , . . . , s9 . The probability model in which the order of selection of the socks is considered relevant has a sample space with 9 × 8 = 72 equiprobable outcomes (si , sj ). There are 4 × 5 = 20 outcomes for which the first sock chosen is black and the second is white, and there are 5 × 4 = 20 outcomes for which the first sock is white and the second is black. The sought probability is 40/72 = 5/9. The probability model in which the order of selection of the socks is 9 not considered relevant has a sample space with 2 = 36 equiprobable outcomes. The number of outcomes for which the socks have different 5 4 colors is 1 × 1 = 20, yielding the same value 20/36 = 5/9 for the sought probability. 1.12 This problem can be solved by using either an ordered sample space or an unordered sample space. Label the ten letters of the word Cincinnati as 1, 2, . . . , 10. As ordered sample space, take the set of all ordered pairs (i1 , i2 ), where i1 is the label of the first letter dropped and i2 is the label of the second letter dropped. This sample space has 10 × 9 = 90 equally likely outcomes. Let A be the event that the two letters dropped are the same. Noting that in the word Cincinnati the letter c occurs two times and occur three times, i and n each the letters it follows that there are 22 × 2! + 32 × 2! + 32 × 2! = 14 outcomes 7 14 = 45 . An unordered sample leading to the event A. Hence P (A) = 90 space can also be used . This sample space consists of all possible sets of two differently labeled letters from the ten letters of Cincinnati. This sample space has 10 2 = 45 equally likely outcomes. The number of outcomes for which the two the labeled letters in the set represent 7 same letter is 22 + 32 + 32 = 7. This gives the same value 45 for the probability that the two letters dropped are the same. 1.13 Take as sample space the set of all unordered pairs of two distinct cards. The sample space has 52 There are 2 equally likely outcomes. 1 51 3 12 × = 51 outcomes with the ten of hearts, and × 1 1 1 1 = 36 outcomes with hearts and aten but not the ten of hearts. The sought probability is (51 + 36)/ 52 2 = 0.0656. 1.14 Represent the words chance and choice by chanCe and choiCe. Take as 4 sample space the set of all possible pairs (l1 , l2 ), where l1 is an element from the word chanCe and l2 is an element from the word choiCe. By distinguishing between c and C, the sample space has 6 × 6 = 36 equally likely outcomes. The number of outcomes for which the two chosen letters represent the same letter is 4 + 1 + 1 = 6. The sought probability is 61 . 1.15 Take as sample space the set of all sequences (i1 , . . . , ik ), where ik is the number shown on the kth roll of the die. Each element of the sample space is equally likely. The explanation is that there is P a one-to-one 10 correspondence between the elements (i1 ,P . . . , ik ) with k=1 ik = s and the elements (7 − i1 , . . . , 7 − ik ) with 10 (7 − i ) = 70 − s. k k=1 1.16 Take as ordered sample space the set of all sequences (i1 , . . . , i12 ), where ik is the number rolled by the kth die. The sample space has 612 equally likely outcomes. The number of outcomes each in which 12 10 8 6 4 number appears exactly two times is 2 × 2 × 2 × 2 × 2 = 12!/26 . The sought probability is 2612! = 0.0034. ×612 1.17 Take as sample space the setof all possible 16 samples of three residents. This leads to the value 41 41 41 / 12 3 = 55 for the sought probability. 1.18 Take as sample space the set of all ordered arrangements of 10 people, where the people in the first five positions form group 1 and the other five people form group 2. The sample space has 10! equally likely elements. The number of elements for which your two friends and you together are in the same group is 5 × 4 × 3 × 7! + 5 × 4 × 3 × 7!. The 1 sought probability is 120×7! 10! = 6 . Alternatively, the probability can be (3)(7)+(3)(7) calculated as 3 2 10 0 5 = 16 , using as sample space the set of all (5) possible combinations of five people for the first group. A third way ( 5) + ( 5) to calculate the probability is 3 10 3 = 61 , using as sample space the (3) set of all possible combinations of three positions for the three friends. 1.19 Take as sample space the set of the 9! possible orderings of the nine books. The subjects mathematics, physics and chemistry can be ordered in 3 × 2 × 1 = 6 ways and so the number of favorable orderings is 6 × 4! × 3! × 2!. The sought probability is (6 × 4! × 3! × 2!)/9! = 1/210. 1.20 The sample space is Ω = {(i, j, k) : i, j, k = 0, 1}, where the three components corresponds to the outcomes of the three individual tosses of the three friends. Here 0 means heads and 1 means tails. Each 5 element of the sample space gets assigned a probability of 81 . Let A denote the event that one of the three friends pays for all the three tickets. The set A is given by A = Ω\{(0, 0, 0), (1, 1, 1)} and consists of six elements. The sought probability is P (A) = 68 . 1.21 Label the eleven letters of the word Mississippi as 1, 2, . . . , 11 and take as sample space the set of the 1111 possible ordered sequences of eleven numbers from 1, . . . , 11. The four positions for a number representing i, the four positions for a number representing s, the two positions for a number representing p, and the for the number one position 7 3 representing m can be chosen in 11 × × ways. Therefore the 4 4 2 number of outcomes in which all letters of the word Mississippi are 11 7 3 4 4 2 represented is 4 × 4 × 2 × 4 × 4 × 2 . Dividing this number by 1111 gives the value 0.0318 for the sought probability. 1.22 One pair is a hand with the pattern aabcd, where a, b, c and d are from distinct kinds of cards. There are 13 kinds and four of each kind in a standard deck of 52 cards. The probability of getting one pair is 13 1 4 2 12 3 52 5 43 1 = 0.4226. Two pair is a hand with the pattern aabbc, where a, b and c are from distinct kinds of cards. The probability of getting two pair is 13 4 4 11 4 2 2 2 1 52 5 1 = 0.0475. 1.23 Take as sample space the set of all possible combinations of two apartments from the 56 apartments. These two apartment represent the 56 vacant apartments. The sample space has 2 equiprobable elements. The number of elements with no vacant 56apartment on the top floor is 48 48 56 . Thus the sought probability is − / 2 2 2 2 = 0.2675. Alternatively, using a sample space made up of all permutations of the = 56 apartments, the probability can be calculated as 1 − 48×47×54! 56! 0.2675. 1.24 Imagine that the balls are labeled as 1 to 11, where the white balls get the labels 1 to 7 and the red balls the labels 8 to 11. Take as sample space is the set of all possible permutations of 1, 2, . . . , 11. The number of outcomes in which a red ball appears for the first time at the ith 6 7 drawing is i−1 × (i − 1)! × 4 × (7 − (i − 1) + 3)! for 1 ≤ i ≤ 8. The sought probability is 4 7 13 1 X × (2k − 1)! × 4 × (7 − (2k − 1) + 3)! = . 11! 33 2k − 1 k=1 1.25 Take as sample space the set of all ordered pairs (i, j), where i is the first number picked and j is the second number picked. There are n2 equiprobable outcomes. For r ≤ n + 1, the r − 1 outcomes (1, r − 1), (2, r − 2), . . . , (r − 1, 1) are the only outcomes (i, j) for which i + j = r. Thus the probability that the sum of the two numbers picked is r is r−1 for 2 ≤ r ≤ n + 1. Therefore the probability of n2 getting a sum s when rolling two dice is s−1 36 for 2 ≤ s ≤ 7. By a symmetry argument, the probability of getting a sum s is the same as the probability of getting a sum 14 − s for 7 ≤ s ≤ 12 (opposite faces of a die always total 7). Thus the probability of rolling a sum s has for 7 ≤ s ≤ 12. the value 14−s−1 36 1.26 Take as sample space the interval (0, 1). The outcome x means that the stick is broken on the point x. The length of the longer piece is at least three times the length of the shorter piece if x ∈ (0, 14 ) or x ∈ ( 43 , 1). The sought probability is 41 + 41 = 12 . 1.27 Take as sample space the square {(x, y) : 0 ≤ x, y ≤ a}. The outcome (x, y) refers to the position of the middle point of the coin. The sought probability is given by the probability that a randomly chosen point in the square falls in the subset {(x, y) : d2 ≤ x, y ≤ a − d2 } and is equal to (a − d)2 /a2 . 1.28 Take as sample space the interval (0, 1). The outcome x means that a randomly chosen point in (0, 1) is equal to x. The sought probability is the probability that a randomly chosen point in (0, 1) falls into one of 1 7 1 1 the intervals (0, 12 ) or ( 21 , 12 ). The sought probability is 12 + 12 = 16 . 1.29 This problem can be solved with the model of picking at random a point inside a rectangle. The rectangle R = {(x, y) : 0 ≤ x ≤ 1, 21 ≤ y ≤ 1} is taken as sample space, where the outcome (x, y) means that you arrive 60x minutes past 7 a.m. and your friend arrives 60y minutes past 7 a.m. The probability assigned to each subset of the sample space is the area of the subset divided by the area of the rectangle R. The sought probability is P (A), where the set A is the union of the three 7 1 1 }, {(x, y) : 12 + 12 < x, y < 34 } disjoint subsets {(x, y) : 12 < x, y < 12 + 12 3 and {(x, y) : 4 < x, y < 1}. This gives P (A) = 1 1 7 1 1 1 1 1 = . × + × + × 2 12 12 6 6 4 4 36 1.30 Translate the problem into choosing a point at random inside the unit square {(x, y) : 0 ≤ x ≤ 1, 0 ≤ y ≤ 1}. The probability that the two persons will meet within 10 minutes of each other is given by the probability that a point chosen at random in the unit square will fall inside the shaded region in the figure. The area of the shaded region is calculated as 1 − 56 × 65 = 0.3056. This gives the desired probability. 5/6 1/6 1/6 5/6 1.31 For q = 1, take the square {(x, y) : −1 < x, y < 1} as sample space. The sought probability is the probability that a point (x, y) chosen at random in the square satisfies y ≤ 41 x2 and is equal to R1 1 2 1 (2 + 4 −1 4 x dx) = 0.5417. For the general case, the sought probability is (make a picture!): Z q 1 2 1 2 1 q x dx = + for 0 < q < 4 2q + 4q 2 4 2 24 −q Z 2√q 2 1 2 1 2 √ 2q + x dx + 2(q − 2 q)q = 1 − √ for q ≥ 4. 2 √ 4q 3 q −2 q 4 1.32 Take as sample space the set {(x, y) : 0 ≤ x, y ≤ 1}. The outcome (x, y) means that a randomly chosen point in the unit square is equal to (x, y). The probability that the Manhattan distance from a randomly chosen point to the point (0, 0) is no more than a is given by the probability that the randomly chosen point (x, y) satisfies x + y ≤ a. The area of the region {(x, y) : 0 ≤ x, y ≤ 1 and x + y ≤ a} is 21 a2 . This gives the sought probability. By a symmetry argument, this 8 probability also applies to the case that the point is randomly chosen in the square {(x, y) : −1 ≤ x, y ≤ 1}. 1.33 Take as sample space the set S = {(y, θ) : 0 ≤ y ≤ 21 D, 0 ≤ θ ≤ π 2 }, where y is the distance from the midpoint of the diagonal of the rectangular card to the closest line on the floor and the angle θ is as described in the figure. It is no restriction to assume that b ≥ a. a a b b α yθ θ α θ Using the figure, it is seen that the card will intersect one of the lines on the floor if and only if the distance y is less than yθ , where √ 1 2 yθ is determined by sin(α + θ) = yθ /( 2 a + b2 ). Since sin(α + θ) = sin(α)cos(θ)+cos(α)sin(θ) with sin(α) = √a2a+b2 and cos(α) = √a2b+b2 , it follows that yθ = 21 acos(θ) + bsin(θ) . The sought probability is the area under the curve y = 21 acos(θ) + bsin(θ) divided by the area of the set S and so it is equal to Z π/2 1 1 2(a + b) acos(θ) + bsin(θ) dθ = . (1/4)πD 0 2 πD 1.34 The perpendicular distance from the randomly chosen point is larger than d if and only if the point falls inside the shaded region of the triangle in the left figure. Using the fact that the base of the shaded triangle is h−d h b (the ratio of this base and b equals (h−d)/h), it follows that the first probability is given by h d b 1 2 [(h − d) × (h − d)b/h] (h − d)2 . = 1 h2 2h × b 9 The randomly chosen point and the base of the triangle form an obtuse triangle if and only if the randomly chosen point falls inside the shaded region in the right figure. The area of the shaded region is the sum of two areas of an equilateral triangle with side lengths b/2 plus the area of one sixth of a circle with radius b/2. √ The area of an equilateral triangle with sides of length a is given by 14 3a2 . It now follows that the second probability is given by √ √ (1/4) 3(b/2)2 + (1/6)π(b/2)2 + (1/4) 3(b/2)2 π 1 √ = + √ = 0.8023. 2 2 (1/4) 3b 6 3 1.35 Take as sample space the unit square {(x, y) : 0 ≤ x, y ≤ 1}. The side lengths v = x, w = y × (1 − v) and 1 − v − w should satisfy the conditions v + w > 1 − v − w, v + 1 − v − w > w and w + 1 − v − w > v. 1 1 These conditions can be translated into y > 1−2x 2−2x , y < 2−2x and x < 2 . The sought probability is given by the area shaded region in the R 0.5of the R 0.5 1 1−2x first part of the figure and is equal to 0 2−2x dx − 0 2−2x dx = ln(2) − 0.5. To find the second probability, let v be the first random 1 1 0.5 0.5 0 0.5 1 0 0.5 1 breakpoint chosen on the stick and w be the other breakpoint. The point (v, w) can be represented by v = x and w = y × (1 − v) if v < 21 and by v = x and w = y ×v if v > 12 , where (x, y) is a randomly chosen point in the unit square. The second probability is the area of shaded region in the second part of the figure and is equal to 2 ln(2) − 0.5 . 1.36 The problem can be translated into choosing a random point in the unit square. The sample space is {(x, y) : 0 ≤ x, y ≤ 1}. For any point (x, y) in the sample space, distinguish between the cases x > y and x < y ( the probability that a randomly chosen point (x, y) satisfies x = y is zero). Consider first the case of x > y. Then, the three side lengths are y, x − y and 1 − x. Three side lengths a, b and c form a 10 triangle if and only if a + b > c, a + c > b and b + c > a. Hence the lengths y, y − x and 1 − y must satisfy the three conditions y + x − y > 1 − x, y+1−x>x−y and x − y + 1 − x > y. These three conditions can be rewritten as x > 21 , y > x − 12 and y < 12 . The set of all points (x, y) satisfying these three conditions is the down-right shaded region in the figure. Next consider the case of x < y. Then the three side lengths are x, y − x, and 1 − y. Then (x, y) must satisfy the three conditions as y > 12 , x > y − 21 and x < 21 . The set of all points (x, y) satisfying these three conditions is the top-left shaded region in the figure. The area of the two shaded regions is 1 1 1 8 + 8 . Hence the desired probability is 4 . 1 0.5 0 0.5 1 1.37 The unique chord having the randomly chosen point P as its midpoint is the chord that is perpendicular to the line connecting the point P to the center O of the circle, see the figure. A little geometry tells us P O that the chord is longer than the side of the equilateral triangle if and only if the point P falls inside the shaded inner circle in the figure. 2 = 14 . Thus the sought probability is π(r/2) πr 2 11 1.38 This is a compound experiment that consists of three subexperiments. Take as sample space the set Ω = {(i, j, k) : i, j, k = 0, 1}, where i=1 (0) if player A beats (is beaten by) player B, the component j=1 (0) if player A beats (is beaten by) player C, and the component k=1 (0) if player B beats (is beaten by) player C. The probabilities pi,j,k assigned to the eight outcomes (i, j, k) are p0,0,0 = 0.5×0.3×0.6 = 0.09, p1,0,0 = 0.5 × 0.3 × 0.6 = 0.09, p0,1,0 = 0.5 × 0.7 × 0.6 = 0.21, p0,0,1 = 0.5 × 0.3 × 0.4 = 0.06, p1,1,0 = 0.5×0.7×0.6 = 0.21, p1,0,1 = 0.5×0.3×0.4 = 0.06, p0,1,1 = 0.5 × 0.7 × 0.4 = 0.14, and p1,1,1 = 0.5 × 0.7 × 0.4 = 0.14. Denote by E the event that player A wins at least as many games as any other player, then E = {(0, 1, 0), (1, 1, 0), (1, 0, 1), (1, 1, 1)}. Thus, the desired probability P (E) = 0.21 + 0.21 + 0.06 + 0.14 = 0.62. 1.39 Take as sample the set of all four-tuples (δ1 , δ2 , δ3 , δ4 ), where δi = 0 if component i has failed and δi = 1 otherwise. The probability r1 r2 r3 r4 is assigned to (δ1 , δ2 , δ3 , δ4 ), where ri = fi if δi = 0 and ri = 1 − fi if δi = 1. Let A0 be the event that the system fails, A1 be the event that none of the four components fails and Ai be the event that only component i fails for i = 2, 3 and 4. Then, P (A1 ) = (1 − f1 )(1 − f2 )(1 − f3 )(1 − f4 ) and P (A2 ) = (1 − f1 )f2 (1 − f3 )(1 − f4 ), P (A3 ) = (1 − f1 )(1 − f2 )f3 (1 − f4 ), and P (A4 ) = (1 − f1 )(1 − f2 )(1 − f3 )f4 . The events Ak are mutually exclusive and their P union is the sample space. Hence the sought probability P (A0 ) is 1 − 4i=1 P (Ai ). 1.40 Proceeding along the same lines as in Example 1.12, the probability P 4 7 2k−1 7 that Bill is the first person to pick a red ball is ∞ = 18 . k=1 11 11 1.41 Take as sample space the set {(1, s), (2, s), . . .} ∪ {(1, e), (2, e), . . .} and assign to the outcomes (i, s) and (i, e) the probabilities (1 − a7 − 6 5 a8 )i−1 a7 and (1 − a7 − a8 )i−1 a8 , where a7 = 36 and P a8 = 36 . The ∞ probability of getting a total of 8 before a total of 7 is i=1 (1 − a7 − 5 a8 )i−1 a8 = a8 /(a7 + a8 ) = 11 . 1.41 Take as sample space the set {(1, s), (2, s), . . .} ∪ {(1, e), (2, e), . . .} and assign to the outcomes (i, s) and (i, e) the probabilities (1 − a7 − 6 5 a8 )i−1 a7 and (1 − a7 − a8 )i−1 a8 , where a7 = 36 and a8 = 36 . The probability of getting a total of 8 before a total of 7 is ∞ X i=1 (1 − a7 − a8 )i−1 a8 = 5 a8 = . a7 + a8 11 12 1.42 Using the same reasoning as in Example 1.12, the probability that desperado A will be the one to shoot himself dead is ∞ X 5 3n 1 n=0 The probabilities are 30 91 6 6 = 36 . 91 for desperado B and 25 91 for desperado C. 1.43 Take as sample space the set {(s1 , s2 ) : 2 ≤ s1 , s2 ≤ 12}, where s1 and s2 are the sums rolled by the two persons. The probability p(s1 , s2 ) = p(s1 ) × p(s2 ) is assigned to the outcome (s1 , s2 ), where p(s) is the probability of getting the sum s in a roll of two dice. The 2 1 , p(3) = p(11) = 36 , probabilities p(s) are given by p(2) = p(12) = 36 3 4 5 6 p(4) = p(10) = 36 , p(5) = p(9) = 36 , p(6) = p(8) = 36 , and p(7) = 36 . The probability that the sums rolled are different is X s1 6=s2 p(s1 , s2 ) = 1 − 12 X s=2 p(s) 2 = 575 . 648 1.44 (a) Let the set C = B\A consists of those outcomes that belong to B but do not belong to A. The sets A and C are disjoint and B = A ∪ C. Then, by Axiom 1.3 (in fact, Rule 1.1 in Section 1.3 should be used), P (B) = P (A ∪ C) = P (A) + P (C). By Axiom 1.1, P (C) ≥ 0 and so we get the desired result P (B) ≥ P (A). S (b) We canSdefine pairwise disjoint sets B1 , B2 , . . . such that ∞ k=1 Ak ∞ is equal to k=1 Bk . Let B1 = A1 and let B2 = A2 \A1 . In general, let Bk = Ak \(A1 ∪ · · · ∪ Ak−1 ) for k = 2, 3, . . . . By induction, B1 ∪ · · · ∪ Bk = A1 ∪ · · · ∪ AS 1 Also, the sets k for any k ≥ S∞ B1 , . . . , Bk are pairwise disjoint. Hence ∞ B = k=1 k k=1 Ak and the sets B1 , B2 , . . . are pairwise disjoint. Using Axiom 1.3, it now follows that ∞ ∞ ∞ X [ [ P (Bk ). Bk = Ak = P P k=1 k=1 k=1 Since Bk ⊆ Ak , we have P (Bk ) ≤ P (Ak ) and so the desired result follows. T 1.45 The sought probability is at least as large as 1−P ( ∞ n=1 Bn ). We have 1 r n P (Bn ) = 1 − for any n ≥ 1. 2 13 By the continuity T∞property of probability, P ( and so 1 − P ( n=1 Bn ) = 1. T∞ n=1 Bn ) = limn→∞ P (Bn ) 1.46 It is intuitively clear that the probability is equal to 0.5. This can be proved as follows. Define A (B) as the event that you see at least 10 consecutive tails (heads) before you see 10 consecutive heads (tails) for the first time if you toss a fair coin indefinitely often. Using the result of Problem 1.45, it follows that P (A ∪ B) = 1. The events A and B are mutually exclusive and satisfy P (A) = P (B). This proves that P (A) = P (B) = 0.5. 1.47 Let A be the event that a second-hand car is bought and B be the event that a Japanese car is bought. Noting that P (A ∪ B) = 1 − 0.55, it follows from P (A ∪ B) = P (A) + P (B) − P (AB) that P (AB) = 0.25 + 0.30 − 0.45 = 0.10. 1.48 Let A be the event that a randomly chosen household is subscribed to the morning newspaper and B be the event that a randomly chosen household is subscribed to the afternoon newspaper. The sought probability P (A∩B) = P (A)+P (B)−P (A∪B) is 0.5+0.7−0.8 = 0.4. 1.49 Let A be the event that the truck is used on a given day and B be the event that the van is used on a given day. Then, P (A) = 0.75, P (AB) = 0.30 and P (Ac B c ) = 0.10. By De Morgan’s first law, P (A ∪ B) = 1 − P (Ac B c ). By P (A ∪ B) = P (A) + P (B) − P (AB), the probability that the van is used on a given day is P (B) = 0.90 − 0.75 + 0.30 = 0.45. Since P (Ac B) + P (AB) = P (B), the probability that only the van is used on a given day is P (Ac B) = 0.45−0.30 = 0.15. 1.50 Since B ⊆ A ∪ B, it follows that P (B) ≤ P (A ∪ B) = 32 . Using the relation P (A ∪ B) = P (A) + P (B) − P (AB), it follows that P (B) ≥ 1 1 P (A ∪ B) − P (A) = 34 − 32 = 12 . Hence, 12 ≤ P (B) ≤ 32 . 1.51 The probability that exactly one of the events A and B will occur is given by P (A ∩ B c ) ∪ (B ∩ Ac ) = P (A ∩ B c ) + P (B ∩ Ac ). Next note that P (A∩B c ) = P (A)−P (A∩B) and P (B ∩Ac ) = P (B)−P (A∩B). Thus the probability of exactly one of the events A and B occurring is P (A) + P (B) − 2P (AB). Note: Similarly, we find a formula for the probability of exactly one of the events A, B, and C occurring. This probability is equal to 14 P (A ∩ B c ∩ C c ) + P (B ∩ Ac ∩ C c ) + P (C ∩ Ac ∩ B c ). The first term P (A ∩ B c ∩ C c ) can be evaluated as P (A) − [P (A ∩ B) + P (A ∩ C)] + P (A ∩ B ∩ C). In the same way, the other two terms can be evaluated. Thus the formula for the probability of exactly one of the events A, B, and C occurring is P (A) + P (B) + P (C) − 2P (AB) − 2P (AC) − 2P (BC) + 3P (ABC). A general formula for the probability that exactly r of the events A1 , . . . , An will occur is n−r X X r+k P (Aj1 · · · Ajr+k ). (−1)k r j1 <···<jr+k k=0 As an illustration, let us determine the probability of getting exactly d different face values when rolling a fair die n times, see also Example 10.6 in the book for another approach. Defining Ai as the event that face value i does not appear in n rolls of the die, we get that the desired probability is given by n min(n−(6−d), d) X d−k 6 k 6−d+k (−1) for d = 1, . . . , 6. 6−d+k 6−d 6n k=0 If n = 6, this probability has the values 1.28 × 10−4 , 0.0199, 0.2315, 0.5015, 0.2315, and 0.0154 for d = 1, . . . , 6. 1.52 In Problem 1.44, the upper bound has already been established. By P (A ∪ B) = P (A) + P (B) − P (AB), the lower bound is true for n = 2. Suppose the lower S bound is verified for n = 2, . . . , k. Then, for n = k + 1, let A = ki=1 Ai and B = Ak+1 . Using the induction hypothesis, we get P k+1 [ ≥ i=1 k X i=1 k X i=1 = Ai = P (A) + P (B) − P (AB) P (Ai ) − P (Ai ) − k+1 X i=1 k−1 X k X i=1 j=i+1 k−1 X k X i=1 j=i+1 P (Ai ) − k X P (Ai Aj ) + P (Ak+1 ) − P P (Ai Aj ) + P (Ak+1 ) − k X i=1 j=i+1 P (Ai Aj ), k X i=1 k [ (Ai Ak+1 ) i=1 P (Ai Ak+1 ) 15 as was to be verified. 1.53 In this “birthday” problem, the sought probability is 1− 250 × 249 × · · · × 221 = 0.8368. 25030 1.54 This problem is the birthday problem with m equally likely birthdays and n people. Using the complement rule, the probability that at least one of the outcomes O1 , . . . , Om will occur two or more times in n trials is (m − 1) · · · (m − n + 1) n − 1 1 1− ··· 1 − . =1− 1− n m m m This probability can be approximated by 1 − e−1/m · · · e−(n−1)/m = 1 1 1 − e− 2 n(n−1)/m for m large. Solving n from 1 − e− 2 n(n−1)/m = 0.5 is equivalent to solving n from the quadratic equation − 12 n(n − 1) = m ln(0.5). This yields √ n ≈ 1.177 m + 0.5. Using the complement rule, the probability that the outcome O1 occurs n ≈ 1 − e−n/m for m large. Solving n from at least once is 1 − (m−1) mn 1 − e−n/m = 0.5 gives n ≈ 0.6931m. 1.55 This problem is a variant of the birthday problem. One can choose two distinct numbers from the numbers 1, 2, . . . , 25 in 25 2 = 300 ways. The desired probability is given by 300 × 299 × · · · × 276 = 0.6424. 30025 1.56 This problem is a birthday problem with m = 49 6 equally likely birthdays and n = 3,016 people. Using the solution of Problem 1.54, the probability that in 3,016 drawings some combination of six numbers 1 will appear more than once is about 1 − e− 2 n(n−1)/m = 0.2776. This approximate value agrees with the exact value in all four decimals. 1− 1.57 The translation step to the birthday problem is to imagine that each of the n = 500 Oldsmobile cars gets assigned a “birthday” chosen at random from m = 2,400,000 possible “birthdays”. Using the approximate formula in Problem 1.54, the probability that at least one subscriber 1 gets two or more cars can be calculated as 1 − e− 2 n(n−1)/m = 0.051. 16 1.58 Imagine that the balls are numbered from 1 to 20. Using the complement rule,we find that the sought probability is 1− 20 × 10 × 18 × 9 × 16 × 8 = 0.8514. 20 × 19 × 18 × 17 × 16 × 15 1.59 (a) The sought probability is 1− 10,000 × 9,999 × · · · × 9,990 = 0.005487. 10,00011 (b) The sought probability is 1 − (1 − 0.005487)300 = 0.8081. 1.60 Using the complement rule, this variant of the birthday problem has as solution 450 × 435 × 420 × 405 × 390 × 375 × 360 × 345 × 330 × 315 45010 = 0.8154. 1− 1.61 Let A1 be the event that the card of your first favorite team is not obtained and A2 be the event that the card of your second favorite team is not obtained. The sought probability is 1 − P (A1 ∪ A2 ). Since 9 5 8 5 9 5 + 10 − 10 , the sought probability is 1 − P (A1 ∪ A2 ) = 10 0.8533 = 0.1467. 1.62 (a) Let A be the event that you get at least one ace. It is easier to compute the probability of the complementary event Ac that you get no ace in a poker hand of five cards. For the sample space of the chance experiment, we take all ordered five-tuples (x1 , x2 , x3 , x4 , x5 ), where xi corresponds to the suit and value of the ith card you get dealt. The total number of possible outcomes equals 52 × 51 × 50 × 49 × 48. The number of outcomes without ace equals 48 × 47 × 46 × 45 × 44. Assuming that the cards are randomly dealt, all possible outcomes are equally likely. Then, the event Ac has the probability P (Ac ) = 48 × 47 × 46 × 45 × 44 = 0.6588. 52 × 51 × 50 × 49 × 48 Hence, the probability of getting at least one ace in a poker hand of five cards is 1 − P (Ac ) = 0.3412. Another possible choice for the sample 17 space consists of the collection of all unordered 52 sets of five distinct 48 cards, resulting in the probability 1 − 5 / 5 = 0.3412. (b) It is easiest to take as sample space the collection of all unordered sets of five distinct cards. The sample space has 52 5 equally likely elements. Let Ai be the event that the five cards ofthe poker hand are from suit i for i = 1, . . . , 4. Each set Ai has 13 5 elements. The events A1 , . . . , A4 are P mutually exclusive and so the desired probability 52 4 13 P (A1 ∪ · · · ∪ A4 ) = = i=1 P (Ai ). For each i, P (Ai ) = 5 / 5 4.95 × 10−4 . Hence the desired probability is 0.00198. 1.63 It is easiest to compute the complementary probability that more than 5 rolls are needed to obtain at least one five and at least one six. This probability is given by P (A ∪ B), where A is the event that no five is obtained in 5 rolls and B is the event that no six is obtained in 5 rolls. We have P (A ∪ B) = P (A) + P (B) − P (A ∩ B) = ( 56 )5 + ( 65 )5 − ( 64 )5 . Therefore the sought probability is given by 1 − 2( 65 )5 + ( 64 )5 = 0.4619. Note: The probability that exactly r rolls are needed to obtain at least one five and at least one six is given by Qr−1 − Qr , where Qn is defined as the probability that more than n rolls are needed to obtain at least one five and at least one six. We have Qn = ( 65 )n + ( 56 )n − ( 64 )n . 1.64 The sample space is given by the set {(i, j, k) : i, j, k = 1, 2, . . . , 6}. 1 Each element gets assigned a probability of 216 . It is easiest to compute the complementary probability P (A), where A is the event that none of the three rolls gives your number chosen. The set A contains 5×5×5 = 125 elements and so P (A) = 125 216 . Hence the desired probability is 1 − 125 = 0.4213. 216 S 1.65 The sought probability is given by P ( 253 i=1 Ai ). For any i, the prob23 = ability P (Ai ) is equal to 365 × 1 × 364 × 363 × · · · × 344/365 S253 0.0014365. The events Ai are mutually exclusive and so P ( i=1 Ai ) = P 253 i=1 P (Ai ). Therefore the probability that in a class 23 children exactly two children have a same birthday is equal to 253 × 0.0014365 = 0.3634. 1.66 Take as sample space the set of all sequences (i1 , . . . , i6 ), where ik is the number shown by the kth roll. The sample space has 66 equally likely elements. Let A be the event that all six face values appear and B be the event that the largest number rolled is r. The set A has 6! elements. Noting that the number of elements for which the largest number rolled does not exceed k equals k 6 , we have that the set B has 18 r6 −(r−1)6 elements. The sought probabilities are P (A) = and P (B) = r 6 −(r−1)6 66 6! 66 = 0.0154 . 1.67 Take as sample space the set of all ordered sequences of the possible destinations of the five winners. The probability that at least one of the destinations A and B will be chosen is 1 − 315 . Let Ak be the event that none of the winners has chosen the kth destination. The 5 5 5 31 . probability P (A1 ∪ A2 ∪ A3 ) equals 235 + 325 + 235 − 315 − 315 − 315 = 81 1.68 Take as sample space the set of all possible combinations of sixdifferent numbers from the numbers 1 to 42. The sample space has 42 6 equally likely elements. Let A be the event that none of the numbers 7, 14, 21, 28, 35, and 42 is drawn and B be the event that exactly one of 42 6 these numbers is drawn. Then, P (A) = 36 6 / 6 and P (B) = 1 × 42 36 / . The events A and B are disjoint. Using the complement 5 6 rule, the sought probability is 1 − P (A) − P (B) = 0.1975. 1.69 Let A be the event that there is no ace among the five cards and B be the event that there is neither a king nor a queen among the five cards. The sought probability is 48 44 40 5 + 5 − 5 1 − P (A ∪ B) = 1 − = 0.1765. 52 5 1.70 Take as sample space the set {(i, j) : 1 ≤ i, j ≤ 6}, where i is the number rolled by John and j is the number rolled by Paul. Each element (i, j) of the sample space gets assigned the probability p(i, j) = 1 36 . Let A be the event that John gets a larger number than Paul and B the event that the product of the numbers rolled by John and Paul is odd. The desired probability is P (A ∪ B) = P (A) + P (B) − P (AB). This gives that the probability of John winning is P (A ∪ B) = 15 36 + P6 Pi−1 3 1 6 , using the fact that P (A) = i=2 j=1 p(i, j) = 15/36, 36 − 36 = P2 P 6 3 P (B) = i odd j odd p(i, j) = 36 and P (AB) = 36 . 1.71 Take as sample space the set of all 40! possible orderings of the aces and the cards 2 through 10 (the other cards are not relevant). The first 1 4×3×38! 1 = 130 . probability is 4×39! 40! = 10 and the second probability is 40! 1.72 The sample space consists of the elements O1 , O12 O1 , O12 O12 O1 , . . ., where O1 occurs if the first trial gives outcome O1 , O12 O1 occurs if the first trial gives neither of the outcomes O1 and O2 and the second trial 19 gives the outcome O1 , etc. The probabilities p1 , (1 − p1 − p2 )p1 , (1 − p1 −p2 )2 p1 , . . . are assigned to the elements O1 , O12 O1 , O12 O12 O1 , . . .. The first probability is ∞ X (1 − p1 − p2 )n−1 p1 = n=1 p1 . p1 + p2 To get the second probability, note that the probability of the event n−1 n−r r−1 k An,k is given by r−1 k p1 p2 (1−p1 −p2 )n−r−k p1 . Since the events An,k are mutually exclusive, it now follows that the second probability is given by ∞ s−1 X X n−1 n−r r k p1 p2 (1 − p1 − p2 )n−r−k . r−1 k k=0 n=r+k This sum can be rewritten as s−1 X k=0 ∞ pr1 pk2 X (l + 1) · · · (l + r + k − 1)(1 − p1 − p2 )l . (r − 1)! k! l=0 P∞ l m+1 , the Using the identity l=0 (l + 1) · · · (l + m) x = m!/(1 − x) desired result next follows. 1.73 (a) The number of permutations in which the particular number r n−1 belongs to a cycle of length k is k−1 (k − 1)!(n − k)!. The sought probability is n−1 1 (k − 1)!(n − k)!)/n! = . k−1 n (b) For fixed r, s with r 6= s, let Ak be the event that r and s belong to a same cycle with length k. The sought probability is n 1 1 X n−2 (k − 1)!(n − k)! . P (A2 ∪ · · · ∪ An ) = k−2 n! 2 k=2 1.74 It is no restriction to assume that the 2n prisoners have agreed that the ith prisoner goes to the ith box for i = 1, 2, . . . , 2n. The order in which the names of the prisoners show up in the boxes is a random permutation of the integers 1, 2, . . . , 2n. Each prisoner finds his own name after inspecting up to a maximum of n boxes if and only if each cycle of the random permutation has length n or less. In other 20 words, the probability that all prisoners will be released is equal to 1 − P (An+1 ∪ · · · ∪ A2n ), where Ak is the event that a random permutation of 1, 2, . . . , 2n contains a cycle of length k. A crucial observation is that, for any k > n, any random permutation of the integers i = 1, 2, . . . , 2n has at most one cycle of length k. Hence P (Ak ) = 2n k (k − 1)!(2n−k)!/(2n)! = k1 . Further, the events An+1 , . . . , A2n are mutually P exclusive and so P (An+1 ∪ · · · ∪ A2n ) = 2n k=n+1 P (Ak ). Hence the sought probability is 1− 1 1 1 − − ··· − . n+1 n+2 2n This probability is about 1 − ln(2) = 0.3069 for n large enough. The exact value is 0.3118 for n = 50. 1.75 In line with the strategy outlined in Problem 1.74, the person with the task of finding the car first opens door 1. This person next opens door 2 if the car key is behind door 1 and next opens door 3 if the goat is behind door 1. The person with the task of finding the car key first opens door 2. This person next opens door 1 if the car is behind door 2 and next opens door 3 if the goat is behind door 2. Under this strategy the probability of winning the car is 23 : the four arrangements (car, key, goat), (car, goat, key), (key, car, goat) and (goat, key, care) are winning, whereas the two arrangements (key, goat, car) and (goat, car, key) are losing. 1.76 Some reflection shows that the game cannot take more than 15 spins (it takes 15 spins if and only if the spins 1, . . . , 4, 6, . . . , 9 and 11, . . . , 14 result in “odds,” while the spins 5, 10 and 15 result in “heads”). Let Ai be the event that the spinner wins on the ith toss for i = 3, . . . , 15. The events Ai are mutually exclusive. The win probability is given by 15 X i=3 P (Ai ) = 1 3 3 5 15 18 19 + 7 + 7 + 8 + 10 + 11 + 12 6 2 2 2 2 2 2 2 15 5 3 3 1 9 + 14 + 14 + 15 + 17 + 18 = 0.11364. 12 2 2 2 2 2 2 The game is unfavorable to the bettor. + 1.77 The probability identifying five or more wines can be calP of−1correctly k /k! = 0.00366 when the person is not a connoisculated as 10 e 1 k=5 seur and just guesses the names of the wines. This small probability is a strong indication that the person is a connoisseur. 21 1.78 This problem is a variant of the hat-check problem. Let A be the event that exactly one student receives his own paper and Bi be the event that the ith student is the only student who gets back his own P (−1)k ≈ e−1 . paper. Then, by results in Example 1.19, P (A) = 14 k=0 Pk! 15 Since A = ∪15 i=1 Ei and the events Ei are disjoint, P (A) = i=1 P (Ei ). For reasons of symmetry P (Ei ) = P (E1 ) for all i. Thus the sought 1 −1 e . probability is about 15 1.79 Let Ai be the event that the ith person gets both the correct coat and the correct umbrella. The sought probability is 5 X k+1 5 (5 − k)!(5 − k)! (−1) P (A1 ∪A2 ∪A3 ∪A4 ∪A5 ) = = 0.1775. k 5!5! k=1 1.80 Label the three Italian wines as 1, 2, and 3. Let Ai be the event that the Italian wine with label i is correctly guessed. The sought probabil ity is 1−P (A1 ∪A2 ∪A3 ) = 1− 31 P (A1 )− 32 P (A1 A2 )+P (A1 A2 A3 ) . 8! 7! 9! , P (A1 A2 ) = 10! , and P (A1 A2 A3 ) = 10! . The We have P (A1 ) = 10! probability that none of the three Italian wines is correctly guessed is 8! 7! 9! = 0.7319. −3× + 1− 3× 10! 10! 10! 1.81 Let Ai be the event that the three choices of five distinct numbers have number i in common. The sought probability is 5 25 X [ 25 (−1)k+1 P (A1 · · · Ak ) Ak ) = P( k k=1 k=1 25−k3 5 X 5−k k+1 25 (−1) = = 0.1891. 25 3 k k=1 5 1.82 Let Ai be the event that all four cards of kind i are contained in the hand of 13 cards. The desired probability is 3 X k+1 13 P (A1 · · · Ak ). (−1) P (A1 ∪ · · · ∪ A13 ) = k k=1 This probability can be evaluated as 48 44 13 13 13 9 5 × 52 − × 52 + × 1 2 3 13 13 40 1 52 13 = 0.0342. 22 1.83 Let Ai be the event that the player’s hand does not contain any card of suit i. The desired probability is 52−13k 4 3 [ X k+1 4 13 = 0.0511 (−1) P ( Ai ) = 52 k 13 i=1 k=1 for the bridge hand. For the poker hand, the desired probability is 52−13k 3 X k+1 4 5 = 0.7363. (−1) 52 k 5 k=1 1.84 Take as sample space the set of all possible ordered arrangements of the 12 people. Label the six rooms as i = 1, . . . , 6. The first two people in an arrangement are assigned into room 1, the next two people in the arrangement are assigned into room 2, etc. Let Ai be the event that room i has two people of different The sought probability P4nationalities. 6 k+1 is 1 − P (A1 ∪ · · · ∪ A5 ) = 1 − k=1 (−1) k P (A1 · · · Ak ). We have 8 × 4 × 2 × 10! 8 × 4 × 7 × 3 × 22 × 8! , P (A1 A2 ) = , 12! 12! 8 × 4 × 7 × 3 × 6 × 2 × 23 × 6! P (A1 A2 A3 ) = , 12! 8 × 4 × 7 × 3 × 6 × 2 × 5 × 24 × 4! . P (A1 A2 A3 A4 ) = 12! P (A1 ) = This leads to the value 1 33 for the sought probability. 1.85 The possible paths from node n1 to node n4 are the four the paths (l1 , l5 ), (l2 , l6 ), (l1 , l3 , l6 ) and (l2 , l4 , l5 ). Let Ai be the event that the ith path is functioning. The probability P (A1 ∪ A2 ∪ A3 ∪ A4 ) can be evaluated as p1 p5 + p2 p6 + p1 p3 p6 + p2 p4 p5 − p1 p2 p5 p6 − p1 p3 p5 p6 − p1 p2 p4 p5 − p1 p2 p3 p6 − p2 p4 p5 p6 + p1 p2 p3 p5 p6 + p1 p2 p4 p5 p6 . This probability reduces to 2p2 (1 + p + p3 ) − 5p4 when pi = p for all i. 1.86 Let Ai be the event that number i has not appeared in 30 draws of the Lotto 6/45. The desired probability is given by P (A1 ∪ · · · ∪ A45 ). 4530 Noting that P (A1 · · · Ak ) = 45−k / 6 , it follows that the desired 6 probability is equal to 39 X 45 − k 45 30 k+1 n = 0.4722. (−1) 6 6 k k=1 23 1.87 Let Ai be the event that it takes more than r purchases to get the ith coupon. The probability that more than r purchases are needed in order to get a complete set of coupons is P n [ i=1 n X n (n − k)r (−1)k+1 Ai = . k nr k=1 Using this expression for n = 38, 365, and 100, we find that the required number of rolls is r = 13, the required number of people is r = 2,287, and the required number of balls is r = 497. 1.88 Let Ai be the event that the ith boy becomes part of a couple. The desired probability is 1 − P (A1 ∪ · · · ∪ An ). For any fixed i, P (Ai ) = 2n−4 n×n2n−2 = nn2 . For any fixed i and j, P (Ai Aj ) = n×(n−1)n = n(n−1) n2n n2n n4 for i 6= j. Continuing in this way, we find P (A1 ∪ · · · ∪ An ) = n X k=1 (−1) k+1 n n(n − 1) · · · (n − k + 1) . k n2k 1.89 Let Ai be the event that the ith person does not share his or her birthday with someone else. The sought probability is given by 1 − P (A1 ∪ · · · ∪ An ) and is equal to 1− 1 365n min(n,365) X k=1 (−1)k+1 n (365)×· · ·×(365−k +1)×(365−k)n−k . k This probability is 0.5008 for n = 3,064. 1.90 Let Ai be the event that all of the three random permutations have the same number in the ith position. The desired probability P (A1 ∪ A2 ∪ · · · ∪ A10 ) is given by 10 X k=1 (−1)k+1 10 10 × · · · × (10 − k + 1) × [(10 − k)!]3 = 0.0947. [10!]3 k 1.91 There are 10 2 = 45 possible combinations of two persons. Let Ak be the event that the two persons from the kth combination have chosen each other’s name. Using the fact that P (Ai Aj ) = 0 for i 6= j when Ai 24 and Aj have a person in common, we find that the sought probability is given by 1 10 8 1 4 10 1 2 − + ··· P (A1 ∪ · · · ∪ A45 ) = 9 2! 2 2 9 2 1 10 8 6 4 2 1 10 − = 0.4654. 5! 2 2 2 2 2 9 1.92 Let Ai be the event that the ith person is a survivor. The desired probability is given by 1 − P (A1 ∪ · · · ∪ An ). For any i, P (Ai ) = (n − 2)n−1 (n − 1) . (n − 1)n For any i, j with i 6= j, P (Ai Aj ) = (n − 3)n−2 (n − 2)2 . (n − 1)n Continuing in this way, it follows that P (A1 ∪ · · · ∪ An ) = n−2 X (−1) k=1 k+1 n−k (n − k)k n n − (k + 1) . k (n − 1)n Note: Suppose that there is a second round with the survivors if there are two or more survivors after the first round, the second round is followed by a third round if there are two or more survivors after the second round, and so on, until there is one survivor or no survivor at all. What is the probability that the game ends with one survivor? This probability is given for several values of n in the table below. These probabilities can be computed by using the powerful method of an absorbing Markov chain, see Chapter 10 of the book. In this method we need the one-step transition probabilities pn,m being the probability of going in one step from state n to state m. In this particular example, the state is the number of survivors and the probability pn,m is the probability of m survivors after a round starting with n survivors. To give these probabilities, we need the generalized inclusion-exclusion formula P (exactly m of the events A1 , . . . , An will occur) n−m X X k m+k (−1) P (Ai1 Ai2 · · · Aim+k ). = m k=0 i1 <i2 <···<im+k 25 n 2 3 4 5 6 7 8 9 10 P (one survivor) 0.0000 0.7500 0.5926 0.4688 0.4161 0.4389 0.4890 0.5323 0.5547 n 20 30 40 50 60 70 80 90 100 P (one survivor) 0.4693 0.5374 0.4996 0.4720 0.4879 0.5155 0.5309 0.5291 0.5160 1.93 Let Ai be the event that no ball of color i is picked for i = 1, 2, 3. The probability of picking at least one ball of each color is 1 − P (A1 ∪ A2 ∪ A3 ) = 1 − 3 X i=1 P (Ai ) − 3 2 X X i=1 j=i+1 P (Ai Aj ) . If the balls are picked with replacement, then 125 105 85 , P (A ) = , P (A ) = , 2 3 155 155 155 55 75 P (A1 A2 ) = 5 , P (A1 A3 ) = 5 , and P (A2 A3 ) = 0. 15 15 P (A1 ) = The sought probability is 1 − 0.4763 = 0.5237. If the balls are picked without replacement, then 7 5 7 3 3 5 3 3 4 X X X k 5−k k 5−k k 5−k , P (A2 ) = , P (A3 ) = P (A1 ) = 15 15 15 k=1 P (A1 A2 ) = 5 7 5 15 , P (A1 A3 ) 5 = 5 k=1 5 5 15 , and P (A2 A3 ) 5 k=1 5 = 0. The probability of picking at least one ball of each color is 1−0.3443 = 0.6557. 1.94 Take as sample space the set of all possible ordered arrangements of the 2n people. The sample space has (2n)! equally likely elements. Imagine that the two people in the positions 2k − 1 and 2k of the ordered arrangement are paired as bridge partners for k = 1, . . . , n. Let Ai the event that couple i is paired as bridge partners. The sought 26 probability is 1 − P (A1 ∪ · · · ∪ An ). The number of elements in the set A1 ∩ · · · ∩ Ak is n × (n − 1) × · · · × (n − k + 1) × 2k × (2n − 2k)!. There are n(n − 1) · · · (n − k + 1) possible choices for the couples in the first 2k positions of the arrangement and two partners from a couple can be ordered in two ways. The remaining 2n − 2k people can be ordered in (2n − 2k)! ways. Thus we find P (A1 ∪ · · · ∪ An ) = = n X k=1 n X (−1) k+1 k=1 n P (A1 · · · Ak ) k n n × (n − 1) × · · · × (n − k + 1) × 2k × (2n − 2k)! (−1)k+1 . k (2n)! Note: the probability that none of the couples will be paired as bridge 1 partners tends to 1−e− 2 = 0.3935 as the number of couples gets large, see also Problem 3.85. 1.95 Let Ai be the event that the four cards of rank i are matched. The sought probability is P 13 [ i=1 Ai 13 X (−1)k+1 = k=1 13 k (4!)k (52 − 4k)! = 4.80 × 10−5 . 52! 27 Chapter 2 2.1 Take as sample space the set of all ordered pairs (i, j), where the outcome (i, j) represents the two numbers shown on the dice. Each of the 36 possible outcomes is equally likely. Let A be the event that the sum of the two dice is 8 and B be the event that the two numbers shown on the dice are different. There are 30 outcomes (i, j) with i 6= j. 4 In four of those outcomes i and j sum to 8. Therefore P (AB) = 36 4/36 2 and P (B) = 30 36 . The sought probability P (A | B) is 30/36 = 15 . 2.2 The ordered sample space consists of the eight equally likely elements (H, H, H), (H, H, T ), (H, T, H), (H, T, T ), (T, T, T ), (T, T, H), (T, H, T ), and (T, H, H), where the first component refers to the nickel, the second to the dime and the third to the quarter. Let A the event that the quarter shows up heads and B be the event that the coins showing up heads represent an amount of at least 15 cents. To find P (A | B) = P (AB)/P (B), note that the set AB consists of four elements (H, H, H), ,(H,T,H),(T,T,H) and (T, H, H), while the set B consists of the 5 elements (H, H, H), (H, H, T ), (H, T, H), (T, T, H), and (T, H, H). This gives P (AB) = 48 and P (B) = 58 . Hence the desired probability P (A | B) is 45 . 2.3 Take as sample space the set of the ordered pairs (G, G), (G, F ), (F, G), and (F, F ), where G stands for a “correct prediction” and F stands for a “false prediction,” and the first and second components of each outcome refer to the predictions of weather station 1 and weather station 2. The probabilities 0.9×0.8 = 0.72, 0.9×0.2 = 0.18, 0.1×0.8 = 0.08, and 0.1 × 0.2 = 0.02 are assigned to these elements. Let the event A = {(G, F )} and the event B = {(G, F ), (F, G)}. The sought 0.18 9 probability is P (A | B) = 0.26 = 13 . 2.4 Let A be the event that a randomly chosen student passes the first test and B be the event that this student also passes the second test. 0.50 Then P (B | A) = 0.80 = 0.625. The answer is 62.5%. 2.5 Let A be the event that a randomly chosen household has a cat and B be the event that the household has a dog. Then, P (A) = 0.3, P (B) = 0.25, and P (B | A) = 0.2. The sought probability P (A | B) satisfies P (A)P (B | A) P (AB) = P (A | B) = P (B) P (B) 28 and thus is equal to 0.3 × 0.2/0.25 = 0.24. 2.6 The ordered sample space is the set {(H, 1), . . . , (H, 6), (T, 1), . . . , (T, 6)}. Each outcome is equally likely to occur. Let A be the event that the coin lands heads and B the event that the die lands six. The set A consists of six elements, the set AB consists of a single element and the set A ∪ B consists of seven elements. Hence the desired probabilities are given by P (AB | A∪B) = P (AB) 1 P (A) 6 = and P (A | A∪B) = = . P (A ∪ B) 7 P (A ∪ B) 7 2.7 Label the two red balls as R1 and R2 , the blue ball as B and the green ball as G. Take as unordered sample space the set consisting of the six equally likely combinations {R1 , R2 }, {R1 , B}, {R2 , B}, {R1 , G}, {R2 , G}, and {B, G} of two balls. Let C be the event that two non-red balls have been grabbed, D be the event that at least one non-red ball has been grabbed, and E be the event that the green ball has been grabbed. Then, P (CD) = 61 , P (D) = 56 , P (CE) = 61 and P (E) = 63 . The sought probabilities are P (C | D) = 51 and P (C | E) = 31 . In the second situation you have more information. 2.8 The ordered sample space is the set {(i, j) : i, j = 1, 2, . . . , 6}. Each element is equally likely to occur. Let A be the event that both dice show up an even number and let B the event that at least one of the two dice shows up an even number. The set AB is equal to the set A consisting of 9 elements and the set B consists of 27 elements. The 9/36 probability P (A | B) of your winning of the bet is equal to 27/36 = 31 . The bet is not fair to you. 2.9 Take as unordered sample space the set of all possible combinations of 13 distinct cards. Let A be the event that the hand contains exactly one ace, B be the event that the hand contains at least one ace, and C be the event that the hand contains the ace of hearts. Then 52 1 48 4 48 1 12 1 12 / 13 52 = 0.6304 and P (A | C) = 1 51 = 0.4388. P (A | B) = 1 − 48 13 / 13 1 12 The desired probabilities are 0.3696 and 0.5612. The second case involves more information. 29 2.10 The probability that the number of tens in the hand is the same as the number of aces in the hand is given by 4 X 4 4 k=0 k k . 52 44 = 0.3162. 13 13 − 2k Hence, using a symmetry argument, the probability that the hand contains more aces than tens is 12 (1 − 0.3162) = 0.3419. Letting A be the event that the hand contains more aces than tens and B the event that the hand contains at least one ace, then P (A | B) = P (AB)/P (B) = P (A)/P (B). Therefore P (A | B) = P4 k=1 0.3419 48 4 k 13−k / 52 13 = 0.4911. 2.11 Let A be the event that each number rolled is higher than all those that were rolled earlier and B be the event that the three different numbers are rolled. Then P (A) = P (AB) and so P (A) = P (B)P (A | B). We have P (B) = 6×5×4 = 59 and P (A | B) = 3!1 . Thus 63 P (A) = 1 5 20 × = . 36 3! 54 P (BA) 2.12 (a) Since P (A | B) > P (B | A) is the same as PP(AB) (B) > P (A) , it follows that P (A) > P (B). (b) Since P (B | A) = P (AB)/P (A) = P (A | B)P (B)/P (A), we get P (B | A) > P (B). Also, by P (B c | A) + P (B | A) = [P (B c A) + P (BA)]/P (A) = P (A)/P (A) = 1, we get P (B c | A) = 1 − P (B | A) ≤ 1 − P (B) = P (B c ). (c) If A and B are disjoint, then P (AB) = 0 and so P (A | B) = 0. If B is a subset of A, then P (AB) = P (B) and so P (A | B) = 1. 2.13 Let A be the event that a randomly chosen student takes Spanish and B be the event that the student takes French. Then, P (A) = 0.35, P (B) = 0.15, and P (A∪B) = 0.40. Thus P (AB) = 0.35+0.15−0.40 = 2 0.10 and so P (B | A) = 0.10 0.35 = 7 . 2.14 Let A be the event that a randomly chosen child is enrolled in swimming and B be the event that the child is enrolled in tennis. The sought probability P (A | B) follows from P (A | B) = P (AB)/P (B) = P (A)P (B | A)/P (B) and is equal to (1/3) × 0.48/0.40 = 0.64. 30 2.15 Let A be the event that a randomly chosen voter is a Democrat, B be the event that the voter is a Republican, and C be the event that the voter is in favor of the election issue. (a) Since P (A) = 0.45, P (B) = 0.55, P (C | A) = 0.7 and P (C | B) = 0.5, it follows from P (AC) = P (C | A)P (A) and P (BC) = P (C | B)P (B) that P (AC) = 0.7 × 0.45 = 0.315 and P (BC) = 0.5 × 0.55 = 0.275. (b) Since P (C) = P (AC) + P (BC), we get P (C) = 0.59. (c) P (A | C) = 0.315 0.59 = 0.5339. 2.16 Let A be the event that a randomly selected household subscribes to the morning newspaper and B be the event that the household subscribes to the afternoon newspaper. To find the sought probability P (Ac | B), use the relation P (A) = P (AB) + P (Ac B). Thus P (Ac | B = 0.50 − 0.40 1 P (A) − P (AB) = = . P (B) 0.70 7 2.17 Let A1 (A2 ) be the event that the first (second) card picked belongs 3 . to one of the three business partners. Then P (A1 A2 ) = 53 × 42 = 10 2.18 Let Ai be the event that the ith card you receive is a picture card that you have not received before. Then, by P (A1 A2 A3 A4 ) = P (A1 )P (A2 | A1 )P (A3 | A1 A2 )P (A4 | A1 A2 A3 ), the sought probability can be computed as P (A1 A2 A3 A4 ) = 16 12 8 4 × × × = 9.46 × 10−4 . 52 51 50 49 2.19 Let A be the event that one or more sixes are rolled and B the event that no one is rolled. Then, by P (AB) = P (B)P (A | B), we have that the sought probability is 4 6 5 6 1− = 0.2471. P (AB) = 6 5 2.20 It is no restriction to assume that the drawing of lots begins with the Spanish teams. Let A0 be the event that the two Spanish team are paired and Ai be the event that the ith Spanish team is not paired to the other Spanish team or to a German team. The events A0 and A1 A2 are disjoint. The sought probability is P (A0 ) + P (A1 A2 ) = 1 4 3 17 1 + P (A1 )P (A2 | A1 ) = + × = . 7 7 7 5 35 31 2.21 Let Ai be the event that you get a white ball on the ith pick. The probability that you need three picks is P (A1 A2 A3 ) = 53 × 25 × 15 = 6 125 . Five picks require that one black ball is taken in the first three picks. By the chain rule, the probability that five picks are needed is 4 3 2 1 3 3 3 2 1 3 2 4 2 1 6 2 5 × 5 × 5 × 5 × 5 + 5 × 5 × 5 × 5 × 5 + 5 × 5 × 5 × 5 × 5 = 125 . 2.22 (a) The sought probability is the same as the probability of getting two red balls when two balls are drawn at random from a bowl with three red bed balls and three blue balls. Let Ai be the event that the ith ball drawn is red. The sought probability is P (A1 A2 ). This probability is evaluated as P (A1 )P (A2 | A1 ) = 36 × 52 = 15 . (b) Let Ai be the event that the ith number drawn is not 10 and Ei be the event that the ith number drawn is more than 10. The first probability is 1 − P (A1 · · · A6 ) = 1 − 36 6 41 40 × ··· × = . 42 41 37 42 22 18 The second probability is P (E1 · · · E6 ) = 23 42 × 41 · · · × 37 = 0.0192. (c) Suppose that first the two cups of coffee are put on the table. Let Ai be the event that the ith cup of coffee is given to a person who ordered coffee. The sought probability is P (A1 A2 ) = P (A1 )P (A2 | A1 ) = 1 2 1 × = . 5 4 10 (d) Suppose that the two socks are chosen one by one. Let Ai be the event that the ith sock chosen is black for i = 1, 2. The sought probability is 2P (A1 A2 ). We have P (A1 A2 ) = P (A1 )P (A2 | A1 ) = 1 1 1 1 1 5 × 4 = 20 . Hence the sought probability is 2 × 20 = 10 . (e) Imagine that two apartments become vacant one after the other. Let A1 be the event that the first vacant apartment is not on the top floor and A2 be the event that the second vacant apartment is not on the top floor. The sought probability is 1 − P (A1 A2 ). The probability 47 P (A1 A2 ) is evaluated as P (A1 )P (A2 | A1 ) = 48 56 × 55 = 0.7325. 2.23 Let Ai be the event that the ith person in line is the first person matching a birthday with one of the persons in front of him. Then 364 1 P (A2 ) = 365 and P (Ai ) = 365 × · · · × 364−i+3 × i−1 365 365 for i ≥ 3. The probability P (Ai ) is maximal for i = 20 and has then the value 0.0323. 2.24 Let A1 be the event that the luggage is not lost in Amsterdam, A2 the event that the luggage is not lost in Dubai and A3 the event that the 32 luggage is not lost in Singapore. Then, P (the luggage is lost) = 1 − P (A1 A2 A3 ) = 1 − P (A1 )P (A2 | A1 )P (A3 | A1 A2 ) = 1 − 0.95 × 0.97 × 0.98 = 0.09693. Letting Aci be the complementary event of the event Ai , we have P (the luggage is lost in Dubai | the luggage is lost) P (A1 )P (Ac2 | A1 ) P (A1 Ac2 ) = = P (the luggage is lost) P (the luggage is lost) = 0.95 × 0.03 = 0.2940. 0.09693 2.25 Let Ai be the event that the ith leaving person has not to squeeze past a still seated person. The sought probability is the same as P (A1 A2 A3 A4 A5 ) = 72 × 62 × 25 × 42 × 32 = 0.0127. 2.26 Let Ak be the event that the first ace appears at the kth card, and let pk = P (Ak ). Then, by P (A1 A2 · · · An ) = P (A1 )P (A2 | A1 ) · · · P (An | 48 4 4 , p2 = 52 × 51 , and A1 . . . An−1 ), it follows that p1 = 52 pk = 48 47 48 − k + 2 4 × × ··· × × , 52 51 52 − k + 2 52 − k + 1 k = 3, . . . , 49. The three players do not have the same chance to become the dealer. For P = A, B, and C, let rP be the probability that player P becomes the dealer. Then rA > rB > rC , because the probabilityP pk is decreasing in k. ThePprobabilities can be calculated P as rA = 16 n=0 p1+3n = 15 0.3600, rB = 15 p = 0.3328, and r = p C n=0 2+3n n=0 3+3n = 0.3072. 2.27 Under the condition that the events A1 , . . . , Ai−1 have occurred, the ith couple can match the birthdays of at most one of the couples 1 to i−1 i − 1. Thus P (Aci | A1 · · · Ai−1 ) = 365 2 and so P (Ai | A1 · · · Ai−1 ) = i−1 1 − 3652 . The sought probability is 1 − P (A2 A3 · · · An ) and equals Q i−1 1 − ni=2 1 − 365 2 , by the chain rule. 2.28 The desired probability is 1 − P (A1 A2 · · · Am−2 ). We have 39 39 39 5 5 −1 5 −2 , P (Ai | A1 · · · Ai−1 ) = P (A1 ) = 39 3 5 39 5 −2 39 5 33 for i ≥ 2. The desired probability now follows by applying the chain rule P (A1 A2 · · · Am−2 ) = P (A1 )P (A2 | A1 ) · · · P (Am−2 | A1 · · · Am−3 ) and is equal to 39 39 m−2 − 1 − 2 5 . 1− 5 m−1 39 5 2.29 Using the chain rule for conditional probabilities, the sought probabilb r b b−1 r for k = 1, r+b × r+b−1 for k = 2 and r+b × r+b−1 × ··· × ity is r+b b−(k−2) r+b−(k−2) r × r+b−(k−1) for 3 ≤ k ≤ b + 1. The sought probability can be written as b r+b−k r k−1 r−1 = r+b . × r+b r + b − (k − 1) k−1 r This representation can be explained as the probability that the first k − 1 picks are blue balls multiplied with the conditional probability that the kth pick is a red ball given that the first k − 1 picks are blue b , as can be directly seen balls. The answer to the last question is r+b by a symmetry argument. The probability that the last ball picked is blue is the same as the probability that the first ball picked is blue. 2.30 The probability that the rumor will not be repeated to any one person once more is P (A1 A2 · · · A10 ), where Ai is the event that the rumor reaches only different persons during the first i times that the rumor is told. Noting that P (A1 ) = 1, P (A2 | A1 ) = 1 and P (Ai | A1 · · · Ai−1 ) = 25−i 23 for i ≥ 3, it follows that the probability that the rumor will not be repeated to any one person once more is 22 21 15 × × ··· × = 0.1646. 23 23 23 The probability that the rumor will not return to the originator is 8 ( 22 23 ) = 0.7007. 18 9 2.31 Since P (A) = 36 , P (B) = 18 36 , and P (AB) = 36 , we get P (AB) = P (A)P (B). This shows that the events A and B are independent. 30 2.32 The number is randomly chosen from the matrix and so P (A) = 50 , 15 25 P (B) = 50 and P (AB) = 50 . Since P (AB) = P (A)P (B), the events A and B are independent. This result can also be explained by noting that you obtain a random number from the matrix by choosing first a row at random and choosing next a column at random. 34 2.33 Since A is the union of the disjoint sets AB and AB c , we have P (A) = P (AB)+P (AB c ). This gives P (AB c ) = P (A)−P (A)P (B) = P (A)[1− P (B)] and so P (AB c ) = P (A)P (B c ), showing that A and B c are independent events. Applying this result with A replaced by B c and B by A, we next get that B c and Ac are independent events. 2.34 Since A = AB∪AB c and the events AB and AB c are disjoint, it follows that P (A) = P (AB) + P (AB c ) = P (A | B)P (B) + P (A | B c )P (B c ). This gives P (A) = P (A | B)P (B)+P (A | B)P (B c ) = P (A | B). Thus P (A) = PP(AB) (B) and so P (AB) = P (A)P (B). 2.35 The result follows directly from P (A1 ∪ · · · ∪ An ) = 1 − P (Ac1 · · · Acn ) and the independence of the Aci , using P (Ac1 · · · Acn ) = P (Ac1 ) · · · P (Acn ) and P (Aci ) = 1 − P (Ai ). 2.36 Using Problem 2.35, the probability is 1 − 21 × 32 × 43 = 34 . T S∞ 2.37 The S set A can be represented as ∞ of n=1 k=n Ak . Since the sequence S∞ A A is nonincreasing, we have P (A) = lim P sets ∞ n→∞ k=n k , k=n k by S the continuity property T∞ ofc probability. Next use the fact that ∞ A = 1−P P k=n Ak . Using the independence of the events k=n k An and the continuity property measure, it is readily ver Q∞ of probability T∞ c ). By P (Ac ) = 1 − P (A ) and c = P (A A ified that P k k=n k=n k k k the inequality 1 − x ≤ e−x , we get P ∞ \ k=n ∞ P∞ Y e−P (Ak ) = e− k=n P (Ak ) = 0 for n ≥ 1, Ack ≤ k=n P∞ where the last equality uses the assumption n=1 P (An ) = ∞. This S∞ verifies that P k=n Ak = 1 for all n ≥ 1 and so P (A) = 1. 2.38 Let A be the event that you have picked the ball with number 7 written on it and Bi the event that you have chosen box i for i = 1, 2. By the law of conditional probability, P (A) = P (A | B1 )P (B1 ) + P (A2 | B2 )P (B2 ). Therefore P (A) = 1 1 1 1 × + × = 0.07. 10 2 25 2 2.39 Let A be the event that HAPPY HOUR appears again, B1 be the event that either the two letters H or the two letters P have been removed, and B2 be the event that two different letters have been removed. Then 35 P (B1 ) = 29 × 81 + 92 × 81 and P (B2 ) = 1 − P (B1 ). Obviously, P (A | B1 ) = 1 and P (A | B2 ) = 12 . By the law of conditional probability, P (A) = 2 X i=1 P (A | Bi )P (Bi ) = 1 × 1 17 19 1 + × = . 18 2 18 36 2.40 Let A be the event that the cases with $1,000,000 and $750,000 are still in the game when you have opened 20 cases. Also, let B0 be the event that your chosen case does not contain either of the amounts $1,000,000 and $750,000 and B1 be the complementary event of B0 . Then P (A) = P (A | B0 )P (B0 ) + P (A | B1 )P (B1 ), which gives h23 25i 24 h24 25i 3 2 P (A) = / × + = . / × 20 20 26 26 65 20 20 2.41 Let A be the event that you ever win the jackpot when buying a single ticket only once. Also, let B be the event that you match the six numbers drawn and C be the event that you match exactly two of these numbers. It follows from P (A) = P (A | B)P (B) + P (A | C)P (C) that P (A) = P (B) + P (A)P (C). Since P (B) = 1/ 59 6 and 59 P (C) = 62 53 / , we get P (A) = 1/40,665,099. 4 6 2.42 Let A be the event that Joe’s dinner is burnt, B0 be the event that he did not arrive home on time, and B1 be the event that he arrived home on time. The probability P (A) = P (A | B0 )P (B0 ) + P (A | B1 )P (B1 ) is equal to 0.5 × 0.2 + 0.15 × 0.8 = 0.22. The inverse probability P (B1 | A) is given by P (B1 )P (A | B1 ) 0.8 × 0.15 6 P (B1 A) = = = . P (A) P (A) 0.22 11 2.43 Let A be the event of reaching your goal, B1 be the event of winning the first bet and B2 be the event of losing the first bet. Then, by P (A) = P (A | B1 )P (B1 ) + P (A | B2 )P (B2 ), we get P (A) = 1 × 9 25 12 37 + 37 × 37 . Thus the probability of reaching your goal is 0.4887. Note: This probability is slightly more than the probability 0.4865 of reaching your goal when you use bold play and stake the whole $10,000 on a 18-numbers bet. 2.44 Let A be the event that the player wins and Bi be the conditioning event that the first roll of the two dice gives a dice sum of i points 36 P12 for i = 2, . . . , 12. Then, P (A) = k=2 P (A | Bk )P (Bk ). We have P (A | Bi ) = 1 for i = 7, 11, and P (A | Bi ) = 0 for i = 2, 3, 12. Put for abbreviation pk = P (Bk ), then pk = k−1 36 for k = 2, . . . , 7 and pk = p14−k for k = 8, . . . , 12. The other conditional probabilities P (A | Bi ) can be given in terms of the pk . For example, the conditional probability P (A | B4 ) is no other than the unconditional probability that the total of 4 will appear before the total of 7 does in the (compound) experiment of repetitive dice rolling. The total of 4 will appear before the total of 7 if and only if one of the events E1 , E2 , . . . occurs, where Ek is the event that the first consecutive k − 1 rolls give neither the total of 4 nor the total of 7 and the kth consecutive roll gives a total of 4. The events E1 , E2 , . . . are mutually exclusive and so P (4 before 7) = P ∞ [ i=1 ∞ X P (Ei ). Ei = i=1 Any event Ek is generated by physically independent subexperiments and thus the probabilities of the individual outcomes in the subexperiments are multiplied by each other in order to obtain P (Ek ) = k−1 p4 for any k ≥ 1. This leads to the formula 1 − p4 − p7 P (4 before 7) = ∞ X k=1 1 − p4 − p7 In this way, we find that P (A | Bi ) = Putting all the pieces together, we get P (A) = 12 X k=2 k−1 pi pi +p7 p4 = p4 , p4 + p7 for i = 4, 5, 6, 8, 9, 10. P (A | Bk )pk = 0.4929. 2.45 Apply the gambler’s ruin formula with p = 0.3, a = 3 and b = 7. The sought probability is 0.0025. 2.46 For fixed integer r, let Ar be the event that there are exactly r winning tickets among the fifty thousand tickets sold. Let Bk be the event that there exactly k winning tickets among the one hundred thousand tickets printed. Then, by the law of conditional probability, P (Ar ) = ∞ X k=0 P (Ar | Bk )P (Bk ). 37 Obviously, P (Ar | Bk ) = 0 for k < r. For all practical purposes the so-called Poisson probability e−1 /k! can be taken for the probability of the event Bk for k = 0, 1, . . ., see Example 1.19. This gives ∞ r −1 ∞ r r X X 1 (1/2) 1 k e 1 −1 (1/2) =e = e− 2 . P (Ar ) = r 2 k! r! j! r! j=0 k=r Hence the probability of exactly r winning tickets among the rfifty 1 for thousand tickets sold is given by the Poisson probability e− 2 (1/2) r! r = 0, 1, . . .. 2.47 It is no restriction to assume that the starting point is 1 and the first transition is from point 1 to point 2 (otherwise, renumber the points). Some reflection shows that the probability of visiting all points before returning to the starting point is nothing else than the probability 1 1 1+10 = 11 from the gambler’s ruin model. 2.48 Let A be the event that the card picked is a red card, B1 be the event that the removed top card is red and B2 be the event that the removed top card is black. The sought probability P (A) is given by P (A | B1 )P (B1 ) + P (A | B2 )P (B2 ). Therefore P (A) = r−1 r r b r × + × = . r+b−1 r+b r+b−1 r+b r+b 2.49 Let A be the event that John needs more tosses than Pete and Bj be the event that Pete needs j tosses to obtain three heads. Then j 1 j j 1 j 1 j P (Bj ) = j−1 2 ( 2 ) and P (A | Bj ) = 0 ( 2 ) + 1 ( 2 ) . By the law of conditional probability, the sought probability P (A) is P (A) = ∞ X j=3 P (A | Bj )P (Bj ) = 0.1852. 2.50 Take any of the twenty balls and mark this ball. Let A be the event that this ball is the last ball picked for the situation that three balls were overlooked and were added to the bin at the end. If we can show 1 that P (A) = 20 , the raffle is still fair. Let B1 (B2 ) be the event that the marked ball is (is not) one of the three balls that were unintentionally overlooked. Then, by the law of conditional probabilities, P (A) = P (A | B1 )P (B1 ) + P (A | B2 )P (B2 ) = 1 3 17 × +0× . 3 20 20 38 1 , the same win probability as for the case in which Hence P (A) = 20 no balls would have been overlooked. 2.51 Let state i mean that player A’ s bankroll is i. Also, let E be the event of reaching state k without having reached state a + b when starting in state a and F be the event of reaching state a + b without having reached state k − 1 when starting in state k. Then the unconditional probability of player A winning and having k as the lowest value of its bankroll during the game is given by P (EF ) = P (E)P (F | E). Using b 1 the gambler’s ruin formula, P (E) = a+b−k and P (F | E) = a+b−k+1 . Thus the sought conditional probability is b(a + b) a(a + b − k)(a + b − k + 1) for k = 1, . . . a. This probability has the values 0.1563, 0.2009, 0.2679, and 0.3750 for k = 1, 2, 3, and 4 when a = 4 and b = 5. 2.52 Let A be the event that two or more participating cyclists will have birthdays on the same day during the tournament and Bi be the event that exactly i participating cyclists have their birthdays during the tournament. The conditional probability P (A | Bi ) is easy to calculate. It is the standard birthday problem. We have P (A | Bi ) = 1 − 23 × 22 × · · · × (23 − i + 1) 23i for 2 ≤ i ≤ 23. Further, P (A | Bi ) = 1 for i ≥ 24. Also, P (A | B0 ) = P (A | B1 ) = 0. Therefore the probability P (Bi ) is given by 180 23 i 23 180−i P (Bi ) = 1− i 365 365 for 0 ≤ i ≤ 180. Putting the pieces together and using P (A) = we get P (A) = 1 − P (B0 ) − P (B1 ) − P180 i=2 P (A | Bi )P (Bi ), 23 X 23 × 22 × · · · × (23 − i + 1) i=2 (23)i P (Bi ). This yields the value 0.8841 for the probability P (A). 2.53 Let pn (i) be the probability of reaching his home no later than midnight without having reached first the police station given that he is 39 i steps away from his home and he has still time to make n steps before it is midnight. The sought probability is p180 (10). By the law of conditional probability, the pn (i) satisfy the recursion 1 1 pn (i) = pn−1 (i + 1) + pn−1 (i − 1). 2 2 The boundary conditions are pk (30) = 0 and pk (0) = 1 for k ≥ 0, and p0 (i) = 0 for i ≥ 1. Applying the recursion, we find p180 (10) = 0.4572. In the same way, the value 0.1341 can be calculated for the probability of reaching the police station before midnight. Note: As a sanity check, we verified that pn (10) tends to 23 as n gets large, in agreement with the gambler’s ruin formula. The probability pn (10) has the values 0.5905, 0.6659, and 0.6665 for n = 360, 1,200 and 1,440. 2.54 Let A be the event that John and Pete meet each other in the semifinals. To find P (A), let B1 be the event that John and Pete are allocated to either group 1 or group 2 but not to the same group and B2 be the event that John and Pete are allocated to either group 3 or group 4 but not to the same group. Then P (B1 ) = P (B2 ) = 12 × 72 = 17 . By the law of conditional probability, 1 1 + P (A | B2 ) × 7 7 1 1 1 1 1 1 1 × × + × × = . 2 2 7 2 2 7 14 P (A) = P (A | B1 ) × = Let C be the event that John and Pete meet each other in the final. To find P (C), let D1 be the event that John is allocated to either group 1 or group 2 and Pete to either group 3 or group 4 and D2 be the event that John is allocated to either group 3 or group 4 and Pete to either group 1 or group 2. Then P (D1 ) = P (D2 ) = 12 × 74 = 27 . By the law of conditional probability, 2 2 + P (C | D2 ) × 7 7 1 1 1 1 1 2 1 1 1 1 2 × × × × + × × × × = . 2 2 2 2 7 2 2 2 2 7 28 P (C) = P (C | D1 ) × = The latter result can also be directly seen by a symmetry argument. The probability that any one pair contests the final is the same as that 8 for any other pair. There are 2 different pairs and so the probability 1 that John and Pete meet each other in the final is 1/ 82 = 28 . 40 2.55 Let A be the event that you have chosen the bag with one red ball and B be the event that you have the other bag. Also, let E be the event that the first ball picked is red. The sought probability that the second ball picked is red is 1 3 P (A | E) + P (B | E), 4 4 by the law of conditional probability. We have P (A | E) = P (A)P (E | A) P (AE) = . P (E) P (E) Further, P (B | E) = 1 − P (A | E). Since P (A) = P (B) = 12 , P (E) = P (A) × 41 + P (B) × 34 = 12 and P (E | A) = 14 , we get P (A | E) = 41 and P (B | E) = 43 . Thus the sought probability is 14 × 41 + 34 × 43 = 58 . 2.56 The key idea for the solution approach is to parameterize the starting state. Define Ds as the event that Dave wins the game when the game begins with Dave rolling the dice and Dave has to roll more than s points in his first roll. Similarly, the event Es is defined for Eric. The goal is to find P (D1 ). This probability can be found from a recursion scheme for the P (Ds ). The recursion scheme follows by conditioning on the events Bj , where Bj is the event that a roll of the two dice results in a sum of j points. The probabilities pj = P (Bj ) are given by pj = j−1 36 for 2 ≤ j ≤ 7 and pj = p14−j for 8 ≤ j ≤ 12. By the law of conditional probability, P (Ds ) = 12 X j=s+1 P (Ds | Bj )pj for s = 1, 2, . . . , 11. Obviously, P (D12 ) = 0. Since P (Ds | Bj ) = 1 − P (Ej ) for j > s and P (Ek ) = P (Dk ) for all k, we get the recursion scheme P (Ds ) = 12 X [1 − P (Dj )]pj for s = 1, 2, . . . , 11. j=s+1 Starting with P (D12 ) = 0, we recursively compute P (D11 ), . . . , P (D1 ). This gives the value P (D1 ) = 0.6541 for the probability of Dave winning the game. 41 2.57 Fix j. Label the c = 7j possible combinations of j stops as l = 1, . . . , c. Let A be the event that there will be exactly j stops at which nobody gets off and Bl be the event that P nobody gets off at the j stops from combination l. Then, P (A) = cl=1 P (A | Bl )P (Bl ). We have that P (Bl ) = (7 − j)25 /725 for all l and P (A | Bl ) is the unconditional probability that at least one person gets off at each stop there are 7 − j stops and 25 persons. Thus P (A | Bl ) = Pwhen 7−j 25 25 1− k=1 (−1)k+1 7−j k (7−j−k) /(7−j) , using the result of Example 1.18. Next we get after some algebra the desired result P (A) = 7−j X k=0 j+k 7 (7 − j − k)25 (−1) . 725 j j+k k Note: More generally, the probability of exactly j empty bins when m ≥ b balls are sequentially placed at random into b bins is given by b−j X k=0 (b − j − k)m b j+k (−1) . j+k j bm k 2.58 Let A be the event that all of the balls drawn are blue and Bi be the event that the number of points shown by the die is i for i = 1, . . . , 6. By the law of conditional probability, the probability that all of the balls drawn are blue is given by P (A) = 6 X i=1 P (A | Bi )P (Bi ) = 5 X i=1 5 i 10 i × 1 5 = . 6 36 The probability that the number of points shown by the die is r given that all of the balls drawn are blue is equal to (1/6) 5r / P (Br A) = P (Br | A) = P (A) 5/36 This probability has the values 53 , 4 1 1 1 15 , 10 , 35 , 210 10 r . and 0 for r = 1, . . . , 6. 2.59 Let A be the event that the both rolls of the two dice show the same combination of two numbers. Also, let B1 be the event that the first roll of the two dice shows two equal numbers and B2 be the event 6 that the first roll shows two different numbers. Then P (B1 ) = 36 and 42 30 . Further, P (A | B1 ) = P (B2 ) = 36 law of conditional probability, P (A) = 2 X i=1 P (A | Bi )P (Bi ) = 1 36 and P (A | B2 ) = 2 36 . By the 1 6 2 30 11 × + × = . 36 36 36 36 216 2.60 Let Aj be the event that the team placed jth in the competition wins the first place in the draft and Bj be the event that this team wins the second place in the draft for 7 ≤ j ≤ 14. Obviously, P (Aj ) = 15 − j 36 for j = 7, . . . , 14. P By the law of conditional probability, P (Bj ) = k6=j P (Bj | Ak )P (Ak ). 15−j for k 6= j. Therefore We have P (Bj | Ak ) = 36−15+k P (Bj ) = X k6=j 15 − j 15 − k × 36 − 15 + k 36 for j = 7, . . . , 14. The probability P (Bj ) has the numerical values 0.2013, 0.1848, 0.1653, 0.1431, 0.1185, 0.0917, 0.0629, and 0.0323 for j = 7, 8, . . . , 14. 2.61 This problem can be seen as a random walk on the integers, where the random walk starts at zero. In the first step the random walk moves from to 1 with probability p and to −1 with probability q = 1 − p. Take p < 21 . Starting from 1, the random walk will ever return to zero with probability 1−limb→∞ [1−(q/p)a ]/[1−(q/p)a+b ] with a = 1. This probability is 1. Starting from −1, the random walk will ever return to zero with probability 1 − limb→∞ [1 − (p/q)a ]/[1 − (p/q)a+b ] with a = 1. p This probability is pq . The sought probability is p×1+(1−p)× 1−p = 2p 1 (this result is also valid for p = 2 ). 2.62 Let A be the event that the drunkard will ever visit the point which is one unit distance south from his starting point. Let B1 be the event that the first step is one unit distance to the south and B2 be the event that the first step is two units distance to the north. By the law of conditional probability, P (A) = P (A | B1 )P (B1 ) + P (A | B2 )P (B2 ). Obviously, P (B1 ) = P (B2 ) = 21 and P (A | B1 ) = 1. Noting that the probability of ever going three units distance to the south from any starting point is P (A) × P (A) × P (A), it follows that 3 P (A) = 0.5 + 0.5 P (A) . 43 The cubic equation x3 − 2x + 1 = 0 has the root x = 1 and so the 2 +x−1) = 0. The only positive equation can be factorized as (x−1)(x √ 1 2 root of x +x−1 = 0 is x = 2 ( 5−1). This is the desired value for the √ sought probability P (A). Next some reflection shows that 12 ( 5 − 1) gives also the probability of the number of heads ever exceeding twice the number of tails if a fair coin is tossed over and over. 2.63 It does not matter what question you ask. To see this, let A be the event that your guess is correct, B1 be the event that the answer of your friend is yes and B2 be the event that the answer is no. For the 1 1 1 question whether the card is red, we have P (A) = 26 × 21 + 26 × 12 = 26 , by the law of conditional probability. For the other question, P (A) = 1 1 1 + 51 × 51 1 × 52 52 = 26 . The same probability. 2.64 Let A be the event that player 1 wins the game. We have P (A) = 0.5, regardless of the value of m. The simplest way to see this is to define E1 as the event that player 1 has more heads than player 2 after m tosses, E2 as the event that player 1 has fewer heads than player 2 after m tosses, and E3 as the event that player 1 has P the same number of heads as player 2 after m tosses. Then P (A) = 3i=1 P (A | Ei )P (Ei ), by the law of conditional probability. To evaluate this, it is not necessary to know the P (Ei ). Since P (E2 ) = P (E1 ) and P (E3 ) = 1 − 2P (E1 ), it follows that 1 × P (E3 ) 2 1 = P (E1 ) + × 1 − 2P (E1 ) = 0.5. 2 P (A) = 1 × P (E1 ) + 0 × P (E2 ) + 2.65 Let A be the event that you roll two consecutive totals of 7 before a total of 12. Let B1 be the event that each of the first two rolls results in a total of 7, B2 be the event that the first roll gives a total of 7 and the second roll a total different from 7 and 12, B3 be the event that the first roll gives a total different from 7 and 12, B4 be the event that the first roll gives a total of 7 and the second roll a total of 12, and B5 be the event that the first roll gives a total of 12. Then, P (A) = 1× 6 6 29 29 6 1 1 6 × +P (A)× × +P (A)× +0× × +0× 36 36 36 36 36 36 36 36 and so P (A) = 6 13 . 44 2.66 A minor modification of the analysis of Example 2.11 shows that the optimal stopping level for player A remains the same, but the win probability of player A changes to 0.458. 2.67 The recursion is 6−i X 1 p(i + 1, t − j), p(i, t) = 6−i+1 j=0 as follows by conditioning upon the number of tokens you lose at the ith cup. This leads to p(1, 6) = 169 720 . 2.68 For fixed n, Let A be the event that the total score ever reaches the value n. To find pn = P (A), condition on the outcome of the first roll of the die. Let Bj be the event that the outcome of this roll is j. Then, P (A | Bj ) = pn−j and so, by the law of conditional probability, pn = 6 X k=1 pn−k × 1 6 for all n ≥ 1 1 with the convention pj = 0 for j ≤ 0. The result that pn tends to 3.5 as n gets large can be intuitively explained from the fact that after each roll of the die the expected increase in the total score is equal to 1 6 (1 + 2 + · · · + 6) = 3.5. 2.69 (a) Define rn as the probability of getting a run of either r successes or r failures in n trials.Also, define sn as probability of getting a run of either r successes or r failures in n trials given that the first trial results in a success, and fn as probability of getting a run of either r successes or r failures in n trials given that the first trial results in a failure. Then rn = psn + (1 − p)fn . The sn and fn satisfy the recursive schemes sn = p r−1 + r−1 X k=1 pk−1 (1 − p)fn−k fn = (1 − p)r−1 + r−1 X k=1 (1 − p)k−1 psn−k for n ≥ r, where sj = fj = 0 for j < r − 1. (b) Parameterize the starting state and let p(r, b, L) be the probability 45 that the longest run of red balls will be L or more when the bowl initially contains r red and b blue balls. Fix r > L and b ≥ 1. Let A be the event that a run of L red balls will occur. To find P (A) = p(r, b, L), let BL be the conditioning event that the first L balls picked are red and Bj−1 be the conditioning event that each of the first j − 1 balls picked is red but the jth ball picked is blue, where 1 ≤ j ≤ L. Then r r − (L − 1) × ··· × r+b r + b − (L − 1) r − (j − 2) b r × ··· × × , 1 ≤ j ≤ L. P (Bj−1 ) = r+b r + b − (j − 2) r + b − (j − 1) Note that P (A | BL ) = 1 and P (A | Bj−1 ) = p r − (j − 1), b − 1, L for 1 ≤ j ≤ L. Then P (BL ) = P (A) = P (BL ) + L X j=1 P (A | Bj−1 )P (Bj−1 ) gives a recursion scheme for the calculation of the probability p(r, b, L). 2.70 For the case of n dwarfs, p(k, n) is defined as the probability that the kth dwarf will not sleep in his own bed when the first dwarf chooses randomly one of the n beds (the dwarfs 1, 2, . . . , n go to bed in this order and dwarf j has bed j). Let us first note that the dwarfs 2, . . . , j− 1 sleep in their own beds if the first dwarf chooses bed j. The first dwarf chooses each of the n beds with the same probability n1 . Fix k ≥ 2. Under the condition that the first dwarf chooses bed j with 2 ≤ j ≤ k, the conditional probability that the kth dwarf will not sleep in his own bed is equal to 1 for j = kand is equal to the unconditional probability p k − (j − 1), n − (j − 1) for 2 ≤ j ≤ k − 1 (when dwarf j goes to bed, we face the situation of n − (j − 1) dwarfs where bed 1 is now the bed of dwarf j and dwarf k is in the k − (j − 1)-th position). Hence, by the law of conditional probability, we find the recursion k−1 p(k, n) = 1 1 X + p k − (j − 1), n − (j − 1) n n j=2 for k = 2, . . . , n and all n ≥ 2. Noting that p(1, n) = n−1 n for all n ≥ 1, 1 we get p(2, n) = n1 and p(3, n) = n−1 . Next, by induction, we obtain p(k, n) = 1 n−k+2 for 2 ≤ k ≤ n. 46 Hence the probability that the kth dwarf can sleep in his own bed is 1 1 n−k+1 equal to 1 − n−1 n = n for k = 1 and 1 − n−k+2 = n−k+2 for 2 ≤ k ≤ n. A remarkable result is that p(n, n) = 12 for all n ≥ 2. A simple intuitive explanation can be given for the result that the last dwarf will sleep in his own bed with probability 21 , regardless of the number of dwarfs. The key observation is that the last free bed is either the bed of the youngest dwarf or the bed of the oldest dwarf. This is an immediate consequence of the fact that any of the other dwarfs always chooses his own bed when it is free. Each time a dwarf finds his bed occupied, the dwarf chooses at random a free bed and then the probability of the youngest dwarf’s bed being chosen is equal to the probability of the oldest dwarf’s bed being chosen. Thus the last free bed is equally likely to be the bed of the youngest dwarf or the bed of the oldest dwarf. Note: Consider the following variant of the problem with seven dwarfs. The jolly youngest dwarf decides not to choose his own bed but rather to choose at random one of the other six beds. Then, the probability 5 that the oldest dwarf can sleep in his own bed is 56 × 12 = 12 , as can be seen by using the intuitive reasoning above. 2.71 Let’s assume that the numbers 1, 2, . . . , R are on the wheel. It is obvious that the optimal strategy of the second player B is to stop after the first spin if and only if the score is larger than the final score of player A and is larger than a2 , where a2 is the optimal switching point for the first player in the two-player game (a2 = 53 for R = 100). Denote by S3 (a) [C3 (a)] the probability that the first player A will beat both player B and player C if player A obtains a score of a points in the first spin and stops [continues] after the first spin. Let the switching point a3 be the largest value of a for which C3 (a) is still larger than S3 (a). Then, in the three-player game it is optimal for player A to stop after the first spin if the score of this spin is more than a3 points. Denote by P3 (A) the overall win probability of player A. Then, by the law of conditional probability, a3 R 1 X 1 X P3 (A) = S3 (a). C3 (a) + R R a=1 a=a3 +1 To obtain S3 (a), we first determine the conditional probability that player A will beat player B when player A stops after the first spin with a points. Denote this conditional probability by Pa . To find, we 47 2 first note that the probability S(a) = Ra 2 from the two-player game represents the probability that the second player in this game scores no more than a points in the first spin and has not beaten the first player after the second spin. Thus, taking into account the form of the optimal strategy of player B, we find for a ≥ a2 , Pa = P (B gets no more than a in the first spin and A beats B) = S(a) = a2 R2 and for 1 ≤ a < a2 , Pa = P (B gets no more than a in the first spin and A beats B) + P (B gets between a and a2 + 1 in the first spin and A beats B) a2 1 a2 R − j 1 X a2 (a2 + 1) − a(a + 1) , = 2+ 1− = S(a) + 2 R R R 2R j=a+1 where 1 − (R − j)/R denotes the probability that B’s total score after the second spin exceeds R when B’s score in the first spin is j. Obviously, for the case that player A stops after the first spin with a points, the conditional probability of player A beating player C given 2 that player A has already beaten player B is equal to Ra 2 . Thus, the function S3 (a) is given by 2 a 2 × a22 , a2 ≤ a ≤ R R R S3 (a) = a2 2 2 + 1 2 [a2 (a2 + 1) − a(a + 1)] × a 2 , 1 ≤ a < a2 . R 2R R Further, C3 (a) = R−a 1 X S3 (a + k) R k=1 for 1 ≤ a ≤ R. PR−a (a+k)4 and C3 (a) = R1 k=1 for a ≥ a2 and Noting that S3 (a) = R4 taking for granted that a3 ≥ a2 , the switching point a3 is nothing else than the largest integer a ≥ a2 for which a4 R4 R−a 1 X (a + k)4 a4 > . R R4 R4 k=1 The probability P3 (B) of the second player B being the overall winner can be calculated as follows. For the situation of optimal play by the 48 players, let p3 (a) denote the probability that the final score of player A will be a points and, for b > a, let p3 (b | a) denote the probability that the final score of player B will be b points given that players’s A final score is a. Then, P3 (B) = R−1 X p3 (a) a=0 R X b=a+1 p3 (b | a) b2 . R2 Pa 3 It easily follows that p3 (0) = k=1 (1/R) × (k/R) = 21 a3 (a3 + 1)/R2 , Pa−1 2 p3 (a) = k=1 (1/R) × (1/R) = (a − 1)/R for 1 ≤ a ≤ a3 , and p3 (a) = 1/R + a3 /R2 for a3 < a ≤ R. Then, for 0 ≤ a < a2 and b > a, 1 a2 b−1 for b ≤ a2 , p3 (b | a) = + 2 for b > a2 , 2 R R R 2 p3 (b | a) = 1/R + a/R for a2 ≤ a < R and b > a. p3 (b | a) = Numerical results: For R = 20, we find a3 = 13 (a2 = 10), P3 (A) = 0.3414, P3 (B) = 0.3307, and P3 (C) = 0.3279 with P3 (C) = 1−P3 (A)− P3 (B). For R = 100, the results are a3 = 65 (a2 = 53), P3 (A) = 0.3123, P3 (B) = 0.3300, and P3 (C) = 0.3577, while for R = 1,000 the results are a3 = 648 (a2 = 532), P3 (A) = 0.3059, P3 (B) = 0.3296, and P3 (C) = 0.3645 (see also the solution of Problem 7.35). Note: The following result can be given for the s-player game with s > 3. Denoting by as the optimal switching point for the first player A in the s-player game, the value of as can be calculated as the largest integer a ≥ as−1 for which R−a 1 X a + k 2(s−1) a 2(s−1) . > R R R k=1 For R = 20, as has the values 14, 15, 16, and 17 for s=4, 5, 7, and 10. These values are 71, 75, 80, and 85 when R = 100 and are 711, 752, 803, and 847 when R = 1,000. 2.72 Let the hypothesis H be the event that a 1 is sent and the evidence E be the event that a 1 is received. The posterior odds are P (H | E) P (H) P (E | H) 0.8 0.95 × = 380. = × = 0.2 0.01 P (H | E) P (H) P (E | H) Hence the posterior probability P (H | E) that a 1 has been sent is 380 1+380 = 0.9974. 49 2.73 Let the hypothesis H be the event that oil is present and the evidence E be the event that the test is positive. Then P (H) = 0.4, P (H) = 0.6, P (E | H) = 0.9, and P (E | H) = 0.15. Thus the posterior odds are P (H) P (E | H) 0.4 0.90 P (H | E) = × = × =4 0.6 0.15 P (H | E) P (H) P (E | H) The posterior probability P (H | E) = 4 1+4 = 0.8. 2.74 Let the hypothesis H be the event that it rains tomorrow and E be the event that rain is predicted for tomorrow. The prior odds of the event H are P (H)/P (H) = 0.1/0.9. The likelihood ratio is given by P (E | H)/P (E | H) = 0.85/0.25. Then, by Bayes’ rule in odds form, the posterior odds are P (H | E) 17 P (H) P (E | H) 0.1 0.85 × = . = × = 0.9 0.25 45 P (H | E) P (H) P (E | H) It next follows that the posterior probability P (E | H) of rainfall tomorrow given the information that rain is predicted for tomorrow is 17/45 = 0.2742. equal to 1+17/45 2.75 Let the hypothesis H be the event that the blue coin is unfair and the evidence E be the event that all three tosses3 of the blue coin show (0.75) 27 a head. The posterior odds are 0.2 0.8 × (0.5)3 = 32 . The posterior probability P (H | E) = 27 59 = 0.4576. 2.76 Let the hypothesis H be the event that Dennis Nightmare played the final and the evidence E be the event that the Dutch team won the final Then, P (H) = 0.75, P (H) = 0.25, P (E | H) = 0.5, and P (E | H) = 0.3. Therefore the posterior odds are P (H | E) 0.75 0.5 × = 5. = 0.25 0.3 P (H | E) Thus the sought posterior probability P (H | E) = 56 . 2.77 Let the hypothesis H be the event that both children are boys. (a) If the evidence E is the event that at least one child is a boy, then the posterior odds are 1 1 1/4 × = . 3/4 2/3 2 50 The posterior probability P (H | E) = 31 . (b) If the evidence E is the event that at least one child is a boy born on a Tuesday, then the posterior odds are 6 2 ih 1 1 1 1 1 i 13 1/4 h × 1− × + × + ×0 = . 3/4 7 3 7 3 7 3 14 13 The posterior probability P (H | E) = 27 . (c) If the evidence E is the event that at least one child is a boy born on one of the first k days of the week, then the posterior odds are i 14 − k k 2 ih 1 k 1 k 1 1/4 h × 1− 1− × + × + ×0 = . 3/4 7 3 7 3 7 3 14 The posterior probability P (H | E) = 14−k 28−k for k = 1, 2, . . . , 7. 2.78 Let the hypothesis H be the event that the inhabitant you overheard spoke truthfully and the evidence E be the event that the other inhabitant says that the inhabitant you overheard spoke the truth. The posterior odds are 1 P (H | E) 1/3 1/3 × = . = 2/3 2/3 4 P (H | E) Hence the posterior probability P (H | E) that the inhabitant you 1/4 overheard spoke the truth is 1+1/4 = 51 . 2.79 Let the hypothesis H be the event that the suspect is guilty and the evidence E be the event that the suspect makes a confession. To verify that P (H | E) > P (H) if and only if P (E | H) > P (E | H), we use a b the fact that 1−a > 1−b for 0 < a, b < 1 if and only if a > b. Bayes’ rule in odds form states that P (H) P (E | H) P (H | E) = × P (H | E) P (H) P (E | H) If P (E | H) > P (E | H), then it follows from Bayes’ rule in odds form P (H|E) P (H) that 1−P (H|E) > 1−P (H) and so P (H | E) > P (H). Next suppose that P (H | E) > P (H). Then P (H|E) 1−P (H|E) > in odds form, P (E | H) > P (E | H). P (H) 1−P (H) and thus, by Bayes’ rule 51 2.80 Let the hypothesis H be the event that the bowl originally contained a red ball and the evidence E be the event that a red ball is picked from the bowl after a red ball was added. Then, P (H) = 0.5, P (H) = 0.5, (H|E) 1 = 1/2 P (E | H) = 1, and P (E | H) = 0.5. Therefore PP (H|E) 1/2 × 1/2 = 2. Thus the posterior probability P (H | E) = 32 . 2.81 Let the hypothesis H be the event that the woman has breast cancer and the evidence E be the event that the test result is positive. Since P (H) = 0.01, P (H) = 0.99, P (E | H) = 0.9, and P (E | H) = 0.1, the 0.01 1 posterior odds are 0.99 × 0.9 0.1 = 11 . Therefore the posterior probability 1 . P (H | E) = 12 Note: As a sanity check, the posterior probability can also be obtained by a heuristic but insightful approach. This approach presents the relevant information in terms of frequencies instead of probabilities. Imagine 10,000 (say) women who undergo the test. On average, there are 90 positive tests for the 100 women having the malicious disease, whereas there are 990 false positives for the 9,900 healthy women. Thus, based on the information presented in this way, we find that the 1 sought probability is 90/(90 + 990) = 12 . 2.82 Let the hypothesis H be the event that Elvis was an identical twin and the evidence E be the event that Elvis’s twin was male. Then 300 5 = 17 , P (H) = 12 P (H) = 425 17 , P (E | H) = 1, and P (E | H) = 0.5. (H|E) 5 Then, by Bayes in odds form, PP (H|E) . = 56 . This gives P (H | E) = 11 Note: A heuristic way to get the answer is as follows. In 3000 births (say), we would expect 3000/300 = 10 sets of identical twins. Roughly half of those we would expect to be boys. That’s 5 sets of boy-boy identical twins. In 3000 births, we would expect 3000/125 = 24 sets of fraternal twins. One fourth would be boy-boy, one-fourth would be girl-girl, one fourth would be boy-girl, and one fourth girl-boy. Therefore six sets would be boy-boy. So, out of 3000 births, five out of eleven sets of boy-boy twins would be identical. Therefore the chances that Elvis was an identical twin is about 5/11. 2.83 Let the hypothesis H be the event that you have chosen the two-headed coin and the evidence E be the event that all n tosses result in heads. The posterior odds are P (H | E) 1 1/10,000 × . = 9,999/10,000 0.5n P (H | E) 52 n 2 This gives P (H | E) = 2n +9 ,999 . The probability P (H | E) has the values 0.0929, 0.7662, and 0.9997 for n = 10, 15, and 25. 2.84 Let the random variable Θ represent the unknown probability that a single toss of the die results in the outcome 6. The prior distribution of Θ is given by p0 (θ) = 0.25 for θ = 0.1, 0.2, 0.3 and 0.4. The posterior probability p(θ | data) = P (Θ = θ | data) is proportional to 75 225 . Hence the L(data | θ)p0 (θ), where L(data | θ) = 300 75 θ (1 − θ) posterior probability p(θ | data) is given by p(θ | data) = = L(data | θ)p0 (θ) k=1 L(data | k/10)p0 (k/10) θ75 (1 − θ)225 , θ = 0.1, 0.2, 0.3, 0.4. P4 75 225 k=1 (k/10) (1 − k/10) P4 The posterior probability p(θ | data) has the values 3.5×10−12 , 0.4097, 0.5903, and 3.5 × 10−12 for θ = 0.1, 0.2, 0.3, and 0.4. 2.85 Let the random variable Θ represent the unknown win probability of Alassi. The prior of Θ is p0 (0.4) = p0 (0.5) = p0 (0.6) = 13 . Let E be the event that Alassi wins The likelihood the best-of-five contest. 3 2 4 2 3 2 function L(E | θ) is θ + 2 θ (1 − θ)θ + 2 θ (1 − θ) θ. The posterior probability p(θ | E) is proportional to p0 (θ)L(E | θ) and has the values 0.2116, 0.3333, and 0.4550 for θ = 0.4, 0.5, and 0.6. 2.86 The prior density of the unknown success probability is p0 (θ) = 1 101 for θ = 0, 0.01, . . . , 0.99, 1. For a single observation, the prior is updated with the likelihood factor θ if the observation corresponds to a success of the new treatment and with 1 − θ otherwise. The first observation S leads to an update that is proportional to θp0 (θ), the second observation S to an update that is proportional to θ2 p0 (θ), the third observation F to an update that is proportional to θ2 (1 − θ)p0 (θ), and so on, the tenth observation F to an update that is proportional to θ2 (1 − θ)θ2 (1 − θ)θ3 (1 − θ)p0 (θ) = θ7 (1 − θ)3 p0 (θ). The same posterior as we found in Example 2.17, where we simultaneously used all observations. 2.87 Let the random variable Θ be 1 if the student is unprepared for the exam, 2 if the student is half prepared, and 3 if the student is well 53 prepared. The prior of Θ is p0 (1) = 0.2, p0 (2) = 0.3, and p0 (3) = 0.5. Let E be the event that the student has answered correctly 26 26 out 24of 50 questions. The likelihood function L(E | θ) is 50 26 aθ (1 − aθ ) , where a1 = 13 , a2 = 0.45 and a3 = 0.8. The posterior probability p(θ | E) is proportional to p0 (θ)L(E | θ) and has the values 0.0268, 0.9730, and 0.0001 for θ = 1, 2, and 3. 2.88 Let the random variable Θ represent the unknown probability that a free throw of your friend will be successful. The prior probabilities are p0 (θ) = P (Θ = θ) has the values 0.2, 0.6, and 0.2 for θ = 0.25, 0.50, and 0.75. The posterior probability p(θ | data) = P (Θ =θ | data) is 7 3 proportional to L(data | θ)p0 (θ), where L(data | θ) = 10 7 θ (1 − θ) . Hence the posterior probability p(θ | data) is given by θ7 (1 − θ)3 p0 (θ) 0.257 × 0.753 × 0.2 + 0.507 × 0.503 × 0.6 + 0.757 × 0.253 × 0.2 for θ = 0.25, 0.50, and 0.75. The possible values 0.25, 0.50 and 0.75 for the success probability of the free throws of your friend have the posterior probabilities 0.0051, 0.5812 and 0.4137. 54 Chapter 3 3.1 Take as sample space the set consisting of the four outcomes (0, 0), (1, 0), (0, 1) and (1, 1), where the first (second) component is 0 if the first (second) student picked has not done homework and is 1 other4 1 15 5 5 15 15 5 × 19 = 19 , 20 × 19 = 15 wise. The probabilities 20 76 , 20 × 19 = 76 , 15 14 21 and 20 × 19 = 38 are assigned to these four outcomes. The random variable X takes on the value 0 for the outcome (1, 1), the value 1 for the outcomes (0, 1) and (1, 0), and the value 2 for the outcome (0, 0). 15 15 1 21 , P (X = 1) = 15 Thus P (X = 0) = 38 76 + 76 = 38 , and P (X = 2) = 19 . 3.2 The probability mass function of X can be calculated by using conditional probabilities. Let Ai be the event that the ith person entering the room is the first person matching a birthday. Then P (X = i) = P (Ai ) for i = 2, 3, . . . , 366. Using the chain rule for conditional prob1 abilities, it follows that P (X = 2) = 365 and P (X = i) = 365 − i + 2 i − 1 364 × ··· × 365 365 365 for i ≥ 3. 3.3 The random variable X can take on the values 0, 1, and 2. By the law of conditional probability, P (X = 0) = 13 × 0 + 23 × 14 = 16 , P (X = 1) = 31 × 0 + 32 × 12 = 13 , and P (X = 2) = 31 × 1 + 23 × 41 = 12 . 3.4 Denote by the random variable X the number of prize winners. The random variable X takes on the value 1 if all three digits of the lottery number drawn are the same, the value 3 if exactly two of the three digits of the lottery number drawn are the same, and the value 6 if all three digits of the lottery number drawn are different. There are 10 lottery numbers for which all three digits are the same and so P (X = 1) = 1,10 000 . There are 10 × 9 × 8 = 720 lottery numbers with three different digits and so P (X = 6) = 1720 ,000 . The probability 270 P (X = 3) = 1 − P (X = 1) − P (X = 6) = 1,000 . 3.5 The random variable X can take on the values 2, 3, and 4. Two tests are needed if the first two tests give the depleted batteries, while three tests are needed if the first three batteries tested are not depleted or if a second depleted battery is found at the third test. Thus, by the 1 chain rule for conditional probabilities, P (X = 2) = 52 × 41 = 10 , 3 2 1 2 3 1 3 2 1 3 P (X = 3) = 5 × 4 × 3 + + 5 × 4 × 3 + 5 × 4 × 3 = 10 . The probability 6 P (X = 4) is calculated as 1 − P (X = 2) − P (X = 3) = 10 . 55 3.6 For 1 ≤ k ≤ 4, you get 2k−1 × 10 dollars if the first k tosses are heads and the (k + 1)th toss is tails. You get 160 dollars if the first five tosses are heads. The probability mass function of X is P (X = 0) = 12 , 1 1 P (X = 10) = 14 , P (X = 20) = 18 , P (X = 40) = 16 , P (X = 80) = 32 , 1 and P (X = 160) = 32 . 3.7 The random variable X can take on the values 0, 5, 10, 15, 20, and 25. Using the chain rule for conditional probabilities, P (X = 0) = 74 , P (X = 5) = 17 × 46 , P (X = 10) = 27 × 64 , P (X = 15) = 27 × 16 × 54 + 1 2 4 2 1 4 2 1 1 2 7 × 6 × 5 , P (X = 20) = 7 × 6 × 5 , and P (X = 25) = 7 × 6 × 5 + 7 × 1 1 1 2 1 6 × 5 + 7 × 6 × 5. 3.8 The sample space is the set {1, . . . , 10}, where the outcome i means that the card with number i is picked. Let the X be your payoff. The random variable X can take on the values 0.5, 5, 6, 7, 8, 9, and 10. We 1 4 and P (X = k) = 10 for 5 ≤ k ≤ 10. Therefore have P (X = 0.5) = 10 P 10 4 1 E(X) = 0.5 × 10 + k=5 k × 10 = 4.70 dollars. 3.9 The probability that a randomly chosen student belongs to a particular class gets larger when the class has more students. Therefore E(Y ) > E(X). We have E(X) = 15 × 41 + 20 × 14 + 70 × 14 + 125 × 14 = 57.5 15 20 70 and E(Y ) = 15 × 230 + 20 × 230 + 70 × 230 + 125 × 125 230 = 88.125. 3.10 Let the random variable X be the net cost of the risk reduction in dollars. The random variable X takes on the values 2,000, 2,000−5,000 and 2,000−10,000 with respective probabilities 0.75, 0.15 and 0.10. We have E(X) = 2,000 × 0.75 − 3,000 × 0.15 − 8,000 × 0.10 = 250 dollars. 3.11 Let the random variable X be the amount of money you win. Then 10 12 P (X = m) = m / m and P (X = 0) = 1 − P (X = m). This gives 10 12 E(X) = m / . m m This expression is maximal for m = 4 with E(X) = 56 33 . 3.12 Using conditional probabilities, your probability of winning is 64 × 53 = 2 2 3 5 . Hence your expected winnings is 5 × 1.25 − 5 × 1 = −0.10 dollars. This is not a fair bet. 52 3.13 Put for abbreviation ck = k−1 . Using the second argument from 4 48 3 Example 3.2, we get P (X2 = k) = c1k k−2 1 × 52−(k−1) , P (X3 = k) = 56 1 48 ck k−3 4 4 48 2 1 × 52−(k−1) , and P (X4 = k) = c1k k−4 3 × 52−(k−1) . This leads to E(X2 ) = 21.2, E(X3 ) = 31.8, and E(X4 ) = 42.4. Intuitively, the result E(Xj ) = 52+1 5 × j for 1 ≤ j ≤ 4 can be explained by a symmetry argument. Note: More generally, suppose that balls are removed from a bowl containing r red and b blue balls, one at a time and at random. Then the expected number of picks until a red ball is removed for the jth time is r+b+1 r+1 × j for 1 ≤ j ≤ r. 2 3.14 Let the random variable X denote the number of chips you get back in any given round. The possible values of X are 0, 2, and 5. The random variable X is defined on the sample space consisting of the 36 equiprobable outcomes (1, 1), (1, 2), . . . , (6, 6). Outcome (i, j) means that i points turn up on the first die and j points on the second die. The total of the two dice is 7 for the six outcomes (1, 6), (6, 1), 6 (2, 5), (5, 2), (3, 4), and (4, 3). Thus P (X = 5) = 36 . Similarly, 15 15 P (X = 0) = 36 and P (X = 2) = 36 . This gives E(X) = 0 × 15 6 2 15 +2× +5× =1 . 36 36 36 3 You bet two chips each round. Thus, your average loss is 2 − 1 32 = chip per round when you play the game over and over. 1 3 3.15 Let the random variable X be the total amount staked and the random variable Y be the amount won. The probability pk that k bets will be 19 10 19 k−1 18 for k = 11. Thus, placed is 37 37 for k = 1, . . . , 10 and 37 E(X) = 10 X (1 + 2 + . . . + 2k−1 )pk + (1 + 2 + . . . + 29 + 1,000)p11 k=1 = 12.583 dollars. If the round goes to 11 bets, the player’s loss is $23 if the 11th bet is won and is $2,023 if the 11th bet is lost. Thus E(Y ) = 1 × (1 − p11 ) − 23 × p11 × 18 19 − 2,023 × p11 × 37 37 = −0.3401 dollars. The ratio of 0.3401 and 12.583 is in line with the house advantage of 2.70% of the casino. 57 3.16 Let the random variable X be the payoff of the game. Then P (X = 0) = ( 12 )m and P (X = 2k ) = ( 21 )k−1 × 12 for k = 1, . . . , m. Therefore E(X) = m. 3.17 Take as sample space the set 0 ∪{(x, y) : x2 +y 2 ≤ 25}, where 0 means that the dart has missed the target. The score is a random variable X with P (X = 0) = 0.25, P (X = 5) = 0.75 × 25−9 25 = 0.48, P (X = 8) = 9−1 1 0.75 × 25 = 0.24, and P (X = 15) = 0.75 × 25 = 0.03. The expected value of the score is E(X) = 0.48 × 5 + 0.24 × 8 + 0.03 × 15 = 4.77. 3.18 Let the random variable X be your net winnings. The random variable takes on the values −1, 0 and 10. We have P (X = 0) = X 10 4 10 6 4 1 3 1 2 / 3 = 10 , P (X = 10) = 3 / 3 = 30 , and P (X = −1) = 1 − P (X = 0) − P (X = 10) = 32 . Thus E(X) = −1 × 3 1 1 2 +0× + 10 × =− . 3 10 30 3 3.19 Let the random variable X be the payoff when you go for a second spin given that the first spin showed a score of a points. Then, P (X = a + k) = 1/1,000 for 1 ≤ k ≤ 1,000 − a and P (X = 0) = a/1,000. Thus 1,000−a X 1 1 E(X) = (a + k) = (1,000 − a)(1,000 + a + 1). 1,000 2,000 k=1 The largest value of a for which E(X) > a is a∗ = 414. The optimal strategy is to stop after the first spin if this spin gives a score of more than 414 points. 3.20 The expected value of the number of fiches you leave the casino is U (a, b) = b × P (a, b) + 0 × (1 − P (a, b)). Since P (a, b) ≈ (q/p)−b for a large, we get U (a, b) ≈ b × (q/p)−b . It is matter of simple algebra to verify that b × (q/p)−b is maximal for b ≈ 1/ ln(q/p). This gives P a, 1 b 1 1 ≈ ≈ . and U a, ln(q/p) e ln(q/p) e 3.21 Let the random variable X be the payoff of the game. Then, using conditional probabilities, P (X = 0) = ( 56 )3 , P (X = 2) = 31 61 ( 56 )2 × 45 , 58 P (X = 2.5) = 31 61 ( 56 )2 × 51 , P (X = 3) = 32 ( 16 )2 56 , and P (X = 4) = 60 15 15 1 ( 61 )3 . This gives E(X) = 0× 125 216 +2× 216 +2.5× 216 +3× 216 +4× 216 = 0.956. 3.22 Let the random variable X be the payoff in the dice game. The random variable X can take on the values 0, 10, and 100. We have P (X 1= 1 , P (X = 10) = 3 × 3 × 42 × ( 61 )4 = 24 , 100) = 6 × ( 61 )4 = 216 206 and P (X = 0) = 1 − P (X = 100) − P (X = 10) = 216 . Therefore 10 E(X) = 100 216 + 24 = 0.8796. The dice game is unfavorable to you. Let Y be the payoff in the coin-tossing game. Then P (Y = 0) = 12 , P (Y = i) = ( 12 )i+1 for 1 ≤ i ≤ 4, and P (Y = 30) = ( 21 )5 . Therefore P E(Y ) = 4i=1 i × ( 21 )i+1 + 30 × ( 12 )5 = 1.75. The coin-tossing game is also unfavorable to you. 3.23 Let the random variable X be the payoff of the game. The probability that X takes on the value k with k < m is the same as the probability that a randomly chosen number from the interval (0, 1) falls into the 1 1 1 1 , k+1 ) or into the subinterval (1 − k+1 , 1 − k+2 ). Thus subinterval ( k+2 1 1 2 P (X = k) = 2 k+1 − k+2 for 1 ≤ k ≤ m − 1 and P (X = m) = m+1 . 1 1 The stake should be E(X) = 2 2 + · · · + m+1 . 3.24 Suppose that your current total is i points. If you decide to roll again the die and then to stop, the expected value of the change of your total is 6 1 20 i 1X k− ×i= − . 6 6 6 6 k=2 Therefore the one-stage-look-ahead rule prescribes to stop as soon as your total is 20 points or more. This stopping rule is optimal. 3.25 Suppose you have rolled a total of i < 10 points so far. The expected value of the change of your current total is 10−i X k=1 k× 1 i−4 1 1 −i× = (10 − i)(10 − i + 1) − i(i − 4) 6 6 12 6 if you decide to continue for one more roll. This expression is positive for i ≤ 5 and negative for i ≥ 6. Thus, the one-stage-look-ahead rule prescribes to stop as soon as the total number of points rolled is 6 or more. This rule maximizes the expected value of your reward. 59 3.26 Suppose that at a given moment there are i0 empty bins and i1 bins with exactly one ball (and 25 − i0 − i1 bins with two or more balls). If you decide to drop one more ball before stopping rather than stopping immediately, the expected value of the change of your net winnings is i1 i0 25 × 1 − 25 × 1.50. The one-stage-look-ahead rule prescribes to stop in the states (i0 , i1 ) with i0 − 1.5i1 ≤ 0 and to continue otherwise. For the case that you lose 21 k dollars for every containing k ≥ 2 balls, the expected value of the change of your net winnings is i0 i1 25 − i0 − i1 ×1− ×2− × 0.50 25 25 25 when you decide to drop one more ball before stopping rather than stopping immediately. Therefore, the one-stage-look-ahead rule prescribes to stop in the states (i0 , i1 ) with 3i0 − 3i1 − 25 ≤ 0 and to continue otherwise. 3.27 Suppose that you have gathered so far a dollars. Then there are still w − a white balls in the bowl. If you decide to pick one more ball, then the expected change of your bankroll is r 1 (w − a) − a. r+w−a r+w−a w This expression is less than or equal to zero for a ≥ r+1 . It is optimal w to stop as soon as you have gathered at least r+1 dollars. 3.28 Suppose your current total is i points. If you decide to continue for one more roll, then the expected value of the change of your dollar value is 2 4 4 6 4 4 2 +4× +5× +6× +7× +8× +9× 3× 36 36 36 36 36 36 36 2 2 1 210 i + 10 × + 11 × − ×i= − . 36 36 6 36 6 The smallest value of i for which the expected change is less than or equal to 0 is i = 35. The one-stage-look-ahead rule prescribes to stop as soon you have gathered 35 or more points. This rule is optimal among all conceivable stopping rules. Note: The maximal expected reward is 14.22 dollars, see Problem 7.52. P P∞ P ∞ 3.29 (a) Writing ∞ k=0 P (X > k) = k=0 j=k+1 P (X = j) and interchanging the order of summation give ∞ X k=0 P (X > k) = j−1 ∞ X X j=0 k=0 P (X = j) = ∞ X j=0 jP (X = j) 60 P∞ and so k=0 P (X > k) = E(X). The interchange of the order of summation is justified by the nonnegativity of the terms involved. (b) Let X be the largest among the 10 random numbers. Then, P (X > k 10 for 1 ≤ k ≤ 99. Thus, E(X) = 0) = 1 and P (X > k) = 1 − 100 P99 P (X > k) = 91.4008. k=0 3.30 Define the random variable X as the number of floors on which the elevator will stop. Let the random variable Xj = 1 if the elevator P does not stop on floor j and Xj = 0 otherwise. Then X = r − rj=1 Xj . m m We have P (Xj = 1) = r−1 and so E(Xj ) = r−1 for all for r r j = 1, 2, . . . , r. Hence r r − 1 m X E(Xj ) = r 1 − E(X) = r − . r j=1 Note: This problem is an instance of the balls-and-bins model from Example 3.8. 3.31 Let Ik = I(Ak ). Then P (Ac1 ∩ · · · ∩ Acn ) = E[(1 − I1 ) · · · (1 − In )]. We have that n X X Ij + Ij Ik + · · · + (−1)n I1 · · · In . (1 − I1 ) · · · (1 − In ) = 1 − j=1 j<k Taking the expected value of both sides, that E(I Sn noting Tni1 · · · cIir ) is A given by P (Ai1 · · · Air ) and using P = 1 − P k=1 k k=1 Ak , we get the inclusion-exclusion formula. 3.32 Let the random variable X be the number of times that two adjacent letters are the same in a random permutation P10 of the word Mississippi. Then, X can be represented as X = j=1 Xj , where the random variable Xj equals 1 if the letters j and j + 1 are the same in the random permutation and equals 0 otherwise. By numbering the eleven letters of the word Mississippi as 1, 2, . . . , 11, it is easily seen that 4 × 3 × 9! + 4 × 3 × 9! + 2 × 1 × 9! 26 = . 11! 110 for all j (alternatively, using conditional probabilities, P (Xj = 1) can 3 4 3 2 1 26 4 × 10 + 11 × 10 + 11 × 10 = 110 ). Hence, using the be calculated as 11 linearity of the expectation operator, P (Xj = 1) = E(X) = 10 X j=1 E(Xj ) = 10 × 26 = 2.364. 110 61 3.33 Let the indicator variable Ik be equal to 1 if the kth team has a married 3 couple and zero otherwise. Then P (Ik = 1) = (12 × 22)/ 24 3 = 23 for any k. The expected number of teams with a married couple is P 8 24 k=1 E(Ik ) = 23 . Note: As a sanity check, the probability that a given team has no 20 22 × 20 married couple is 23 22 = 23 , by using the chain rule for conditional probabilities. 3.34 It is no restriction to assume that n is even. Let the indicator variable I2j be 1 if the random walk returns to the zero level at time 2j and 0 otherwise. Then, by the linearity of the expectation operator, the expected number of returns to the zero level during the first n time Pn/2 units is j=1 E(I2j ). Let pj = P (I2j√= 1), then E(I2j ) = pj . In Example 1.4, it is shown that pj ≈ 1/ πj for j large. Next, we get p Pn/2 √ 2/π n for n large. j=1 E(I2j ) ≈ 3.35 Let Ik be 1 if two balls of the same color are removed on the kth pick and 0 otherwise. The expected Pr+b number of times that you pick two E(Ik ). By a symmetry argument, each balls of the same color is k=1 Ik has the same distribution as I1 (the order in which you draw the pairs of balls does not matter, that is, for all practical purposes the kth pair can be considered as the first pair). Since 2r 2r − 1 2b 2b − 1 × + × , 2r + 2b 2r + 2b − 1 2r + 2b 2r + 2b − 1 the expected number of times that you pick two balls of the same color is [r(2r − 1) + b(2b − 1)]/(2r + 2b − 1). P (I1 = 1) = 3.36 (a) Let Xi be equal to 1 if there is a birthday on day i and 0 otherwise. For each i, P (Xi = 0) = (364/365)100 and P (Xi = 1) = 1−P (Xi = 0). The expected number of distinct birthdays is 365 X Xi = 365 × [1 − (364/365)100 ] = 87.6. E i=1 (b) Let Xi be equal to 1 if some child in the second class shares the birthday of the ith child in the first class and Xi is zero otherwise. Then P (Xi = 1) = 1 − (364/365)s for all i. The expected number of children in the first class sharing a birthday with some child in the other class is r X Xi = r × [1 − (364/365)s ]. E i=1 62 3.37 Let the indicator variable Is be 1 if item s belongs to T after n steps n n and so E(Is ) = 1− 1− n1 . and 0 otherwise. Then P (Is = 0) = n−1 n Thus theexpected value of the number of distinct items in T after n n steps is n 1− 1− n1 . Note that the expected value is about n 1− 1e for n large. 3.38 Let the random variable Xi be 1 if the ith person survives the first round and Xi be zero otherwise. The random variable Xi takes on the value 1 if and only if nobody of the other n − 1 persons shoots at person i. Hence n − 2 n−1 n−2 n−2 × ··· × = P (Xi = 1) = n−1 n−1 n−1 for all 1 ≤ i ≤ n. The expected value of the number of people who survive the first round is n−1 n X n − 2 n−1 1 Xi = n E =n 1− . n−1 n−1 i=1 This expected value can be approximated by e = 2.71828 . . .. n e for n large, where 3.39 Label the white balls as 1, . . . , w. Let the indicator variable Ik be equal to 1 if the white ball with label k remains in the bag when you stop and 0 otherwise. To find P (Ik = 1), you can discard the other white 1 balls. Therefore Pw P (Ik ) = r+1w. The expected number of remaining white balls is k=1 E(Ik ) = r+1 . 3.40 Number the 25 persons as 1, 2, . . . , 25 and let person 1 be the originator of the rumor. Let the random variable Xk be equal to 1 if person k hears about the rumor for 2 ≤ k ≤ 25. For fixed k with 2 ≤ k ≤ 25, let Aj be the event that person k hears about the rumor for the first time P when the rumor is told the jth time. Then, E(Xk ) = P (Xk = 1) = 10 j=1 P (Aj ), where 1 and P (Aj ) = P (A1 ) = 24 1 j−2 1 1 1− 1− , 2 ≤ j ≤ 10. 24 23 23 This gives 10 X 1 1 1 j−2 1 1− E(Xk ) = + = 0.35765 1− 24 24 23 23 j=2 63 for 2 ≤ k ≤ 25. Hence the expected value of the number of persons who know about the rumor is 1+ 25 X k=2 E(Xk ) = 1 + 24 × 0.35765 = 9.584. 3.41 Let the indicator variable Ik be equal to 1 if the numbers k and k + 1 appear 45in the lotto drawing and 0 otherwise. Then, P (Ik = 1) = 43 / 4 6 . The expected number of consecutive numbers is P44 2 k=1 E(Ik ) = 3 . 3.42 For any k ≥ 2, let Xk be the amount Ps you get at the kth game. Then the total amount you will get is k=2 Xk . By the linearity of the expectation operator, the expected value of the total amount you will Ps 1 get is k=2 E(Xk ). Since P (Xk = 1) = k(k−1) and P (Xk = 0) = 1 1 − P (Xk = 1), it follows that E(Xk ) = k(k−1) for k = 2, . . . , s. Hence the expected value of the total amount you will get is equal to s X k=2 1 s−1 = . k(k − 1) s The fact that the sum equals s−1 s is easily verified by induction. 3.43 Let the random variable Xi be equal to 1 if box i contains more than 3 apples and Xi be equal to 0 otherwise. Then, 25 X 25 1 k 9 25−k P (Xi = 1) = = 0.2364. k 10 10 k=4 and so E(Xi ) = 0.2364. Thus the expected value P of the number of 10 X boxes containing more than 3 apples is given by E = 10 × k=1 k 0.2364 = 2.364. 3.44 In Problem 1.73 the reader was asked to prove that the probability that a particular number r belongs to a cycle of length k is n1 , regardless of the value of k (this result can also be proved by using conditional n−2) n−k+2 1 1 probabilities: n−1 n × n−1 × · · · × n−k+1 × n−k+1 = n ). Thus, for fixed 1 k, E(Xr ) = n for r = 1, . . . , n. Therefore the expected number of cycles of length k is n n 1 X 1 X 1 E Xr = E(Xr ) = k k k r=1 r=1 64 for any 1 ≤ k ≤ n. This shows that the expected value of the total number of cycles is n X 1 . k k=1 This sum can be approximated by ln(n) + γ for n sufficiently large, where γ = 0.57722 . . . is Euler’s constant. 3.45 For any i 6= j, let Xij = 1 if the integers i and j are switched in the random permutation and Xij = 0 otherwise. Then P (Xij = 1) = (n−2)! n! . The expected number of interchanges is n−1 X n X i=1 j=i+1 1 1 n = . E(Xij ) = 2 2 n(n − 1) 3.46 Let X be the payoff in dollars for investment A and Y be the payoff in dollars for investment B . Then E(X) = 0.20 × 1,000 + 0.40 × 2,000 + 0.40 × 3,000 = 2,200. Also, we have E(X 2 ) = 0.20 × 1,0002 + 0.40 × 2,0002 + 0.40 × 3,0002 = 5,400,000. Thus σ 2 (X) = 5,400,000 − 2,2002 = 560,000 and so σ(X) = 748.33. In the same way, we get E(Y ) = 2,200, E(Y 2 ) = 5,065,000, σ 2 (Y ) = 225,000, and σ(Y ) = 474.34. P P 3.47 Using the basic sums nk=1 k = 12 n(n + 1) and nk=1 k 2 = 16 n(n + 1)(2n + 1), it follows that E(X) and E(X 2 ) are 21 (a + b) and 61 (2a2 + 1 (a2 − 2ab − 2a + b2 + 2b). 2ab − a + 2b2 + b), which leads to var(X) = 12 3.48 Since E(X) = pa + (1 − p)b and E(X 2 ) = pa2 + (1 − p)b2 , we have 2 var(X) = pa2 + (1 − p)b2 − pa + (1 − p)b . It is a matter of simple algebra to get var(X) = p(1 − p)(a − b)2 . Then, by noting that p(1 − p) is maximal for p = 21 , the desired result follows. Note: We have E(X) ≤ 14 (a − b)2 for any discrete random variable X that is concentrated on the integers a, a + 1, . . . , b. 3.49 Let IkPbe 1 if the kth team hasP a married couple 0 otherwise. Let Pand P X = 8k=1 Ik . Then E(X 2 ) = 8k=1 E(Ik2 ) + 2 7j=1 8k=j+1 E(Ij Ik ). 65 We have E(Ik2 ) = P (Ik = 1) = 12 × 22 24 for all k 3 E(Ij Ik ) = P (Ij = 1, Ik = 1) = 12 × 11 × 20 × 19 24 21 3 This gives E(X) = 24 23 , E(X 2 ) = 48 23 , 3 for all j 6= k. and σ(X) = 0.9981. 3.50 Define X as the number of integers that do not show up in the 15 lotto drawings. Let Xi = 1 if the number i does not show P up in the 15 lotto drawings and Xi = 0 otherwise. Then, X = 45 i=1 Xi . The probability that a specified number does not show up in any given 45 15 drawing is 44 / 6 = 39/45. Hence E(Xi ) = P (Xi = 1) = (39/45) 6 and so !15 39 E(X) = 45 × = 5.2601. 45 The probability that two specified numbers 45 i and j with i 6= j do not 43 show up in any given drawing is 6 / 6 = (39/45) × (38/44). Hence E(Xi Xj ) = P (Xi = 1, Xj = 1) = [(39 × 38)/(45 × 44)]15 and so 15 39 × 38 15 45 39 2 × +2 = 30.9292. E(X ) = 45 × 2 45 45 × 44 This leads to σ(X) = 1.8057. 3.51 By the substitution rule, we have 10 10 X X 11 − k 11 − k 2 k E(X) = k2 = 4 and E(X ) = = 22, 55 55 k=1 k=1 √ and so σ(X) = 22 − 16 = 2.449. The number of reimbursed treatments is Y = g(X), where g(x) = min(x, 5). Then, by applying the substitution rule, E(Y ) = 10 4 X 37 11 − k X 11 − k + 5 = k 55 55 11 k=1 4 X E(Y 2 ) = k=1 k2 k=5 10 X 11 − k + 55 k=5 25 11 − k 151 = . 55 11 p The standard deviation σ(Y ) = 151/11 − (37/11)2 ) = 1.553. 66 3.52 Let V be the stock left over and W be the amount of unsatisfied demand. We have V = g1 (X) and W = g2 (X), where the functions g1 (x) and g2 (x) are given by g1 (x) = Q − x for x < Q and g1 (x) = 0 otherwise, and g2 (x) = x − Q for x > Q and g1 (x) = 0 otherwise. By the substitution rule, E(V ) = Q−1 X k=0 (Q − k)pk and E(W ) = ∞ X (k − Q)pk . k=Q+1 PQ P∞ P Note: Writing ∞ k=0 (k − Q)pk − k=Q+1 (k − Q)pk as k=0 (k − Q)pk , it follows that E(W ) = µ−Q+E(V ), where µ is the expected demand. 3.53 Let the random variable X be the number of repairs that will be necessary in the coming year and Y be the maintenance costs in excess of the prepaid costs. Then Y = g(X), where the function g(x) = 100(x − 155) for x > 155 and g(x) = 0 otherwise. By the substitution rule, E(Y ) = ∞ X x=156 ∞ X 100(x − 155)e−150 E(Y 2 ) = x=156 150x = 280.995 x! 1002 (x − 155)2 e−150 The standard deviation of Y is 150x = 387,929. x! p E(Y 2 ) − E 2 (Y ) = 555.85. 3.54 Let the random variable X be the monthly demand for the medicine and Y be the net profit in any given month. Then Y = g(X), where the function g(x) is given by 400x − 800 for 3 ≤ x ≤ 8 g(x) = 400x − 800 − 350(x − 8) for x > 8. By the substitution rule, E(Y ) = 10 X x=3 g(x) P (X = x) and E(Y 2 ) = 10 X [g(x)]2 P (X = x). x=3 p The standard deviation of Y is E(Y 2 ) − E 2 (Y ). Substituting the values of g(x) and P (X = x), it follows after some calculations that the expected value and the standard deviation of the monthly net profit Y = g(X) are given by $1227.50 and $711.24. 67 3.55 Your probability of winning the contest is n X k=0 1 1 , P (X = k) = E k+1 1+X by the law of conditional probability. The random variable X can be written as X = X1 + · · · + Xn , where Xi is equal to 1 if the ith person in the first round survives this round and 0 otherwise. Since E(Xi ) = n1 for all i, we have E(X) = 1. Thus, by Jensen’s inequality, E 1 1 1 ≥ = . 1+X 1 + E(X) 2 3.56 By the linearity of the expectation operator, E[(X1 +· · ·+Xn )2 +(Y1 + 2 ] + E[(Y + · · · + Y )2 ]. Using · · · + Yn )2 ] is equal to E[(X1 + · · · + Xn )P 1 P n n−1 Pn the algebraic formula, (a1 +· · ·+an )2 = ni=1 a2i +2 i=1 j=i+1 ai aj and again the linearity of the expectation operator, it follows that E[(X1 + · · · + Xn )2 ] = = n X E(Xi2 ) + 2 i=1 n X E(Xi Xj ) i=1 j=i+1 i=1 n X n−1 X E(Xi2 ) + 2 n−1 X n X E(Xi )E(Xj ), i=1 j=i+1 where the last equality uses the fact that E(Xi Xj ) = E(Xi )E(Xj ) by the independence of Xi and Xj for i 6= j. For each i, E(Xi ) = 1 × 14 + (−1) × 14 = 0 and E(Xi2 ) = 12 × 41 + (−1)2 × 41 = 12 . This gives 1 E[(X1 + · · · + Xn )2 ] = n. 2 In the same way, E[(Y1 +· · ·+Yn )2 ] = 12 n. Hence we find the interesting result that the expected value of the squared distance between the drunkard’s position after n steps and his starting position is equal to n for any value of n. Note: It is not true that the expected value of the distance between the drunkard’s position after n steps and his starting position equals √ n. Otherwise, the variance of the distance would be zero and so the distance would exhibit no variability, but this cannot be true. 3.57 For x, y ∈ {−1, 1}, we have P (X = x, Y = y) = P (X = x | Y = y)P (Y = y) and P (X = x | Y = y) = P (Z = x/y | Y = y). Since Y 68 and Z are independent, P (Z = x/y | Y = y) = P (Z = x/y) = 0.5. This gives P (X = x, Y = y) = 0.5 × P (Y = y) for all x, y ∈ {−1, 1}. Also, P (X = 1) = P (Y = 1, Z = 1) + P (Y = −1, Z = −1). Thus, by the independence of Y and Z, we get P (X = 1) = 0.25 + 0.25 = 0.5 and so P (X = −1) = 0.5. Therefore, the result P (X = x, Y = y) = 0.5 × P (Y = y) implies P (X = x, Y = y) = P (X = x)P (Y = y), proving that X and Y are independent. However, X is not independent of Y + Z. To see this, note that P (X = 1, Y + Z = 0) = 0 and P (X = 1)P (Y + Z = 0) > 0. 3.59 Noting that X = Xi−1 +P Ri , we get Xi = R2 + · · · + Ri for 2 ≤ i ≤ 10. Pi 10 This implies i=2 Xi = 10 k=2 (11 − k)Rk . Since P (Rk = 0) = P (Rk = 1) = 12 , we have E(Rk ) = 12 and σ 2 (Rk ) = 41 . The random variables Rk are independent and so, by the Rules 3.1 and 3.9, E 10 X i=2 σ2 10 X (11 − k)E(Rk ) = 22.5 Xi = 10 X Xi = i=1 k=2 10 X k=2 (11 − k)2 σ 2 (Rk ) = 71.25. Pk−1 3.60 We have P (X + Y = k) = j=1 (1 − p)j p (1 − p)k−j−1 p for k = 2, 3, . . ., by the convolution formula. This leads to P (X + Y = k) = (k − 1)p2 (1 − p)k−2 for k = 2, 3, . . . . P∞ P∞ 3.61 We have E k=1 E(Xk Ik ), since it is always allowed k=1 Xk Ik = to interchange expectation and summation for nonnegative random variables. Since Xk and Ik are independent, E(Xk Ik ) = E(Xk )E(Ik ) for any k ≥ 1. Also, E(Ik ) = P (N ≥ k). Thus, E ∞ X k=1 Xk Ik = E(X1 ) ∞ X k=1 P (N ≥ k) = E(X1 )E(N ), P using the fact that E(N ) = ∞ n=0 P (N > n), see Problem 3.29. Note: If the Xk are not nonnegative, applies in view of P∞ P∞ the proof still P∞ the fact that E Y ) = E(Y ) when k k=1 k k=1 k=1 E(|Yk |) < ∞. The proof of this so-called dominated convergence result can be found in advanced texts. 69 3.62 Let the random variable X be the number of passengers showing up. Then X is binomially distributed with parameters n = 160 and p = 0.9. The overbooking probability is 160 X 160 0.9k (0.1)160−k = 0.0359. k k=151 Denote by the random variable R the daily return. Using the substitution rule, the expected value of R is calculated as 150 X 160 X 160 75k + 37.5(160 − k) 75 × 150 0.9k (0.1)160−k + k k=0 k=151 160 +37.5(160 − k) − 425(k − 150) 0.9k (0.1)160−k . k This gives E(R) = $11367.63. Also, by the substitution rule, E(R2 ) is calculated as 160 150 X X 2 160 75 × 150 75k + 37.5(160 − k) 0.9k (0.1)160−k k k=151 k=0 2 160 +37.5(160 − k) − 425(k − 150) 0.9k (0.1)160−k . k p Hence the standard deviation of R equals E(R2 ) − E 2 (R) =$194.71. 3.63 The probability Pr = 6r X 6r 1 k 5 6r−k k=r k 6 6 of getting at least r sixes in one throw of 6r dice has the values 0.6651, 0.6187, and 0.5963 for r = 1, 2, and 3. Thus it is best to throw 6 dice. Pepys believed that it was best to throw 18 dice. Note: The probability Pr is decreasing in r and tends to 12 as r → ∞. 3.64 If the competition would been the Yankees would have kcontinued, Phave 7 7 7−k won with probability = 11 k=2 k 0.5 0.5 16 . The prize money should be divided between the Yankees and the Mets according to the proportion 11:5. 3.65 This question can be translated into the question what the probability is of getting 57 or less heads in 199 tosses of a fair coin. A binomial 70 random variable with parameters n = 199 and p√= 0.5 has expected value 199 × 0.5 = 99.5 and standard deviation 0.5 199 = 7.053. Thus the observed number of polio cases in the treatment group is more than six standard deviations below the expected number. Without doing any further calculations, we can say that the probability of this occurring is extremely small (the precise value of the probability is 7.4 × 10−10 ). This makes clear that there is overwhelming evidence that the vaccine does work. 3.66 Let X be the number of beans that will come up white and Y be the number 8 of point gained by the bean thrower. Then, P3 P (X = k) = 8 0.5 for k = 0, 1, . . . , 8. We have P (Y = 1) = k=0 P (X = 2k + k 1 1) = 0.5, P (Y = 2) = P (X = 0) + P (X = 8) = 128 , and P (Y = −1) = 63 3 1 − P (Y = 1) − P (Y = 2) = 128 . This gives E(Y ) = 128 . Thus the bean thrower has a slight advantage. 3.67 Using the law of conditional probability, we have that the sought probability is given by n X k n k p (1 − p)n−k . n k k=0 This probability is nothing else than n1 E(X), where X is binomially distributed with parameters n and p. Thus the probability that you will be admitted to the program is np n = p. 3.68 Let X and Y be the numbers of successful penalty kicks of the two teams. The independent random variables X and Y are binomially distributed with parameters n = 5 and p = 0.7. The probability of a P tie is 5k=0 P (X = k, Y = k. Using the independence of X and Y , this probability can be evaluated as 5 X 5 k=0 k k 5−k 0.7 0.3 5 × 0.7k 0.35−k = 0.2716. k 3.69 Let the random variable X be the number of coins that will be set aside. Then the random variable 100 − X is binomially distributed with parameters n = 100 and p = 81 . Therefore P (X = k) = 1 100−k 7 k 100 ( 8 ) for k = 0, 1, . . . , 100. 100−k ( 8 ) 3.70 Using only the expected value and the standard deviation of the binomial distribution, you can see that is highly unlikely that the medium 71 has to be paid out. The expected value and the standard deviation of the binomial distribution with parameters n = 250 and p = 15 are q 50 and 250 × 51 × 54 = 6.32. Thus the requirement of 82 or more correct answers means more than five standard deviations above the expected value, which has a negligible probability. Note: The psychic who took the challenge of the famous scientific skeptic James Randi was able to get only fifty predictions correct. 3.71 Let the random variable X be the number of rounds you get cards with an ace. If the cards are well-shuffled each time, then X is binomially 52 48 distributed with parameters n = 10 and p = 1 − 13 / 13 . The answer to the question should be based on P (X ≤ 2) = 2 X 10 k=0 k pk (1 − p)10−k = 0.0017. This small probability is a strong indication that the cards were not well-shuffled. 3.72 By the same argument as used for the binomial distribution, P (X1 = x1 , X2 = x2 , . . . , Xr = xr ) is given by n n − x1 n − x1 − · · · − xr−1 x1 ··· p1 · · · pxr r . x1 x2 xr Using this expression, the result follows. 3.73 Let Xi be the number of times that image i shows up in the roll of the five poker dice. Then (X1 , X2 , . . . , X6 ) has a multinomial distribution with parameters n = 5 and p1 = p2 = · · · = p6 = 61 . Let X be the payoff to the player for each unit staked. Then E(X) = 3 × P (X1 ≥ 1, X2 ≥ 1). This gives E(X) = 3 4 5−x X1 X x1 =1 x2 =1 5! px1 px2 (1 − p1 − p2 )5−x1 −x2 x1 !x2 !(5 − x1 − x2 )! 1 2 = 0.9838. 3.74 The number of winners in any month has a binomial distribution with 50 parameters n = 200 and p = 450,000 . This distribution can be very well approximated by a Poisson distribution with an expected value 72 1 50 = 45 . The monthly amount the corporation of λ = 200 × 450,000 will have to give away is $0 with probability e−λ = 0.9780, $25,000 2 with probability e−λ λ = 0.0217 and $50,000 with probability e−λ λ2! = 2.4 × 10−4 . 3.75 Let the random variable X be the number of king’s rolls. Then X has a binomial distribution with parameters n = 4×6r−1 and p = 61r . This distribution converges to a Poisson distribution with expected value np = 32 as r → ∞. The binomial probability 1 − (1 − p)n tends very fast to the Poisson probability 1 − e−2/3 = 0.48658 as r → ∞. The binomial probability has the values 0.51775, 0.49140, 0.48738, 0.48660, and 0.48658 for r = 1, 2, 3, 5, and 7. 3.76 An appropriate model is the Poisson model. We have the situation of 1 500 independent trials each having a success probability of 365 . The number of marriages having the feature that both partners are born on the same day is approximately distributed as a Poisson random variable with expected value 500 365 = 1.3699. The approximate Poisson probabilities are 0.2541, 0.3481, 0.2385, 0.1089, 0.0373, and 0.0102 for 0, 1, 2, 3, 4, and 5 matches, while the exact binomial probabilities are 0.2537, 0.3484, 0.2388, 0.1089, 0.0372, and 0.0101 3.77 The Poisson model is an appropriate model. Using the fact that the sum of two independent Poisson randomPvariables is again Poisson −6.2 6.2k /k! = 0.0514. distributed, the sought probability is 1 − 10 k=0 e 3.78 It is reasonable to model the number goals scored per team per game 128 by a Poisson random variable X with expected value 2×48 = 1.3125. To see how good this fit is, we calculate P (X = k) for k = 0, 1, 2, 3 and P (X > 3). These probabilities have the values 0.2691, 0.3533, 0.2318, 0.1014, and 0.0417. These values are close to the empirical probabilities 0.2708, 0.3542, 0.2500, 0.0833, and 0.0444. 3.79 Using the Poisson model, an estimate is 1− P7 k=0 e −4.2 4.2k /k! = 0.064. 3.80 The probability that you will the jackpot in any given week by submitting 5 six-number sequences in the lottery 6/42 is 5/ 42 6 = 9.531 × −7 10 . The number of times that you will win the jackpot in the next 312 drawings of the lottery can be modeled by a Poisson distribution 73 with expected value λ0 = 312 × 5/ 42 6 = 2.9738 × 10−4 . Therefore P (you win the jackpot two or more times in the next 312 drawings) = 1 − e−λ0 − λ0 e−λ0 = 4.421 × 10−8 . Thus the number of people under the 100 million players who will win the jackpot two or more times in the coming three years can be modeled by a Poisson distribution with expected value λ = 100,000,000 × 4.421 × 10−8 = 4.421. The sought probability is 1 − e−λ = 0.9880. 3.81 Let X be the number of weekly winners. An appropriate model for X is the Poisson distribution with √ expected value 0.25. The standard deviation of this distribution is 0.25 = 0.5. The observed number of winners lies 3−0.25 0.5 = 5.5 standard deviations above the expected value. Without doing any further calculations, we can say that the probability of three or more winners is quite small (P (X ≥ 3) = 2.2 × 10−3 ). 3.82 (a) Suppose that X has a Poisson distribution. By the substitution rule, E[λg(X + 1) − Xg(X)] ∞ ∞ X λk X λk = λg(k + 1)e−λ − kg(k)e−λ k! k! k=0 = ∞ X λg(k + 1)e−λ k=0 λk k! k=0 ∞ X −λ g(l + 1)e−λ l=0 λl = 0. l! (b) Let pj = P (X = j) for j = 0, 1, . . .. For fixed i ≥ 1, define the indicator function g(x) by g(k) = 1 for k = i and g(k) = 0 for k 6= i. Then the relation E[λg(X + 1) − Xg(X)] = 0 reduces to λpi−1 − ipi = 0. This gives pi = λi pi−1 for i ≥ 0. By repeated application of this i equation, it next follows that pi = λi! p0 for i ≥ 0. Using the fact that P∞ −λ . This gives i=0 pi = 1, we get p0 = e P (X = i) = e−λ proving the desired result. λi i! for i = 0, 1, . . . , 74 3.83 Translate the problem into a chance experiment with 25 = 2,300 3 trials. There is a trial for each possible combination of three people. 1 2 ) Three given people have the same birthday with probability ( 365 and have birthdays falling within one day of each other with prob1 2 ) . The Poisson heuristic gives the approximations ability 7 × ( 365 2 2 −2 , 300×(1/365) = 0.0171 and 1 − e−2,300×7/(365) = 0.1138 for the 1−e sought probabilities. Simulation shows that the approximate values are close to the exact values. In a simulation study we found the values 0.016 and 0.103. 3.84 Label the four suits as i = 1, . . . , 4. Translate the problem into a chance experiment with 4 trials. The ith trial is said to be successful if suit i is missing inthe bridge hand. The success probability of 52 each trial is p = 39 / 13 13 . The Poisson heuristic gives the approxima−4p tion 1 − e to the probability that some suit will be missing in the bridge hand. The approximate value is 0.0499. The exact value can be obtained by the inclusion-exclusion formula and is 0.0511. 3.85 Translate the problem into a chance experiment with n trials. The ith trial is said to be successful if couple i is paired as bridge partners. 1 The success probability of each trial is p = 2n−1 . Letting λ = np = n −n/(2n−1) to 2n−1 , the Poisson heuristic gives the approximation 1 − e the probability that no couple will be paired as bridge partners. For n = 10, the approximate value is 0.4092. This approximate value is quite close to the exact value 0.4088, which is obtained from the inclusion-exclusion method. 3.86 Imagine a chance experiment with 51 trials. In the ith trial the face values of the cards in the positions i and i + 1 are compared. The trial is said to be successful if the face values are the same. The success 3 probability is 51 . The sought probability is equal to the probability of no successful trial. The latter probability can be approximated by the Poisson probability e−51×(3/51) = e−3 = 0.950. In a simulation study we found the value 0.955. 3.87 Translate the problem into a chance experiment with 365 trials. The ith trial is said to be successful if seven or more people have their 354 75 birthdays on day i. The success probability is p = [1 − ( 365 ) − 75 × 364 74 1 × ( ) ]. Letting λ = 365p = 6.6603, the Poisson heuristic gives 365 365 P6 −λ k the approximation 1 − k=0 e λ /k! = 0.499 to the probability that there are seven or more days so that on each of these days two or more 75 people have their birthday. In a simulation study we found the value 0.516. 3.88 There are n2 combinations of two different integers from 1 to n. Take a random permutation of the integers 1 to n. Imagine a chance ex n periment with 2 trials, where the ith trial is said to be successful if the two integers involved in the trial have interchanged positions in the random permutation. The success probability of each trial is (n−2)! 1 = n(n−1) . The number of successful trials can be approximated n! 1 = 21 . In parby a Poisson distribution with expected value n2 n(n−1) 1 ticular, the probability of no successful trial is approximated by e− 2 . 3.89 Think of a sequence of 2n trials with n = 5, where in each trial a person draws at random a card from the hat. The trial is said to be successful if the person draws the card with their own number, or the card with the number of their spouse. The success probability of each 2 trial is p = 2n . The number of successes can be approximated by a Poisson distribution with expected value λ = 2n×p = 2. In particular, the probability of no success can be approximated by e−2 = 0.1353. The exact value is 0.1213 when n = 5. This value is obtained from the exact formula Z ∞ 1 (x2 − 4x + 2)n e−x dx, (2n)! 0 which is stated without proof. Note: Using the exact formula, it can be experimentally verified that 1 e−2 (1 − 2n ) is a better approximation than e−2 . In a generalization of the Las Vegas card game, you have two thoroughly shuffled decks of cards, where each deck has r(= 13) types of cards and s(= 4) cards of each type. A match occurs when two cards of the same type occupy the same position in their respective decks. Then the probability of no match can be approximated by e−s for r large, while the exact value of this probability can be calculated from r Z ∞ rs (s!) [Ls (x)]r e−x dx (−1) (rs)! 0 j P with Ls (x) = sj=0 (−1)j sj xj! is the Laguerre polynomial of degree s. 3.90 Imagine a chance experiment with b trials, where the ith trial is said to be successful if the ith bin receives no ball. The success probability m of each trial is p = b−1 . The trials are weakly dependent when b b 76 is large. Then the probability mass function of the number of empty bins can be approximated by a Poisson distribution with expected m . value b b−1 b 3.91 Imagine a trial for each person. The trial is said to be successful if the person involved has a lone birthday. The success probability is 364 m−1 p = ( 365 ) . The probability that nobody in the group has a lone birthday is the same as the probability of having no successful trial. Thus, by the Poisson heuristic, the probability that nobody in the m−1 . group has a lone birthday is approximately equal to e−m(364/365) This leads to the approximate value 3,061 for the minimum number of people that are required in order to have a fifty-fifty probability of no lone birthday. The exact value is 3,064, see Problem 1.89. 3.92 Translate the problem into a chance experiment with 8 trials. The ith trial is said to be successful if you have predicted correctly the two = teams for the ith match. The success probability is p = 8×2×14! 16! 1 1 8 . The Poisson distribution with expected value λ = 8 × = 15 15 15 provides a remarkable accurate approximation for the distribution of the number of correctly predicted matches. In a simulation study we found the values 0.587, 0.312, 0.083, and 0.015 for the probability that k matches are correctly predicted for k = 0, 1, 2, and 3, while the approximate values are 0.5866, 0.3129, 0.0834, and 0.0148. 3.93 To approximate the probability of drawing two consecutive numbers, translate the problem into a chance experiment with 44 trials, where there is a trial for any two consecutive numbers from 1 to 45. The 45 probability of drawing two specific consecutive numbers is 43 / 4 6 . 45 Thus, letting λ1 = 44 × 43 / , the Poisson heuristic gives the 4 6 approximation 1 − e−λ1 = 0.487 for the probability of drawing 45two consecutive numbers. In the same way, letting λ2 = 43 × 42 3 / 6 , we get the approximation 1 − e−λ2 = 0.059 for the probability of drawing three consecutive numbers. In a simulation study we found the values 0.529 and 0.056 for the two probabilities. Note: An exact expression for the probability of two or more consecutive numbers in a draw of the lottery is given by 40 45 1− / . 6 6 The trick to get this result is as follows. There is a one-to-one correspondence between the non-adjacent ways of choosing six distinct 77 numbers from 1 to 45 and all ways of choosing six distinct numbers from 1 to 40. To explain this, take a particular non-adjacent draw of 6 from 45, say 3 − 12 − 18 − 27 − 35 − 44. This non-adjacent draw can be converted to a draw of 6 from 40 by subtracting respectively 0, 1, 2, 3, 4, and 5 from the ordered six numbers. This gives the draw 3 − 11 − 16 − 24 − 31 − 39 for the Lotto 6/40. Conversely, take any set of 6 from 40 and add respectively 0, 1, 2, 3, 4, and 5. 3.94 Imagine that the twenty numbers drawn from the numbers 1, . . . , 80 are identified as R = 20 red balls in an urn and that the remaining sixty, nonchosen numbers are identified as W = 60 white balls in the urn. You have ticked ten numbers on your game form. The probability that you have chosen r numbers from the red group is simply the probability that r red balls will come up in the random drawing of n = 10 balls from the urn when no balls are replaced. Thus 60 20 P (r numbers correct out of 10 ticked numbers) = r 10−r 80 10 . This probability has the values 4.58 × 10−2 , 1.80 × 10−1 , 2.95 × 10−1 , 2.67 × 10−1 , 1.47 × 10−1 , 5.14 × 10−2 , 1.15 × 10−2 , 1.61 × 10−3 , 1.35 × 10−4 , 6.12 × 10−6 , and 1.12 × 10−7 for r = 0, 1, . . . 10. 3.95 Let X denote how many numbers you will correctly guess. Then X has a hypergeometric distribution with parameters R = 5, W = 34, and n = 5. Therefore 34 5 P (X = k) = k 5−k 39 5 for k = 0, . . . , 5. Let E be the expected payoff per dollar staked. Then, E = 100,000 × P (X = 5) + 500 × P (X = 4) + 25 × P (X = 3) + E × P (X = 2). This gives E = 0.631. The house percentage is 36.9%. 3.96 Let the random variable X be the number of left shoes among the four shoes you have chosen. The random variable X has a hypergeometric distribution with parameters R = 10, W = 10 and n = 4 applies. The desired probability is 10 10 1 − P (X = 0) − P (X = 4) = 1 − 4 20 4 − 4 20 4 = 0.9133. 78 3.97 The hypergeometric model with R = W = 25 and n = 25 is applicable under the hypothesis that the psychologist blindly guesses which 25 persons are left-handed. Then, the probability of identifying correctly 18 or more of the 25 left-handers is 25 25 25 X k 25−k = 2.1 × 10−3 . 50 25 k=18 This small probability provides evidence against the hypothesis. 3.98 Let the random variables X1 , X2 , and X3 indicate how many syndicate tickets that match five, four, or three of the winning numbers. Then the expected amount of money won by the syndicate is $25,000 × E(X1 ) + $925 × E(X2 ) + $27.50 × E(X3 ). To work out this expression, we need the probability that a particular ticket matches 5, 4, or 3 numbers given that none of the 200,000 tickets matches all six winning numbers. Denote by Ak the event that a particular ticket matches exactly k of the six winning numbers and B be the event that none of the 200,000 tickets matches all six winning numbers. Then, by the hypergeometric model, 40 6 P (Ak ) = k 6−k 46 6 . The sought probability P (Ak | B) satisfies P (Ak | B) = P (Ak B) P (B | Ak )P (Ak ) = . P (B) P (B) The probability P (B) can be calculated as the Poisson probability of zero successesin 200,000 independent trials each having success probability 1/ 46 and P (B | Ak ) can be calculated as the Poisson 6 probability of zero successes in 199,999 of such trials. Noting that P (B | Ak )/P (B) is 1 for all practical purposes, we get P (A | Bk ) = P (Ak ). We now find that E(Xk ) = 200,000 × 6 40 6−k k 46 6 for k = 1, 2, 3. 79 This gives the values E(X1 ) = 5.12447, E(X2 ) = 249.8180 and E(X3 ) = 4219.149. Therefore the expected amount of money won by the syndicate is 25,000 × 5.12447 + 925 × 249.8180 + 27.50 × 4219.149 = 475,220 dollars. The expected profit is $75,220. To conclude, we remark that the random variables X1 , X2 , and X3 are Poisson distributed. These random variables are practically independent of each other and so the standard deviation of the random variable 25,000X1 +925X2 +27.50X3 can be approximated by q 25,0002 σ 2 (X1 ) + 9252 σ 2 (X2 ) + 27.502 σ 2 (X3 ) = 58,478.50. Note: The probability distribution of the random variable 25,000X1 + 925X2 + 27.50X3 can be approximated by a normal distribution with expected value $475,220 and standard deviation $58,478.50 (see Chapter 4). This leads to the approximation of 9.9% for the probability that the syndicate will lose money on its investment of $400,000. 3.99 Use the hypergeometric model with R = 8, W = 7, and n = 10. The sought probability is equal toPthe probability 5 or more red 7 of picking 8 8 15 9 balls and this probability is k=5 k 10−k / 10 = 11 . 3.100 Let the random variable X be the largest number drawn and Y the smallest number drawn. Then, k−1 P (X = k) = P (Y = k) = 5 45 6 45−k 5 45 6 for 6 ≤ k ≤ 45 for 1 ≤ k ≤ 40. 3.101 For a single player, the problem can be translated into the urn model 56 80 with 24 red and 56 white balls. This leads to Qk = 1 − 24 24 k−24 / k for 24 ≤ k ≤ 79, where Q23 = 1 and Q80 = 0. The probability that more than 70 numbers must be called out before one of the players has achieved a full card is given by Q36 70 = 0.4552. The probability that you will be the first player to achieve a full card while no other player has a full card at the same time as you is equal to 79 X k=24 Qk−1 − Qk Q35 k = 0.0228. 80 The probability that you will be among the first players achieving a full card is 35 80 X X 35 k=24 a=0 a Qk−1 − Qk a+1 Q35−a = 0.0342. k 3.102 Write X P = X1 + · · · + Xr , where number picked. Then P Xi is the ith P r−1 Pr E(X) = ri=1 E(Xi ), E(X 2 ) = ri=1 E(Xi2 )+2 i=1 j=i+1 E(Xi Xj ). Since the Xi are interchangeable random variables, E(Xi ) = E(X1 ) for all i and E(Xi Xj ) = E(X1 X2 ) for all i 6= j. Obviously, E(X1 ) = 1 2 (s + 1) and so 1 E(X) = r(s + 1). 2 1 s Since P (X1 = k, X2 = l) = E(X1 X2 ) = X × kl k,l: l6=k 1 s−1 for any k 6= l, we have s s k=1 l=1 X X 1 l−k . k = s(s − 1) P P Using the formulas sl=1 l = 21 s(s+1) and sk=1 k 2 = 61 s(s+1)(2s+1), 1 it follows after some P algebra that E(X1 X2 ) = 12 (s + 1)(3s + 2). Also, we have E(X12 ) = 1s sk=1 k 2 = 61 (s + 1)(2s + 1). Putting the pieces together, we get 1 1 E(X 2 ) = r(s + 1)(2s + 1) + r(r − 1) (s + 1)(3s + 2). 6 12 Next, it is a matter of simple algebra to get the formula for σ 2 (X). 3.103 Fix 1 ≤ r ≤ a. Let the random variable X be the number of picks needed to obtain r red balls. Then X takes on the value k if and only if r − 1 red balls are obtained in the first k − 1 picks and another red ball at the kth pick. Thus, for k = r, . . . , b + r, P (X = k) = a r−1 b k−1−(r−1) a+b k−1 × 1 . a + b − (k − 1) Alternatively, the probability mass of X can be obtained from Pfunction r−1 a b a+b the tail probability P (X > k) = j=0 j k−j / k . 81 3.104 Call your opponents East and West. The probability that East has two spades and West has three spades is 5 2 21 10 26 13 = Hence the desired probability is 2 × 39 . 115 39 115 = 0.6783. 3.105 For 0 ≤ k ≤ 4, let Ek be the event that a diamond has appeared 4 times and a spade k times in the first 4 + k cards and Fk be the event that 1)th card is a diamond. The sought probability is P4 the (4 + k + P 4 P (E F ) = k k k=0 P (Ek )P (Fk | Ek ). Thus, the win probability k=0 of player A is 4 8 7 X 4 4 k × = 0.6224. 15 11 − k 4+k k=0 3.106 Let Ak be the event that the last drawing has exactly k numbers in common with the second last drawing. Seeing the six numbers from the second last drawing as red balls and the other numbers as white balls, it follows that P (Ak ) = 6 k 43 6−k 49 6 for k = 0, 1, . . . , 6. Let E be the event that the next drawing will have no numbers common with the last two Pdrawings. Then, by the law of conditional probabilities, P (E) = 6k=0 P (E | Ak )P (Ak ) and so P (E) = 6 X k=0 49−(6+6−k) 6 49 6 × 6 k 43 6−k 49 6 = 0.1901. 3.107 Let the random variable X be the number of tickets you will win. Then X has a hypergeometric distribution with parameters R = 100, W = 124,900, and n = 2,500. The sought probability is 1 − P (X = 0) = 0.8675. Since R + W ≫ n, the hypergeometric distribution can be approximated by the binomial distribution with parameters R = 0.0008. The binomial distribution in turn n = 2,500 and p = R+W can be approximated by a Poisson distribution with expected value λ = np = 2. 82 3.108 The probability of the weaker team winning the final is 7 X k−1 (0.45)4 (0.55)k−4 = 0.3917. 3 k=4 Let the random variable X be the number of games the final will take. Then, k−1 k−1 4 k−4 P (X = k) = (0.45) (0.55) + (0.55)4 (0.45)k−4 . 3 3 This probability has the numerical value 0.1325, 0.2549, 0.3093, and 0.3032 for k = 4, 5, 6, and 7. The expected value and the standard deviation of the random variable X are given by 5.783 and 1.020. 3.109 The probability of fifteen successes before four failures is 19 X n − 1 3 15 1 n−15 = 0.4654, 4 4 14 n=15 using the negative binomial distribution. As a sanity check, imagine that 19 trials are done. Then, the sought probability is the probability of 15 or more successes in 19 trials and is equal to the binomial probability 19 X 19 3 k 1 19−k = 0.4654. k 4 4 k=15 3.110 Let the random variable X have a negative binomial distribution with parameters r = 15 and p = 34 . The probability that the red bag will be emptied first is 19 X k=15 19 X k − 1 3 15 1 k−15 P (X = k) = = 0.4654. 14 4 4 k=15 The probability that there still k ≥ 1 balls in the blue bag when the red bag gets empty is 19 − k 3 15 1 5−k P (X = 20 − k) = 4 4 14 for k = 1, . . . , 5. 83 3.111 Imagine that each player continues rolling the die until one of the assigned numbers of that player appears. Let X1 be the number of rolls player A needs to get a 1 or 2 and X2 be the number of rolls player B needs to get a 4, 5 or 6. Then X1 and X2 are independent and geometrically distributed with parameters p1 = 31 and p2 = 21 . The probability of player A winning is P (X1 ≤ X2 ) = ∞ X j=1 p1 (1 − p1 )j−1 (1 − p2 )j−1 = p1 1 = . p1 + p2 − p1 p2 2 The length of the game is X = min(X1 , X2 ). Thus P (X > l) = (1 − p1 )l (1 − p2 )l = (1 − p)l for l = 0, 1, . . . , where p = p1 + p2 − p1 p2 . Therefore the length of the game is geometrically distributed with parameter p = 23 . 1 applies to 3.112 The geometric distribution with success probability p = 37 this situation. The probability that the house number 0 will not come up in 25 spins of the roulette wheel is 1 − (1 − p)25 = 0.495897. The expected value of the gambler’s net profit per dollar bet is $0.0082. 3.113 Let PA be the probability of player A winning and Pd be the probability of a draw. By the law of conditional probability, PA = a(1 − b) + (1 − a)(1 − b)PA and Pd = ab + (1 − a)(1 − b)Pd . This gives PA = a(1 − b) ab and Pd = . a + b − ab a + b − ab The length of the game is geometrically distributed with parameter p = 1 − (1 − a)(1 − b) = a + b − ab. 3.114 The random variable X is given by Y − 3, where the random variable Y has a negative binomial distribution with parameters n = 3 and p = 12 . Hence x+3 1 x+2 P (X = x) = 2 2 for x = 0, 1, . . . . The √ expected value and the standard deviation of X are given 6−3 = 3 and 6 = 2.449. 84 3.115 Suppose the strategy is to stop as soon as you have picked a number larger than or equal to r. The number of trials needed is geometrically and the amount you get paid has distributed with parameter 25−r+1 25 a discrete uniform distribution on r, . . . , 25. The expected net payoff is given by 25 X 25 1 25 1 k− = (25 + r) − . 25 − r + 1 25 − r + 1 2 25 − r + 1 k=r This expression takes on the maximal value $18.4286 for r = 19. 3.116 The probability that both coins simultaneously show the same outcome is p × 21 + (1 − p) × 12 = 12 . The desired probability distribution is the geometric distribution with parameter 21 . 3.117 Let X be the number of rounds required for the game. The P random 2 variable X is geometrically distributed with parameter p = 12 i=2 ai = 73 648 , where ai is the probability of rolling a dice total of i and is given by ai = i−1 36 for 2 ≤ i ≤ 7 and a14−i = ai for 8 ≤ i ≤ 12. The probability of John paying for the beer is 5 X k=1 a22k+1 12 .X a2i = i=2 38 . 73 3.118 Let us say that a success occurs each time an ace is drawn that you have not seen before. Denote by Xj be the number of cards drawn between the occurrences of the (j − 1)th and jth success. The random variable Xj is geometrically distributed with success probability 4−(j−1) . Also, the random variables X1 , . . . , X4 are independent of 52 each other (the cards are drawn with replacement). A geometrically distributed random variable with parameter p has expected value 1/p and variance (1 − p)/p2 . Hence the expected value and the standard deviation of the number of times you have to draw a card until you have seen all four different aces are 52 52 52 52 + + + = 108.33 4 3 2 1 v u 4 uX 1 − k/52 = 61.16. σ(X1 + X2 + X3 + X4 ) = t (k/52)2 E(X1 + X2 + X3 + X4 ) = k=1 85 Chapter 4 R 10 1 . The prob4.1 Since c 0 (10 − x) dx must be equal to 1, we get c = 50 R5 1 3 abilities are P (X ≤ 5) = 0 50 (10 − x) dx = 4 and P (X > 2) = R 10 1 16 2 50 (10 − x) dx = 25 . R1 4.2 The constant c follows from the requirement c 0 (3x2 − 8x − 5) = 1. This gives c = − 81 . Since f (x) = (5+8x−3x2 )/8 is positive for 0 < x < 1, we have that f (x) represents indeed a probability density function. The cumulative probability distribution function F (x) = P (X ≤ x) is given by Z 1 1 x (5 + 8y − 3y 2 ) dy = (5x + 4x2 − x3 ) for 0 ≤ x ≤ 1. F (x) = 8 0 8 Further F (x) = 0 for x < 0 and F (x) = 1 for x ≥ 1. Ra R ca2 2 2 4.3 Noting that 0 2cxe−cx dx = 0 e−u du = 1 − e−ca , we get Z 15 2 2cxe−cx dx = 0.3023, P (X ≤ 15) = Z0 ∞ 2 2cxe−cx dx = 0.2369, P (X > 30) = 30 Z 25 2 2cxe−cx dx = 0.1604. P (20 < X ≤ 25) = 20 4.4 Let the random variable X be the length of any particular phone call made by the travel agent. Then, Z ∞ ∞ 0.25e−0.25x dx = −e−0.25x 7 = e−1.75 = 0.1738. P (X > 7) = 7 4.5 The proportion of pumping R ∞engines that 2will not fail before 10,000 hours use is P (X > 10) = 10 0.02xe−0.01x dx . Since Z ∞ Z ∞ 2 −0.01x2 e−y dy = e−a /100 , 0.02xe dx = a a2 /100 we get P (X > 10) = e−1 . Also P (X > 5) = e−0.25 . Therefore the probability that the engine will survive for another 5,000 hours given that it has functioned properly during the past 5,000 hours is P (X > 10 | X > 5) = P (X > 10) e−1 = −0.25 = 0.4724. P (X > 5) e 86 Rx 4.6 The cumulative distribution function P (X ≤ x) = −∞ f (y) dy is given 1 1 (x−115)2 for 115 ≤ x ≤ 120 and F (x) = 1− 50 (125−x)2 by F (x) = 50 21 for 120 ≤ x ≤ 125. Since P (117 < X < 123) = F (123) − F (117) = 25 , 4 the proportion of non-acceptable strain gauges is 25 . 4.7 The cumulative distribution R xfunction F (x) = P (X ≤ x) of the random variable X is F (x) = 105 0 y 4 (1 − y)2 dy = x5 (15x2 − 35x + 21) for 0 ≤ x ≤ 1. The solution of the equation 1 − F (x) = 0.05 is x = 0.8712. Thus the capacity of the storage tank in thousands of gallons should be 0.8712. 4.8 A stockout occurs if and only if the demand X is larger than Q. Thus Z Q Z ∞ f (x) dx. f (x) dx = 1 − P (stockout) = Q 0 2 4.9 Let the random variable Y be the area of the circle. Then Y = p πX . Since P (X ≤ x) = xpfor 0 ≤ x ≤ 1 and P (Y ≤ y) = P (X ≤ y/π), we get P (Y ≤ y) = y/π for 0 ≤ y ≤ π. Differentiation of P (Y ≤ y) √ gives that the density function of Y is 1/ 2 πy for 0 < y < π and 0 otherwise. 4.10 To find the density function of Y = X1 , we determine P (Y ≤ y). Obviously, P (Y ≤ y) = 0 for y ≤ 1. For y > 1, 1 1 1 P (Y ≤ y) = P X ≥ =1−P X ≤ =1−F , y y y where F (x) is the probability distribution function of X. By differentiation, it follows that the density function g(y) of Y is given by 6 1 1 1 1 + √ × 2 = g(y) = f for y > 1 y y 7 y3 y2 y and g(y) = 0 otherwise. 4.11 The cumulative distribution function of Y = X 2 is √ √ P (Y ≤ y) = P (X ≤ y) = F ( y) for y ≥ 0, where F (x) = P (X ≤ x). Differentiation gives that the density func√ √ tion of Y is 12 f ( y)/ y for y > 0 and 0 otherwise. The cumulative distribution function of W = V 2 is √ √ √ 2 w for 0 ≤ w ≤ a2 . P (W ≤ w) = P (− w ≤ V ≤ w) = 2a 87 √ The density function of W is 1/ 2a w for 0 < w < a2 and 0 otherwise. 4.12 (a) Let the random variable V be the sum of the coordinates of the point Q. For 0 ≤ v ≤ 1, the random variable V takes on a value smaller than or equal to v if and only if the point Q falls in a right triangle with legs of length v (draw a picture). The area of this triangle is 21 v 2 . Hence 1 P (V ≤ v) = v 2 2 for 0 ≤ v ≤ 1. For 1 ≤ v ≤ 2, the random variable V takes on a value larger than v if and only if the point Q falls in a right triangle with legs of length 1 − (1 − v) = 2 − v. The area of this triangle is 21 (2 − v)2 and so P (V > v) = 12 (2 − v)2 for 1 ≤ v ≤ 2. This gives 1 P (V ≤ v) = 1 − (2 − v)2 2 for 1 ≤ v ≤ 2. By differentiation, it now follows that the density function fV (v) of V satisfies fV (v) = v for 0 < v ≤ 1, f (v) = 2 − v for 1 < v ≤ 2 and fV (v) = 0 otherwise. (b) Let the random variable W be the product of the coordinates of the randomly chosen point Q. A point (x, y) in the unit square satisfies xy ≤ w for any given 0 ≤ w ≤ 1 if and only if either the point belongs to the set {(x, y) : 0 ≤ x ≤ w, 0 ≤ y ≤ 1} or the point satisfies w ≤ x ≤ 1 and is below the graph y = wx (draw a figure). This gives P (W ≤ w) = w + Z 1 w w dx = w − wln(w). x The density function of W is fW (w) = −ln(w) for 0 < w < 1 and fW (w) = 0 otherwise. 4.13 The random variable V = X/(1 − X) satisfies v v P (V ≤ v) = P X ≤ = 1+v 1+v for v ≥ 0. 1 Thus the density function of V is (1+v) 2 for v > 0 and 0 otherwise. To get the density of W = X(1 − X), note that the function x(1 − x) has 1 4 as its maximal value on (0, 1) and that the equation x(1 − x) = w 88 has the solutions x1 = 0 ≤ w ≤ 14 . Thus 1 2 − 1 2 √ 1 2 1 − 4w and x2 = P (W > w) = P (x1 ≤ X ≤ x2 ) = Z x2 1 dx = x1 √ + 1 2 √ 1 − 4w for 1 for 0 ≤ w ≤ . 4 1 − 4w √ Thus the density function of W is 2/ 1 − 4w for 0 < w < otherwise. 1 4 and 0 4.14 Let the random variable U be a number chosen at random from the interval (0,1). Using the fact that P (U ≤ u) = u for 0 ≤ u ≤ 1, it follows that P (X ≤ x) = P (0 ≤ U ≤ x)+P (1−x ≤ U ≤ 1) = 2x for 0 ≤ x ≤ 0.5. Hence X has the density function f (x) = 2 for 0 < x < 0.5 and f (x) = 0 otherwise. Let the random variable Y = X/(1 − X). Then, P (Y ≤ y) = P y X≤ 1+y = 2y 1+y for 0 ≤ y ≤ 1. The density function of Y = X/(1−X) is fY (y) = and fY (y) = 0 otherwise. 2 (1+y)2 for 0 < y < 1 4.15 The sample space of the experiment is {(x, y) : 0 ≤ x, y ≤ 1}. Noting that max(x, y) ≤ v if and only if x ≤ v and y ≤ v, it follows that the random variable V takes on a value smaller than or equal to v if and only if the randomly chosen point falls in the set A = {(x, y) : 0 ≤ x, y ≤ v}. Hence the probability P (V ≤ v) is equal to the area of the set A and so P (V ≤ v) = v 2 for 0 ≤ v ≤ 1. Hence the density function of V is fV (v) = 2v for 0 < v < 1 and fV (v) = 0 otherwise. Noting that min(x, y) > w if and only if x > w and y > w, the probability P (W > w) can be calculated as the area of the set B = {(x, y) : 1 > x, y > w}. This gives P (W ≤ w) = 1 − (1 − w)2 for 0 ≤ w ≤ 1. Hence the density function of W is given by fW (w) = 2(1 − w) for 0 < w < 1 and fW (w) = 0 otherwise. 89 4.16 Drawing a figure and using the symmetry in the model, we can conclude that the height X above the ground can be modeled as X = 15 + 15 cos(Θ), where Θ is a randomly chosen angle between 0 and π. Then, x P (X ≤ x) = P (15 + 15 cos(Θ) ≤ x) = P Θ ≥ arccos −1 15 x 1 = − arccos −1 for 0 ≤ x ≤ 30, π 15 where the last equality uses the fact that the randomly chosen angle Θ has the density π1 on (0, π). In particular, P (X ≤ 22.5) = 2/3 and 1 . Hence the P (X ≤ 7.5) = 1/3. The derivative of arccos(z) is − √1−z 2 density function of X is given by ( √ 1 for 0 < x < 30 15π 1−(x/15−1)2 f (x) = 0 otherwise. 4.17 We have E(X) = hundred hours. 1 625 R 75 50 x(x − 50) dx + 1 625 R 100 4.18 By partial integration, Z 0.25 √ √ xπ 2cos(πx) dx = 2x sin(πx) E(X) = 0 75 x(100 − x) dx = 75 0.25 0 + √ 2 cos(πx) π 0.25 . 0 This gives E(X) = 0.1182 seconds. 4.19 The density function of the distance X thrown by Big John is f (x) = 90−x x−50 600 for 50 < x < 80, f (x) =R 200 for 80 < x < 90, and f (x) = 0 90 otherwise. This gives E(X) = 50 xf (x) dx = 73 13 meters. 4.20 The expected values of the random variables in the Problems 4.2, 4.4, and 4.6 are 53 96 , 4, and 120. 4.21 Let the random variable X be the distance from the randomly chosen point to the base of the triangle. Using a little geometry, it follows that P (X > x) is equal to the ratio of 12 [(h − x) × (h − x)b/h] and 12 h × b. Differentiation shows that the density function of X is 2(h − x)/h2 for 0 < x < h and 0 otherwise. Then Z h 1 2(h − x) dx = h. x E(X) = 2 h 3 0 90 4.22 The value of the price paid for the collector’s item is E(X) = R 1 expected 9 8 9 0 x 90(x − x ) dx = 11 . 4.23 Let X be the distance from the point to the origin. Then 1 P (X ≤ a) = πa2 for 0 ≤ a ≤ 1 4 Z ap p 1 1 1 2 a2 − x2 dx = πa2 − a2 arccos( ) + a2 − 1 P (X ≤ a) = πa − 2 4 4 a 1 for 1 < a ≤ √ 2. The density function f (x) of X satisfies f (x) = 1 2 πx 1 2 πx for 0 < x < 1, √ for 1 < x < 2. − 2x arccos( x1 ) Numerical integration leads to E(X) = R √2 0 xf (x) dx = 0.765. 4.24 The range of the random variable X is the interval (0, 0.5). Let A be the subset of points from the unit square for which the distance to the closest side of the rectangle is larger than x, where 0 < x < 0.5. Then A is a square whose sides have the length 1 − 2x and so the area of A is (1 − 2x)2 . It now follows that P (X ≤ x) = 1 − (1 − 2x)2 for 0 ≤ x ≤ 0.5. The probability density f (x) of X is given by f (x) = 4(1 − 2x) for 0R < x < 0.5 and f (x) = 0 otherwise. The expected value of X is 0.5 0 x 4(1 − 2x) dx = 0.1667. 4.25 (a) By P (A | B) = P (AB)/P (B), we get P (X ≤ x | X > a) = Rx a f (v) dv . P (X > a) Thus the conditional density of X given that X > a is f (x)/P (X > a) for x > a and 0 otherwise. In the same way, the conditional density of X given that X ≤Ra is f (x)/P (X ≤ a) for x < a and 0 otherwise. 1 1 1 (b) By E(X) = 1−a a x dx, we get E(X) = 2 (1 + a). (c) E(X | X > a) = a + λ1 and E(X | X ≤ a) = a > 0. 1−e−λa −λae−λa λ(1−e−λa ) for any 91 4.26 (a) By P (X > x) = Z ∞ R∞ x f (y) dy, it follows that P (X > x) dx = 0 Z ∞ dx x=0 Z ∞ f (y) dy. y=x By interchanging the order of integration, the last integral becomes Z ∞ f (y) dy y=0 Z y dx x=0 = Z ∞ yf (y) dy = E(X), 0 proving the desired result. (b) Let the random variable X be the smallest of n independent random numbers from (0, 1). Then P (X > x) = (1 R− x)n for 0 ≤ x ≤ 1 1 1 and P (X > x) = 0 for x > 1. This gives E(X) = 0 (1 − x)n dx = n+1 . 4.27 (a) The function g(x) = x1 is convex for x > 0. Therefore, by5Jensen’s 3 1 1 1 inequality, E X ≥ E(X) . Since E(X) = 5 , we get E X ≥ 3 . 2 R1 R1 (b) E X1 = 0 x1 12x2 (1 − x) dx = 2 and E[ X1 ] = 0 x12 12x2 (1 − √ x) dx = 6, and so σ( X1 ) = 2. 4.28 The random variable U has density function f (u) = 1 for 0 < u √ <1 and f (u) = 0 otherwise. By the substitution rule, we find for V = U and W + U 2 thatthat 1√ Z 1 2 1 2 E(V ) = u du = u du = and E(V ) = 3 2 0 0 Z 1 Z 1 1 1 u4 du = . and E(W 2 ) = u2 du = E(W ) = 3 5 0 0 Z Hence, the expected value and standard deviation of V are given by 23 q and 12 − 94 = 0.2357. The expected value and standard deviation of q W are given by 31 and 15 − 91 = 0.2981. 4.29 Let Y be the amount paid by the supplement policy. Then Y = g(X), where g(x) is min(500, x − 450) for x > 450 and 0 otherwise. By the substitution rule, E(Y ) = Z 950 450 1 (x − 450) dx + 500 1250 Z 1250 950 1 dx = 220 dollars. 1250 92 4.30 Let random variable X be the amount of waste (in thousands of gallons) produced during a week and Y be the total costs incurred during a week. Then the random variable Y can be represented as Y = g(X), where the function g(x) is given by 1.25 + 0.5x for 0 < x < 0.9, g(x) = 1.25 + 0.5 × 0.9 + 5 + 10(x − 0.9) for 0.9 < x < 1. and g(x) = 0 otherwise. By the substitution rule, the expected value of the weekly costs is given by Z 1 g(x)x4 (1 − x)2 dx = 1.6975. E(Y ) = 105 0 To find the standard deviation of the weekly costs, we first calculate Z 1 2 g 2 (x)x4 (1 − x)2 dx = 3.6204. E(Y ) = 0 Thus the standard deviation of the weekly costs is 0.8597. p E(Y 2 ) − E 2 (Y ) = 4.31 The net profit Y = g(X), where g(x) = 2x for 0 ≤ x ≤ 250 and g(x) = 2 × 250 − 0.5(x − 250) for x > 250. By the substitution rule, Z ∞ Z 250 [500 − 0.5(x − 250)]f (x) dx = 194.10 2xf (x) dx + E(Y ) = 0 250 dollars. The probability of a stockout is P (X > 250) = 1− 0.0404. R 250 0 f (x) dx = 4.32 The insurance payment (in thousands of dollars) is a so-called mixed random variable S, where with probability 0.01 20 − 1 with probability 0.02 S = max(0, X − 1) 0 with probability 0.97, where X represents the cost of a repairable damage. The random 1 variable X has the density function f (x) = 200 (20 − x) for 0 < x < 20. Thus, Z Z 20 1 0 × f (x)dx + (x − 1)f (x)dx + 0.97 × 0 E(S) = 0.01 × 19 + 0.02 0 1 Z 20 20 − x (x − 1) = 0.19 + 0.02 dx = 0.19 + 0.11432 = 0.30432. 200 1 93 The expected value of the insurance payment is 304.32 dollars. 4.33 Let U be the random point in (0, 1) and define g(u) = 1 − u if u < s and g(u) = u if u ≥ s. Then L = g(U ) is the length of the subinterval covering the point s. By the substitution rule, E(L) = Z s 0 (1 − u) du + Z 1 s 1 u du = s − s2 + . 2 53 4.34 In the Problems 4.2, 4.4, and 4.6, E(X) has the values 96 , 4, and 120 2 and the second moment E(X ) has the values 0.3833, 32, and 14404.2 Therefore the standard deviation σ(X) has the values 0.2802, 4, and 2.0412. 4.35 The area of the circle is Y = πX 2 , where X has the density function f (x) = 1 for 0 < x < 1. By the substitution rule, E(Y ) = Z 1 0 π and E(Y 2 ) = πx dx = 3 2 Z 1 π 2 x4 dx = 0 The expected value and the standard deviation of Y are π 3 π2 . 5 and 2π √ . 3 5 4.36 Let the random variable X be the distance from the center of the sphere to the point Q. Using the fact that the volume of a sphere with radius r is 43 πr3 , we get P (X ≤ x) = x3 r3 for 0 ≤ x ≤ r. 2 for 0 < x < r and Hence X has the density function f (x) = 3x r3 f (x) = 0 otherwise. The expected value p and the standard deviation of the random variable X are 43 r and 3/80 r. 4.37 (a) E (X − c)2 = E(X 2 )−2cE(X)+c2 and is minimal for c = E(X), as follows by differentiation. The minimal R ∞ value is the variance of X. Rc (b) E(|X − c|) = −∞ (c − x)f (x)dx + c (x − c)f (x)dx. The derivative of E(|X − c|) is 2P (X ≤ c) − 1. The minimizing value of c satisfies P (X ≤ c) = 21 and is the median of X. 4.38 The height above the ground is given by the random variable X = 15 + 15 cos(Θ), where Θ is uniformly distributed on (0, π). Using the 94 substitution rule and the relation cos2 (x) = 12 (cos(2x) + 1), we get Z 1 π E(X) = [15 + 15cos(x)] dx = 15 π 0 Z π 1 E(X 2 ) = 225[1 + 2cos(x) + cos2 (x)] dx π 0 Z 225 π cos(2x) dx = 337.5. = 337.5 + 2π 0 √ The standard deviation of X is σ(X) = 337.5 − 152 = 10.61 meters. Note: An alternative method to calculate E(X 2 ) is to use the density function h(x) of X, see Problem 4.16. However, it seems R 30 that numerical integration must be used to obtain the value of 0 x2 h(x) dx (as a sanity check, the numerical computation of the integral gives also the answer 337.5). It is much simpler to use the substitution rule to get this answer. 4.39 Let Y be the amount of demand that cannot be satisfied from stock on hand and define g(x) = x − s for x > s and g(x) = 0 otherwise. By the substitution rule, Z ∞ Z ∞ −λx 2 (x − s)2 λe−λx dx. (x − s)λe dx and E(Y ) = E(Y ) = s s These integrals can be evaluated as E(Y ) = 1 −λs λe 2 −λs e . λ2 and 1 −λs 1 e and σ(Y ) = [e−λs (2 − e−λs )]1/2 . λ λ 4.40 (a) The expected value of X is given by Z Z ∞ α+1 x(α/β)(β/x) dx = E(X) = β = This leads to αβ α 1−α x−α+1 ∞ = β ∞ αβ α x−α dx β αβ , α−1 provided that α > 1; otherwise E(X) = ∞. For α > 2, Z ∞ Z ∞ 2 α+1 2 αβ α x−α+1 dx x (α/β)(β/x) dx = E(X ) = β β = αβ α 2−α x−α+2 ∞ β = αβ 2 α−2 . 95 For 0 < α ≤ 2, E(X 2 ) = ∞. Thus, for α > 2, αβ 2 αβ 2 αβ 2 = − . var(X) = α−2 α−1 (α − 1)2 (α − 2) For any α > 0, P (X ≤ x) = Z x β (α/β)(β/y)α+1 dy = 1 − (β/x)α for x > β. Putting P (X ≤ x) = 0.5 gives the for the median m the value m = 21/α β. (b) The mean of the income is 4,500 dollars and the median is 3,402 dollars. The percentage of the population with an income between 25 and 40 thousand dollars is 0.37%, as follows from P (25 < X ≤ 40) = P (X ≤ 40) − P (X ≤ 25) = (2.5/25)2.25 − (2.5/40)2.25 = 0.0037. (c) The Pareto distribution shows rather well the way that a larger portion of the wealth in a country is owned by a smaller percentage of the people in that country. The explanation is that the Pareto density f (x) decreases from x = β onwards and has a long tail. Thus, most realizations of a Pareto distributed random variable tend to be small but occasionally the realizations will be very large. This is quite typical for income distributions. Also, the Pareto distribution has the property that the mean is always larger than the median. 4.41 The age of the bulb upon replacement is Y = g(X), where =x R 10 g(x) 1 dx + for x ≤ 10 and g(x) = 10 for x > 10. Then E(Y ) = 2 x 10 R 12 R 10 2 1 R 12 2 1 1 2 10 10 10 dx and E(Y ) = 2 x 10 dx + 10 10 10 dx. This leads to E(Y ) = 6.8 and σ(Y ) = 2.613. 4.42 Let the random variable X be the thickness of a sheet of steel and Y be the thickness of a non-scrapped sheet of steel. Then P (Y > y) = P (X > y | X > 125) for 125 ≤ y ≤ 150. The random variable X is uniformly distributed on (120, 150) and so for 120 ≤ x ≤ 150. This implies that P (X > x) = 150−x 30 P (Y > y) = 150 − y 25 for 125 ≤ y ≤ 150. 96 In other words, the random variable Y is uniformly distributed on (125, 150). Hence the expected value and the standard deviation of a = 137.5 millimeters non-scrapped sheet of steel are given by 125+150 2 and √2512 = 7.217 millimeters. 4.43 Since P min(X, Y ) > t = P (X > t, Y > t) = P (X > t)P (Y > t), we get P min(X, Y ) > t = e−αt e−βt = e−(α+β)t for all t > 0, and so min(X, Y ) is exponentially distributed. Using this result and the memoryless property of the exponential distribution, we have that the time to failure of the reliability system is distributed as T1 +T2 +T3 , where T1 , T2 and T3 are independent and exponentially distributed with respective parameters 5λ, 4λ and 3λ. Thus 1 1 1 47 + + = 5λ 4λ 3λ 60λ √ 1 769 1 1 0.5 = + + . σ(T1 + T2 + T3 ) = 25λ2 16λ2 9λ2 60λ E(T1 + T2 + T3 ) = 4.44 By the memoryless property of the exponential distribution, the time from three o’clock in the afternoon until the next departure of a limousine has an exponential distribution with an expected value of 20 minutes. Using the fact that the standard deviation of an exponential density is the same as the expected value of the density, the expected value and the standard deviation of your waiting time are both equal to 20 minutes. 4.45 Since the sojourn time of each bus is exactly half an hour, the number of buses on the parking lot at 4 p.m is the number of buses arriving between 3:30 p.m and 4 p.m. Taking the hour as unit of time, the buses arrive according to a Poisson process with rate λ = 43 . Using the memoryless property of the Poisson process, the number of buses arriving between 3:30 p.m and 4 p.m is Poisson distributed with expected value λ × 21 = 23 . 4.46 Take the hour as unit of time. The average number of arrivals per hour between 6 p.m and 10 p.m is 1.2. The random variable X measuring the time from 6 p.m until the first arrival after 6 p.m is exponentially distributed with parameter λ = 1.2. Hence the expected value of X 1 is 1.2 = 10 12 hours or 50 minutes. The median of X follows by solving 97 1 − e−1.2x = 0.5 and is equal to −ln(0.5)/1.2 = 0.5776 hours or 34.66 minutes. The probability that the first call occurs between 6:20 p.m and 6:45 p.m is given by 3 1 = e−1.2×1/3 − e−1.2×3/4 = 0.2638. ≤X≤ P 3 4 Let the random variable Y be the time measured from 7 p.m until the first arrival after 7 p.m. The probability of no arrival between 7 p.m and 7:20 p.m and at least one arrival between 7:20 p.m and 7:45 p.m is P ( 13 < Y ≤ 34 ). By the memoryless property of the exponential distribution, the random variable Y has the same exponential distribution as X. Hence the probability P ( 31 < Y ≤ 34 ) is also equal to e−1.2×1/3 − e−1.2×3/4 = 0.2638. 4.47 The probability that the time between the Rpassings of two consecutive ∞ cars is more than c seconds is given by p = c λe−λt dt = e−λc . By the lack of memory of the exponential distribution, p = e−λc gives also the probability that no car comes around the corner during the c seconds measured from the moment you arrive at the road. The number of passing cars before you can cross the road has the shifted geometric distribution {(1 − p)k p, k = 0, 1, . . .}. 4.48 By the lack of memory of the exponential distribution, the remaining washing time of the car being washed in the station has the same exponential density as a newly started washing time. Hence the probability that the car in the washing station will need no more than five other minutes is equal to Z 5 1 −1t e 15 dt = 1 − e−5/15 = 0.2835. 15 0 The probability that you have to wait more than 20 minutes before your car can be washed is equal to P (X1 + X2 > 20), where X1 is the remaining service time of the car in service when you arrive and X2 is the service time of the other car. The random variables X1 and X2 are independent. By the memoryless property of the exponential distribution, X1 has the same exponential distribution as X2 . The random variable X1 + X2 has an Erlang-2 distribution and the sought probability is given by P (X1 + X2 > 20) = e−20/15 + 20 −20/15 e = 0.6151. 15 98 Alternatively, this answer can be seen from Rule 4.3 by noting that P (X1 + X2 > 20) is the probability of at most one service completion in the 20 minutes. 4.49 The probability of having a replacement because of a system failure is given by ∞ X n=0 ∞ X e−µnT − e−µ[(n+1)T −a] . P nT < X ≤ (n + 1)T − a = n=0 e−µ(T −a) )/(1 This probability is equal to (1 − time between two replacements is ∞ X n=1 − e−µT ). The expected nT P (n − 1)T < X ≤ nT ) = T . 1 − e−µT 4.50 The probability that the closest integer to the random observation is odd is equal to ∞ ∞ Z 2k+1+ 1 X X 2 1 1 P (2k + < X < 2k + 1 + ) = e−x dx 1 2 2 2k+ 2 k=0 k=0 1 ∞ −1 X 1−e e− 2 −(2k+1+ 12 ) − 21 −(2k+ 12 ) −e =e e . = = 1 − e−2 1 + e−1 k=0 The conditional probability that the closest integer to the random observation is odd given that it is larger than the even integer r is equal to ∞ X P (2k + k=0 1 1 < X < 2k + 1 + X > r) 2 2 ∞ X 1 1 1 P (2k + < X < 2k + 1 + , X > r) P (X > r) 2 2 k=0 ∞ Z 2k+1+ 1 ∞ 2 1 X 1 X −(2k+ 1 ) −(2k+1+ 12 ) 2 − e = −r e−x dx = −r e e e 2k+ 1 = k=r/2 k=r/2 2 P −(2l+ 1 ) −(2l+1+ 1 ) 2 −e 2 −e = e−r ∞ , Since k=r/2 e l=0 e the conditional probability that the closest integer to the random observation is odd given that it is larger than r is equal to 1 ∞ 1 X −(2l+ 1 ) 1 1 e−1 e− 2 −(2l+1+ ) − 2 − e 2 e =e 2 − . = 1 − e−2 1 − e−2 1 + e−1 P∞ l=0 −(2k+ 21 ) −(2k+1+ 21 ) 99 The conditional probability is the same as the unconditional probability that the closest integer to the random observation from the exponential density is odd. This result can also be explained from the memoryless property of the exponential distribution. 4.51 Your win probability is the probability of having exactly one signal in (s, T ). This probability is e−λ(T −s) λ(T − s), by the memoryless property of the Poisson process. Putting the derivative of this expression equal to zero, we get that the optimal value of s is T − λ1 . The maximal win probability is e−1 . 4.52 Let N (t) be the number of events to occur in (0, t). Then, P (N (a) = k | N (a + b) = n) = P (N (a) = k, N (a + b) − N (a) = n − k) P (N (a + b) = n) for any 0 ≤ k ≤ n. We have P (N (a) = k, N (a + b) − N (a) = n − k) = P (N (a) = k)P (N (a + b) − N (a) = n − k), by the independence of N (a) and N (b) − N (a). Thus, for k = 0, 1, . . . , n, e−λa (λa)k /k! × e−λb (λb)n−k /(n − k)! n e−λ(a+b) λ(a + b) /n! n a k b n−k = . k a+b a+b P (N (a) = k | N (a + b) = n) = In view of the characteristic properties of the Poisson process, it is not surprising that the conditional distribution of N (a) is the binomial a distribution with parameters n and a+b . 8 4.53 Take the minute as time unit. Let λ = 60 and T = 10. The probability that the ferry will leave with two cars is 1 − e−λT = 0.7364. Let the generic variable X be exponentially distributed with an expected value of λ1 = 7.5 minutes. The expected value of the time until the ferry leaves is Z T Z ∞ 1 1 + E min(X, T ) = + tλe−λt dt + T λe−λt dt λ λ 0 T minutes. This leads to an expected value of minutes. 1 λ + λ1 (1 − e−λT ) = 13.02 4.54 Noting that major cracks on the highway occur according to a Poisson 1 process with rate 10 , it follows that the probability that there are no 100 major cracks on a specific 15-mile stretch of the highway is e−15/10 = 0.2231 and the probability of two or more major cracks on that part of the highway is 1 − e−15/10 − (15/10)e−15/10 = 0.4422. 4.55 In view of Rule 4.3, we can think of failures occurring according to a Poisson process with a rate of 4 per 1,000 hours. The probability of no more than five failures during 1,000 hours is given by the Poisson probability 5 X 4k e−4 = 0.7851. k! k=0 The smallest value of n such that Pn k=0 e −4 4k k! ≥ 0.95 is n = 8. 4.56 The probability of no bus arriving during a wait of t minutes at the bus stop is e−t/10 . Putting e−t/10 = 0.05 gives t = 29.96. You must leave home no later than 7:10 a.m. 4.57 (a) Since the number of goals in the match is Poisson distributed with P k 1 an expected value of 90 × 30 = 3, the answer is 1 − 2k=0 e−3 3k! = 0.5768. (b) The numbers of goals in disjoint time intervals are independent of 2 −1.5 1.5 = 0.0840. each other and so the answer is e−1.5 1.5 2! × e (c) Let, for k = 0, 1, . . ., ak = e−3×(12/25) (3 × (12/25))k (3 × (13/25))k and bk = e−3×(13/25) . k! k! Then, by the results P of Rule 3.12, we get that the probability of a draw is equal to ∞ ak × bk = 0.2425 and the probability of a win k=0 P Pk−1 for team A is equal to ∞ k=1 ak × n=0 bn = 0.3524. 4.58 The probability of having no other emergence unit within a distance r of the incident is given by the probability of no emergence unit in a circle with radius r around the point of the incident. The probability 2 of no Poisson event in a region with area πr2 is e−απr and so the 2 desired probability is 1 − e−απr . 20 ) × 100% = 10.56%. 4.59 The answer is 1 − Φ( 16 4.60 The solution of 1−Φ(x) = 0.05 is given by the percentile z0.95 = 1.6449. Thus the cholesterol level of 5.2 + 1.6449 × 0.65 = 6.27 mmol/L is exceeded by 5% of the population. 101 4.61 An estimate for the standard deviation σ of the demand follows from the formula 50+σz0.95 = 75, where z0.95 = 1.6449 is the 95% percentile of the standard normal distribution. This gives the estimate σ = 1.2. 4.62 The proportion of euro coins that are not accepted by the vending machine is 23.60 − 23.25 22.90 − 23.25 +1−Φ = 2[1 − Φ(3.5)] = 0.0046. Φ 0.10 0.10 4.63 By P (X < 20) = P (X ≤ 20) = P X−25 ≤ 20−25 = Φ(−2), we 2.5 2.5 have P (X < 20) = 0.0228. Finding the standard deviation σ of the thickness of the coating so that P (X < 20) = 0.01 translates into = 0.01. The 0.01th percentile of the N (0, 1) solving σ from Φ 20−25 σ distribution is −2.3263, and so −5/σ = −2.3263, or σ = 2.149. 4.64 The proportion of the mills output that can be used by the customer is equal to 10.15 − 10 9.85 − 10 Φ −Φ = 0.9839. 0.07 0.07 4.65 We have P (|X − µ| > kσ) = P (|Z| > k) = P (Z ≤ −k) + P (Z ≥ k), where Z is N (0, 1) distributed. Since P (Z ≥ k) = P (Z ≤ −k) and P (Z ≥ k) = 1 − Φ(k), the sought result follows. 4.66 Let the random variable Y = aX + b. To evaluate P (Y ≤ y), distinguish between the two cases a ≥ 0 and a < 0. For the case that a ≥ 0, X −µ y − b − aµ y−b =P ≤ P (Y ≤ y) = P X ≤ a σ aσ y − b − aµ = Φ , aσ showing that Y is N (aµ+b, a2 σ 2 ) distributed. For the case that a < 0, y−b y − b − aµ X −µ P (Y ≤ y) = P X ≥ =P ≥ a σ aσ −y + b + aµ y − b − aµ = 1−Φ =1−Φ . aσ |a|σ Using the fact that Φ(−x) =1 − Φ(x) for any x > 0, it next follows y−b−aµ that P (Y ≤ y) = Φ . In other words, Y is N (aµ + b, a2 σ 2 ) |a|σ distributed. 102 4.67 The number of heads in 10,000 tosses of a fair coin is approximately normally distributed with expected value 5,000 and standard deviation 50. The outcome of 5,250 heads lies five standard deviations above the expected value. Without doing any further calculations we can conclude that the claim is highly implausible (1 − Φ(5) = 2.87 × 10−7 ). 4.68 (a) For any z ≥ 0, we have P (|Z| ≤ z) = P (−z ≤ Z ≤ z) = Φ(z)−Φ(−z). Differentiation gives that |Z| has the probability density function 1 2 2 √ e− 2 z for z > 0. 2π Using the change of variable v = z 2 , we get √ Z ∞ Z ∞ 1 2 2 − 1 z2 1 − v e 2 dv = √ . z √ e 2 dz = √ E(|Z|) = π 2π 2π 0 0 Also, noting that E(|Z|2 ) = E(Z 2 ) and using the fact that E(Z 2 ) = 1 for the N (0, 1) distributed random variableqZ, we get E(|Z|2 ) = 1. 2 π 1 − π2 . R∞ (b) Let V = max(Z − c, 0). Since E(V ) = 0 P (V > v) dv (see Problem 4.26), we have Z ∞ Z ∞ [1 − Φ(x)] dx. [1 − Φ(v + c)] dv = E(V ) = This gives σ 2 (|Z|) = 1 − and so σ(|Z|) = c 0 By partial integration, we next get E(V ) = x[1 − Φ(x)] The integral R∞ c 1 2 xe− 2 x dx is ∞ c R∞ 1 2 c 2 1 +√ 2π Z ∞ 1 2 xe− 2 x dx. c 1 2 e−y dy = e− 2 c . Thus we get 1 2 1 E(V ) = −c[1 − Φ(c)] + √ e− 2 c . 2π √ 4.69 Since X − Y is N (0, 2σ 2 ) distributed, (X − Y )/(σ 2)√ispN (0, 1) distributed. Thus, using Problem 4.68, E(|X − Y |) = σ 2 2/π. Also, E(X + Y ) = 2µ. The formulas for E(|X − Y |) and E(X + Y ) give two equations in E[max(X, Y )] and E[min(X, Y )], yielding the sought result. q Note: max(X, Y ) and min(X, Y ) have each standard deviation σ 1 − π1 . 103 4.70 The random variable Dn can be represented as Dn = |X1 + · · · + Xn |, where the random variable Xi is equal to 1 if the ith step of the drunkard goes to the right and is otherwise equal to −1. The random variables X1 , . . . , Xn are independent and have the same distribution with expected value µ = 0 and standard deviation σ = 1. The central limit theorem now tells us that X1 +· · ·+Xn is approximately normally √ distributed with expected value 0 and standard deviation n for n large. Thus, x −x P (Dn ≤ x) ≈ Φ √ −Φ √ for x > 0. n n In Problem 4.68, the expected value and the standard deviation of V = |X| are given for a standard normally distributed random variable X. √ Using this result and the fact that (X1 +· · ·+Xn )/ n is approximately N (0, 1) distributed, the approximations for the expected value and the standard deviation of Dn follow. 4.71 Let X1 and X2 be the two measurement errors. Since X1 and X2 are independent, 21 (X1 + X2 ) is normally distributed with expected value √ √ 0 and standard deviation 21 0.0062 l2 + 0.0042 l2 = l 52/2,000. The sought probability is P 1 |X1 + X2 | ≤ 0.005l = P (−0.01l ≤ X1 + X2 ≤ 0.01l) 2 20 20 =Φ √ −Φ − √ = 0.9945. 52 52 4.72 The desired probability is P (|X1 − X2 | ≤ a). Since the random variables X1 and X2 are independent, the random variable X1 − X2 is normal distributed p with expected value µ = µ1 − µ2 and standard deviation σ = σ12 + σ22 . It now follows that P (|X1 − X2 | ≤ a) = P (−a ≤ X1 − X2 ≤ a) −a − µ X1 − X2 − µ a−µ =P ≤ ≤ σ σ σ ! ! a − (µ1 − µ2 ) −a − (µ1 − µ2 ) p p =Φ −Φ . σ12 + σ22 σ12 + σ22 104 P 4.73 (a) The profit of Joe and his brother after 52 weeks is 52 i=1 Xi , where 1 the Xi are independent with P (Xi = 10) = 2 and P (Xi = −5) = 21 . The √ expected value and the standard deviation of the Xi are 2.5 and 62.5 − 2.52 = 7.5 dollars. The sought probability is P 52 X 100 − 52 × 2.5 √ Xi ≥ 100 ≈ 1 − Φ = 0.7105. 7.5 52 i=1 (b) Let Xi be the score of the ith roll. Then P 80 X 300 − 80 × 3.5 √ = 0.9048. Xi ≤ 300 ≈ Φ 1.7078 80 i=1 4.74 The random variable Yn = n1 (X1 + · · · + Xn ) − µ has expected value 0 2 and variance σn . Using the central limit theorem, Yn is approximately 2 N (0, σn distributed for n large. Since P (|Yn | > c) = P (Yn < −c) + P (Yn > c), we have −c√n c√n h c√n i P (|Yn | > c) ≈ Φ +1−Φ =2 1−Φ . σ σ σ 4.75 (a) The number of sixes in one throw of 6r dice is distributed as the P binomial random variable S6r = 6r X k=1 k , where the Xk are independent 0−1 variables with expected value µ = 61 and standard deviation √ σ = 61 5. We have P (S6r ≥ r) = P S − 6rµ √ ≥0 . σ 6r 6r By the central limit theorem, this probability tends to 1 − Φ(0) = 21 as r → ∞. (b) Let X1 , . . . Xn be independent and Poisson distributed random variables with expected value 1. The sum X1 + · · · + Xn is Poisson distributed with expected value n. Therefore, n2 nn n + ··· + P (X1 + · · · + Xn ≤ n) = e−n 1 + + 1! 2! n! Next repeat the arguments in (a). 4.76 The probability is about 1 − Φ(2.828) = 0.0023. 105 4.77 The number of even numbers in any given drawing of the lotto 6/45 has a hypergeometric distribution with expected value µ = 132 45 and √ 1 standard deviation σ = 15 299. By the central limit theorem, the total number of even numbers that will be obtained in 52 drawings of the lotto 6/45 is approximately normally distributed with expected √ value 52µ = 152.533 and standard deviation σ 52 = 8.313. The outcome 162 lies (162 − 152.533)/8.313 = 1.14 standard deviations above the expected value. This outcome does not cast doubts on the unbiased nature of the lotto drawings. 4.78 The probability distribution of the total rainfall (in millimeters) next year in Amsterdam can be modeled by a normal distribution with expected value 799.5 and standard deviation 121.39. The sought probability is equal to 1−Φ 1000 − 799.5 = 0.0493. 121.39 4.79 The payoff per game has an expected value of µ = 12 dollars and √ a standard deviation of σ = 6,142 − 144 = 77.447 dollars. By the central limit theorem, the probability of the casino losing money in a given week is approximately Φ − 5, 000 × 3 √ = 1 − Φ(2.739) = 3.1 × 10−3 . 77.447 5, 000 4.80 The total number of bets lost by the casino is X1 + · · · + Xn , where the random variable Xi is equal to 1 if the casino loses the ith p bet and Xi is otherwise equal to 0. We have E(Xi ) = p and σ(Xi ) = p(1 − p). By the central limit theorem, X1 + · · · + Xn has approximately a normal distribution with expected value np and standard deviation 1√ [p(1 − p)] 2 n for large n. The casino loses money to the player if and only if the casino loses 12 n + 1 or more bets (assume that n is even). The probability of this is approximately equal to 1 − Φ (βn ), where βn = 1 2n + 1 − np 1/2 √ . p(1 − p) n The loss probability is about 0.1876, 0.0033, and 6.1 × 10−18 for n = 1,000, 10,000 and 100,000. Assuming one dollar is staked on each 106 bet, then for n plays the profit of the casino over the gamblers equals Wn = n − 2 (X1 + · · · + Xn ). We have 1√ E(Wn ) = n(1 − 2p) and σ(Wn ) = 2[p(1 − p)] 2 n. The random variable Wn is approximately normally distributed for large n The standard normal density has 99% of its probability mass to the right of point −2.326. This means that, with a probability of approximately 99%, the profit of the casino over the player is greater 1√ than n(1 − 2p) − 2.326 × 2[p(1 − p)] 2 n. 4.81 The premium c should be chosen such that P rc − (X1 + · · · + Xn ) ≥ 1 10 rc is at least 0.99, where Xi is the amount claimed by the ith 9 policy holder. This probability can be approximated by Φ ( 10 rc − √ √ 9 rµ)/(σ r) . Thus c should be chosen such that ( 10 rc − rµ)/(σ r) equals the 0.99th percentile 2.326 of the standard normal distribution. Therefore, 2.326σ 10 µ+ √ c≈ . 9 r 4.82 The probability mass function of the number of copies of the appliance to be used when an infinite supply would be available is a Poisson distribution with expected value of 150 2 = 75. Suppose that Q copies of the appliance are stored in the space ship. Let the exponentially distributed random variable Xi be the lifetime (in days) of the ith copy used. Then the probability of a shortage during the space mission is P (X1 + · · · + XQ ≤ 150). The random variables X1 , . . . , XQ are independent and have an expected value of λ1 days and a standard deviation of λ1 days, where λ = 12 . By the central limit theorem, X1 + · · · + XQ − 2Q 150 − 2Q √ √ P (X1 + · · · + XQ ≤ 150) = P ≤ 2 Q 2 Q 150 − 2Q √ ≈Φ . 2 Q The 0.001th percentile of the standard normal distribution is -3.0902. √ = −3.0902 gives Q = 106.96 and so the Solving the equation 150−2Q 2 Q normal approximation suggests to store 107 units. Note: The exact value of the required stock followsPby finding the k smallest value of Q for which the Poisson probability k>Q e−75 75k! is smaller than or equal to 10−3 . This gives Q = 103. 107 4.83 Let Xi be the amount of dollars the casino owner loses on the ith bet. 18 Then the Xi are independent random variables with P (Xi = 10) = 37 √ 19 45 and P (Xi = −5) = 37 . Then, E(Xi ) = 85 i ) = 37 38. 37 and σ(X P2,500 The amount of dollars lost by the casino owner is i=1 Xi and is 85 and σ = approximately N (µ, σ 2 ) distributed with µ = 2,500 × 37 √ 38. The casino owner will lose more than 6,500 dollars with 50 × 45 37 a probability of about 6,500 − µ 1−Φ = 0.0218. σ 4.84 By a change to polar coordinates x = r cos(θ) and y = r sin(θ) with dxdy = rdrdθ , it follows that Z ∞ −∞ Z ∞ e − 12 (x2 +y 2 ) −∞ Z ∞ Z 2π 1 2 e− 2 r r dr dθ 0 0 Z ∞ Z 1 2 re− 2 r dr = π = 2π dx dy = 0 ∞ 1 2 e− 2 r dr2 = 2π, 0 √ R∞ 1 using the fact that 0 e− 2 y dy = 2. This proves the result I = 2π. √ Note: This result implies Γ( 21 ) = π. The change of variable t = 21 x2 √ R∞ 1 2 R∞ 1 in I = 2 0 e− 2 x dx leads to 2π = √22 0 e−t t− 2 dt, showing that √ Γ( 12 ) = π. 4.85 Let Vn be the bankroll (in dollars) of the gambler after the nth bet. Then Vn = (1 − α)Vn−1 + αVn−1 Rn , where α = 0.05, V0 = 1,000 and 19 the Ri are independent random variables with P (Ri = 41 ) = 37 and 18 P (Ri = 2) = 37 . Iterating this equality gives Vn = (1 − α + αR1 ) × · · · × (1 − α + αRn )V0 . This leads to ln(Vn /V0 ) = n X i=1 ln(1 − α + αRi ). The random variables Xi = ln(1 − α + αRi ) are independent. The expected value and the variance of these random variables are 19 18 ln(0.9625) + ln(1.05) 37 37 18 2 19 2 ln (0.9625) + ln (1.05) − µ2 . σ2 = 37 37 µ= 108 By the central limit theorem, the random variable ln(V100 /V0 ) is approximately N (100µ, 100σ 2 ) distributed (the gambler’s bankroll after 100 bets is approximately lognormally distributed). The probability that the gambler will take home more than d dollars is P (Vn > V0 + d) = P ln(Vn /V0 ) > ln(1 + d/V0 ) . This probability is approximately equal to ln(1 + d/V ) − 100µ 0 1−Φ 10σ and has the values 0.8276, 0.5494, 0.2581, and 0.0264 for d = 0, 500, 1,000, and 2,500. 4.86 Denoting by the random variable Fn the factor at which the size of the population changes in the nth generation, the size Sn of the population after n generations is distributed as (F1 × · · · × Fn )s0 . By the central limit theorem, n X ln(Fi ) + ln(s0 ) ln(Sn ) = i=1 has approximately a normal distribution with expected value nµ1 + √ ln(s0 ) and standard deviation σ1 n for n large, where µ1 and σ1 are the expected value and the standard deviation of the ln(Fi ). The numerical values of µ1 and σ1 are given by µ1 = 0.5ln(1.25) + 0.5ln(0.8) = 0 p σ1 = 0.5[ln(1.25)]2 + 0.5[ln(0.8)]2 = 0.22314. Since ln(Sn ) has approximately a normal distribution with expected √ value ln(s0 ) and standard deviation 0.22314 n, the probability distribution of Sn can be approximated by a lognormal distribution with √ parameters µ = ln(s0 ) and σ = 0.22314 n. √ 4.87 The distance can be modeled as X 2 + Y 2 , where X and Y are independent N (0, 1) random variables. The random variable X 2 + Y 2 has 1 the chi-square density f (v) = 21 e− 2 v . We have P( p X2 2 2 2 2 + Y ≤ r) = P (X + Y ≤ r ) = Z r2 f (x) dx for r > 0. 0 Hence the probability density of the distance from the center of the 1 2 target to the point of impact is 2rf (r2 ) = re− 2 r for r > 0. The √ R ∞ 2 − 1 r2 expected value of the distance is 0 r e 2 dr = 21 2π. The density 109 √ 1 2 re− 2 r assumes its maximum value 1/ e at r = 1.pThus the mode of the distance is 1. The median of the distance is 2 ln(2), as follows Rx 1 2 by solving the equation 0 re− 2 r dr = 0.5. 4.88 If the random variable X is positive, the result follows directly from Rule 4.6 by noting that x1 is strictly decreasing for x > 0 and has − x12 as its derivative. If X can take on both positive and negative values, we use first principles. Then, by P (Y ≤ y) = P X1 ≤ y , we have P (Y ≤ y) = ( P (X ≤ 0) + P 0 < X ≤ P y1 ≤ X ≤ 0 1 y for y > 0 for y ≤ 0. Differentiation gives that Y has the density function y12 f y1 . The 1 1 desired result next follows by noting that y12 π(1+1/y 2 ) = π(1+y 2 ) . q 2y ds is s = 4.89 The inverse of the function y = m . We have dy = q 1 2ym . An application of Rule 4.6 gives that the probability density of the kinetic energy E is 1 2 2 ms 2 c3 r y −my/c2 e πm for y > 0. 4.90 The conditions of Rule 4.6 are not satisfied for the random variable ln(|X|a ). Noting that 0 < |X|a < 1 and so ln(|X|a ) < 0 , we get by first principles that P (ln(|X|a ) ≤ x) = P (ln(|X|) ≤ x ) = P (|X| ≤ ex/a ) for x ≤ 0. a Therefore, using the fact that X is uniformly distributed on (−1, 1), P (ln(|X|a ) ≤ x) = P (−ex/a ≤ X ≤ ex/a ) = 2ex/a for x ≤ 0. 2 This shows that the probability density of ln(|X|a ) is a1 ex/a for x < 0 and is 0 otherwise. Note: For a < 0, the probability density of ln(|X|a ) is − a1 ex/a for x > 0 and 0 otherwise. This result readily follows by noting that ln(|X|a ) = − ln(|X|−a ) for a < 0. 110 4.91 It follows from P (Y ≤ y) = P X ≤ P (Y ≤ y) = ln(y) ln(10) ln(y) ln(10) that for 1 ≤ y ≤ 10. 1 Next, differentiation shows that Y has the density function ln(10)y for 1 < y < 10. 4.92 (a) The Weibull distributed random variable has the cumulative probα ability distribution function F (x) = 1 − e−(λx) for x ≥ 0. Letting u be random number between 0 and 1, the solution of the equation F (x) = u gives the random observation x = λ1 [− ln(1 − u)]1/α from the Weibull distribution. Since 1 − u is also a random number between 0 and 1, one can also take x= 1 [− ln(u)]1/α λ as a random observation from the Weibull distribution. The Weibull distribution with α = 1 is the exponential distribution. (b) Let u be random number between 0 and 1, then solving the u equation F (x) = u for F (x) = ex /(1 + ex ) gives ex = 1−u . Thus u x = ln 1−u is a random observation from the logistics distribution. 4.93 Generate n random interval (0, 1). Then numbers u1 , . . . , un from the 1 1 the number − λ ln(u1 ) + · · · + ln(un ) = − λ ln(u1 × · · · × un ) is a random observation from the gamma distributed random variable. 4.94 It is easiest to consider first the case of a random variable V having a triangular density on (0, 1) with m0 as its most likely value. The density function f (v) of V is given by f (v) = 2v/m0 for 0 < v ≤ m0 and f (v) = 2(1 − v)/(1 − m0 ) for m0 < v < 1. The probability distribution function of V is ( 2 v for 0 ≤ v ≤ m0 0 P (V ≤ v) = m 2v−v 2 −m0 for m0 ≤ v ≤ 1. 1−m0 To obtain random observation from V , generate a random number u between 0 and 1 and solve v from the equation P (V p ≤ v) = u. The √ solution is v = m0 u if 0 < u ≤ m0 and v = 1 − (1 − m0 )(1 − u) for m0 < u < 1. If X has a triangular density on (a, b) with m as its most likely value, then the random variable V = (X − a)/(b − a) has a 111 triangular density on (0, 1) with m0 = (m − a)(b − a). Hence a random √ observation from X is given p by x = a + (b − a) m0 u if 0 < u ≤ m0 and x = a + (b − a)[1 − (1 − m0 )(1 − u)] if m0 < u < 1. 4.95 (a) The random variable X is with probability p distributed as an exponential random variable with parameter λ1 and with probability 1 − p as an exponential random variable with parameter λ2 . Hence generate two random numbers u1 and u2 from (0, 1). The random observation is −(1/λ1 ) ln(u1 ) if u1 ≤ p and is −(1/λ2 ) ln(u2 ) otherwise. (b) Let V be exponentially distributed with parameter λ. Then the random variable X is with probability p distributed as V and with probability 1 − p as −V . Hence generate two random numbers u1 and u2 from (0, 1). The random observation is − λ1 ln(u2 ) if u1 ≤ p and 1 λ ln(u2 ) otherwise. This simulation method is called the composition method. f (x) 4.96 Apply the definition r(x) = 1−F (x) to get the expression for r(x). It is a matter of elementary but tedious algebra to show that r(x) is first decreasing and then increasing with a bathtub shape. As an illustration of the bathtub shape of r(x), we give in the figure the graph of r(x) for ν = 0.5. r(x) 20 16 12 8 4 0 x 0 0.2 0.4 0.6 0.8 1 112 4.97 Since P (V > x) = P (X1 > x, . . . Xn > x), it follows from the independence of the Xk and the failure rate representation of the reliability function that P (V > x) = P (X1 > x) · · · P (Xn > x) = e− Rx 0 This gives P (V > x) = e− r1 (y) dy · · · e− Rx 0 R x Pn 0 [ k=1 rk (y)]dy 4.98 Noting that 1 − F (x) = e− Rx 0 r(t)dt rn (y) dy for x ≥ 0. , proving the desired result. , we get 1 − F (x) = e− ln(1+x) = 1 1+x for x ≥ 0. 4.99 Let F (x) = P (X ≤ x). Since Z x Z x r(t) dt = λ d ln(1 + tα ) = λ ln(1 + xα ), 0 0 α it follows that F (x) = 1 − e−λ ln(1+x ) . Thus the reliability function is given by 1 − F (x) = (1 + xα )−λ for x ≥ 0. The derivative r′ (x) = (α − 1 − xα )/(1 + xα )2 . For the case that α > 1, the derivative is positive for x < (α − 1)1/α and negative for x > (α − 1)1/α , showing that r(x) first increases and then decreases. 4.100 Let Fi (x) = P (Xi ≤ x). Then R s+t e− 0 ri (x)dx 1 − F (s + t = − R s r (x)dx . P (Xi > s + t | Xi > s) = 1 − F (s) e 0 i Since r1 (x) = 21 r2 (x), we get 1 P (X1 > s + t | X1 > s) = e− 2 e R s+t − 12 0 Rs 0 r2 (x)dx r2 (x)dx = e− e R s+t − 0 Rs 0 r2 (x)dx 1 r2 (x)dx 2 . p showing that P (X1 > s + t | X1 > s) = P (X2 > s + t | X2 > s). R∞ 2.1 2.1 4.101 The integral 1,000 e−(y/1,250) dy/e−(1,000/1,250) gives the mean residual life time m(1,000). By numerical integration, m(1,000) = 516.70 hours. 113 P∞ 4.102 In P∞ − i=1 pi log pi subject to pi ≥ 0 for all i, P∞order to minimize i=1 ipi = µ, form the Lagrange function i=1 pi = 1 and F (p1 , p2 , . . . , λ1 , λ2 ) = − ∞ X pi log pi +λ1 ∞ X i=1 i=1 ∞ X ipi −µ , pi −1 +λ2 i=1 where λ1 and λ2 are the Lagrange multipliers. Putting ∂F/∂pi = 0 gives the equations −1 − log pi + λ1 + λ2 i = 0 for i ≥ 1 and so pi = eλ1 −1+λ2 i for i ≥ 1. P∞ λ1 −1 = (1 − The condition i=1 pi = 1 implies that λ2 < 0 and e λ −λ λ λ (i−1) e 2 ) e 2 . Hence we have pi = (1 − e 2 ) e 2 for i ≥ 1. Letting i−1 λ 2 r = 1 − e , we get the P∞geometric distribution 1pi = (1 − r) r 1for i ≥ 1. The condition i=1 ipi = µ implies that r = µ and so r = µ . 4.103 Form the Lagrange function F (p1 , . . . , pn , λ1 , λ2 ) = − X pi log pi +λ1 n X i=1 i≥1 n X pi −1 +λ2 pi Ei −E , i=1 where λ1 and λ2 are Lagrange multipliers. Putting ∂F/∂pi = 0 results in −1 − log pi + λ1 + λ2 Ei = 0Pand so pi = eλ1 −1+λ2 Ei for all Pi. Substituting pi into the constraint ni=1 pi = 1 gives eλ1 −1 = 1/ ni=1 eλ2 Ei . Thus eλ 2 E i p i = Pn for all i. λ2 Ek k=1 e P Substituting this into ni=1 pi Ei = E gives the equation n X i=1 Ei eλ 2 E i − E n X eλ 2 E k = 0 k=1 for λ2 . Replacing λ2 by −β, we get the desired expression for the p∗i and the equation for the unknown β. 4.104 Apply Rule 10.8 with E1 = 4.50, E2 = 6.25, E3 = 7.50, and E = 5.75. Using a numerical root-finding method, we obtain β = 0.218406. The maximum entropy probabilities of a customer ordering a regular cheeseburger, a double cheeseburger and a big cheeseburger are 0.4542, 0.30099 and 0.2359. 114 Chapter 5 5.1 Let X be the low points rolled and Y be the high points rolled. These random variables are defined on the sample space consisting of the 36 equiprobable outcomes (i, j) with 1 ≤ i, j ≤ 6, where i is the number shown by the first die and j is the number shown by the second die. For k < l, the event {X = k, Y = l} occurs for the outcomes (k, l) and 2 (l, k). This gives P (X = k, Y = l) = 36 for 1 ≤ k < l ≤ 6. Further, 1 P (X = k, Y = k) = 36 for all k. 5.2 Imagine that the 52 cards are numbered as 1, 2, . . . , 52. The random variables X and Y are defined on the sample space consisting of all 52 sets of 13 different numbers from 1, 2, . . . , 52. Each element of the 13 sample space is equally likely. The number ω for which of 26elements 13 13 X(ω) = x and Y (ω) = y is equal to x y 13−x−y . Hence the joint probability mass function of X and Y is given by 26 13 13 P (X = x, Y = y) = x y 13−x−y 52 13 for all integers x, y with x, y ≥ 0 and x + y ≤ 13. 5.3 The joint mass function of X and Y − X is P (X = x, Y − X = z) = P (X = x, Y = z + x) = e−2 x!z! P∞ e−2 P e−1 e−1 e−2 for x, z = 0, 1, . . .. Since ∞ x=0 x!z! = z! , the z=0 x!z! = x! and marginal distributions of X and Y − X are Poisson distributions with expected value 1. Noting that P (X = x, Y − X = z) = P (X = x)P (Y − X = z) for all x, z, we have by Rule 3.7 that X and Y − X are independent. Thus, using Example 3.14, the random variable Y is Poisson distributed with expected value 2. 5.4 The joint probability mass function of X and Y satisfies 3+i P (X = i, Y = 4 + i) = 0.45i 0.554 for i = 0, 1, 2, 3 i 3+k 0.454 0.55k for k = 0, 1, 2, 3. P (X = 4, Y = 4 + k) = k The other P (X = i, Y = j) are zero. 115 5.5 The sample space for X and Y is the set of the 10 3 = 120 combinations of three distinct numbers from 1 to 10. The joint mass function of X and Y is y−x−1 for 1 ≤ x ≤ 8, x + 2 ≤ y ≤ 10. P (X = x, Y = y) = 120 The marginal distributions are 10 X y−x−1 (10 − x)(9 − x) P (X = x) = = for 1 ≤ x ≤ 8 120 240 y=x+2 P (Y = y) = y−2 X x=1 (y − 1)(y − 2) y−x−1 = for 3 ≤ y ≤ 10. 120 240 Further, for 2 ≤ k ≤ 9, P (Y − X = k) = 10−k X P (X = x, Y = x + k) = x=1 (k − 1)(10 − k) . 120 5.6 The random variable X and Y are defined on a countably infinite sample space consisting of all pairs (x, y) of positive integers with x 6= y. The joint probability mass function of X and Y is given by y−x−1 x−1 1 9 1 8 for 1 ≤ x < y P (X = x, Y = y) = 10 10 10 10 x−y−1 y−1 1 9 1 8 for 1 ≤ y < x. P (X = x, Y = y) = 10 10 10 10 Let the random variables V and W be defined by V = min(X, Y ) and W = max(X, Y ). Then, P (V = v) = ∞ X P (X = v, Y = y) + ∞ X P (X = x, Y = v) y=v+1 y=v+1 8 v−1 1 =2 for v = 1, 2, . . . . 10 10 Pw−1 1 Pw−1 x−1 (9/10)w−x−1 = 1 (9/10)w x Noting that 100 x=1 (8/10) x=1 (8/9) , 72 we find after some algebra that P (W = w) = w−1 X P (X = x, Y = w) + x=1 = 8 w−1 2 9 w 1− 9 10 9 w−1 X P (X = w, Y = y) y=1 for w = 2, 3, . . . . 116 n n 1 n 5.7 Using the formula P (X = x, Y = y, N = n) = 16 nx 21 y 2 , we find that 6 2n 1X n n 1 P (X = x, Y = y) = for 0 ≤ x, y ≤ 6. y x 6 2 n=1 Since P (X = Y ) = 1 P6 n=1 6 n 2 1 2n Pn x=0 x , 2 it follows that 2n 6 1 X 2n 1 P (X = Y ) = = 0.3221. n 6 2 n=1 5.8 The random variables X, Y and N are defined on a countably infinite state space. The event {X = i, Y = j, N = n} can occur in n−1 n−1−(i−1) ways. This is the number of ways to choose i − 1 i−1 j−1 places for the first i − 1 heads of coin 1 and to choose j − 1 nonoverlapping places for j − 1 heads of coin 2 in the first n − 1 tosses. Thus the joint probability mass function of X, Y and N is given by n n−1 n−i 1 P (X = i, Y = j, N = n) = 4 i−1 j−1 for P∞i, j = 1, 2, . . . and n = i + j − 1, i + j, . . .. By P (X = i, Y = j) = n=i+j−1 P (X = i, Y = j, N = n), it follows that n ∞ X (n − 1)!(n − i)! 1 P (X = i, Y = j) = (i − 1)!((n − i)!(n − i − j + 1)!(j − 1)! 4 n=i+j−1 n X ∞ 1 n−1 i+j−2 . = 4 i+j−2 i−1 n=i+j−1 Using the identity follows that P∞ m=r m m = am /(1 − a)m+1 for 0 < a < 1, it r a m+1 X ∞ 1 m i+j−2 P (X = i, Y = j) = i+j−2 i−1 4 m=i+j−2 i+j−1 i+j−2 1 = 3 i−1 P∞ By P (X = i) = j=1 P (X = i, Y = j), i+j−1 n+1 ∞ ∞ X X i+j−2 1 n 1 P (X = i) = = . i−1 3 3 i−1 j=1 n=i−1 117 Using again the identity it follows that P∞ m=r m r m a = am /(1 − a)m+1 for 0 < a < 1, i 1 P (X = i) = 2 for i = 1, 2, . . . . Further, we have P (X = Y ) = 2n−1 ∞ X 1 2n − 2 3 n−1 n=1 ∞ k 1 1 X 2k = . k 3 9 k=0 k √ x = 1/ 1 − 4x for |x| < 41 , the numeriUsing the identity cal value 0.4472 is obtained for P (X = Y ). P∞ 2k k=0 k 5.9 The constant c must satisfy Z ∞Z x Z −2x 1=c e dx dy = c 0 0 x xe−2x dx. 0 −2x integrates to 1 over Noting that the Erlang probability density 4xe RR (0, ∞), we find c = 4. By P ((X, Y ) ∈ C) = C f (x, y) dx dy, we have that Z = X − Y satisfies Z ∞ Z ∞ Z ∞ −2x P (Z > z) = dy 4e dx = 2e−2(y+z) dy = e−2z for z > 0. 0 y+z 0 Thus Z has the exponential density 2e−2z . R∞R∞ 5.10 The constant c is determined by c 0 0 xe−2x(1+y) dx dy = 1. The R ∞ k uk−1 −λu e du = 1 for any integer k ≥ 1 gamma density satisfies 0 λ(k−1)! and any λ > 0. Using this identity with k = 1 and λ = 2x, we get Z ∞Z ∞ Z ∞ Z ∞ 1 −2x(1+y) −2x xe dx dy = c c 2xe−2xy dy e dx 2 0 0 0 0 Z ∞ 1 1 = c e−2x dx = c = 1 2 0 4 and so c = 4. Let the random variable Z be defined by Z = XY . RR Then, using the basic formula P ((X, Y ) ∈ C) = C f (x, y) dx dy, we find Z ∞ Z z/x xe−2x(1+y) dx dy P (XY ≤ z) = 4 0 0 =2 Z ∞ 0 xe −2x dx Z z/x 0 2xe−2xy dy, 118 and so Z ∞ e−2x (1 − e−2xz/x ) dx 0 Z ∞ −2z 2e−2x dx = 1 − e−2z = (1 − e ) P (XY ≤ z) = 2 for z > 0. 0 √ R1 R1√ −1 5.11 Since c 0 dx 0 x + y dy = 1, the constant RR c = (15/4)(4 2 − 2) . Using the basic formula P ((X, Y ) ∈ C) = C f (x, y) dx dy, it follows that Z z−x Z z Z √ 2c z 3/2 dx x + y dy = P (X + Y ≤ z) = c (z − x3/2 ) dx 3 0 0 0 2c √ = z 2 z for 0 ≤ z ≤ 1 5 Z 1 Z 1 √ x + y dy dx P (X + Y ≤ z) = c z−x z−1 2c 4c 5/2 2 − z 5/2 − (2 − z) z 3/2 for 1 ≤ z ≤ 2. = 15 3 √ Differentiation gives that the density function of X + Y is cz z for √ 0 < z < 1 and c(2 − z) z for 1 ≤ z < 2. √ 5.12 The random variable V = 2π X 2 + Y 2 gives the circumference of 1 2 2 the circle. Thus P (V > π) RR = P (X + Y > 4 ). Using the basic formula P ((X, Y ) ∈ C) = C f (x, y) dx dy with C = {(x, y) : x, y ≥ 0, x2 + y 2 ≤ 14 }, we get √ Z 0.5 Z 0.25−x2 1 P X2 + Y 2 ≤ (x + y) dy dx = 4 0 0 Z 1 p 1 x 0.25 − x2 dx + = 24 0 Z 0.25 1 1√ 1 0.25 − u du + = . = 2 24 12 0 Therefore P (V > π) = 11 12 . 5.13 Let U1 and U2 be independent and uniformly distributed on (0, 1). Then, for ∆x and ∆y small, P x < X ≤ x + ∆x, y < Y ≤ y + ∆y = P x < U1 ≤ x + ∆x, y < U2 ≤ y + ∆y) + P (x < U2 ≤ x + ∆x, y < U1 ≤ y + ∆y ∆x ∆y × for 0 < x < y < a =2 a a 119 Therefore the joint density function of X and Y is given by f (x, y) = a22 for 0 < x < y < a and f (x, y) = 0 otherwise. Alternatively, the joint density f (x, y) can be obtained from P (X > x, Y ≤ y) = Next, use the identity y − x 2 for 0 ≤ x ≤ y ≤ a. a P (X ≤ x, Y ≤ y) + P (X > x, Y ≤ y) = P (Y ≤ y) ∂ ∂ and apply f (x, y) = ∂x P (X ≤ x, Y ≤ y) . ∂y 5.14 The joint density function of (X, Y, Z) is given by f (x, y, z) = 1 for 0 < x, y, z < 1 and f (x, RRRy, z) = 0 otherwise. Using the representation P (X, Y, Z) ∈ D = D f (x, y, z) dx dy dz with D = {(x, y, z) : 0 ≤ x, y, z ≤ 1, x + y < z}, we get Z 1 Z z Z z−y dx. dy dz P (X + Y < Z) = 0 0 This integral can be evaluated as Z 1 Z z Z dz (z − y) dy = 0 0 1 0 0 1 2 1 z dz = . 2 6 Since P max(X, Y ) < Z is the probability that Z is the largest of the three components X, Y , and Z, we have by a symmetry argument that P max(X, Y ) < Z = 13 . Thus, by P max(X, Y ) > Z = 1 − P max(X, Y ) < Z , 1 2 P max(X, Y ) > Z = 1 − = . 3 3 R 1 R 1 R max(x,y) Alternatively, P max(X, Y ) > Z = 0 dx 0 dy 0 dz. This R1 Rx R1 R1 integral is 0 dx[ 0 x dy + x y dy] = 0 [x2 + 12 (1 − x)2 ] dx = 32 . RR 5.15 Using the basic formula P ((X, Y ) ∈ C) = C f (x, y) dx dy, it follows that Z ∞ Z 10 1 1 e− 2 (y+3−x) dy dx P (X < Y ) = 10 5 x Z 10 1 3 1 1 e− 2 (3−x) 2 e− 2 x = e− 2 . = 10 5 120 RR 5.16 Let Z = X+Y . By the basic formula P ((X, Y ) ∈ C) = C f (x, y) dx dy, we get that P (Z ≤ z) is given by Z z−x Z z Z Z 1 z 1 z −(x+y) (x + y)e dy = ue−u du dx dx 2 0 2 0 0 x Z 1 1 z (−ze−z + xe−x + e−x − e−z )dx = 1 − e−z (1 + z + z 2 ) = 2 0 2 for z ≥ 0. Hence the density function of Z = X + Y is f (z) = 21 z 2 e−z for z > 0 and f (z) = 0 otherwise. This is the Erlang density with shape parameter 3 and scale parameter 1. 5.17 We have P (max(X, Y ) > a min(X, Y )) = P (X > aY ) + P (Y > aX). Thus, by a symmetry argument, P (max(X, Y ) > a min(X, Y )) = 2P (X > aY ). The joint density of X and Y is f (x, y) = 1 for 0 < x, y < 1 and so Z 1 Z x/a Z 1 x 1 P (X > aY ) = dx dy = dx = . 2a 0 0 0 a The sought probability is a1 . 5.18 The expected value of the time until the electronic device goes down is given by Z ∞Z ∞ 24 (x + y) E(X + Y ) = dxdy (x + y)4 Z1 ∞ 1 Z ∞ Z ∞ 24 12 = dx dy = dx = 6. 3 (x + y) (x + 1)2 1 1 1 To find the density function of X + Y , we calculate P (X + Y > t) and distinguish between 0 ≤ t ≤ 2 and t > 2. Obviously, P (X +Y > t) = 1 for 0 ≤ t ≤ 2. For the case of t > 2, Z ∞ Z t−1 Z ∞ Z ∞ 24 24 dx dx dy + dy P (X + Y > t) = 4 (x + y)4 1 t−x (x + y) t−1 1 Z t−1 Z ∞ 8 8 8(t − 2) 4 = dx + dx = + 2. 3 3 3 t t t 1 t−1 (x + 1) By differentiation, the density function g(t) of X + Y is g(t) = for t > 2 and g(t) = 0 otherwise. 24(t−2) t4 121 5.19 The time until both components are down is T = max(X, Y ). Noting that P (T ≤ t) = P (X ≤ t, Y ≤ t), it follows that Z t Z 1 t P (T ≤ t) = dx (2y + 2 − x)dy = 0.125t3 + 0.5t2 for 0 ≤ t ≤ 1 4 0 0 Z t Z 1 1 (2y + 2 − x)dy = 0.75t − 0.125t2 for 1 ≤ t ≤ 2. dx P (T ≤ t) = 4 0 0 The density function of T is 0.375t2 + t for 0 < t < 1 and 0.75 − 0.25t for 1 ≤ t < 2. 5.20 Let X and Y be the two random points at which the stick is broken with X being the point that is closest to the left end point of the stick. Assume that the stick has length 1. The joint density function of (X, Y ) satisfies f (x, y) = 2 for 0 < x < y < 1 and f (x, y) = 0 otherwise. To see this, note that X = min(U1 , U2 ) and Y = max(U1 , U2 ), where U1 and U2 are independent and uniformly distributed on (0, 1). For any 0 < x < y < 1 and dx > 0, dy > 0 sufficiently small, P (x ≤ X ≤ x + dx, y ≤ Y ≤ y + dy) is equal to the sum of P (x ≤ U1 ≤ x + dx, y ≤ U2 ≤ y + dy) and P (x ≤ U2 ≤ x + dx, y ≤ U1 ≤ y + dy). By the independence of U1 and U2 , this gives P (x ≤ X ≤ x + dx, y ≤ Y ≤ y + dy) = 2dxdy, showing that f (x, y) = 2 for 0 < x < y < 1. All three pieces are no longer than half the length of the stick if and only if X ≤ 0.5, Y − X ≤ 0.5 and 1 − Y ≤ 0.5. That is, (X, Y ) should satisfy 0 ≤ X ≤ 0.5 and 0.5 ≤ Y ≤ 0.5 + X. It now follows that P (no piece is longer than half the length of the stick) Z 0.5 Z 0.5+x Z 0.5 1 x dx = . 2dy = 2 dx = 4 0 0.5 0 RR 5.21 (a) Using the basic formula P ((X, Y ) ∈ C) = C f (x, y) dx dy, it follows that the sought probability is Z 1Z 1 χ(a, b)f (a, b) da db, P (B 2 ≥ 4A) = 0 0 where χ(a, b) = 1 for b2 ≥ 4a and χ(a, b) = 0 otherwise. This leads to Z 1 Z b2 /4 2 (a + b) da = 0.0688. db P (B ≥ 4A) = 0 0 122 (b) Similarly, 2 P (B ≥ 4AC) = Z 0 1Z 1Z 1 0 χ(a, b, c)f (a, b, c) da db dc, 0 where χ(a, b, c) = 1 for b2 ≥ 4ac and χ(a, b, c) = 0 otherwise. A convenient order of integration for P (B 2 ≥ 4AC) is 2 3 Z 1 db 0 Z b2 /4 da 0 Z 1 0 2 (a+b+c) dc+ 3 Z 1 db 0 Z 1 da b2 /4 Z b2 /(4a) (a+b+c) dc. 0 This leads to P (B 2 ≥ 4AC) = 0.1960. 5.22 The marginal density of X is given by Z ∞ Z −2x(1+y) −2x 4xe dy = 2e fX (x) = 0 = 2e−2x ∞ 2xe−2xy dy 0 for x > 0. The marginal density of Y is given by Z ∞ Z ∞ 2 1 −2x(1+y) xe−2x(1+y) dx 2(1 + y) 4xe dx = fY (y) = (1 + y)2 0 0 1 = for y > 0, (1 + y)2 using the fact that the gamma density λ2 xe−λx for x > 0 integrates to 1 over (0, ∞). Rx 5.23 The marginal densities Rof X and Y are fX (x) = 0 4e−2x dy = 4xe−2x ∞ for x > 0 and fY (y) = y 4e−2x dx = 2e−2y for y > 0. 5.24 The marginal density of X is given by Z 1−x 1 (3 − 2x − y) dy = (3 − 2x)(1 − x) − (1 − x)2 fX (x) = 2 0 = 1.5x2 − 4x + 2.5 for 0 < x < 1. The marginal density of Y is given by Z 1−y (3 − 2x − y) dx = 3(1 − y) − (1 − y)2 − y(1 − y) fY (y) = 0 = 2 − 2y for 0 < y < 1. 123 √ 5.25 The joint density of X and Y is f (x, y) = 4/ 3 for (x, y) inside the triangle. The marginal density of X is (R √ x 3 f (x, y) dy = 4x for 0 < x < 0.5 fX (x) = R0(1−x)√3 f (x, y) dy = 4(1 − x) for 0.5 < x < 1. 0 The marginal density of Y is Z 1−y/√3 8y 4 f (x, y) dx = √ − fY (y) = √ 3 3 y/ 3 for 0 < y < 1√ 3. 2 5.26 Since f (x, y) = x1 for 0 < Rx < 1 and 0 < y < x and f (x, y) = 0 x otherwise, we get fX (x) = 0 x1 dx = 1 for 0 < x < 1 and fY (y) = R1 1 y x dx = − ln(y) for 0 < y < 1. RR 5.27 Using the basic formula P ((X, Y ) ∈ C) = C f (x, y) dx dy, we have Z x Z v+z e−w dw = (1 − e−x )(1 − e−z ) dv P (X ≤ x, Y − X ≤ z) = 0 v for x, z > 0. By partial differentiation, we get that the joint density of X and Z = Y − X is f (x, z) = e−x e−z for x, z > 0. The marginal densities of X and Z are the exponential densities fX (x) = e−x and fZ (z) = e−z . TheR time until the Rsystem goes down is Y . The density y y function of Y is 0 f (x, y) dx = 0 e−y dx = ye−y for y > 0. This is the Erlang density with shape parameter 2 and scale parameter 1. 5.28 The joint density of X and Y is f (x, y) = 1 for 0 < x, y < 1. The area RR of the rectangle is Z = XY . Using the relation P ((X, Y ) ∈ C) = C f (x, y) dx dy, it follows that Z z/x Z 1 Z 1 Z z dy = z −zln(z) for 0 ≤ z ≤ 1. dx dy + dx P (Z ≤ z) = 0 0 z 0 The density function of Z is f R(z) = − ln(z) for 0 < z < 1. The ex1 pected value of Z is E(Z) = − 0 z ln(z) dz = 41 . Note that E(XY ) = E(X)E(Y ). 5.29 Let X and Y be the packet delays on the two lines. The joint density of X and Y is f (x, y) = RR λe−λx λe−λy for x, y > 0. Using the basic formula P ((X, Y ) ∈ C) = C f (x, y) dx dy, we obtain Z ∞ Z ∞ 1 −λy λe−λx dx = e−λv λe dy P (X − Y > v) = 2 y+v 0 124 for v ≥ 0. For any v ≤ 0, P (X − Y ≤ v) = P (Y − X ≥ −v). Thus, by symmetry, P (X − Y ≤ v) = 12 eλv for v ≤ 0. Thus the density of X − Y is 12 λe−λ|v| for −∞ < v < ∞, which is the so-called Laplace density. 5.30 It is easiest to derive the results by using the basic relation P ((X, Y ) ∈ RR C) = C f (x, y) dx dy. The joint density of X and Y is f (x, y) = 1 for 0 < x, y < 1. Let V = 21 (X + Y ). Then P (V ≤ v) = P (X + Y ≤ 2v). Thus Z 2v Z 2v−x dy = 2v 2 for 0 ≤ v ≤ 0.5 dx P (V ≤ v) = 0 0 Z 1 Z 1 dy = 2 − 4v + 2v 2 for 0.5 ≤ v ≤ 1. dx P (V > v) = 2v−x 2v−1 Thus fV (v) = 4v for 0 < v ≤ 12 and fV (v) = 4 − 4v for 0.5 < v < 1. This is the triangular density with a = 0, b = 1, m = 0.5. To get the density of |X − Y |, note that P (|X − Y | ≤ v) = P (X − Y ≤ v) − P (X − Y ≤ −v) for 0 ≤ v ≤ 1. Also, P (X − Y ≤ −v) = P (Y − X ≥ v) for 0 ≤ v ≤ 1. Thus P (|X − Y | ≤ v) = 2P (X − Y ≤ v) − 1 for 0 ≤ v ≤ 1. We have Z 1 Z 1 Z 1−v Z y+v dx dy dx + dy P (X − Y ≤ v) = P (X ≤ Y + v) = 1−v 0 0 1 1 = v − v2 + 2 2 0 for 0 ≤ v ≤ 1. Thus P (|X − Y | ≤ v) = 2(v − 21 v 2 + 12 ) − 1 for 0 ≤ v ≤ 1. Therefore the density of |X − Y | is 2(1 − v) for 0 < v < 1. This is the triangular density with a = 0, b = 1, m = 1. 5.31 We have P (F ≤ c) = P (X + Y ≤ c) + P (1 ≤ X + Y ≤ c + 1) for 0 ≤ c ≤ 1. R c R c−x Since P (X + Y ≤ c) = 0 dx 0 dy and P (1 ≤ X + Y ≤ c + 1) = R 1 R min(c+1−x, 1) dy, we get for any 0 ≤ c ≤ 1 that 0 dx 1−x 1 P (X + Y ≤ c) = c2 2 P (1 ≤ X + Y ≤ c) = Z c dx 0 Z 1 dy + 1−x Z 1 dx c Z c+1−x 1−x 1 dy = c2 + c(1 − c). 2 This gives P (F ≤ c) = c for all 0 ≤ c ≤ 1, proving the desired result. 125 5.32 Let X be uniformly distributed on (0, 24) and Y be uniformly distributed on (0, 36), where X and Y are independent. The sought probability is given by P (X < Y < X + 10) + P (Y < X < Y + 10). Since Z x+7 Z 24 1 7 1 dx dy = P (X < Y < X + 10) = 24 36 36 x 0 Z 24 Z min(24, y+7) 1 1 287 P (Y < X < Y + 10) = dy dx = , 36 24 1728 0 y we find that the sought probability is equal to 7 36 + 287 1728 = 0.3605. 5.33 The joint density function f (x, y) of X and Y satisfies f (x, y) = fX (x)fY (y) and is equal to 1 for all 0 < x,RR y < 1 and 0 otherwise. Using the relation P ((X, Y ) ∈ C) = C f (x, y) dx dy with C = {(x, y) : 0 < x < min(1, yz), 0 < y < 1}, we get P (Z ≤ z) = Z 1 dy 0 Z min(1,zy) dx for z > 0. 0 Distinguish between the cases 0 ≤ z ≤ 1 and z > 1. For 0 ≤ z ≤ 1. P (Z ≤ z) = Z 1 dy 0 Z zy dx = 0 Z 1 0 1 zy dy = z. 2 For z > 1, P (Z ≤ z) = Z 1/2 dy 0 Z zy dx + 0 Z 1 dy 1/z Z 1 0 dx = 1 − 1 . 2z Hence the density function of Z is 12 for 0 < z ≤ 1 and 2z12 for z > 1. The probability that the first significant digit of Z equals 1 is given by ∞ X n=0 n n P (10 ≤ Z < 2 × 10 ) + 5 1 1 = + = . 18 18 3 ∞ X n=1 P (10−n ≤ Z < 2 × 10−n ) In general, the probability that the first significant digit of Z equals k is 1 1 10 × + for k = 1, . . . , 9. 18 k(k + 1) 18 126 5.34 We have P (Z ≤ z) = Z ∞ λe −λx dx 0 Z ∞ λe −λy dy = x/z Thus the density function of Z is Z ∞ e−λx/z λe−λx dx = 0 z . 1+z 1 . (1+z)2 5.35 We have P max(X, Y ) ≤ t = P (X ≤ t, Y ≤ t) = P (X ≤ t)P (Y ≤ t) 2 = 1 − e−λt for t > 0. Also, 1 P X+ Y ≤t = 2 Z t λe−λx dx 0 Z 2(t−x) 0 2 λe−λy dy = 1 − e−λt . 5.36 (a) The formula is true for n = 1. Suppose that the formula has been verified for n = 1, . . . , k. This means that the density function of sk−1 X1 + · · · + Xk satisfies (k−1)! for 0 < s < 1. Then, by the convolution formula, the density function of X1 + · · · + Xk + Xk+1 is given by Z s (s − y)k−1 sk dy = for 0 < s < 1. (k − 1)! k! 0 This gives P (X1 + · · · + Xk+1 ≤ s) = Z s 0 sk+1 xk = for 0 ≤ s ≤ 1. k! (k + 1)! 1 (b) We have P (NP> n) = P (X1 + · · · + Xn ) = n! , it follows from the ∞ formula E(N ) = n=0 P (N > n) (see Problem 3.29) that E(N ) = ∞ X 1 = e. n! n=0 5.37 Let X1 , X2 , . . . be a sequence of independent random variables that are uniformly distributed on (0, 1), and let Sn = X1 + · · · + Xn . The sought probability is P (S1 > a) + ∞ X n=1 P (Sn ≤ a, a < Sn + Xn+1 ≤ 1). 127 Since Sn and Xn+1 are independent of each other, the joint density sn−1 for 0 < s < 1 and fn (s, x) of Sn and Xn+1 satisfies fn (s, x) = (n−1)! 0 < x < 1, using the result (a) of Problem 5.36. Therefore, Z a Z 1−s an fn (s, x) dx = (1 − a) . ds P (Sn ≤ a, a < Sn + Xn+1 ≤ 1) = n! a−s 0 Thus the sought probability is 1−a+ ∞ X n=1 (1 − a) an = (1 − a)ea . n! 5.38 By the independence of X1 , X2 , and X3 , the joint density function of X1 , X2 , and X3 is 1 × 1 × 1 = 1 for 0 < x1 , x2 , x3 < 1 and 0 otherwise. Let C = {(x1 , x2 , x3 ) : 0 < x1 , x2 , x3 < 1, 0 < x2 + x + 3 < x1 }. Then Z x1 −x2 Z x1 Z 1 ZZZ dx3 . dx2 dx1 dx1 dx2 dx3 = P (X1 > X2 + X3 ) = 0 C 0 0 This gives P (X1 > X2 + X3 ) = Z 1 dx1 0 Z x1 0 (x1 − x2 ) dx2 = 1 1 1 = × = . 2 3 6 Z 1 0 1 2 x dx1 2 1 Since the events {X1 > X2 +X3 }, {X2 > X1 +X3 } and {X3 > X1 +X2 } are mutually exclusive, the probability that the largest of the three random variables is greater than the sum of the other two is 3 × 16 = 12 . Note: More generally, let X1 , X2 , . . . , Xn be independent random num1 for any n ≥ 2. bers chosen from (0, 1), then P (X1 > X2 +· · ·+Xn ) = n! 5.39 By P (V > v, W ≤ w) = P (v < X ≤ w, v < Y ≤ w) and the independence of X and Y , we have 2 P (V > v, W ≤ w) = P (v < X ≤ w)P (v < Y ≤ w) = e−λv − e−λw for 0 ≤ v ≤ w. Taking partial derivatives, we get that the joint density of V and W is f (v, w) = 2λ2 e−λ(v+w) for 0 < v < w. It follows from Z ∞ Z ∞ 2λ2 e−λ(v+w) dw dv P (W − V > z) = 0 v+z that P (W − V > z) = e−λz for z > 0, in agreement with the memoryless property of the exponential distribution. 128 5.40 By the substitution rule, the expected value of the area of the rectangle is equal to E(XY ) = Z 1Z 0 1 xy(x + y) dx dy = 0 Z 1 x 0 1 2 x+ 1 1 dx = . 3 3 5.41 Define the function g(x, y) as g(x, y = T − max(x, y) if 0 ≤ x, y ≤ T and g(x, y) = 0 otherwise. The joint density function of X and Y is e−(x+y) for x, y > 0. Using the memoryless property of the exponential distribution, the expected amount of time the system is down between two inspections is given by T T − max(x, y) e−(x+y) dx dy E[g(X, Y )] = 0 0 Z T 1 =2 (T − x)(1 − e−x )e−x dx = T − 1.5 + 2e−T − e−2T . 2 0 Z TZ 5.42 By the substitution rule, the expected value of the time until the system goes down is 1 4 Z Z 2 Z dx 1 max(x, y)(2y + 2 − x) dy Z i 1 1 1 h2 1 2 2x dx + = (1 − x3 ) + (2 − x)(1 − x2 ) dx 4 0 3 2 0 Z 2 1 (3x − x2 ) dx = 0.96875. + 4 1 E[max(X, Y )] = 0 0 The expected value of the time between the failures of the two components is E[max(X, Y )] − E[min(X, Y )]. By the substitution rule, 1 E[min(X, Y )] = 4 Z 2 dx 0 Z 1 0 min(x, y)(2y + 2 − x) dy = 0.44792 and so the expected time between the failures of the two components is 0.52083. 5.43 Using the substitution rule, the expected value of the area of the circle is Z 1 Z 1Z 1 1 1 1 5 2 2 (x3 + x2 + x + ) dx = π. π(x + y )(x + y) dx dy = π 2 3 4 6 0 0 0 129 5.44 Using the substitution rule and writing x + y = 2x + y − x, we get E(X + Y ) = ∞ X ∞ X (x + y) x=0 y=x ∞ ∞ x=0 z=0 X e−1 X e−1 e−2 2x z = + = 3. x!(y − x)! x! z! Also, using the substitution rule and writing xy = x(y − x + x) E(XY ) = ∞ ∞ X X x=0 y=x xy ∞ ∞ ∞ X e−1 X e−1 X 2 e−1 e−2 x x z = + = 3. x!(y − x)! x! z! x! x=0 z=0 x=0 5.45 The inverse functions x = a(v, w) and y = b(v, w) are a(v, w) = vw and b(v, w) = v(1 − w). The Jacobian J(v, w) is equal to −v. The joint density of V and W is fV,W (v, w) = µe−µvw µe−µv(1−w) | − v| = µ2 ve−µv for v > 0, 0 < w < 1. The marginal densities of V and W are Z 1 µ2 ve−µv dw = µ2 ve−µv for v > 0 fV (v) = Z0 ∞ µ2 ve−µv dv = 1 for 0 < w < 1. fW (w) = 0 Since fV,W (v, w) = fV (v)fW (w) for all v, w, the random variables V and W are independent. 5.46 To find the joint density of V and W , we apply the transformation formula. The inverse functions x = a(v, w) and y = b(v, w) are given by a(v, w) = vw/(1 + w) and b(v, w) = v/(1 + w). The Jacobian J(v, w) is equal to −v/(1 + w)2 and so the joint density of V and W is given by v fV,W (v, w) = 1 × 1 × |J(v, w)| = for 0 < v < 2 and w > 0 (1 + w)2 and fV,W (v, w) = 0 otherwise. The marginal density of V is Z ∞ 1 v dw = v for 0 < v < 2 fV (v) = 2 (1 + w) 2 0 and fV (v) = 0 otherwise. The marginal density of W is given by Z 2 v 2 fW (w) = dv = for w > 0 2 (1 + w)2 0 (1 + w) and fW (w) = 0 otherwise. Since fV,W (v, w) = fV (v)fW (w) for all v, w, the random variables V and W are independent. 130 1 1 5.47 Since Z 2 has the χ21 density √12π u− 2 e− 2 u when Z is N (0, 1) distributed and the random variables Z12 and Z22 are independent, the joint density 1 1 1 (xy)− 2 e− 2 (x+y) for x, y > 0. For the of X = Z12 and Y = Z22 is 2π transformation V = X + Y and W = X/Y , the inverse functions x = a(v, w) and y = b(v, w) are a(v, w) = 21 (v + w) and b(v, w) = 21 (v − w). The Jacobian J(v, w) is equal to − 21 . The joint density of V and W is fV,W (v, w) = 1 1 1 (v 2 − w2 )− 2 e− 2 v for v > 0, −∞ < w < ∞. 2 4π The random variables V and W are not independent. √ 5.48 Let V = Y X and W = X. To find the joint density of V and W , we apply the transformation formula. The inverse functions x = a(v, w) √ and y = b(v, w) are a(v, w) = w and b(v, w) = v/ w. The Jacobian √ J(v, w) is equal to −1/ w and so the joint density of V and W is given by fV,W (v, w) = 1 √ − 1 w − 1 v2 1 −w(1+v2 /w) 1 √ = we we 2 e 2 π π w and fV,W (v, w)R = 0 otherwise. The densities fV (v) = ∞ and fW (w) = 0 fV,W (v, w)dv are given by fV (v) = r 2 − 1 v2 e 2 for v > 0, π R∞ 0 for v, w > 0 fV,W (v, w)dw 1 1 1 fW (w) = √ w 2 e− 2 w for w > 0. 2π The random variable V is distributed as |Z| with Z having the standard normal distribution and W has a gamma distribution with shape parameter 32 and shape parameter 21 . Since fV,W (v, w) = fV (v)fW (w) for all v, w, the random variables V and W are independent. √ 1 2 2 5.49 The inverse functions are a(v, w) = ve− 4 (v +w ) / v 2 + w2 and b(v, w) = √ 1 2 1 2 2 2 we− 4 (v +w ) / v 2 + w2 . The Jacobian is 12 e− 2 (v +w ) . Since fX,Y (x, y) = 1 π , we get fV,W (v, w) = 1 1 − 1 (v2 +w2 ) × e 2 π 2 1 2 for − ∞ < v, w < ∞. 1 2 Noting that fV,W (v, w) = √12π e− 2 v × √12π e− 2 w for all v, w, it follows that V and W are independent and N (0, 1) distributed. 131 1 for 5.50 The joint density function f (r, θ) of (R, Θ) is given by 1 × 2π 0 < r < 1 and 0 < θ < 2π. The√inverse functions r = a(v, w) and θ = b(v, w) are given by a(v, w) = v 2 + w2 and b(v, w) = arctan( wv ). 1 Using the fact that arctan(x) has 1+x 2 as derivative, it follows that 1 1 the Jacobian is given by √v2 +w2 . Noting that f (a(v, w), b(v, w)) is 2π if v 2 + w2 ≤ 1 and 0 otherwise, it follows from the two-dimensional transformation formula that the joint density fV,W (v, w) of the random vector (V, W ) is given by fV,W (v, w) = 1 1 √ 2 2π v + w2 for − 1 < v, w < 1, v 2 + w2 ≤ 1 and fV,W (v, w) = 0 otherwise. To get the marginal density fV (v) = Z √ 1−v 2 √ − 1−v 2 fV,W (v, w)dw, we use the following result from calculus: Z x p dt √ = ln(x + 1 + x2 ) 1 + t2 0 This leads after some algebra to ! √ 1 − v2 1 1 + fV (v) = ln π |v| |v| for x > 0. for − 1 < v < 1. The marginal density of W is of course the same as that of V . The intuitive explanation that (V, W ) is not a random point inside the unit circle is as follows. The closer a (small) rectangle within the unit circle is to the center of the circle, the larger the probability of the point (V, W ) falling in the rectangle. 1 5.51 The joint density of X and Y is Γ(α)Γ(β) xα−1 y β−1 e−(x+y) . The inverse functions are a(v, w) = vw and b(v, w) = w(1 − v). The Jacobian J(v, w) = w. Thus the joint density of V and W is 1 (vw)α−1 (w(1 − v))β−1 e−v w Γ(α)Γ(β) for v, w > 0. This density can be rewritten as Γ(α + β) α−1 wα+β−1 −w v (1 − v)β−1 e Γ(α)Γ(β) Γ(α + β) for all v, w > 0. 132 This shows that V and W are independent, where V has a beta distribution with parameters α and β, and W has a gamma distribution with shape parameter α + β and scale parameter 1. 5.52 Since Z and Y are independent, the joint density of Z and Y is 1 1 1 2 y 2 ν−1 e− 2 y 1 f (z, y) = √ e− 2 z × 1 2π 2 2 ν Γ( 12 ν) for z, y > 0. The = b(v, w) are given by z = p inverse functions z = a(v, w) and y p w v/ν and y = v. The Jacobian is v/ν. Thus, by the twodimensional transformation formula, the joint density function of V and W is that r 1 1 v 1 − 1 w2 v/ν v 2 ν−1 e− 2 v 2 × 1 × fV,W (v, w) = √ e for v, w > 0. ν 1 ν 2π 2 2 Γ( ν) 2 2 Letting λw = 1 + wν for any w, α = 12 (ν + 1) R ∞and using the change v of variable u = 2 , it follows that fW (w) = 0 fV,W (v, w)dv can be written as Z ∞ α α−1 −λu 1 1 λ−α λ−α λw u e w Γ( 2 (ν + 1)) w Γ( 2 (ν + 1)) fW (w) = du = , √ √ Γ(α) πν Γ( 12 ν) πν Γ( 21 ν) 0 showing the desired result. Note: This problem shows that the two-dimensional transformation V = X and W = h(X, Y ) may be useful when you to want to find the density of a function h(x, y) of a random vector (X, Y ) with a given joint density function f (x, y). 5.53 For ∆x, ∆y sufficiently small, 1 1 1 1 P x − ∆x ≤ U(1) ≤ x + ∆x, y − ∆y ≤ U(n) ≤ y + ∆y 2 2 2 2 n n−1 = (y − x)n−2 ∆x∆y for 0 < x < y < 1. 1 1 Therefore the joint density of U(1) and U(n) is f (x, y) = n! (y − x)n−2 (n − 2)! for 0 < x < y < 1. 133 5.54 Let X = U(1) and Y = U(n) . The joint density of X and Y is given by n(n − 1)(y − x)n−2 for 0 < x < y < 1, see Problem 5.53. For the transformation V = Y and W = Y − X, the inverse functions are a(v, w) = v − w and b(v, w) = v. The Jacobian J(v, w) = 1. Thus the joint density of V and W is given by n(n − 1)wn−2 for w < v < 1 and 0 < w < 1. The marginal density of the range W is Z 1 w n(n − 1)wn−2 dv = n(n − 1)wn−2 (1 − w) for 0 < w < 1. Note: an alternative derivation of the results of the Problems 5.53 and 5.54 can be given. This derivation goes as follows. It follows from P (X > x, Y ≤ y) = P (x < Ui ≤ y for i = 1, . . . , n) that P (X > x, Y ≤ y) = (y − x)n for 0 ≤ x ≤ y ≤ 1. Taking partial derivatives, we get that the joint density function of X and Y is given by f (x, y) = n(n − 1)(y − x)n−2 for 0 < x < y < 1 and f (x, y)R = 0 otherwise. Next, R 1−z 1 we get from P (Y − X > z) = n(n − 1) 0 dx z+x (y − x)n−2 dy that R 1−z P (Y − X > z) = n 0 [(1 − x)n−1 − z n−1 ] dx. This gives P (Y − X > z) = 1 + (n − 1)z n − nz n−1 for 0 ≤ z ≤ 1 and so the density of Y − X is n(n − 1)z n−2 (1 − z) for 0 < z < 1. 5.55 The marginal distributions of X and Y are the Poisson distributions pX (x) = e−1 /x! for x ≥ 0 and pY (y) = e−2 2y /y! for y ≥ 0 with E(X) = σ 2 (X) = 1 and E(Y ) = σ 2 (Y ) = 2. We have E(XY ) = ∞ X ∞ X xy x=0 y=x P∞ e−1 y=x y (y−x)! e−2 . x!(y − x)! P∞ e − x + x) (y−x)! = 1 + x, we get √ E(XY ) = 3. This gives ρ(X, Y ) = 1/ 2. Noting that = −1 y=x (y 5.56 Using the marginal densities fX (x) = fY (y) = 4y 3 for 0 < y < 1, we obtain 4 3 (1 − x3 ) for 0 < x < 1 and 2 4 14 2 E(X) = , E(Y ) = , σ 2 (X) = , and σ 2 (Y ) = . 5 5 225 75 R1 Ry R1 By E(XY ) = 0 dy 0 xy4y 2 dx = 0 2y 5 dy, we find E(XY ) = Hence E(XY ) − E(X)E(Y ) ρ(X, Y ) = = 0.3273. σ(X)σ(Y ) 1 3. 134 5.57 The variance of the portfolio’s return is 2 2 f 2 σA + (1 − f )2 σB + 2f (1 − f )σA σB ρAB . Putting the derivative of this function equal to zero, it follows that the 2 2 2 −σ σ ρ optimal fraction f is σB A B AB / σA + σB − 2σA σB ρAB . 5.58 Using the linearity of the expectation operator, it is readily verified from the definition of covariance that cov(X + Z, Y + Z) = cov(X, Y ) + cov(X, Z) + cov(Z, Y ) + cov(Z, Z). Since the random variables X, Y , and Z are independent, we have cov(X, Y ) = cov(X, Z) = cov(Z, Y ) = 0. Further, cov(Z, Z) = σ 2 (Z), σ 2 (X + Z) = σ 2 (X) + σ 2 (Z) = 2, and σ 2 (Y + Z) = σ 2 (Y ) + σ 2 (Z) = 2. Therefore ρ(X + Z, Y + Z) = 12 . 5.59 (a) Let RA be the rate of return of stock A and RB be the rate of return of stock B. Since RB = −RA + 14, the correlation coefficient is −1. (b) Let X = f RA + (1 − f )RB . Since X = (2f − 1)RA + 14(1 − f ), the variance of X is minimal for f = 12 . Invest 21 of your capital in stock A and 12 in stock B. Then the portfolio has a guaranteed rate of return of 7%. R1 Rx 5.60 We have E(XY ) = 6 0 dx 0 xy(x−y) dy = 51 . The marginal densities of X and Y are fX (x) = 3x2 for 0 < x < 1 and fY (y) = 3y 2 −p6y + 3 for 0 < y < 1. Then, E(X) = 43 , E(Y ) = 14 , σ(X) = σ(Y ) = 3/80. This leads to ρ(X, Y ) = 13 . 5.61 The joint density of (X, Y ) is f (x, y) = π1 for (x, y) inside the circle C. Then, ZZ 1 xy dx dy. E(XY ) = π C Since the function xy has opposite signs on the quadrants of the circle, a symmetry argument gives E(XY ) = 0. Also, by a same argument, E(X) = E(Y ) = 0. This gives ρ(X, Y ) = 0, although X and Y are dependent. 5.62 The joint density function of X and Y is 21 on the region D. Since the function xy has opposite signs on the four triangles of the region D, we have E(XY ) = 0. Also, E(X) = E(Y ) = 0. Therefore ρ(X, Y ) = 0. 135 5.63 The joint density function fV,W (v, w) of V and W is most easily obtained from the relation P (v < V < v + ∆v, w < W < w + ∆w) = P (v < X < v + ∆v, w < Y < w + ∆w) + P (v < Y < v + ∆v, w < X < w + ∆w) = 2∆v∆w for 0 ≤ v < w ≤ 1 when ∆v, ∆w are small enough. This shows that fV,W (v, w) = 2 for 0 < v < w < 1. Next it follows that fV (v) = 2(1−v) for 0 < v < 1 and fW (w) = 2w for 0 < w < 1. This leads to 1 E(V W ) = 14 , E(V ) = 31 , E(W ) = 32 , and σ(V ) = σ(W ) = 3√ . Thus 2 ρ(V, W ) = 12 . 5.64 Let X denote the low points rolled and Y the high points rolled. We 1 2 have P (X = i, Y = i) = 36 for 1 ≤ i ≤ 6 and P (X = i, Y = j) = 36 for 1 ≤ i < j ≤ 6, see also Problem 5.1. The marginal distribution 9 7 of X is given by P (X = 1) = 11 36 , P (X = 2) = 36 , P (X = 3) = 36 , 5 3 1 P (X = 4) = 36 , P (X = 5) = 36 , and P (X = 6) = 36 , while the 1 3 marginal distribution of Y is P (Y = 1) = 36 , P (Y = 2) = 36 , P (Y = 7 9 11 5 . 3) = 36 , P (Y = 4) = 36 , P (Y = 5) = 36 , and P (Y = 6) = 36 Straightforward calculations yield 91 301 161 791 , E(X 2 ) = , E(Y ) = , E(Y 2 ) = , σ(x) = 1.40408 36 36 36 36 6 6 X X 441 σ(Y ) = 1.40408, E(XY ) = xyP (X = x, Y = y) = . 36 y=x E(X) = x=1 It now follows that ρ(X, Y ) = 441/36 − (91/36)(161/36) E(XY ) − E(X)E(Y ) = = 0.479. σ(X)σ(Y ) (1.40408)2 5.65 The joint probability mass function of X and Y is given by P (X = x, Y = y) = 1 1 × 100 x for x = 1, 2, . . . , 100, y = 1, . . . , x. The marginal distributions of X and Y are given by 1 P (X = x) = 100 100 and 1 X1 P (Y = y) = 100 x=y x 136 for 1 ≤ x ≤ 100 and 1 ≤ y ≤ 100. Next it follows that E(XY ) = 100 X x X x=1 y=1 100 xy × 1 X1 1 = x(x + 1) = 1717. 100x 100 2 x=1 Further, 100 E(X) = 1 X x = 50.5, 100 100 E(X 2 ) = x=1 1 X 2 x = 3383.5, 100 x=1 100 100 100 x 100 1 X X1 1 X 1 X1X E(Y ) = y y= (x + 1) = 100 x 100 x 200 x=y y=1 x=1 y=1 x=1 = 25.75, E(Y 2 ) = 100 100 100 x 1 X1X 2 1 X 2X 1 y y = 100 x 100 x x=y y=1 = 1 600 100 X x=1 y=1 (x + 1)(2x + 1) = 1153.25. x=1 √ Hence the standard deviations of X and Y are σ(X) = 3383.5 − 50.52 = √ 28.8661 and σ(Y ) = 1153.25 − 25.752 = 22.1402 and so ρ(X, Y ) = 1717 − 50.5 × 25.75 = 0.652. 28.8661 × 22.1402 5.66 The joint density of X and Y is f (x, y) = x1 for 0 < y < x < 1 and R1 Rx f (x, y) = 0 otherwise. Thus, E(XY ) = 0 dx 0 xy x1 dy = 16 . The marginal Rdensities of X and Y are fX (x) = 1 for 0 < x < 1 and 1 fY (y) = y x1 dx = − ln(y) for 0 < y < 1. This leads to E(X) = 21 , q q 1 7 and σ(Y ) = E(Y ) = 14 , σ(X) = 12 144 . Therefore ρ(X, Y ) = q 1 7 ( 16 − 12 × 41 )/ 12 × 144 = 0.655. 5.67 The joint probability mass function p(x, y) = P (X = x, Y = y) is given by p(x, y) = rx−1 p(r+p)y−x−1 q for x < y and p(x, y) = ry−1 q(r+ q)x−y−1 p for x > y. It is matter of some algebra to get E(XY ) = Also, E(X) = 1 p 1 1 1+r q 1+r p +p + +q . 2 3 2 q (1 − r) (1 − r) p (1 − r) (1 − r)3 and E(Y ) = 1q . This leads to cov(X, Y ) = −1/(1 − r). 137 5.68 To obtain the joint density function of X and Y , note that for ∆x and ∆y small P (x < X ≤ x + ∆x, y < Y ≤ y + ∆y) = 6∆x(y − x)∆y for 0 ≤ x < y < 1, see also Example 5.3. Thus the joint density function of X and Y is given by f (x, y) = 6(y − x) for 0 < x < y < 1 and f (x, y) = 0 otherwise. Therefore Z 1 Z 1 xy(y − x) dy dx E(XY ) = 6 x 0 Z 1 1 1 1 3 2 x (1 − x ) − x(1 − x ) dx = . =6 3 2 5 0 The marginal density functions of X and Y are given by fX (x) = 3(1 − x)2 for 0 < x < 1, fY (y) = 3y 2 for 0 < y < 1. p Simple calculations give E(X) = 14 , E(Y ) = 34 , σ(X) = 1/10 − 1/16 = p p p 3/80, and σ(Y ) = 3/5 − 9/16 = 3/80. This leads to 1 1/5 − (1/4) × (3/4) = . ρ(X, Y ) = p 3 (3/80) × (3/80) 5.69 The “if part” follows from the relations cov(X, aX+b) = acov(X, X) = aσ12 and σ(aX + b) = |a|σ1 . Suppose now that |ρ| = 1. Since var(V ) = ρ 1 2 ρ2 2 σ2 + 2 σ1 − 2 cov(X, Y ) = 1 − ρ2 , 2 σ1 σ2 σ2 σ1 we have var(V ) = 0. This result implies that V is equal to a constant E(X) ) and this constant is E(V ) = E(Y σ2 − ρ σ1 . This shows that Y = aX + b, where a = ρσ2 /σ1 and b = E(Y ) − aE(X). 5.70 Using the linearity of the expectation operator, it is readily verified from the definition of covariance that cov(aX + b, cY + d) = accov(X, Y ). Also, σ(aX + b) = |a|σ(X) and σ(cY + d) = |c|σ(Y ). Therefore ρ(aX + b, cY + d) = ρ(X, Y ) if a and c have the same signs. 138 5.71 (a) Suppose that E(Y 2 ) > 0 (if E(Y 2 ) = 0, then Y = 0). Let h(t) = E[(X − tY )2 ]. Then, h(t) = E(X 2 ) − 2tE(XY ) + t2 E(Y 2 ). The function h(t) is minimal for t = E(XY )/E(Y 2 ). Substituting this t-value into h(t) and noting that h(t) ≥ 0, the Cauchy-Schwartz inequality follows. (b) The Cauchy-Schwartz inequality gives [cov(X, Y )]2 ≤ var(X)var(Y ) or, equivalently, ρ2 (X, Y ) ≤ 1 and so −1 ≤ ρ(X, Y ) ≤ 1. (c) Noting that E(XY ) = E(X) and E(Y 2 ) = P (X > 0), the CauchySchwartz inequality gives [E(X)]2 ≤ E(X 2 )P (X > 0). This shows that P (X > 0) ≥ [E(X)]2 /E(X 2 ) and so P (X = 0) ≤ var(X)/E(X 2 ). 5.72 (a) Since var(X) = a2 var(X) and cov(aX, bY ) = abcov(X, Y ), it suffices to verify the assertion for ai = 1 for all i. We use the method of induction to prove that var k X Xj = j=1 k X var(Xj ) + 2 j=1 k k−1 X X cov(Xi , Xj ) i=1 j=i+1 for all k ≥ 2. For k = 2, the assertion has been proved in Rule 11.5. Suppose the assertion has been proved for k = 2, . . . , m for some m ≥ 2. Then, by the induction hypothesis and Rule P 11.5 with X = X1 + · · · + Xm and Y = Xm+1 , it follows that var( m+1 j=1 Xk ) is given by var m X j=1 +2 m m X X var(Xj ) Xj , Xm+1 = Xj + var(Xm+1 ) + 2cov m X cov(Xi , Xj ) + var(Xm+1 ) + 2 m+1 X j=1 var(Xj ) + 2 m X cov(Xi , Xm+1 ) i=1 i=1 j=i+1 = j=1 j=1 m X m m+1 X X cov(Xi , Xj ). i=1 j=i+1 (b) Using the fact that σ 2 (aX) = a2 σ 2 (X) for any constant a, it follows that 1 1 σ 2 (X n ) = 2 [nσ 2 + 2 × n(n − 1)]. n 2 (c) Since cov(aX, bY ) = abcov(X, Y ), it suffices to verify the assertion for ai = 1 for all i and bj = 1 for all j. Using the linearity of the 139 expectation operator, it is immediately verified from the definition of covariance that cov(X, Y + Z) = cov(X,P Y ) + cov(X,P Z). It is readily m verified by induction on m that cov(X1 , m Y ) = j j=1 j=1 cov(X1 , Yj ) for all m P ≥ 1. Next, for fixed m, it can be verified by induction on n Pm Pn P m n that cov( i=1 Xi , j=1 Yj ) = i=1 j=1 cov(Xi , Yj ). (d) Using the result of (c) and the fact that cov(X, Y ) = 0 for independent X and Y , we get n n n 1X 1 XX cov(Xk , Xj ) cov(Xk , Xi ) − 2 cov X n , Xi − X n = n n k=1 j=1 k=1 n 1 2 1 X 2 σ (Xk ) = 0. = σ (Xi ) − 2 n n k=1 (e) Using the result of (c), we have cov(X1 − X2 , X1 + X2 ) = σ 2 (X1 ) − cov(X1 , X2 ) − cov(X2 , X1 ) − σ 2 (X2 ) = σ 2 (X1 ) − σ 2 (X2 ) = 0. 5.73 Since cov(Xi , Xj ) = cov(Xj , Xi ), the matrix C is symmetric. Pn Pn To prove that C is positive semi-definite, we must verify that i=1 j=1 ti tj σij ≥ 0 for all real P numbers t1 , . . . , tn . This property follows from the formula for var( ni=1 ti Xi ) in Problem 5.72 and the fact that the variance is always nonnegative. 5.74 Since X and Y are independent cov(X, Y ) = 0. Therefore, using the result of Problem 5.72(c), cov(X, V ) = cov(X,X)+cov(X,Y) = σ 2 (X) = 1 > 0, cov(V, W ) = cov(X, Y ) − acov(X, X) + cov(Y, Y ) − acov(Y, X) = −aσ 2 (X) + σ 2 (Y ) = 1 − a > 0 for 0 < a < 1, and cov(X, W ) = cov(X, Y ) − acov(X, X) = −a < 0. √ 5.75 Let V = max(X, Y ) and W = min(X, Y ). Then E(V ) = 1/ π and √ E(W ) = −1/ π, see Problem 4.69. Obviously, V W = XY and so E(V W ) = E(X)E(Y ) = 0, by the independence of X and Y . Thus cov(V, W ) = 1 . π We have min(X, Y ) = − max(−X, −Y ). Since the independent random variables −X and −Y are distributed as X and Y , it follows that min(X, Y ) has the same distribution as min(−X, −Y ) = − max(X, Y ). Therefore σ 2 (V ) = σ 2 (W ). Also, by V + W = X + Y , we have 140 σ 2 (V + W ) = σ 2 (X + Y ) = 2. Using the relation σ 2 (V + W ) = σ 2 (V ) + σ 2 (W ) + 2cov(V, W ), we get σ 2 (V ) + σ 2 (W ) = 2 − 2/π and so σ 2 (V ) = σ 2 (W ) = 1 − 1/π. This leads to ρ(V, W ) = 1/π 1 = . 1 − 1/π π−1 Note: the result is also true when X and Y are N (µ, σ 2 ) distributed. To see this, use the relations h X − µ Y − µ i σ 2 hX − µ Y − µi , , , min = cov(V, W ) = σ 2 cov max σ σ σ σ π 1 var(V ) = var(W ) = σ 2 1 − . π 5.76 Let V = max(X, Y ) and W = min(X, Y ). The random variable V is 1 1 exponentially distributed and has E(V ) = 2λ and σ 2 (V ) = (2λ) 2 . The random variable W satisfies P (W ≤ w) = (1 − e−λw ) × (1 − e−λw ) for w ≥ 0. 3 It is matter of some algebra to get E(W ) = 2λ and σ 2 (W ) = 1 Noting that E(V W ) = E(XY ) = E(X)E(Y ) = λ2 , we find cov(V, W ) = This leads to ρ(V, W ) = 5 . 4λ2 1 1 3 1 − × = 2. 2 λ 2λ 2λ 4λ √1 . 5 5.77 The linear least square estimate of D1 given that D1 − D2 = d is equal to σ(D1 ) [d − E(D1 − D2 )]. E(D1 ) + ρ(D1 − D2 , D1 ) σ(D1 − D2 ) By thep independence of D1 and D2 , E(D1 − D2 ) = µ1 − µ2 , σ(D1 − D2 ) = σ12 + σ22 and cov(D1 − D2 , D1 ) = σ12 . The linear least square estimate is σ2 µ1 + 2 1 2 (d − µ1 + µ2 ). σ1 + σ2 141 Chapter 6 6.1 It suffices to prove the result for the standard bivariate normal distribution. Let W = aX + bY . Let us first assume that b > 0. Then, P (W ≤ w) = 1 2π p 1 − ρ2 Z ∞ h Z (w−ax)/b 1 e− 2 (x −∞ 2 −2ρxy+y 2 )/(1−ρ2 ) −∞ i dy dx for −∞ < w < ∞. Differentiation yields that the density function of W is Z ∞ 1 2 1 2 2 2 p fW (w) = e− 2 [x −2ρx(w−ax)/b+(w−ax) /b ]/(1−ρ ) dx. 2πb 1 − ρ2 −∞ It is a matter of some algebra to obtain √ 1 fW (w) = (η 2π)−1 exp(− w2 /η 2 ) for − ∞ < w < ∞, 2 p where η = a2 + b2 + 2abρ. This result also applies when b ≤ 0. To see this, write W = aX + (−b)(−Y ) and note that (X, −Y ) has the standard bivariate normal density with correlation coefficient −ρ. We can now conclude that aX + bY is normally distributed for all a, b if (X, Y ) has a bivariate normal distribution. To find P (X > Y ) for (X, Y ) having a bivariate normal distribution with parameters (µ1 , µ2 , σ12 , σ22 , ρ), note that X − Y is N (µ1 − µ2 , σ12 + σ22 − 2ρσ1 σ2 ) distributed. Therefore P (X > Y ) = 1 − Φ −(µ1 − µ2 ) . (σ12 + σ22 − 2ρσ1 σ2 )1/2 6.2 Let the random variables X and Y denote the rates of return on the stocks A and B. Define the random variable V by V = 12 X + 12 Y . Since (X, Y ) is assumed to have a bivariate normal distribution, any linear combination of X and Y is normally distributed. Hence, the random variable V is normally distributed with expected value E(V ) = 21 µ1 + 1 1 2 1 1 2 2 2 µ2 = 0.10 and variance σ (V ) = 4 σ1 + 4 σ1 + 2 × 4 ρσ1 σ2 = 0.004375. Thus, 0.11 − 0.10 P (V > 0.11) = 1 − Φ √ = 0.4399. 0.004375 142 RR 6.3 Using the basic formula P ((X, Y ) ∈ C) = C f (x, y) dx dy, we have R∞ R0 R yz R∞ P (Z ≤ z) = 0 dy −∞ f (x, y) dx+ −∞ dy yz f (x, y) dx. Taking the derivative, we get that the density of Z is Z ∞ Z 0 Z ∞ |y|f (yz, y) dy. yf (yz, y) dy = yf (yz, y) dy − fZ (z) = −∞ −∞ 0 Inserting the standard bivariate normal density for f (x, y), the desired result follows after some algebra. 6.4 Using the decomposition formula for the standard bivariate normal density, X − µ a − µ1 Y − µ2 b − µ2 1 P (X ≤ a, Y ≤ b) = P ≤ , ≤ σ1 σ1 σ2 σ2 Z (b−µ2 )/σ2 Z (a−µ1 )/σ1 2 1 2 1 1 1 2 √ e− 2 x dx √ e− 2 (y−ρx) /τ dy, = 2π τ 2π −∞ −∞ where τ 2 = 1 − ρ2 . Since the second integral represents the probability that an N (ρx, τ 2 ) distributed random variable is smaller than or equal to (b − µ2 )/σ2 , we obtain that P (X ≤ a, Y ≤ b) can be calculated as ! Z (a−µ1 )/σ1 1 2 1 (b − µ2 )/σ2 − ρx √ p e− 2 x dx. Φ 2π −∞ 1 − ρ2 This one-dimensional integral is well suited for numerical integration. For the special of a = µ1 and b = µ2 , we can p give an explicit expression for P (X ≤ a, Y ≤ b). Letting c = ρ/ 1 − ρ2 and using the standard method of changing to polar coordinates, it follows that P (X ≤ µ1 , Y ≤ µ2 ) can be evaluated as ! Z 0 1 2 −ρx 1 √ e− 2 x dx Φ p 2 2π −∞ 1−ρ Z ∞ Z 0 Z −cx Z 3π 2 1 2 1 1 − 21 (x2 +y 2 ) e− 2 r r dr dφ. dx dy = = e 2π −∞ −∞ 2π π−arctg(c) 0 Noting that R∞ 0 1 2 e− 2 r r dr = 1, we obtain P (X ≤ µ1 , Y ≤ µ2 ) = 1 2π Z 3 π 2 π−arctg(c) dφ = 1 1 + arctg(c). 4 2π 143 6.5 Any linear combination of V and W is a linear combination of X and Y and thus is normally distributed. 6.6 The vector (X, X + Y ) has a bivariate normal distribution q with pa√ ∗ ∗ ∗ ∗ ∗ rameters µ1 = µ2 = 0, σ1 = 1, σ2 = 2 + 2ρ, and ρ = 12 (1 + ρ). 6.7 Since X and V are linear combinations of X and Y , any linear combination of X and V is normally distributed and so (X, V ) has a bivariate normal distribution. Noting that E(V ) = 0 and σ 2 (V ) = (1 + ρ2 − 2ρ2 )/(1 − ρ2 ) = 1, the random variable V is N (0, 1) distributed like the random variable X. To prove the independence of X and V , it suffices to verify that cov(X, V ) = 0. This is immediate from cov(X, Y ) − ρσ 2 (X) ρ−ρ p cov(X, V ) = =p = 0. 2 1−ρ 1 − ρ2 6.8 The solution to this problem requires the fact that any linear combination of two independent normally distributed random variables is again normally distributed (see Rule 8.6 in Chapter 8). Any linear combination of X1 and X2 is a linear combination of the independent and normally distributed random variables Z1 and Z2 and is thus normally distributed, showing that (X1 , X2 ) has a bivariate normal distribution. Using Rule 5.11 and the relations cov(aX +b, cY +d) = accov(X, Y ) for any constants a, b, c, d and cov(X, V + W ) = cov(X, V ) + cov(X, W ), we obtain E(X1 ) = µ1 , E(X2 ) = µ2 , σ 2 (X1 ) = σ12 , σ 2 (X2 ) = σ22 (ρ2 + 1 − ρ2 ), cov(X1 , X2 ) σ1 σ2 ρcov(Z1 , Z1 ) ρ(X1 , X2 ) = = = ρ, σ(X1 )σ(X2 ) σ1 σ2 where the expressions for σ 2 (X2 ) and cov(X1 , X2 ) use the independence of Z1 and Z2 . Hence the parameters of the bivariate normal distribution of (X1 , X2 ) are given by (µ1 , µ2 , σ12 , σ22 , ρ). 6.9 Any linear combination of X + Y and X − Y is a linear combination of X and Y and thus is normally distributed. This shows that the random vector (X +Y, X −Y ) has a bivariate normal distribution. The components X+Y and X−Y are independent if cov(X+Y, X−Y ) = 0. Since cov(X + Y, X − Y ) = cov(X, X) − cov(X, Y ) + cov(X, Y ) − cov(Y, Y ), it follows that cov(X + Y, X − Y ) = σ 2 (X) − σ 2 (Y ) = 0. 144 6.10 Go through the path of length n in opposite direction and next continue this path with m steps. 6.11 Since Sn1 and Sn2 are approximately N (0, 21 n) distributed for n large, it follows from the results in Problem 4.68 that both |Sn1 | and |Sn2 | 2 has the approximate density √2πn e−u /n with E(|Sn1 |) = E(|Sn2 |) ≈ p n/π. This gives p E(Rn ) ≈ 2 n/π for lare n. Also, |Sn1 | and |Sn2 | are nearly independent for n large. Using the convolution formula in Section 5.2, the density of Rn is approximately Z r 2 2 2 2 √ e−u /n √ e−(r−u) /n du. πn πn 0 R r/√n 1 2 1 2 4 e− 2 r /n √12π −r/√n e− 2 z dz, This integral can be rewritten as √2πn showing that the density of Rn is approximately h 1 2 −r i r 4 √ e− 2 r /n Φ( √ ) − Φ( √ ) for large n. n n 2πn 6.12 Noting that 70,000 km is equal to 7 × 1010 millimeters, the equality r 8 7 × 1010 m= 3π 10−1 shows that the average number of collisions that a photon undergoes before reaching the sun’s surface is approximately equal to m = 5.773× 1023 . A photon travels at a speed of 300,000 km per second and thus the travel time of a photon between two collisions is equal to 10−1 /(3 × 1011 ) = 3.333 × 10−13 seconds. The average travel time of a photon from the sun’s core to its surface is thus approximately equal to (5.773 × 1023 ) × (3.333 × 10−13 ) = 1.924 × 1011 seconds. If you divide this by 365.25×24×3,600, then you find that the average travel time is approximately 6,000 years. A random walk is not a very fast way to get anywhere! Once it reaches the surface of the sun, it takes a photon only 8 minutes to travel from the surface of the sun to the earth (the distance from the sun to the earth is 149,600,000 km). 145 6.13 The random variables a1 X1 +· · ·+an Xn and b1 Y1 +· · ·+bm Ym are normally distributed for any constants a1 , . . . , an and b1 , . . . , bm . Moreover, these random variables are independent (any functions f and g result into independent random variables f (X) and g(Y) if X and Y are independent). Since the sum of two independent normally distributed random variables is again normally distributed (see Rule 8.6), a1 X1 + · · · + an Xn + b1 Y1 + · · · + bm Ym is normally distributed, showing that (X, Y) has a multivariate normal distribution. 6.14 Let the random variables XA , XB , and XC denote the annual rates of return on the stocks A, B, and C. Also, let’s express all amounts in units of $1,000. (a) Let the random variable W denote the portfolio’s value after one year. Then, the random variable W can be written as W = (1 + XA ) × 20 + (1 + XB ) × 20 + (1 + XC ) × 40 + 1.05 × 20. Since the random vector (XA , XB , XC ) is assumed to have a trivariate normal distribution, the random variable W is normally distributed with expected value E(W ) = 1.075 × 20 + 1.1 × 20 + 1.2 × 40 + 1.05 × 20 = 112.5 thousand dollars. The variance of W is computed as 400 × σ 2 (XA ) + 400 × σ 2 (XB ) + 1,600 × σ 2 (XC ) + 2 × 400 × 0.7 × σ(XA ) × σ(XB ) + 2 × 800 × (−0.5) × σ(XA ) × σ(XC ) + 2 × 800 × (−0.3) × σ(XB ) × σ(XC ). This leads to the standard deviation σ(W ) = 7.660 thousand dollars. (b) Suppose that the fractions f1 , f2 , f3 , and f4 of the investor’s capital are invested in the stocks A, B, C, and the riskless asset, respectively. Then the expected value of the portfolio’s value (in units of $1,000) after one year is given by (1.075f1 + 1.1f2 + 1.2f3 + 1.05f4 ) × 100 and the variance is given by (0.0049f12 +0.01f22 +0.04f32 +0.0098f1 f2 −0.014f1 f3 −0.012f2 f3 )×104 . There are many choices for the values of the fi such that the resulting portfolio has an expected return that is not less than that of the portfolio from question (a) but whose risk is smaller than the risk of the 146 portfolio of question (a). The optimal values of the fi are determined by the optimization problem Minimize 0.0049f12 + 0.01f22 + 0.04f32 + 0.0098f1 f2 − 0.014f1 f3 − 0.012f2 f3 subject to the constraints 1.075f1 + 1.1f2 + 1.2f3 + 1.05f4 ≥ 1.125, f1 + f2 + f3 + f4 = 1 and fi ≥ 0 for i = 1, . . . , 4. This optimization problem is a so-called quadratic programming problem that can be numerically solved by existing codes. The solution is f1 = 0.381, f2 = 0.274, f3 = 0.345, and f4 = 0. The expected return is again 112.5 thousand dollars, but the standard deviation is now 6.538 thousand dollars. (c) Since W is normally distributed, the probability that the portfolio’s value next year will be less than $112,500 is Φ(0) = 0.5 and the probability that the portfolio’s value next year will be more than $125,000 is 125 − 112.5 1−Φ = 0.0295. 6.619 6.15 The value of the chi-square statistic is (60,745 − 59,438.2)2 (60,179 − 61,419.5)2 (55,551 − 55,475.7)2 + + ··· + 61,419.5 55,475.7 59,438.2 2 (61,334 − 61,419.5) + = 642.46. 61,419.5 The probability that a χ211 distributed random variable takes on a value larger than 642.46 is practically zero. This leaves no room at all for doubt about the fact that birth dates are not uniformly distributed over the year. 6.16 The observed value of test statistic D is 0.470. The probability P (χ23 ≥ 0.470) = 0.925. The agreement with the theory is very good. 6.17 The parameter of the hypothesized Poisson distribution is estimated as λ = 37 78 vacancies per year. The data are divided in three groups: years with 0 vacancies, with 1 vacancy, and with ≥ 2 vacancies. Letting pi = e−λ λi /i!, the expected number of years with 0 vacancies is 78p0 = 147 48.5381, with 1 vacancy is 78p1 = 23.0245 and with ≥ 2 vacancies is 78(1−p0 −p1 ) = 6.4374. The chi-square test statistic with 3−1−1 = 1 degrees of freedom has the value 0.055. Since P (χ21 > 0.055) = 0.8145, the Poisson distribution gives an excellent fit. 6.18 The parameter of the hypothesized Poisson distribution for the number of deaths per corps-year is estimated as λ= 196 = 0.7. 14 × 20 The corps-years with 3 or more deaths are aggregated and so four possible data groups are considered. In the table we give the observed number and the expected number of corps-years with 0, 1, 2 and ≥ 3 deaths, where the expected number of corps-years with exactly k deaths is computed as 280 × e−λ λk /k!. The value of the test statistic is calculated as (144 − 139.0439)2 (91 − 97.3307)2 (32 − 34.0658)2 + + 139.0439 97.3307 34.0658 2 (13 − 9.5596) = 1.952. + 9.5596 The test statistic has approximately a chi-square distribution with 4 − 1 − 1 = 2 degrees of freedom. The probability P (χ22 ≥ 1.952) = 0.3768. The Poisson distribution gives a good fit. number of deaths 0 1 2 ≥3 observed number of corps-years 144 91 32 13 expected number of corps-years 139.0439 97.3307 34.0658 9.5596 6.19 The parameter of the hypothesized Poisson distribution is estimated as λ = 2.25. The matches with 5 or more goals are aggregated and so six data groups are considered. The test statistic has approximately a chi-square distribution with 6 − 1 − 1 = 4 degrees of freedom and its value is 1.521. The probability P (χ24 > 1.521) = 0.8229. The Poisson distribution gives an excellent fit. 148 6.20 The parameter of the hypothesized Poisson distribution is estimated as λ = 10097/2608 = 3.8715. The intervals with 11 or more α-particles are aggregated and so 12 data groups are considered. The expected number of time intervals with exactly k particles computed as 2608× P∞ is−λ −λ k e λ /k! for k = 0, 1, . . . , 10, while 2608× j=11 e λj /j! is computed as the expected number of time intervals with 11 or more particles. The value of the test statistic is calculated as 12.364. The test statistic has approximately a chi-square distribution with 12 − 1 = 1 degrees of freedom. The probability P (χ211 ≥ 12.364) = 0.3369. Thus the Poisson distribution gives a good fit. 6.21 Think of n = 98, 364,597 independent repetitions of a chance experiment with seven possible outcomes, where the outcomes 1, 2, . . . , 6 correspond to the prizes 1, 2, . . . , 6 and the outcome 7 means that none of these six prizes was won. Denote by pj the probability of outcome j for j = 1, . . . , 7. The probabilities pj are easiest obtained by imagining a vase with 6 red balls ( the regular numbers from the lotto drawing), 1 blue number (the bonus number from the lotto drawing) and 38 black numbers (the other numbers). Then, 6 6 6 1 6 1 1 5 1 5 6 6 5 1 5 1 p1 = 45 × , p2 = 45 × , p3 = 45 × , p4 = 45 × , 6 6 6 6 6 6 6 6 6 6 38 6 38 X 1 5 pj . p5 = 5 451 × , p6 = 5 451 × , p7 = 1 − 6 6 6 6 j=0 Letting N1 = 2, N2 = 6, N3 = 9, N4 = 35, N5 = 411, N6 = 2, 374, and N7 = 98, 361, 760 and assuming that the tickets are randomly filled in, the value of the test statistic D is calculated as 7 X (Nj − npj )2 = 20.848. npj j=1 Next we calculate the probability P (χ26 > 20.848) = 0.00195. This small value is a strong indication that the tickets are not randomly filled in. People do not choose their numbers randomly, but they often use birth dates, lucky numbers, arithmetical sequences, etc., in order to choose lottery numbers. A same conclusion was reached in a similar study of D. Kadell and D. Ylvisaker entitled “Lotto play: the good, the fair and the truly awful,” Chance 4 (1991): 22-25. 149 Chapter 7 7.1 To specify P (X = x | Y = 2) for x = 0, 1, we use the results from Table 5.1. From this table, we get P (X = 0, Y = 2) = (1 − p)2 , P (X = 1, Y = 2) = 2p(1 − p)2 and P (Y = 2) = (1 − p)2 + 2p(1 − p)2 . Therefore P (X = 0 | Y = 2) = 1 2p and P (X = 1 | Y = 2) = . 1 + 2p 1 + 2p 7.2 First the marginal mass function of Y must be determined. Using the binomium of Newton, it follows that y X y 1 x 1 y−x y 1 y X y 1 x = 3 3 x 2 x 6 x=0 x=0 1 y 1 y 1 y = for y = 1, 2, . . . . +1 = 3 2 2 P (Y = y) = That is, the random variable X is geometrically distributed with parameter 12 . Hence the conditional mass function of X given that Y = y is given by P (X = x | Y = y) = 1 x 1 y−x y x 6 3 y 1 2 y 1 x 2 y = 3 x 2 for x = 0, 1, . . . , y. P∞ y y−x = (1 − a)−x−1 for |a| < 1, Note: Using the identity y=x x a it is readily verified that the marginal mass function of X is given 1 3 1 x by P (X = 0) = 2 and P (X = x) = 2 4 for x = 1, 2, . . .. The conditional mass function of Y given that X = 0 is 1 y P (Y = y | X = 0) = 2 3 for y ≥ 1. For x ≥ 1 the conditional mass function of Y given that X = x is y 2 x+1 1 y−x P (Y = y | X = x) = x 3 3 for y ≥ x. 150 7.3 Since P (X = 1, Y = 2) = 61 × 61 , P (X = x, Y = 2) = 46 × 61 × for x ≥ 3 and P (Y = 2) = 65 × 61 , it follows that P (X = x | Y = 2) = 1 5 4 30 P (X = x | Y = 20) = ( 5 x−3 1 ×6 6 for x = 1, x ≥ 3, 5 x−3 for 6 x−1 −x 4 1 5 6 6 6 19 1 5 x−21 4 6 6 6 for 1 ≤ x ≤ 19, for x ≥ 21. 7.4 In Problem 5.5 it was shown that the joint probability mass function of X and Y is P (X = x, Y = y) = y−x−1 120 for 1 ≤ x ≤ 8 and x + 2 ≤ y ≤ 10. and t the marginal distributions of X and Y are P (X = x) = (10 − x)(9 − x) 240 for 1 ≤ x ≤ 8. (y − 1)(y − 2) for 3 ≤ y ≤ 10. 240 It now follows that, for fixed y, the conditional probability mass function of X given that Y = y is P (Y = y) = 2(y − x − 1) (y − 1)(y − 2) P (X = x | Y = y) = for x = 1, . . . , y − 2. For fixed x, the conditional probability mass function of Y given that X = x is P (Y = y | X = x) = 2(y − x − 1) (10 − x)(9 − x) for y = x + 2, . . . , 10. 7.5 The joint mass function of X and Y is x−1 x 1 1 x 5 for 0 ≤ y ≤ x, x ≥ 1. P (X = x, Y = y) = 6 6 y 2 The marginal mass function of Y is P 5 x 12 ∞ x 1 5 x=y y 12 = 35 P (Y = y) = P∞ 5 x−1 1 1 x = x=1 6 6 2 5 y 7 1 7 for y ≥ 1, for y = 0, 151 For fixed y ≥ 1, x −y 5 5 7 x for x ≥ y. P (X = x | Y = y) = 12 y 12 7 Further, P (X = x | Y = 0) is 5 x 12 7 5 for x ≥ 0. 7.6 The joint probability mass function of X and Y is given by 1 36 2 36 P (X = i, Y = i) = P (X = i, Y = j) = for 1 ≤ i ≤ 6 for 1 ≤ i < j ≤ 6. The marginal mass functions of X and Y are P (X = i) = 2(6 − i) 1 2(j − 1) 1 + and P (Y = j) = + 36 36 36 36 for 1 ≤ i ≤ 6 and 1 ≤ j ≤ 6. This leads to P (X = i | Y = j) = ( 1 1+2(j−1) 2 1+2(j−1) for i = j for i < j P (Y = j | X = i) = ( 1 1+2(6−i) 2 1+2(6−i) for j = i for j > i 7.7 The joint mass function of X and Y is 24 1 x 5 24−x x 1 y 5 x−y P (X = x, Y = y) = . x 6 6 y 6 6 for 0 ≤ x ≤ 24 and 0 ≤ y ≤ x. Thus, the marginal mass function of Y is y 24−y X 24 x 1 24 x 5 1 P (Y = y) = y x 6 6 6 x=y for 0 ≤ y ≤ 24. For fixed y, the conditional mass function of X is P (X = x | Y = y) = 24 x x y P24 24 k=y k 1 x 6 1 k k y 6 for y ≤ x ≤ 24. 152 7.8 The joint probability mass function of X and Y is given by 26 13 13 P (X = x, Y = y) = x y 13−x−y 52 13 for x + y ≤ 13.The marginal mass functions of X and Y are 39 39 13 13 P (X = x) = x 13−x 52 13 and P (Y = y) = y 13−y 52 13 . for x = 0, 1 . . . 13 and y = 0, 1, . . . , 13. Thus the conditional mass functions are given by 26 26 13 13 P (X = x | Y = y) = x 13−x−y 39 13−y , P (Y = y | X = x) = y 7.9 The marginal densities of X and Y are given by Z ∞ fX (x) = xe−x(y+1) dy = e−x for x > 0 Z 0∞ 1 xe−x(y+1) dx = for y > 0. fY (y) = (1 + y)2 0 13−y−x 39 13−x . Therefore the conditional density functions of X and Y are fX (x | y) = (y + 1)2 xe−x(y+1) fY (y | x) = xe −xy for x > 0, for y > 0. R∞ R∞ fY (y | 1) dy = 1 e−y dy = e−1 . R1 7.10 The marginal densities are fX (x) = 0 (x − y + 1) dy = x + 0.5 for R1 0 < x < 1 and fY (y) = 0 (x − y + 1) dx = 1.5 − y for 0 < y < 1. Thus the conditional density functions of X and Y are Further, P (Y > 1 | X = 1) = fX (x | y) = 1 x−y+1 x−y+1 and fY (y | x) = . 1.5 − y x + 0.5 for 0 < x < 1 and 0 < y < 1. This gives Z 1 4 (x + 0.75) dx = 0.6 P (X > 0.5 | Y = 0.25) = 5 0.5 Z 1 4 1 P (Y > 0.5 | X = 0.25) = (1.25 − y) dy = . 3 0.5 3 153 7.11 The marginal probability densities of X and Y are Z x 1 dy = 1 for 0 < x < 1 fX (x) = x 0 Z 1 1 fY (y) = dx = −ln(y) for 0 < y < 1. y x Therefore, for any given y with 0 < y < 1, the conditional density 1 fX (x | y) = − x ln(y) for y ≤ x < 1. For any given x with 0 < x < 1, the conditional density fY (y | x) = x1 for 0 < y ≤ x. 7.12 The marginal density functions of X and Y are Z ∞ fX (x) = e−y dy = e−x for x > 0 x Z y e−y dx = ye−y for y > 0. fY (y) = 0 Thus the conditional density functions of X and Y are fX (x | y) = for 0 < x < y and fY (y | x) = e−(y−x) for y > x. 1 y 7.13 We have fX (x) = 1 for 0 < x < 1 and fY (y | x) = x1 for 1 − x < y < 1. By f (x, y) = fX (x)fY (y | x), the joint density of X and Y is f (x, y) = 1 x for 0 < x < 1 and RR 1 − x < y < 1. Hence, using the basic formula P ((X, Y ) ∈ C) = C f (x, y) dx dy, we get Z 1 Z 1 1 dx P (X + Y > 1.5) = dy = 0.5ln(0.5) + 0.5 = 0.1534 x 1.5−x 0.5 Z 0.5 Z 1 Z 1 Z 1 1 1 dx P (Y > 0.5) = dx dy + dy = 0.5 − 0.5ln(0.5) 1−x x 0 0.5 x 0.5 = 0.8466. R 1−y 7.14 Since fY (y) = 0 3(x + y) dx = 32 (1 − y)2 + 3y(1 − y) for 0 < y < 1, we have for fixed y that x+y for 0 < x < 1 − y. fX (x | y) = 1 2 2 (1 − y) + y(1 − y) 7.15 By f (x, y) = fX (x)fY (y | x), the joint density of X and Y is f (x, y) = 2x × x1 = 2 for 0 < y ≤ x < 1. The marginal density of Y is fY (y) = 2(1 − y) for 0 < y < 1. Using again f (x, y) = fY (y)fX (x | y), we get fX (x | y) = 1 1−y for y ≤ x < 1, 154 which is the uniform density on (y, 1). 7.16 The marginal density functions of X and Y are fX (x) = Z x 0 2y dy = 1 and fY (y) = x2 Z 1 y 2y dx = 2 − 2y x2 for 0 < x < 1 and 0 < y < 1. Therefore the conditional density functions of X and Y are fX (x | y) = 2y/x2 2y/x2 for y < x < 1, fY (y | x) = for 0 < y < x. 2 − 2y 1 To simulate a random observation from f (x, y), we use the representation f (x, y) = fX (x)fY (y | x). A random observation from fX (x) is obtained by generating a random number from (0, 1). Since for fixed x 2 the cumulative distribution function P (Y ≤ y | x) = xy 2 for 0 ≤ y ≤ x is easily inverted, a random observation from fY (y | x) = x2y2 can be obtained by using the inverse-transformation method. Therefore, a random observation from f (x, y) can be simulated as follows: (i) generate two random numbers u1 and u2 from (0, 1), (ii) output x := u1 √ and y := u1 u2 . 7.17 Put for abbreviation Then 1 1 P∆y (x | y) = P X = x | y − ∆y ≤ Y ≤ y + ∆y . 2 2 P y − 21 ∆y ≤ Y ≤ y + 12 ∆y | X = x P (X = x) . P∆y (x | y) = P y − 21 ∆y ≤ Y ≤ y + 21 ∆y Thus, for continuity points y, we have P∆y (x | y) ≈ pX (x)fY (y | x) pX (x)fY (y | x)∆y = . fY (y)∆y fY (y) Define pX (x | y) as lim∆y→0 P∆y (x | y). Then, for fixed y, pX (x | y) as function of x is proportional to P pX (x)fY (y | x). The proportionality constant is the reciprocal of x pX (x)fY (y | x). This explains the definition of the conditional mass function of X. 155 7.18 Assuming that the random noise N is independent of X, y−1 N −0 ≤ P (Y ≤ y | X = 1) = P (N ≤ y − 1) = P σ σ y−1 = Φ for − ∞ < y < ∞. σ In the same way, P (Y ≤ y | X = −1) = Φ y+1 σ for − ∞ < y < ∞. Differentiation gives √ (1/σ 2π)e− 12 (y−1)2 /σ2 fY (y | x) = (1/σ √2π)e− 12 (y+1)2 /σ2 for x = 1 for x = −1. Next apply the general formula pX (x)fY (y | x) . pX (x | y) = P u pX (u)fY (y | u) This formula was derived in Problem 7.17. Hence we find 1 P (X = 1 | Y = y) = pe− 2 (y−1) 1 pe− 2 (y−1) 2 /σ 2 2 /σ 2 1 + (1 − p)e− 2 (y+1) 2 /σ 2 . 7.19 Let Y be the time needed to process a randomly chosen claim. For fixed 0 < x < 1, we have fY (y | x) = x1 for x < y < 2x. By the relation f (x, y) = fY (y | x)fX (x), the joint density function of X and Y satisfies f (x, y) = 1.5(2 − x) for 0 < x < 1 and x < y < 2x, and f (x, y) = 0 otherwise. The sought probability is P ((X, Y ) ∈ C) with C = {(x, y) : 0 ≤ x ≤ 1, x ≤ y ≤ min(2x, 1)} and is evaluated as Z 1 Z min(2x,1) 1.5(2 − x) dy dx x 0 Z 1 Z 0.5 1.5(2 − x)(1 − x) dx = 0.5625. 1.5(2 − x)x dx + = 0 0.5 7.20 By the independence of X and Y , the joint density function f (x, y) of X and Y is given by f (x, y) = λe−λx λe−λy for x, y > 0. 156 Let V = X and W = X + Y . To obtain fV,W (v, w), we use the transformation rule 5.7. The functions a(v, w) and b(v, w) are a(v, w) = v and b(v, w) = w − v. The Jacobian is equal to 1. Hence the joint density of V and W is fV,W = λe−λv λe−λ(w−v) = λ2 e−w for 0 < v < w < ∞. The marginal density of W is given by Z w λ2 e−w dv = wλ2 e−w fW (w) = for w > 0. 0 Hence, for any fixed w, the conditional density of V given that W = w is λ2 e−w 1 fV (v | w) = = for 0 < v < w. 2 −w wλ e w This verifies that the conditional density of X given that X + Y = u is the uniform density on (0, u). R1 7.21 We have P (N = k) = 0 P (N = k | X1 = u) du, by the law of conditional probability. Thus Z 1 1 uk−2 (1 − u)du = for k = 2, 3, . . . . P (N = k) = k(k − 1) 0 P 1 The expected value of N is equal to ∞ k=2 k−1 = ∞. 7.22 The number p is a random observation from a random variable U that is uniformly distributed on (0, 1). By the law of conditional probability, Z 1 Z 1 n k P (X = k | U = p) dp = P (X = k) = p (1 − p)n−k dp k 0 0 R1 for k = 0, 1, . . . , n. Using the fact that the beta integral 0 xr−1 (1 − x)s−1 dx is equal to (r − 1)!(s − 1)!/(r + s − 1)! for positive integers r and s, we next obtain 1 n k!(n − k)! = for k = 0, 1, . . . , n. P (X = k) = n+1 k (n + 1)! 7.23 Condition on the unloading time. By the law of conditional probability, the probability of no breakdown is given by Z ∞ 1 1 2 2 1 2 2 e−λy √ e− 2 (y−µ) /σ dy = e−µλ+ 2 σ λ . σ 2π −∞ 157 7.24 Let the random variable R denote the number of passengers who make a reservation for a given trip. Then, P (R = r) = 61 for r = 1, . . . , 6. By the law of conditional probability, P (V = j) = 10 X r j r=j P (W = k) = 10 X j=0 0.8j 0.2r−j P (R = j) for j = 0, 1, . . . , 10 P (W = k | V = j)P (V = j) for k = 0, 1, 2, 3. The probability P (V = j) has the values 0.0001, 0.0014, 0.0121, 0.0547, 0.1397, 0.2059, 0.1978, 0.1748, 0.1286, 0.0671, and 0.0179 for j = 0, 1, . . . , 10. The probability mass function of W is P (W = 0) = 0.25[P (V = 0) + · · · + P (V = 9)] + P (V = 10) = 0.2634, P (W = 1) = 0.45[P (V = 0) + · · · + P (V = 8)] + 0.75P (V = 9) = 0.4621, P (W = 2) = 0.20[P (V = 0) + · · · + P (V = 7)] + 0.30P (V = 8) = 0.1959, P (W = 3) = 0.10[P (V = 0) + · · · + P (V = 7)] = 0.0786. 7.25 Let YPbe the outcome of the first roll of the die. Then, by P (X = k) = 6i=1 P (X = k | Y = i)P (Y = i), we get P (X = k) = 5 X 1 k 5 i−k i i=1 k 6 6 × 1 k−1 5 6−k+1 1 6 1 × + k−1 6 6 6 6 6 for k = 0, 1, . . . , 7 with the convention ki = 0 for k > i and −1 = 0. This probability has the numerical values 0.4984, 0.3190, 0.1293, 0.0422, 0.0096, 0.0014, 1.1 × 10−4 , and 3.6 × 10−6 . 7.26 Let f (x) be the gamma density with shape R ∞ parameter r and scale parameter (1 − p)/p. Then, P (N = j) = 0 P (N = j | X = x)f (x) dx, by the law of conditional probability. Thus Z ∞ xj 1 − p r xr−1 −x(1−p)/p e−x P (N = j) = e dx j! p (r − 1)! 0 Z ∞ r+j xr+j−1 (r + j − 1)! j 1 r = p (1 − p) e−x/p dx. j! (r − 1)! p (r + j − 1)! 0 158 Since the gamma density 1/p (0, ∞), it next follows that P (N = j) = r+j xr+j−1 −x/p (r+j−1)! e r+j−1 j p (1 − p)r r−1 integrates to 1 over for j = 0, 1, . . . . k−r This can be written as P (N = k − r) = k−1 (1 − p)r for k = r−1 p r, r + 1, . . . . In other words, the random variable N + r has a negative binomial distribution with parameters r and p (the random variable N gives the number of failures before the rth success occurs). 7.27 By the law of conditional probability, the probability of having k red balls among the r selected balls is B X n=0 n k B−n r−k B r B n p (1 − p)B−n . n This probability can be simplified to kr pk (1 − p)r−k . This result can be directly seen by assuming that the B balls are originally non-colored and giving each of the r balls chosen the color red with probability p. 7.28 Denote by f1 (x) and f2 (x) the probability densities of the random variables X1 and X2 . Let us first point out that pf1 (x) + (1 − p)f2 (x) is not the probability density of W = pX1 + (1 − p)X2 , as many students erroneously believe. As counterexample, take p = 21 and assume that X1 and X2 are independent random variables having the uniform distribution on (0, 1). Then pf1 (x)+(1−p)f2 (x) is the uniform density on (0, 1), but 21 X1 + 21 X2 has a triangular density rather than a uniform density. The random variable V is distributed as X1 with probability p and as X2 with probability 1 − p. Then, by the law of conditional probability, P (V ≤ x) = pP (X1 ≤ x) + (1 − p)P (X2 ≤ x) and so pf1 (x)+(1−p)f2 (x) is the probability density of V . This density is the N (pµ1 + (1 − p)µ2 , p2 σ12 + (1 − p)2 σ22 ) density when the N (µ1 , σ12 ) distributed X1 and the N (µ2 , σ22 ) distributed X2 are independent of each other. This result uses the fact that the sum of two independent normal random variables is normally distributed, see Rule 8.6. 159 7.29 By the law of conditional probability, Z 1 hZ 1 Z 1 b2 i b2 2 P C≤ db db = da P AC ≤ P (B ≥ 4AC) = 4 4a 0 0 0 Z 1 2 i Z 1 h 2 Z 1 h Z b2 /4 b b b2 b2 i da + db = db da = − ln 4 4 4 b2 /4 4a 0 0 0 1 5 + ln(2) = 0.2544. = 36 6 7.30 Let the random variable X1 be the first number picked. Also, let the random variable χ be 1 if you have to pick exactly two numbers and be 0 otherwise. By the law of conditional probability, the probability that you have to pick exactly two numbers is Z 1 Z 1 0.5 1 1 P (χ = 1 | X1 = x1 ) dx1 = P (χ = 1) = dx1 = − ln . 2 2 0.5 x1 0.5 Similarly, we get that the probability that you have to pick exactly three numbers is equal to Z x1 Z 1 1 h 1 i2 0.5 . dx2 = ln dx1 4 2 0.5 x2 0.5 In general, for any 0 < a < 1, the probability that exactly n num bers must be picked in order to obtain a number less than a is a − n−1 ln(a) /(n − 1)! for n = 1, 2, . . . (by writing a as e−(− ln(a)) , this probability mass function can be seen as a shifted Poisson distribution). The expected value of the number of picks is 1 − ln(a). Note: A funny illustration of the discrete version of the problem is as follows. It is your birthday. You are asked to blow out all of the c burning candles on your birthday cake. The number of burning candles that expire when you blow while there are still d burning candles has the discrete uniform distribution on 0, 1, . . . , d. Then, using the law of conditional expectation, it is readily verified that the expected P value of the number of attempts to blow out all c candles is 1+ ck=1 k1 . However, the probability mass function of the number of attempts is rather difficult to calculate. 7.31 The expected number of crossings of the zero level during the first Pn−1 n jumps is k=1 E(Ik ), where E(Ik ) = P (Ik = 1). Denote by Sk the position of the particle just before the (k + 1)th jump. Then Sk 160 is the sum of k independent standard normally distributed random variables and is thus normally distributed (see also Rule 8.6). The random variable Sk has expected value 0 and variance k. Thus Sk has 1 2 1 e− 2 x /k . By conditioning on Sk , we get that the density function √2πk P (Ik = 1) is equal to Z −x 1 2 1 2 1 1 √ e− 2 y dy e− 2 x /k dx 2π 2πk −∞ 0 Z 0 Z ∞ 1 2 1 2 1 1 √ √ e− 2 y dy. + e− 2 x /k dx 2π 2πk −∞ −x Z ∞ √ Using polar coordinates in order to evaluate these it next √ integrals, follows that P (Ik = 1) is equal to π1 π2 − arctg( k) = π1 arctg √1k . Therefore n−1 n−1 1 X 1X E(Ik ) = arctg √ . π k k=1 k=1 √ Note: An asymptotic expansion for this sum is π2 n + c + 6π1√n , where c = −0.68683... . 7.32 Take the minute as time unit and represent the period between 5.45 and 6 p.m. by the interval (0, 15). If you arrive at time point x at the bus stop, you will take bus number 1 home if no bus number 3 will arrive in the next 15 − x time units. The probability of this 1 happening is e−λ(15−x) with λ = 15 . This follows from the fact that the exponential distribution is memoryless. By conditioning on your arrival epoch having the uniform distribution on (0,15) with density 1 for 0 < x < 15 and using the law of conditional probability, f (x) = 15 it now follows that Z 15 1 e− 15 (15−x) f (x) dx P (you take bus 1 home) = 0 Z 15 1 1 1 e− 15 (15−x) dx = 1 − . = 15 0 e An intuitive explanation of why this probability is larger than 21 is as follows. If you arrive at a random point in time at the bus stop, your average waiting time for bus number 3 is 15 minutes (by the memoryless property of the exponential distribution), while your average waiting time for bus number 1 is 7.5 minutes. 161 7.33 It suffices to find P (a, b) for a ≥ b. By symmetry, P (a, b) = 1 − P (b, a) for a ≤ b. For fixed a and b with a ≥ b, let the random variables SA and SB be the total scores of the players A and SB . Let fA (s) be the probability density of SA . Then, by the law of conditional probability, Z 1 P (A beats B | SA = s)fA (s) ds. P (a, b) = 0 By conditioning on the outcome of the first draw of player A, it follows that Z s P (SA ≤ s) = (s − u) du for 0 < s ≤ a, 0 Ra for a < s ≤ 1, 1 − s + 0 1 − (s − u) du P (SA > s) = R a 1 − (s − u) du for 1 < s < 1 + a. s−1 Differentiation gives that the density function fA (s) of SA is s for 0 < s ≤ a, 1 + a for a < s ≤ 1 and 1 + a − s for 1 < s < 1 + a. The distribution of SB follows by replacing a by b in the distribution of SA . Next it is a matter of tedious algebra to obtain P (a, b) = 1 1 − (a − b)(a2 b + a2 + ab2 + b2 + ab + 3a − 3) 2 6 for a ≥ b. Also, by a symmetry argument, P (a, b) = 1 1 + (b − a)(b2 a + b2 + ba2 + a2 + ba + 3b − 3) 2 6 for a ≤ b, using the fact that P (a, b) = 1 − P (b, a) for a ≤ b. Let a0 be the optimal threshold value of player A. Then, P (a0 , b) ≥ 0.5 for all b with P (a0 , b) = 0.5 for b = a0 . This leads to the equation 2a30 + 3a20 + 3a0 − 3 = 0. The solution of this equation is a0 = 0.5634. If player A uses this threshold value, his win probability is at least 50%, whatever threshold player B uses. 7.34 It should be clear that each player uses a strategy of the following form: choose a new number if the original number is below a given threshold, otherwise keep the original number. Let P (a, b) denote the winning probability of player A when player A uses a threshold a and 162 player B uses a threshold b. Player A wants to use the threshold a = a∗ , where a∗ attains the maximum in maxa minb P (a, b). By the law of conditional probability, we have for any given a, b that Z 1 Z 1 P (a, b) = dx a(x, y)dy, 0 0 where a(x, y) is player A’s winning probability if the original number of player A is x and the original number of player B is y. Consider first the case of a ≥ b. Then, for x ≤ a we have that a(x, y) = 1 − y for y > b and a(x, y) = 12 for y < b, while for x > a we have that a(x, y) = 1 for b < y < x, a(x, y) = 0 for y > x and a(x, y) = x for y < b. This leads to P (a, b) = 1 1 + (a − b − a2 + ab + ab2 − a2 b) 2 2 for a ≥ b. By a symmetry argument, we have P (a, b) = 1 − P (b, a) for a ≤ b. This gives 1 1 + (a − b + b2 − ab + ab2 − a2 b) for a ≤ b. 2 2 It is not necessary to invoke a numerical method for getting the number a that attains the maximum in maxa minb P (a, b). It is not difficult to verify by analytical means that this number is given by 1 √ a∗ = ( 5 − 1). 2 P (a, b) = To prove this, we write P (a, b) as P (a, b) = 12 + 21 (a − b)(1 − a − ab) for a ≥ b and P (a, b) = 12 + 12 (a − b)(1 − b − ab) for a ≤ b. Using these √ expressions and the fact that a∗ = 21 ( 5 − 1) satisfies a∗ × a∗ + a∗ = 1, it is directly verified that P (a∗ , b) > 21 both for b > a∗ and for b < a∗ (of course, P (a∗ , b) = 21 for b = a∗ ). Hence, if player A chooses his √ threshold as 12 ( 5 − 1) he will win with a probability of more than √ 50% unless player B also uses the threshold 21 ( 5 − 1) in which case player A wins with a probability of exactly 50%. 7.35 Denote by S3 (a) [C3 (a)] the probability of player A being overall winner if player A gets the score a at the first draw and stops [continues] after the first draw. By conditioning on the outcome of the second draw of player A, Z 1−a S3 (a + v) dv for 0 < a < 1. C3 (a) = 0 163 The function S3 (a) is increasing with S3 (0) = 0 and S3 (1) = 1, whereas the function C3 (a) is decreasing with C3 (0) > 0 and C3 (1) = 0. Let a3 be defined as the solution to the equation S3 (a) = C3 (a), then a3 is the optimal stopping point for the first player A in the three-player’s game. It will be shown that S3 (a) = a4 for a ≥ a2 and 1 C3 (a) = (1 − a5 ) 5 for a ≥ a2 , where a2 (= 0.53209) is the optimal stopping point for the first player in the two-player’s game. Taking for granted that a3 ≥ a2 , it follows that the optimal stopping point a3 is the solution to the equation 1 a4 = (1 − a5 ) 5 on the interval (a2 , 1). This solution is given by a3 = 0.64865. The calculation of the overall winning probability of player A is less simple and requires S3 (a) for all 0 < a < 1. To derive S3 (a) for all 0 < a < 1, we first observe that in the three player’s game the optimal strategy of the second player B is to stop after the first draw if and only if the score of this draw exceeds both the final score of player A and a2 = 0.53209. Thus, given that player A’s final score is a with a > a2 , the probability of player B getting a score below a in the first A in the R adraw and next losing from player 2 second draw is equal to 0 [a − x + (1 − (1 − x))] dx = a . Hence P (A will beat B | A’s final score is a) = a2 for a > a2 . To obtain P (A will beat B | A’s final score is a) for 0 < a < a2 , we have to add the probability that B’s score is between a and a2 in the first drawR and exceeds 1 after the second draw. This probability is a given by a 2 xdx = 12 a22 − 21 a2 . Thus, for 0 < a < a2 , 1 1 P (A will beat B | A’s final score is a) = a2 + a22 − a2 . 2 2 Obviously, the conditional probability that player A will beat player C given that player’s A final score is a and player A has already beaten player B is equal to a2 for all 0 < a < 1. This gives ( a2 × a2 for a > a2 S3 (a) = 1 2 1 2 2 2 for 0 < a < a2 . (a + 2 a2 − 2 a ) × a 164 Next we evaluate C3 (a). By C3 (a) = Z 1−a S3 (a + v) dv for 0 < a < 1, 0 we obtain 1 5 (1 − a5 ) C3 (a) = 1 5 1 5 1 5 2 3 5 10 (a2 − a ) + 6 (a2 − a2 a ) + 5 (1 − a2 )) for a ≥ a2 for 0 < a < a2 . This result completes the derivation of the critical level a3 , but also enables us to calculate P3 (A), which is defined as the probability of player A winning under optimal play of each of the players. By the law of conditional probability, P3 (A) = Z a3 C3 (a) da + 0 Z 1 S3 (a) da = 0.3052. a3 Let P3 (B) be the probability of player B being the overall winner when all players act optimally. To calculate P3 (B), it is convenient to define F (a) as the probability that the final score of player A will be no more than a for 0 ≤ a ≤ 1. Then, by conditioning on the result of the first draw of player A, Z a3 1 (1 − (1 − x))dx = a23 for a = 0, F (a) = 2 0 Z a Z a−x 1 F (a) = F (0) + dx dy = F (0) + a2 for 0 < a < a3 . 2 0 0 For a > a3 , F (a) = F (0) + Z a dx + a3 Z 0 a3 dx Z a−x 0 1 dy = F (0) − a23 − a3 + (1 + a3 )a. 2 The cumulative distribution function F (a) has the mass 21 a23 at a = 0, the density a for 0 < a < a3 and the density 1 + a3 for a3 < a < 1. Next we calculate P3 (B) by conditioning on the final score of player A. Using the fact that P2 (A) = 0.453802 is the overall winning probability of the first player in the two-player game and noting that in the twoplayer game the first player wins with probability v 2 when the final 165 score of the first player is v, it follows that P3 (B) is given by Z 1−v Z x Z 1 i hZ 1 1 2 2 (v + w)2 dw dv v dv + a3 P2 (A) + (1 + a3 ) dx 2 x−v 0 x a3 Z 1 Z a3 i h Z x Z 1−v v 2 dv (v + w)2 dw + + dv x dx + + Z Z 0 a2 a2 x dx 0 a2 dv x hZ Z x 0 1−v dv Z x−v 1−v x (v + w)2 dw x−v 2 (v + w) dw + 0 Z 1 a2 i v 2 dv . After some algebra, this leads to h1 1 1 1 1 i P3 (B) = a23 P2 (A) + (1 + a3 ) − a3 − + a43 2 3 3 12 12 h1 1 1 1 5i 2 + a + (1 + a3 ) − a3 − 6 6 15 15 3 1 1 1 + (a23 − a22 ) − a53 + a52 6 15 15 1 6 1 3 1 3 1 1 1 3 + a3 − a3 + a2 − a2 − a62 + a62 9 18 6 9 24 72 1 1 + (1 − a32 ) × a22 . 3 2 This gives P3 (B) = 0.3295. Finally, the probability of player C being the overall winner is 1 − P3 (A) − P3 (B) = 0.3653. By simulation, we found that the final score of the winning player in the three-player game has the expected value 0.836 and the standard deviation 0.149 (in the two-player game the final score of the winning player has the expected value 0.753 and the standard deviation 0.209). Note: For the s-player game the optimal strategy for the players is easy to characterize: the first player A stops after the first draw if and only if this draw gives a score that exceeds as , the second player stops after the first draw if and only if this draw gives a score that exceeds both as−1 and the final score of the first player; generally, the ith player stops after the first draw only if this draw gives a score that exceeds both as−i+1 and the largest value of the final scores obtained so far. For any s ≥ 2, the critical level as is the solution of a2(s−1) = 1 (1 − a2s−1 ) 2s − 1 166 on the interval (as−1 , 1), where a1 = 0. For the general s-player game, the calculation of the overall win probability of each of the players is rather cumbersome. We have used computer simulation to obtain the overall win probabilities for the cases of s = 4 and s = 5. The overall win probabilities of the players are 0.231, 0.242, 0.255, and 0.271 when s = 4 and are 0.186, 0.192, 0.199, 0.207, and 0.215 when s = 5. The optimal stopping point as has the values 0.71145 and 0.75225 for s = 4 and s = 5. 7.36 The conditional expected value of the number of consolation prizes given that no main prize has been won is given by 30 15 3 3 X X y 3−y y yP (Y = y | X = 0) = = 1. E(Y | X = 0) = 45 y=0 y=0 7.37 Since fY (y | s) = 3(s + 1)3 (s + y)4 3 for y > 1, we have ∞ 3(s + 1)3 dy (s + y)4 1 Z ∞ 1 (s + y)−3 dy = 1 + (s + 1). = 1 + (s + 1)3 2 1 E(Y | X = s) = Z y 7.38 The joint probability mass function of X and Y is given by P (X = x, Y = y) = y−x−1 for 1 ≤ x ≤ 98, x + 2 ≤ y ≤ 100. 100 3 The marginal distributions of X and Y are given by P (X = x) = (100 − x)(99 − x) (y − 1)(y − 2) , P (Y = y) = 100 2 3 2 100 3 for x = 1, 2, . . . , 98 and y = 3, . . . , 100. Next it follows that 2(y − x − 1) (y − 1)(y − 2) 2(y − x − 1) P (Y = y | X = x) = . (100 − x)(99 − x) P (X = x | Y = y) = 167 Hence y−2 X 1 2 x(y − x − 1) = y E(X | Y = y) = (y − 1)(y − 2) 3 x=1 E(Y | X = x) = 100 X 2 1 y(y − x − 1) = (x + 202). (100 − x)(99 − x) 3 y=x+2 7.39 The joint density of X and Y is f (x, y) = 6(y − x) for 0 < x < y < 1, as follows from P (x < X ≤ x + ∆x, y < Y ≤ y + ∆y) = 6∆x(y − x)∆y for ∆x, ∆y small, see also Example 5.3 This gives fX (x) = 3(1 − x)2 for 0 < x < 1 and fY (y) = 3y 2 for 0 < y < 1. Thus 6(y − x) 3y 2 6(y − x) fY (y | x) = 3(1 − x)2 fX (x | y) = for 0 < x < y for x < y < 1. This gives y 6(y − x) 1 dx = y 2 3y 3 0 Z 1 6(y − x) 2+x y E(Y | X = x) = dy = . 2 3(1 − x) 3 x E(X | Y = y) = Z x 7.40 For ease, consider the case that X and X are continuously distributed. If X and Y are independent, then their joint density function f (x, y) satisfies f (x, y) = fX (x)fY (y). Then, by fX (x | y) = f (x, y)/fY (y), it follows that fX (x | y) = fX (x) and so Z Z E(X | Y = y) = xfX (x | y) dx = xfX (x) dx = E(X). x x 7.41 Noting that X can be written as X = 12 (X + Y ) + 21 (X − Y ), it follows that 1 1 E(X | X + Y = v) = v + E(X − Y | X + Y = v). 2 2 By Problem 6.9, X + Y and X − Y are independent and so E(X − Y | X + Y = v) = E(X − Y ). Also, E(X − Y ) = µ1 − µ2 . Thus 1 1 E(X | X + Y = v) = v + (µ1 − µ2 ). 2 2 168 Note: The conditional distribution of X given that X + Y = v is the normal distribution with mean 12 (µ1 − µ2 + v) and variance 21 σ 2 (1 − ρ). This result follows from the relation 1 1 P (X ≤ x | X + Y = v) = P (X − Y ) + v ≤ x 2 2 and the fact that X − Y is N (µ1 − µ2 , 2σ 2 (1 − ρ)) distributed. R1 7.42 The marginal density of X is fX (x) = x dy = 1 − x for 0 < x < 1 R1 and fX (x) = −x dy = 1 + x for −1 < x < 0. The marginal density of Ry Y is fY (y) = −y dx = 2y for 0 < y < 1. Therefore, for any 0 < y < 1, fX (x | y) = 1 2y for − y < x < y and fX (x | y) = 0 otherwise. Thus E(X | Y = y) = Z y x −y 1 dx = 0. 2y For any 0 < x < 1, we have fY (y | x) = 1 1−x for x < y < 1 and fY (y | x) = 0 otherwise. For any −1 < x < 0, we have fY (y | x) = 1 1+x for − x < y < 1 and fY (y | x) = 0 otherwise. Thus E(X | Y = y) = Z y x −y 1 dx = 0. 2y For 0 < x < 1, we have E(Y | X = x) = Z 1 y 1 1 dy = (1 + x). 1−x 2 y 1 1 dy = (1 − x). 1+x 2 x For −1 < x < 0, we have E(Y | X = x) = Z 1 −x 169 7.43 Let X be the number of trials until the first success in a sequence of Bernoulli trials and N be the number of successes in the first n trials. Then, for 1 ≤ r ≤ n and 1 ≤ j ≤ n − r + 1, n − j r−1 P (X = j, N = r) = (1 − p)j−1 p p (1 − p)n−j−(r−1) r−1 for 1 ≤ r ≤ n, 1 ≤ j ≤ n − r + 1. Since P (N = r) = nr pr (1 − p)n−r , we get P (X = j | N = r) = Thus, by E(X | N = r) = Pn−r+1 j=1 E(X | N = r) = n−j r−1 n r . j P (X = j | N = r), we find n+1 r+1 for 1 ≤ r ≤ n. 7.44 Suppose that r dice are rolled. Define the random variable X as the total number of point gained. Let the random variable I = 1 if none of the r dice shows a 1 and I = 0 otherwise. Then E(X) = E(X | I = 0)P (I = 0) + E(X | I = 1)P (I = 1) = E(X | I = 1)( 65 )r . Under the condition that I = 1 the random variable X is distributed as the sum of r independent random variables Xk each having the discrete uniform distribution on 2, . . . , 6. Each of the Xk has expected value 4. Thus 5 r . E(X) = 4r 6 The function 4r( 65 )r is maximal for both r = 5 and 6. The maximal value is 8.0376. To find σ(X), use E(X 2 ) = E(X 2 | I = 1)P (I = 1) together with E(X 2 | I = 1) = E[(X1 + · · · + Xr )2 ] = rE(X12 ) + r(r − 1)E 2 (X1 ). P6 21 We have E(X1 ) = 4 and E(X12 ) = k=2 k 5 = 18. This leads to E(X) = 8.03756 and σ(X) = 10.008 for r = 5 and E(X) = 8.0376 and σ(X) = 11.503 for r = 6. 7.45 Denote by the random variables X and Y the zinc content and the iron content. The marginal density of Y is Z 3 1 1 fY (y) = (5x + y − 30) dx = (y − 17.5) for 20 < y < 30, 75 2 75 170 and so, for any 2 < x < 3, we have fX (x | y) = E(X | Y = y) = Z 3 2 xfX (x | y) dx = 5x+y−30 y−17.5 . 15y − 260 6(y − 17.5) Thus for 20 < y < 30. 7.46 The insurance payout is a mixed random variable: it takes on one of the discrete values 0 and 2×106 or a value in the continuous interval (0, 2× 106 ). To calculate its expected value we condition on the outcome of the random variable I, where I = 0 if no claim is made and I = 1 otherwise. The insurance payout is 0 if I takes on the value 0, and otherwise the insurance payout is distributed as min(2×106 , D), where the random variable D has an exponential distribution with parameter λ = 1/106 . Thus, by conditioning, E(insurance payout) = 0.9 × 0 + 0.1 × E[min(2 × 106 , D)]. Using the substitution rule, it follows that Z ∞ 6 min(2 × 106 , x)λe−λx dx E[min(2 × 10 , D)] = 0 = Z 2×106 xλe 0 −λx dx + Z ∞ 2×106 (2 × 106 )λe−λx dx. This leads after some calculations to E[min(2 × 106 , D)] = 106 (1 − e−2 ) = 864,665 dollars. Hence, we can conclude that E(insurance payout) = $86,466.50. 7.47 (a) We have 1 P (X ≤ x | a < Y < b) = P (a < Y < b) Z x dv −∞ Z b f (v, w) dw. a Rb Differentiation yields that a f (x, w) dw/P (a < Y < b) is the conditional probability density of X given that a < Y < b. In the same way, we get that Rx −∞ f (x, w) dw P (X > Y ) is the conditional density of X given that X > Y . (b) For (X, Y ) having a standard bivariate normal distribution with 171 correlation coefficient ρ, the formula for E(X | a < Y < b) is obvious from (a). To get E(X | X > Y , note that X = 12 (X + Y ) + 21 (X − Y ). Therefore 1 1 E(X | X > Y ) = E(X + Y | X − Y > 0) + E(X − Y | X − Y > 0). 2 2 By the independence of X + Y and X − Y (see Problem 6.9), it follows that E(X + Y | X − Y > 0) = E(X + Y ) = 0. Since X − Y is N (0, σ 2 ) distributed with σ 2 = 2(1 − ρ), we have Z ∞ 1 2 1 1 2 √ E(X − Y | X − Y > 0) = ve− 2 v /σ dv, P (X − Y > 0) σ 2π 0 which yields E(X − Y | X − Y > 0) = 7.48 For any 0 ≤ x ≤ 1, p (1 − ρ)π. P (U1 ≤ x, U1 > U2 ) P (U1 ≤ x | U1 > U2 ) = = P (U1 > U2 ) = x2 , Rx Ru du1 0 1 du2 1/2 P (U2 ≤ x, U1 > U2 ) = P (U2 ≤ x | U1 > U2 ) = P (U1 > U2 ) 1 = 2 x − x2 . 2 Rx du2 0 0 R1 u2 du1 1/2 Thus the conditional densities of U1 and U2 given that U1 > U2 are 2x and 2(1 − x) for 0 < x < 1 and zero otherwise. This gives Z 1 2 x 2x dx = E(U1 | U1 > U2 ) = 3 0 Z 1 1 x 2(1 − x) dx = . E(U2 | U1 > U2 ) = 3 0 7.49 By the law of conditional probability, the probability of running out of oil is given by 32 P (X1 > Q) + 13 P (X2 > Q), where Xi is N (µi , σi2 ) distributed. The stockout probability can be evaluated as Q − µ 1 Q − µ 2 1 2 + . 1−Φ 1−Φ 3 σ1 3 σ2 172 By the law of conditional expectation, the expected value of the shortage is 2 1 E[(X1 − Q)+ ] + E[(X2 − Q)+ ], 3 3 + where x = max(x, 0). The expected value of the shortage can be evaluated as Q − µ 1 Q − µ 2 1 2 σ1 I + σ2 I , 3 σ1 3 σ2 where I(k) is the so-called normal loss integral Z ∞ 1 2 1 I(k) = √ (x − k)e− 2 x dx. 2π k The normal loss integral can be evaluated as 1 2 1 I(k) = √ e− 2 k − k[1 − Φ(k)]. 2π The expected value of the number of gallons left over equals the expected value of the shortage minus 23 µ1 + 31 µ2 − Q. 7.50 Denote by the random variable R the number of people who wish to make a reservation. The random variable R is Poisson distributed with expected value λ = 170. Let the random variable S be the number of people who show up for a given flight and the random variable D be the number of people who show up for a given flight but cannot be seated. By the law of conditional expectation, E(S) = E(D) = ∞ X E(S | R = r)P (R = r) = r=0 ∞ X r=0 ∞ X E(S | R = r)e−λ r=0 ∞ X E(D | R = r)P (R = r) = r=0 λr r! E(S | R = r)e−λ λr . r! We have min(r,Q) E(S | R = r) = X min(r, Q) k (1 − q)k q min(r,Q)−k k X min(r, Q) (1 − q)k q min(r,Q)−k . (k − N ) k k=0 min(r,Q) E(D | R = r) = k=N +1 For the numerical data Q = 165, N = 150 and q = 0.07, we get E(S) = 150.61 and E(D) = 2.71. 173 7.51 Let the geometrically distributed random variable Y be the number of messages waiting in the buffer. Under the condition that Y = y the random variable X is uniformly distributed on 0, 1, . . . , y−1. Therefore E(X | Y = y) = 12 (y − 1) and E(X 2 | Y = y) = 61 (2y 2 − 3y + 1), see the answer to Problem 3.47. By the law of conditional expectation, E(X k ) = ∞ X y=1 E(X k | Y = y)p(1 − p)y−1 for k = 1, 2. P P 1 2 k−1 = k−1 = and ∞ Using the relations ∞ k=1 k a k=1 ka (1−a)2 for 0 < a < 1, we find after some algebra that 1+a (1−a)3 1 p2 − 5p + 4 (1 − p) and E(X 2 ) = . 2p 6p2 p This gives σ(X) = 2√13p (1 − p)(p + 5). E(X) = 7.52 Let the random variable X be the number of newly arriving messages during the transmission time T of a message. The conditional distribution of X given that T = n is the binomial distribution with parameters n and p. Thus, by the law of conditional expectation, E(X) = ∞ X E(X | T = n)P (T = n) = n=1 ∞ X E(X 2 ) = = ∞ X n=1 npa(1 − a)n−1 = pa p = 2 a a E(X 2 | T = n)P (T = n) n=1 ∞ X n=1 p(1 − p)a p2 a(2 − a) + np(1 − p) + n2 p2 a(1 − a)n−1 = a2 a3 The standard deviation of X is 1 √ a a p pa(1 − p) + 2p2 (1 − a). 7.53 For fixed n, let uk (i) = E [Xk (i)]. The goal is to find un (0). Apply the recursion 1 1 uk (i) = uk−1 (i + 1) + uk−1 (i) 2 2 for i satisfying u0 (i) = i n−k ≤ 21 . The boundary conditions are i 1 i and uk (i) = for i > (n − k)for 1 ≤ k ≤ n. n n−k 2 174 The sought probability un (0) has the values 0.7083, 0.7437, 0.7675, and 0.7761 for n = 5, 10, 25, and 50. Note: un (0) tends to π4 as n increases without bound, see also Example 8.4. 7.54 If you arrive at time point x at the bus stop, then your waiting time until the next bus arrival is distributed as W (x) = min(15 − x, T ), where the random variable T is the time from your arrival epoch until the next bus number 3 arrives. By the memoryless property of the exponential distribution, the random variable T has the exponential 1 density λe−λt with λ = 15 . By conditioning on the random variable T , the expected value of W (x) is calculated as E W (x) = Z 15−x tλe−λt dt + 0 Z ∞ 15−x (15 − x)λe−λt dt, which leads after some algebra to E[W (x)] = λ1 (1 − e−λ(15−x) ). Your arrival time X at the bus stop is uniformly distributed over (0,15) and 1 = λ for 0 < x < 15. By conditioning on thus has density f (x) = 15 your arrival time X and applying again the law of conditional expectation, we find that the expected value of your waiting time until the next bus arrival is given by Z 15 0 E W (x) f (x) dx = Z 0 15 1 15 1 − e− 15 (15−x) dx = . e 7.55 Let Xa be your end score when you continue for a second spin after having obtained a score of a in the first spin. Then, by the law of conditional expectation, E(Xa ) = Z 1−a (a + x) dx + 0 Z 1 1 0 dx = a(1 − a) + (1 − a)2 . 2 1−a √ The solution of a(1 − a) + 12 (1 − a)2 = a is a∗ = 2 − 1. The optimal strategy √ is to stop after the first spin if this spin gives a score larger than 2 − 1. Your expected payoff is $609.48. 7.56 Given that the carnival master tells you that the ball picked from the red beaker has value r, let L(r) be your expected payoff when you guess a larger value and let S(r) your expected payoff when you guess 175 a smaller value. Then L(r) = 10 r/2 1 r/2 110 − r2 1 X k+ = (10 − r)(r + 11) + = 10 10 20 10 20 k=r+1 r−1 1 X 1 r/2 r2 r/2 S(r) = = (r − 1)r + = . k+ 10 10 20 10 20 k=1 We have L(r) > S(r) for 1 ≤ r ≤ 7 and L(r) < S(r) for 8√≤ r ≤ 10, as can be seen by noting that 110 − x2 = x2 has x∗ = 55 ≈ 7.4 as solution. Thus, given that the carnival master tells you that the ball picked from the red beaker has value r, your expected payoff is maximal by guessing a larger value if r ≤ 7 and guessing a smaller value otherwise. Applying the law of conditional expectation, it now follows that your expected payoff is 7 X 110 − r2 k=1 20 10 × X r2 1 1 + × = 4.375 dollars 10 20 10 k=8 if you use the decision rule with critical level 7. The game is not fair, but the odds are only slightly in favor of the carnival master if you play optimally. Then the house edge is 2.8% (for critical levels 5 and 6 the house edge has the values 8.3% and 4.1%). 7.57 In each of the two problems, define vi as the expected reward that can be achieved when your current total is i points. A recursion scheme for the vi is obtained by applying the law of conditional expectation. (a) For Problem 3.24, use the recursion 6 1X vi = vi+k 6 k=2 for 0 ≤ i ≤ 19, where vj = j for j ≥ 20. The maximal expected reward is v0 = 8.5290. (b) For Problem 3.25, use the recursion 6 vi = 1X vi+k 6 k=1 for 0 ≤ i ≤ 5, where vj = j for 6 ≤ j ≤ 10 and v11 = 0. The maximal expected reward is v0 = 6.9988. 176 7.58 Let pr be the probability of rolling a dice total of r with two different numbers. Then, pr = r−1 36 for 2 ≤ r ≤ 7 and pr = p14−r for 8 ≤ r ≤ 12. To find the expected reward under the stopping rule, apply the recursion 11 X vi+r pr for i = 0, 1, . . . , 34, vi = r=3 where vs = s for s ≥ 35. The expected reward under the stopping rule is v(0) = 14.215. Note: The stopping rule is the one-stage-look-ahead rule, see also Problem 3.28. 7.59 Define Ei as the expected value of the remaining duration of the game when the current capital of John is i dollars. Then, by conditioning, Ei = 1 + pEi+1 + qEi−1 for 1 ≤ i ≤ a + b − 1, where E0 = Ea+b = 0. The solution of this standard linear difference equation is Ei = i a + b 1 − (q/p)i − if p 6= q and Ei = a + b − i if p = q. q − p q − p 1 − (q/p)a+b 19 Substituting p = 18 37 , q = 37 , a = 2 and b = 8 into the formula for the expected duration of the game of gambler’s ruin, we get that the expected value of the number of bets is 15.083. Note: The expected value of the number of dollars you will stake in the game is 25 × 15.0283 = 377.07, and the expected value of the number of dollars you will lose is (1 − 0.1598) × 50 − 0.1598 × 200) = 10.2 (the probability that you will reach a bankroll of $250 is 0.1592). The ratio of 10.2 and 377.07 is 0.027, in agreement with the fact that in the long-run you will lose on average 2.7 dollar cents on every dollar you bet in European roulette, regardless of what roulette system you play. 7.60 Let µn be the expected number of clumps of cars when there are n cars on the road. Then, by conditioning on the position of the slowest car, we get the recursion µn = n X i=1 (1 + µn−i ) 1 n for n = 1, 2, . . . , where P µ0 = 0. This gives that the expected number of clumps of cars is ck=1 k1 . 177 7.61 For fixed n, let F (i, k) be the maximal expected payoff that can be achieved when still k tosses can be done and heads turned up i times so far. The recursion is h1 1 i i F (i, k) = max F (i+1, k−1)+ F (i−1, k−1), for k = 1, . . . , n 2 2 n−k with F (i, 0) = ni . The maximal expected payoff F (0, n) has the values 0.7679, 0.7780, 0.7834, and 0.7912 for n = 25, 50, 100, and 1,000. 7.62 Define the value function fk (i) as the expected value of the maximal score you can still reach when k rolls are still possible and the last roll of the two dice gave a score of i points. You want to find f6 (0) and the optimal strategy. Let aj denote the probability of getting a score of j in a single roll of two dice. The aj are given by aj = j−1 36 for 2 ≤ j ≤ 7 and aj = a14−j for 8 ≤ j ≤ 12. The recursion is 12 i h X fk (i) = max i, fk−1 (j)aj for i = 0, 1, . . . , 12. j=2 applies for k = 1, . . . , 6 with the boundary condition f0 (i) = i for all i. The recursion leads to f6 (0) = 9.474. The numerical calculations reveal the optimal strategy as well: if still k rolls are possible, you stop if the last roll gave sk or more points and otherwise you continue, where s1 = s2 = 8, s3 = s4 = 9, and s5 = 10. 7.63 Let state (l, r, 1) ((l, r, 0)) mean that r numbers have been taken out of the hat, l is the largest number seen so far and l was obtained (not obtained) at the last pick. For k = 0, 1, define Fr (l, k) as the maximal probability of obtaining the largest number starting from state (l, r, k) when r numbers out of the hat. The maximal success P have been taken 1 F (l, 1) probability is N . The optimality equations are l=1 1 N N X l−r 1 Fr+1 (j, 1) + N −r N −r j=l+1 N h l−r X l−r 1 i n−r Fr+1 (j, 1) + Fr (l, 1) = max N −r , Fr+1 (l, 0) N −r N −r n−r Fr (l, 0) = Fr+1 (l, 0) j=l+1 l−r for l = r, . . . , N , where n−r = 0 for l < n and the boundary conditions are Fn (l, 0) = 0 and Fn (l, 1) = 1 for l = n, . . . , N . For n = 10 178 and N = 100, the maximal success probability is 0.6219 and the optimal stopping rule is characterized by l1 = 93, l2 = 92, l3 = 91, l4 = 89, l5 = 87, l6 = 84, l7 = 80, l8 = 72, and l9 = 55. This rule prescribes to stop in state (l, r, 1) if l ≥ lr and to continue otherwise. Note: For the case of n = 10 and N = 100, we verified experimentally that lr is the smallest value of l such that Qs (l, r) ≥ Qc (l, r), where Qs (l, r) = l−r 10−r 100−r 10−r and Qc (l, r) = 10−r X k=1 1 k 100−l l−r k 10−r−k 100−r 10−r We have that Qs (l, r) is the probability of having obtained the overall largest number when stopping in state (l, r, 1), and Qc (l, r) is the probability of getting the overall largest number when continuing in state (l, r, 1) and stopping as soon as you pick a number larger than l. 7.64 For k = 0, 1, let state (l, r, k) and the value-function Fr (l, k) be defined in the same way as in Problem 7.63. Then N Fr (l, 0) = Fr+1 (l, 0) l−1 X 1 Fr+1 (j, 1) + N N j=l N h l n−r 1i l−1 X Fr+1 (j, 1) , Fr+1 (l, 0) + Fr (l, 1) = max N N N j=l for l = r, . . . , N , where the boundary conditions are Fn (l, 0) = 0 and F (l, 1) = 1 for l = n, . . . , N . The maximal success probability is PnN 1 l=1 F1 (l, 1) N . 7.65 Define the value function v(i0 , i1 ) as the maximal expected net winnings you can still achieve starting from state (i0 , i1 ), where state (i0 , i1 ) means that there are i0 empty bins and i1 bins with exactly one ball. The desired expected value v(b, 0) can be obtained from the optimality equation h 1 i0 i1 v(i0 , i1 ) = max i1 − (b − i0 − i1 ), v(i0 − 1, i1 + 1) + v(i0 , i1 − 1) 2 b b i b − i0 − i1 v(i0 , i1 ) + b with the boundary condition v(0, i1 ) = i1 − 21 (b − i1 ). This equation can be solved by backwards calculations. First calculate v(1, i1 ) for 179 i1 = 0, . . . , b−1. Next calculate v(2, i1 ) for i1 = 0, . . . , b−2. Continuing in this way, the desired v(b, 0) is obtained. Numerical investigations lead to the conjecture that the optimal stopping rule has the following simple form: you stop only in the states (i0 , i1 ) with i1 ≤ a, where a is the smallest integer larger than or equal to 2i0 /3. For b = 25, we find that the maximal expected net winnings is $7.566. The one-stage-lookahead rule prescribes to stop in the states (i0 , i1 ) with i0 ≤ (1 + 0.5)i1 and to continue otherwise. This stopping rule has an expected net winnings of $7.509. Note: The standard deviation of the net winnings is $2.566 for the optimal stopping rule and $2.229 for the one-stage-look-ahead rule. 7.66 Let state (i, s) correspond to the situation that your accumulated reward is i dollars and the dice total in the last role is s. Define the value-function V (i, s) is the maximal achievable reward starting from P state (i, s). The goal is to find 12 V s=2 (s, s) ps , where ps is the probability of getting a dice total of s in a single roll of the two dice. The pk are given by pj = j−1 36 for 2 ≤ j ≤ 7 and pj = p14−j for 8 ≤ j ≤ 12. The optimality equation is 12 i h X V (i + k, k) pk V (i, s) = max i, k=2,k6=s with the boundary condition V (j, k) = j for j ≥ M . Using backward calculations the values V (s, s) and the optimal stopping rule are found. Note: A heuristic rule, which is very close in performance to the optimal stopping rule, is the one-stage-look-ahead rule. The heuristic rule prescribes to stop in the states (i, s) with i ≥ Ns and to continue otherwise, where the threshold values Ns are given by N2 = 250, N3 = 123, N4 = 80, N5 = 58, N6 = 45, N7 = 42, N8 = 43, N9 = 54, N10 = 74, N11 = 115, and P N12 = 240. The critical level Ns is the smallest integer i such that 12 k=2,k6=s k pk − i ps ≤ 0 or, equivalently, 7 − (i + s)ps ≤ 0. 7.67 Imagine that the balls are placed into the bins at times generated by a Poisson process with rate 1. Then, a Poisson process with rate 1b generates the times at which the ith bin receives a ball. Using the independence of the Poissonian subprocesses and conditioning upon the time that the ith bin receives its first ball, it follows that Z ∞X m k b−1 1 1 (t/b) 1 e− b t P (Ai ) = e− b t dt. k! b 0 k=1 180 The sought probability is bility is bb!b for m = 1. Pb i=1 P (Ai ). As a sanity check, this proba- 7.68 Imagine that the bottles are bought at epochs generated by a Poisson process with rate 1. Let T be the first time at which the required numbers of the letters have been collected and let N be the number of bottles needed. Then E(N ) = E(T ). The letters A, B, R, and S are obtained at epochs generated by Poisson processes with respective rates 0.15, 0.10, 0.40, and 0.35. Moreover, these PoissonR processes are ∞ independent of each other. Using the relation E(T ) = 0 [1 − P (T ≤ t)] dt, we find that the expected number of bottles needed to form the payoff word is Z ∞ 1 − (1 − (1 + 0.15t + 0.152 t2 /2!)e−0.15t ) × (1 − (1 + 0.1t)e−0.1t ) 0 × (1 − (1 + 0.4t)e−0.4t ) × (1 − e−0.35t ) dt = 26.9796. 7.69 Imagine that rolls of the two dice occur at epochs generated by a Poisson process with rate 1. Let N be the number of rolls needed to remove all tokens and T be the first epoch at whichR all tokens have ∞ been removed. Then, E(N ) = E(T ) and E(T ) = 0 P (T > t) dt. Also, T = max2≤j≤12 Tj , where Tj is the first epoch at which all tokens in section j have been removed. The rolls resulting in a dice total of k occur according to a Poisson process with rate pk and these Poissonian subprocesses are independent of each other. The pk are given by pk = k−1 36 for 2 ≤ k ≤ 7 and pk = p14−k for 8 ≤ k ≤ 12. By the independence of the Tk , P (T ≤ t) = P (T2 ≤ t) · · · P (T12 ≤ t). Also, P (Tk > t) = aX k −1 j=0 e−pk t (pk t)j . j! Putting the pieces together and using numerical integration, we find E(N ) = 31.922. 7.70 Imagine that purchases are made at epochs generated by a Poisson process with rate 1. For any i = 1, . . . , n, a Poisson subprocess with rate n1 generates the epochs at which a coupon of type i is obtained. 181 The Poisson subprocesses are independent of each other. Let T be the first epoch at which two complete sets of coupons are obtained and let N be the number of purchases needed to get two complete of coupons. Then E(N ) = E(T ). Using the relation E(T ) = Rsets ∞ 0 [1 − P (T ≤ t)] dt, we find that the expected number of purchases needed to get two complete sets of coupons is equal to Z ∞h n i t dt. 1 − 1 − e−t/n − e−t/n n 0 This integral has the value 24.134 when n = 6. 7.71 Imagine that the rolls of the die occur at epochs generated by a Poisson process with rate 1. Then, the times at which an odd number is rolled are generated by a Poisson process with rate 21 and the times at which the even number k is rolled are generated by a Poisson process with rate 16 for k = 2, 4, and 6. These Poisson processes are independent of each other. By conditioning on the first epoch at which an odd number is rolled, we find that the sought probability is Z ∞ 3 1 e−(1/2)t dt = 0.05. 1 − e−t/6 2 0 7.72 Taking the model with replacement and mimicking the arguments used in the solution of Example 7.13, we get that the sought probability is Z ∞ 5 5 10 20 e− 35 t dt = 0.6095. 1 − e− 35 t 1 − e− 35 t 35 0 7.73 Imagine that the rolls of the die occur at epochs generated by a Poisson process with rate 1. Then, independent Poisson processes each having rate µ = 16 describe the moves of the horses. The density of the sum of r = 6−s1 independent exponentially distributed interoccurrence times tr−1 −µt each having expected value 1/µ is the Erlang density µr (r−1)! e , see Section 4.5.1. Thus the win probability of horse 1 with starting position s1 = 0 is Z 0 4 ∞ X k=0 e − 6t 3 5 k=0 k=0 (t/6)k 2 X − t (t/6)k 2 X − t (t/6)k 1 6 t5 − t e 6 e 6 e 6 dt. k! k! k! 6 5! This gives the win probability 0.06280 for the horses 1 and 6. In the same way, we get the win probability 0.13991 for the horses 2 and 182 5, and the win probability 0.29729 for the horses 3 and 4. To find the expected duration of the game, let Ti be the time at which horse i would reaches the finish when the game would be continued until each horse has been finished. The expected duration R ∞of the game is equal R ∞ to T = min(T1 , . . . , T6 ). Noting that E(T ) = 0 P (T > t) dt = 0 P (T1 > t) · · · P (T6 > t) dt, it follows that the expected duration of the game is Z ∞X 5 (t/6)k 6 e−t/6 dt = 19.737 k! 0 k=0 when each horse starts at panel 0. 7.74 The analysis is along the same lines as the analysis of Problem 7.73. The probability of player A winning is Z 0 3 ∞ X 2 e−3t/9 k=0 (3t/9)k X −2t/9 (2t/9)l 4 5 t4 −4t/9 e e dt. × k! l! 9 4! l=0 Similarly, the other win probabilities. This leads to P (A) = 0.3631, P (B) = 0.3364, and P (C) = 0.3005. The expected number of games is given by Z 0 4 ∞ X j=0 e−4t/9 3 2 k=0 l=0 (4t/9)j X −3t/9 (3t/9)k X −2t/9 (2t/9)l e e × × dt. j! k! l! This integral can be evaluated as 7.3644. 7.75 Imagine that cards are picked at epochs generated by a Poisson process with rate 1. Let N be the number of picks until each card of some of the suits has been obtained and let T be the epoch at which this occurs. Then E(T ) = E(N ). Any specific card is picked at epochs 1 generated by a Poisson process with rate 20 . These Poisson processes are independent of each other. Let Ti be the time until all cards of the ith suit has been picked. The Ti are independent random variables and T = min(T1 , T2 , T3 , T4 ). Since P (Ti > t) = 1 − (1 − e−t/20 )5 and P (T > t) = P (T1 > t) · · · P (T4 > t), we get Z ∞h 5 i4 1 − 1 − e−t/20 dt = 24.694. E(T ) = 0 Therefore E(N ) = 24.694. 183 7.76 Imagine that a Poisson process with rate 1 generates the epochs at which a ball is placed into a randomly chosen bin. Then, for any i = 1, . . . , b, the epochs at which the ith bin receives a ball are generated by a Poisson subprocess with rate 1b . These Poisson processes are independent of each other. Let T be the first time at which each bin contains m or more balls and let N be number of balls needed until each bin containsR at least m balls. Then E(N ) = E(T ). Using the ∞ relation E(T ) = 0 [1 − P (T ≤ t)] dt, we get E(N ) = Z =b 0 ∞h Z 1− 0 ∞h ∞ X e−t/b k=m 1− 1− m−1 X k=0 (t/b)k b i dt k! e−u uk b i du. k! 7.77 Writing xi − θ = xi − x + x − θ, it follows that n X i=1 (xi − θ)2 = n(x − θ)2 + Noting that Pn n X i=1 (xi − x)2 − 2(x − θ) n X i=1 (xi − x). i=1 (xi − x) = 0, it follows that √ 1 Pn 2 2 L(x | θ) = (σ 2π)−n e− 2 i=1 (xi −θ) /σ 1 2 2 is proportional to e− 2 n(θ−x) /σ . Thus the posterior density f (θ | x) is 1 1 2 2 2 2 proportional to e− 2 n(θ−x) /σ f0 (θ), where f0 (θ) = σ√12π e− 2 (θ−µ0 ) /σ0 . Next it is a matter of some algebra to find that the posterior density is proportional to 1 2 2 e− 2 (θ−µp ) /σp , where µp and σp2 are equal to µp = σ02 (σ 2 /n) σ02 x + (σ 2 /n)µ0 2 and σ = . p σ02 + σ 2 /n σ02 + σ 2 /n In other words, the posterior density is the N (µp , σp2 ) density. Inserting √ the data n = 10, σ = 2, µ0 = 73 and σ0 = 0.7, it follows that the posterior density is maximal at θ∗ = µp = 73.356. Using the 0.025 and 0.975 percentiles of the standard normal density, a 95% Bayesian credible interval for θ is (µp − 1.960σp , µp + 1.960σp ) = (72.617, 74.095). 184 7.78 The IQ of the test person is modeled by the random variable Θ. The posterior density of Θ is proportional to 1 e− 2 (123−θ) 2 /56.25 1 × e− 2 (θ−100) 2 /125 . Using a little algebra to rewrite this expression, we get that the posterior density is a normal density with expected value µp = 115.862 and standard deviation σp = 6.228 (the normal distribution is a conjugate prior for a likelihood function of the form of a normal density, see also Problem 7.77). The posterior density is maximal at θ∗ = 115.862. A 95% Bayesian credible interval for θ is (µp − 1.960 × σp , µp + 1.960 × σp ) = (103.56, 128.01). 7.79 The posterior density is proportional to 1 e− 2 (t1 −θ) 2 /σ 2 1 × e− 2 (θ−µ0 ) 2 /σ 2 0 , where t1 = 140, σ = 20, µ0 = 150, and σ0 = 25 light years. Next a little algebra shows that the posterior density is proportional to 1 2 2 e− 2 (θ−µp ) /σp , where µp = σ02 σ 2 σ02 t1 + σ 2 µ0 2 and σ = . p σ02 + σ 2 σ02 + σ 2 This gives that the posterior density is a normal density with an expected value of µp = 143.902 light years and a standard deviation of σp = 15.617 light years. The posterior density is maximal at θ = 143.902. A 95% Bayesian credible interval for the distance is (µp − 1.960σp , µp + 1.960σp ) = (113.293, 174.512). 7.80 The prior density f0 (θ) of the proportion of Liberal voters is f0 (θ) = c θ474−1 (1 − θ)501−1 , where c is a normalization constant. The likelihood function L(E | θ) is given by 110 527 L(E | θ) = θ (1 − θ)573 . 527 The posterior density f (θ | E) is proportional to L(E | θ)f0 (θ) and so it is proportional to θ1,000 (1 − θ)1,073 . 185 In other words, the posterior density f (θ | E) is the beta(1,001, 1,073) density. The posterior probability that the Liberal party will win the election is Z 1 0.5 f (θ | E) dθ = 0.5446. A Bayesian 95% confidence interval for the proportion of Liberal voters can be calculated as (0.4609, 0.5039). 7.81 The prior density of the parameter of the exponential lifetime of a light bulb is f0 (θ) = c θα−1 e−λθ , where c is a normalization constant. Let E be the event that light bulbs have failed at times t1 < · · · < tr and m − r light bulbs are still functioning at time T . The likelihood function L(E | θ) is defined as m r! θr e−[t1 +···+tr +(m−r)T ]θ r (the rationale of this definition is the probability that one light bulb fails in each of the infinitesimal intervals (ti − 12 ∆, ti + 12 ∆) and m − r light bulbs are still functioning at time T ). The posterior density f (θ | E) is proportional to θα+r−1 e−[λ+t1 +···+tr +(m−r)T ]θ . In other words, the posterior density f (θ | E) is aP gamma density with shape parameter α + r and scale parameter λ + ri=1 ti + (m − r)T . 186 Chapter 8 8.1 (a) The binomial random variable X can be represented as X = X1 + · · · + Xn , where the Xi are independent with P (Xi = 0) = 1 − p and P (Xi = 1) = p. Since E(z Xi ) = 1 − p + pz, Rule 8.2 gives that GX (z) = (1 − p + pz)n . The negative binomial random variable Y can be represented as Y = Y1 +· · ·+Yr , where the Yi are independent with P (Yi = k) = pk−1 (1−p) for k ≥ 1. Since E(z Yi ) = pz/(1 − (1 − p)z), Rule 8.2 gives that GY (z) = r pz . 1 − (1 − p)z (b) For the binomial distribution, G′X (1) = np and G′′X (1) = n(n − 1)p2 , implying that E(X) = np and σ 2 (X) = np(1 − p). For the negative binomial distribution, G′Y (1) = r/p and G′′Y (1) = (r(1 − 2p) + r2 )/p2 , and thus E(Y ) = r r(1 − p) and σ 2 (Y ) = . p p2 8.2 Put for abbreviation pn = P (X = n). By the definition of GX (z), we have ∞ ∞ X X p2n+1 . p2n − GX (−1) = n=0 n=0 P∞ P∞ Also, n=0 p2n + P n=0 p2n+1 = 1. Adding these two equations, we get GX (−1) + 1 = 2 ∞ n=0 p2n , showing the desired result. 8.3 By Rule 8.2, the generating function of the total score S is given by 1 2 1 3 1 4 1 5 n z+ z + z + z + z . GS (z) = 3 6 6 6 6 1 Since GS (−1) = (− 31 )n , the sought probability is 12 [(− 13 )n + 1]. 187 8.4 (a) Denote by Xk the kth sample from the continuous probability distribution. Obviously, P (Ik = 1) = P Xk = max(X1 , . . . , Xk ) and so, by a symmetry argument, P (Ik = 1) = 1 k for k = 1, . . . , n. By the chain rule for conditional probabilities, P (Ii1 = 1, . . . , Iir = 1) is equal to P (Iir = 1)×P (Iir−1 = 1 | Iir = 1)×· · ·×P (I1 = 1 | Iir = 1, . . . , Ii2 = 1). Also, we have that 1 P (Iik = 1 | Iir = 1, . . . , Iik+1 = 1) = P Xik = max(X1 , . . . , Xik ) = . ik Hence, P (Ii1 = 1, . . . , Iir = 1) = i1r × · · · × i11 . This proves that, for all 1 ≤ i1 < · · · < ir ≤ n and 1 ≤ r ≤ n, P (Ii1 = 1, . . . , Iir = 1) = P (Ii1 = 1) × · · · × P (Iir = 1). Next it will be shown that P (Ii1 = δi1 , . . . , Iir = δir ) = P (Ii1 = δi1 ) × · · · × P (Iir = δir ) for all 1 ≤ i1 < . . . < ir ≤ n and δi1 , . . . , δir ∈ {0, 1}. The proof is by induction on l, where l is the number of δik with the value zero. Take l = 1 and suppose for ease that δi1 = 0. Then, P (Ii1 = 0, Ii2 = 1, . . . , Iir = 1) is equal to P (Ii2 = 1, . . . , Iir = 1) − P (Ii1 = 1, Ii2 = 1, . . . , Iir = 1) = P (Ii2 = 1) × · · · × P (Iir = 1) − P (Ii1 = 1) × · · · × P (Iir = 1) = 1 − P (Ii1 = 1) × P (Ii2 = 1) × · · · × P (Iir = 1) = P (Ii1 = 0) × P (Ii2 = 1) × · · · × P (Iir = 1), as was to be verified. Continuing in this way, we finally get that P (I1 = δ1 , . . . , In = δn ) = P (I1 = δ1 ) × · · · × P (In = δn ) for all δ1 , . . . , δn ∈ {0, 1}, showing that I1 , I2 , . . . , In are independent. (b) The number of record draws is distributed as R = I1 + · · · + Ir , For each k, we have P (Ik = 1) = k1 and P (Ik = 0) = 1 − k1 . The generating function of the random variable Ik is given by 1− 1 1 + z k k for k = 1, . . . , r. 188 The random variables I1 , . . . , Ir are independent. Hence, by the convolution rule, the generating function of R is given by 1 1 1 1 GR (z) = z 1 − + z · · · 1 − + z . 2 2 r r Using mathematical software for the multiplication of polynomials, we find that for r = 10 the generating function GR (z) is given by 1 7 129 2 1 303 3 4 523 4 19 5 3 013 6 z+ z + z + z + z + z 10 25 200 4 032 22 680 256 172 800 29 1 1 1 7 z + z8 + z9 + z 10 . + 384 120 960 80 640 3 628 800 Denoting by pk the probability of exactly k records in 10 samples, we 7 129 1 1 1 , p2 = 25 have p1 = 10 200 , . . ., p9 = 80 640 , and p10 = 3 628 800 . 8.5 We have GX+Y (z) = e−µ(1−z) , where µ = E(X + Y ). By the independence of X and Y , GX+Y (z) = GX (z)GY (z). Since X and Y have the same distribution, GX (z) = GY (z). Thus [GX (z)]2 = e−µ(1−z) and so 1 GX (z) = e− 2 µ(1−z) . The generating function uniquely determines the probability mass function. Thus X and Y are Poisson distributed with mean 21 µ. Note: The assumption that X and Y are identically distributed can be dropped, but this requires deep analysis using characteristic functions. 8.6 By conditioning on N , we have S E(z ) = ∞ X n=0 E(z S | N = n)P (N = n) 0 = z P (N = 0) + ∞ X E(z X1 +···+Xn )P (N = n). n=1 Using the convolution rule and the assumption that X1 , X2 , . . . are independent random variables with generating function A(z), it follows that ∞ X µn [A(z)]n e−µ GS (z) = = e−µ[1−A(z)] . n! n=0 189 Taking the first two derivatives of GS (z) at z = 1 gives the expressions for E(S) and var(S). 8.7 Since E z N (t+h) = E z N (t+h)−N (t) z N (t) and N (t + h) − N (t) is independent of N (t), we have gz (t + h) = E z N (t+h)−N (t) gz (t) = (1 − λh) + λhz + o(h) for h → 0. This leads to gz (t+h)−gz (t) /h = −λ(1−z)gz (t)+o(h)/h as h → 0, and so ∂ gz (t) = −λ(1 − z)gz (t) for t > 0. ∂t Together with gz (t) = 1 for z = 1, this gives gz (t) = e−λt(1−z) , showing that the generating function of N (t) is given by the generating function of a Poisson distributed random variable with expected value λt. By the uniqueness property of the generating function, it follows that N (t) is Poisson distributed with expected value λt. 8.8 Let N be the outcome of the first roll of the die. Also, let X1 , X2 , . . . be independent random variables each having the uniform distribution on 1, 2, . . . , 6. The sum S of the face values of the simultaneous roll of the dice is distributed as X1 +· · ·+XN . The conditional distribution of X1 +· · ·+XN given that N = k is the same as the unconditional distribution of X1 + · · · + Xk . Applying the law of conditional expectation, we get that the generating function of S is given by GS (z) = E z = 1 6 X1 +···+XN 6 X k=1 6 X E z X1 +···+XN | N = k P (N = k) = k=1 1 1 1 1 1 k 1 z + z2 + z3 + z4 + z5 + z6 . 6 6 6 6 6 6 This generating function can be expanded as a polynomial by using standard mathematical software. Letting pk = P (S = k), we have 1 7 49 343 2 401 , p2 = , p3 = , p4 = , p5 = , 36 216 1 296 7 776 46 656 16 807 493 4 169 3 269 p6 = , p7 = , p8 = , p9 = , 279 936 11 664 93 312 69 984 2 275 749 749 , p11 = , p12 = . p10 = 15 552 46 656 15 552 p1 = 190 The sought probability is P12 k=1 pk = 0.5323. 8.9 Using first-step analysis, we get E(z X ) = pzE(z X ) + q + rE(z X ). This leads to ∞ X q . P (X = k)z k = 1 − pz − r k=0 Writing q/(1 − pz − r) as q/(1 − r) / 1 − pz/(1 − r) and using the expansion ∞ q/(1 − r) q X p k k z , = 1 − pz/(1 − r) 1−r 1−r k=0 for |z| < (1 − r)/p, we obtain by equating terms that q p k P (X = k) = for k = 0, 1, . . . . 1−r 1−r 8.10 Define the random variable X as the sum of the integers that are generated. Let the random variable I denote the first integer generated. Then, the conditional distribution of X given that I = i with i 6= 0 is the same as the unconditional distribution of i + X. Thus, by the law of conditional expectation, X E(z ) = 9 X i=0 9 E(z X 1 X 1 + E(z i+X ). | I = i)P (I = i) = 10 10 i=1 It can now be seen that GX (z) = E(z X ) is given by GX (z) = 1/10 . P 1 − (1/10) 9i=1 z i An alternative derivation of this formula is as follows. ThePrandom N −1 variable X can be represented as the random sum X = i=1 Yi , 1 where N has a geometric distribution with probability p = 10 and Y1 , Y2 , . . . are independent random variables that have a discrete uniform distribution on 1, . . . , 9. Also, the Yi are independent of N . Next, by repeating the analysis for Problem 8.6, we get the result. Note: The first and second derivatives of GX (z) at z = 1 have the values G′X (1) = 45 and G′′X (1) = 4,290. This gives E(X) = 45 and σ(X) = 48.06. By numerical inversion of GX (z) (using the Fast Fourier Transform), we find that P (X > 10k) has the numerical values 0.9000, 0.7506, 0.6116, 0.4967, 0.4035, 0.3278, 0.2663, 0.2163, 0.1757, 0.1427, and 0.1159 for k = 0, 1, . . . , 10. 191 8.11 Conditioning on the outcome of the first toss, we get 1 1 E(z X ) = zE(z X1 ) + zE(z X2 ). 2 2 The random variable X1 is equal to r − 1 if the next r − 1 tosses give heads and X1 is distributed as k + X2 for 1 ≤ k ≤ r − 1 if the next k − 1 tosses give heads and are followed by tails. Therefore E(z X1 ) = 1 r−1 2 z r−1 + r−1 X 1 k k=1 2 z k E(z X2 ). Since E(z X2 ) = E(z X1 ), we have E(z X1 ) = ( 21 z)r−1 /[1 − This leads to r 2 21 z . E(z X ) = Pr−1 1 k z 1 − k=1 2 Pr−1 1 k k=1 ( 2 z) ]. Taking the derivatives of E(z X ) and putting z = 1, we get E(X) = 2r − 1 and var(X) = 2r (2r − 2r + 1) − 2. 8.12 Since limz→1 G′X (z) = ∞, we have E(X) = ∞. The probability mass function of X has a long tail. It is interesting to give some numerical values for the tail probability P (X > n). The tail probability has the values 0.3125, 0.2460, 0.1550, 0.1123, 0.0796, 0.0504, 0.0357, 0.0252, 0.0160, 0.0113, 0.0092, and 0.0080 for n = 5, 10, 25, 50, 100, 250, 500, 1000, 2,500, 5,000, 7,500 and 10,000. 8.13 The extinction probability is the smallest root of the equation u = P (u), where the generating function P (u) is given by P (u) = p 1 − (1 − p)u for |u| ≤ 1. p and u = 1. The The equation u = P (u) has the two roots u = 1−p p 1 extinction probability is 1−p if p < 2 and is 1 otherwise. 10 8.14 The offspring distribution is given by p0 = 51 + 45 ( 13 × 14 + 23 × 18 ) = 30 , 4 1 1 2 3 10 4 1 1 2 3 8 p1 = 5 ( 3 × 2 + 3 × 8 ) = 30 , p2 = 5 ( 3 × 4 + 3 × 8 ) = 30 , and p3 = 2 1 2 4 5 × 3 × 8 = 30 . The generating function of the offspring distribution is 1 1 4 1 P (u) = + u + u2 + u3 . 3 3 15 15 192 √ The equation P (u) = u has the roots u1 = 1, u2 = 21 (5 + 45), √ and u3 = 21 (5 − 45) (the equation P (u) = u can be factorized as 1 2 u + 13 u − 13 ) = 0). The desired probability is u∞ = 0.8541. (u − 1)( 15 8.15 The generating function of the offspring distribution is P (u) = 1 2 2 + u . 3 3 (a) To find u3 , iterate un = P (un−1 ) starting with u0 = 0. This gives 2 11 11 , and u3 = P ( 27 ) = u1 = P (0) = 13 , u2 = P ( 31 ) = 13 + 23 31 = 27 2 1 2 11 = 0.4440. 3 + 3 27 (b) The equation u = 13 + 32 u2 has the roots u = 1 and u = 21 . The probability u∞ = 21 . (c) The probabilities are u23 = 0.1971 and u2∞ = 0.25. 1 ebt − eat for all t, where MX (0) = 1. 8.16 We have MX (t) = (b−a)t 8.17 Since the uniform random variable on (0, 1) has the moment-generating function Z 1 et − 1 , etu du = t 0 it is plausible that X takes on the value 0 with probability 21 and is uniformly distributed on (0, 1) with probability 21 . Indeed, for such a t mixed random variable X, we have E(etX ) = 21 + 12 e −1 t . 8.18 We have 0 Z ∞ 1 ax 1 MX (t) = e a e dx + etx ae−ax dx 2 2 −∞ 0 Z ∞ Z ∞ a e−(a−t)x dx . e−(a+t)y dy + = 2 0 0 Z tx Therefore MX (t) is only defined for −a < t < a and is given by a 1 1 . MX (t) = + 2 a+t a−t ′ (0) = 0 and M ′′ (0) = By MX X 2 , a2 we have E(X) = 0 and var(X) = 2 . a2 8.19 Since the random Pn variables Xi are independent, the moment-generating function of i=1 Xi is λ α1 λ αn λ α1 +···+αn = . ··· λ−t λ−t λ−t 193 This proves the desired result by the uniqueness property of the momentgenerating function. 8.20 By the independence of X and Y , we have MX+Y (t) = MX (t)MY (t). If the random variable X +Y is N (µ, σ 2 ) distributed, then MX+Y (t) = 1 2 2 eµt+ 2 σ t . Since X and Y are identically distributed, we have MX (t) = MY (t) and so 1 2 2 [MX (t)]2 = eµt+ 2 σ t . Hence we find that 1 MX (t) = e(µ/2)t+ 2 (σ 2 /2)t2 . This is the moment-generating function of an N ( 12 µ, 12 σ 2 ) distributed random variable. The moment-generating function uniquely determines the underlying probability distribution. Thus, both X and Y are N ( 12 µ, 12 σ 2 ) distributed. Note: The assumption that X and Y are identically distributed can be dropped, but this requires deep analysis. R∞ 8.21 The definition MX (t) = −∞ etx ex /(1 + ex )2 dx reveals that MX (t) is finite only for −1 < t < 1. Using the change of variable u = 1/(1 + ex ) 1−u x x 2 x and noting the relations du dx = −e /(1 + e ) and e = u , it readily follows that Z 1 1 − u t MX (t) = du for − 1 < t < 1. u 0 This integral is the beta-integral, see Section 4.6.3. Thus MX (t) = Γ(−t + 1)Γ(t + 1) for − 1 < t < 1, where Γ(x) is the gamma function. Since at = et ln(a) has as derivative ln(a)at , we get from the integral representation of MX (t) that Z 1 1 − u 1 − u t ′ du. ln MX (t) = u u 0 Thus ′ MX (0) = Z 1 0 [ln(1 − u) − ln(u)] du = 0. In the same way, we find Z 1 π2 ′′ [ln(1 − u) − ln(u)]2 du = MX (0) = , 3 0 showing that E(X) = 0 and σ 2 (X) = π2 3 . 194 8.22 Let Y = 1 − X. Then, by the assumption MX (t) = et MX (−t), we get MY (t) = E(et(1−X) ) = et MX (−t) = MX (t). Thus 1 − X has the same distribution as X. This implies E(1 − X) = E(X) and so E(X) = 0.5. The distribution of X is not uniquely determined: 1 − X has the same distribution both for a continuous random variable X that is uniformly distributed on (0, 1) and for a discrete random variable X with P (X = 0) = P (X = 1) = 0.5. 8.23 (a) Using the decomposition formula for the standard bivariate normal density function RR f (x, y) in Section 6.1 and the basic formula P ((X, Y ) ∈ C) = C f (x, y) dx dy, the moment-generating function E(esX+tY ) can be evaluated as Z ∞ Z ∞ 1 1 1 2 2 sx 1 − 12 x2 ety √ p dx e √ e e− 2 (y−ρx) /(1−ρ ) dy. 2π 2π 1 − ρ2 −∞ −∞ The inner integral can be interpreted as E(etW ) with N (ρx, 1 − ρ2 )distributed W , and so, using Example 8.5, the inner integral reduces 1 2 2 to eρxt+ 2 (1−ρ )t . Thus we get Z ∞ 1 1 2 1 (1−ρ2 )t2 sX+tY 2 e(s+ρt)x √ e− 2 x dx. E(e )=e 2π −∞ The latter integral can be interpreted as E(e(s+ρt)Z ) with N (0, 1)1 2 distributed Z and is thus equal to e 2 (s+ρt) . Putting the pieces together, we have 1 2 2 E(esX+tY ) = e 2 (s +2ρst+t ) , as was to be verified. (b) It suffices to verify the assertion for N (0, 1) distributed X and Y . Let ρ = ρ(X, Y )(= cov(X, Y )). Using the assumption and Rule 5.12, the random variable aX + bY is N (0, a2 + 2abρ + b2 ) distributed for any constants a, b. Then, by Example 8.5 with t = 1, 1 E(eaX+bY ) = e 2 (a 2 +2abρ+b2 ) for all a, b. This proves the desired result with an appeal to the result of part (a) and the uniqueness property of the moment-generating function. 195 Chapter 9 9.1 Let X be the total time needed for both tasks. Then E(X) = 45 and σ 2 (X) = 65. Write the probability P (X < 60) as 1 − P (X ≥ 45 + 15). The one-sided Chebyshew’s inequality gives P (X < 60) ≥ 1 − 65 = 0.7759. 65 + 225 9.2 By the two-sided Chebyshew’s inequality, P (|X − µ| ≤ kσ) ≥ 1 − √ Thus choose k ≥ 1/ 1 − β. 1 σ2 = 1 − 2. 2 2 k σ k 9.3 The moment-generating function of a random variable X that is uni1 (et − e−t ) for all t. Put for formly distributed on (−1, 1) is MX (t) = 2t abbreviation X n = n1 (X1 + · · · + Xn ). Since the Xi are independent, h 1 in et/n − e−t/n . MX n (t) = 2t/n By Chernoff’s bound, P (X n ≥ c) ≤ mint>0 e−ct MX n (t). Using the 2 inequality 21 (eu − e−u ) ≤ ueu /6 for u > 0, we get P (X n ≥ c) ≤ min e−ct et 2 /6n t>0 The function e−(ct−t sired bound. 2 /6n) . is minimal for t = 3cn, which gives the de- 9.4 (a) We have E(etXi ) = E ∞ n n X t X i n=0 n! = ∞ n X t n=0 n! E(Xin ). The interchange of the order of expectation n and summation is justified P∞ t Xin by the absolute convergence of n=1 E n! . Since E(Xi ) = 0 and E(Xin ) ≤ B n−2 E(Xi2 ) = B n−2 σi2 for n ≥ 2, we get E(e tXi ∞ σ2 σi2 X tn B n = 1 + i2 (etB − 1 − tB) )≤1+ 2 B n! B n=2 ≤e σi2 (etB −1−tB)/B 2 , 196 where the last inequality uses the fact that 1 + x ≤ ex for x > 0. (b) Using Rule 9.3 and Rule 8.5, we have P n 1 X n i=1 Xi ≥ c ≤ mint>0 [e−nct MX1 (t) · · · MXn (t)] ≤ mint>0 [e(−nct+nσ 2 (etB −1−tB)/B 2 ) ], P where σ 2 = n1 ni=1 σi2 . The minimizing value of t is t = B1 ln 1 + cB , σ2 2 tB 2 as follows by putting the derivative of −nct + nσ (e − 1 − tB)/B equal to zero. Next it is matter of some algebra to get the desired result n 1 X nc2 − Xi ≥ c ≤ e 2σ2 +2Bc/3 for c > 0. P n i=1 P 9.5 The random variable X is distributed as ni=1 Xi , where the Xi are independent with P (Xi = 1) = p and P (Xi = 0) = 1 − p. This gives MX (t) = (pet + 1 − p)n . By Chernoff’s bound, P X ≥ np(1 + δ) ≤ min[e−np(1+δ)t (pet + 1 − p)n ]. t>0 Let g(t) = e−p(1+δ)t (pet + 1 − p). Putting the derivative of g(t) equal to zero, it follows that the function g(t) takes on its absolute minimum for t = ln(γ) with (1 − p)(1 + δ) γ= . 1 − p(1 + δ) This leads to the upper bound pγ + 1 − p n . P X ≥ np(1 + δ) ≤ γ p(1+δ) Next it is matter of some algebra to obtain the first bound. To get the other bound, note that f (a) = a ln( ap ) + (1 − a) ln( 1−a 1−p ) has the derivatives a 1 − a 1 f ′ (a) = ln − ln , f ′′ (a) = for 0 ≤ a < 1. p 1−p a(1 − a) Next use Taylor’s formula f (a) = f (p) + (a − p)f ′ (p) + 2!1 (a − p)2 f ′′ (ηa ) for some ηa with p < ηa < a. Since f (p) = f ′ (p) = 0 and η(1 − η) ≤ 14 for 0 < η < 1, we obtain f (a) ≥ 2(a − p)2 = 2δ 2 p2 for p < a < 1, which gives the second bound. 197 9.6 Using Rule 8.5 and the arithmetic-geometric mean inequality, we have E(etX ) = (1 − p1 + p1 et ) × · · · × (1 − pn + pn et ) n h1 X in ≤ (1 − pi + pi et ) = (pet + 1 − p)n , n i=1 Pn where p = n1 i=1 pi . The rest of the proof is identical to the proof for Problem 9.5. R1 P 9.7 Since ln(Pn ) = n1 nk=1 ln(Xk ) and E(ln(Xk )) = 0 ln(x)dx = −1, it follows from the strong law of large numbers that P ({ω : limn→∞ ln(Pn (ω)) = −1}) = 1. This implies that P ({ω : limn→∞ Pn (ω) = e−1 }) = 1. 9.8 For P∞ fixed ǫ > 0, let An = {ω : |Xn (ω) − X(ω)| > ǫ}. Then, by n=1 P (An ) < ∞, it follows from the first Borel-Cantelli lemma that P ({ω : |Xn (ω) − X(ω)| > ǫ for infinitely many n} = 0. This holds for any ǫ > 0. Thus P ({ω : limn→∞ Xn (ω) = X(ω)}) = 1. 9.9 Fix P∞ǫ > 0. Let An = {ω : |Xn (ω)| > ǫ}. Suppose to the contrary that n=1 P (An ) = ∞. Then, by the second Borel-Cantelli lemma, P ({ω : ω ∈ An for infinitely many n} = 1. This contradicts P the assumption that P ({ω : limn→∞ Xn (ω) = 0}) = 1. Therefore ∞ n=1 P (An ) < ∞. 9.10 By the definition of Xk , we have P ({ω : limk→∞ Xk (ω) = 0}) = 1. Therefore Xk converges almost surely to 0. However, E(Xk ) = ∞ X l=k+1 ∞ X 1 c =∞ l 2 =c l l l=k+1 for any k ≥ 1. 9.11 By Markov’s inequality, P (|Xn − X| > ǫ) = P (|Xn − X|2 > ǫ2 ) ≤ E(|Xn − X|2 ǫ2 for each ǫ > 0. Therefore limn→∞ P (|Xn − X| > ǫ) = 0 for each ǫ > 0, showing convergence in probability. A counterexample showing that 198 convergence in probability does not imply convergence in mean square is provided by Problem 9.10. Note: Mean-square convergence is neither stronger nor weaker than almost sure convergence. 9.12 Since the Xk are uncorrelated, we have E[(Xi − µ)(Xj − µ)] = 0 for j 6= i. Therefore E (Mn − µ)2 can be evaluated as n n n X n i 1 X X 1 hX E (Xi − µ)(Xj − µ) (X − µ) = (X − µ) E j i 2 2 n n j=1 i=1 = 1 n2 i=1 j=1 n X i=1 nσ 2 σ2 E (Xi − µ)2 = 2 = , n n which shows that E (Mn − µ)2 converges in mean square to µ. P 9.13 Since the Xi are independent, E(Yk ) = µ2 and so E n1 nk=1 Yk = µ2 . By Chebyschev’s inequality, Pn n σ2 1X k=1 Yk 2 Yk − µ > ǫ ≤ P n n2 ǫ 2 k=1 for each ǫ > 0. By the independence of the Xk , we have cov(Yi , Yj ) = µ4 − µ4 = 0 for j > i + 1. Thus σ2 n X k=1 n−1 n X X σ 2 (Yk ) + 2 cov(Yi , Yi+1 ). Yk = k=1 i=1 2 )−µ4 and cov(Y , Y 2 2 We have σ 2 (Yk ) = E(Xk2 )E(Xk+1 i i+1 ) = µ E(Xi+1 )− 4 2 µ . Thus, by the boundedness of the σ (Xi ), there is a constant c > 0 such that n X 2 Yk ≤ nc for each n ≥ 1. σ k=1 1 n Pn 2 Then, we get P k=1 Yk − µ > ǫ tends to 0 as n → ∞, as was to be proved. P∞ 9.14 Since E n=1 Yn < ∞, we have ∞ X Yn (ω) < ∞} = 1. P {ω : n=1 199 If P∞ n=1 Yn (ω) < ∞, then limn→∞ Yn (ω) = 0. Therefore P {ω : lim Yn (ω) = 0} = 1. n→∞ 9.15 We have P (Xn ≤ x) = P (Xn ≤ x, |Xn − X| ≤ ǫ) + P (Xn ≤ x, |Xn − X| > ǫ) ≤ P (X ≤ x + ǫ) + P (|Xn − X| > ǫ). Letting n → ∞ and using the assumption that Xn converges in probability to X, we get lim P (Xn ≤ x) ≤ P (X ≤ x + ǫ) n→∞ for any ǫ > 0. Next, letting ǫ → 0, we obtain lim P (Xn ≤ x) ≤ P (X ≤ x) n→∞ when x is a continuity point of P (X ≤ x). Next we will verify that limn→∞ P (Xn ≤ x) ≥ P (X ≤ x) when x is a continuity point of P (X ≤ x). Interchanging the roles of Xn and X in the above argument, we get P (X ≤ x) = P (X ≤ x, |Xn − X| ≤ ǫ) + P (X ≤ x, |Xn − X| > ǫ) ≤ P (Xn ≤ x + ǫ) + P (|Xn − X| > ǫ). Thus, by the assumption that Xn converges in probability to X, P (X ≤ x) ≤ lim P (Xn ≤ x + ǫ). n→∞ Replacing x by x − ǫ, we obtain lim P (Xn ≤ x) ≥ P (X ≤ x − ǫ) n→∞ for any ǫ > 0. Therefore limn→∞ P (Xn ≤ x) ≥ P (X ≤ x) when x is a continuity point of P (X ≤ x). This completes the proof. P P 9.16 Since nk=1 Ik ≤ 1, we have Sn2 ≥ Sn2 nk=1 Ik and so E(Sn2 ) ≥ E n X k=1 n X E(Sn2 Ik ). Sn2 Ik = k=1 200 Thus, writing Sn2 as Sk2 + 2Sk (Sn − Sk ) + (Sn − Sk )2 , we get E(Sn2 ) ≥ = n X n X E Sk2 + 2Sk (Sn − Sk ) + (Sn − Sk )2 Ik = E(Sn2 Ik ) k=1 k=1 n X E(Sk2 Ik ) + n X k=1 k=1 E[2Sk (Sn − Sk )Ik ] + n X k=1 E[(Sn − Sk )2 Ik ] Note that E[Sk (Sn −Sk )Ik ] = E(Sk Ik )E(Sn −Sk ), by the independence of the Xj . Since E(Xj ) = 0 for all j, we have E(Sn − Sk ) = 0. Also, E[(Sn − Sk )2 Ik ] ≥ 0. Next, using the fact that the events Ak are disjoint, we find E(Sn2 ) ≥ n X E(Sk2 Ik ) k=1 2 ≥ n X 2 c P (Ik = 1) = c k=1 2 n X 2 P (Ak ) = c P k=1 = c P max1≤k≤n |Sk | ≥ c , n [ Ak k=1 where the second inequality uses the fact that E(Sk2 Ik ) =E(Sk2 | Ik = 1)P (Ik = 1) + E(Sk2 | Ik = 0)P (Ik = 0) ≥ E(Sk2 | Ik = 1)P (Ik = 1) ≥ c2 P (Ik = 1). This completes the proof of Kolmorogov’s inequality. The result can be rewritten as n 1 X 2 E(Sn ) ≥ 2 var(Xk ). c k=1 This follows from the assumption that the Xk are independent and satisfy E(Xk ) = 0 for all k, implying that E(Xi Xj ) = E(Xi )E(Xj ) = 0 for i 6= j and E(Xk2 ) = var(Xk ). Therefore, E(Sn2 ) = E n X k=1 = n X k=1 9.17 Since Rx 1 0 1+y 2 n n−1 n X X X E(Xi Xj ) E(Xk2 ) + 2 Xk2 = E(Xk2 ) = k=1 n X i=1 j=i+1 var(Xk ). k=1 dy = arctg(x), we have P (Xi ≤ x) = 2 π x ≥ 0. Since the Xi are independent, P Mnn ≤ x nx) · · · P (Xn ≤ nx). Therefore 2 n M 2 1 n n ≤x = arctg(nx) = 1 − arctg P n π π nx arctg(x) for = P (X1 ≤ for x > 0. 201 1 | < 1 for n large enough. Using the For any fixed x, we have that | nx power series expansion of arctg(y) for |y| < 1, it follows that M 2 2 n n lim P = e− πx for x > 0. ≤ x = lim 1 − n→∞ n→∞ n πxn 9.18 The assumption E(Xi4 ) < ∞ implies that E(|Xi |k ) < ∞ for k = 1, 2, and 3. This follows from the inequality |x|k ≤ 1+x4 for all x and k = 1, 2, and 3. It suffices to prove the strong law of large numbers under the assumption that E(Xi ) = 0; otherwise, replace Xi by Xi − E(Xi ). We first verify that, for some constant c, E (X1 + · · · + Xn )4 ≤ cn2 for all n ≥ 1. To verify this, note that (X1 +· · ·+Xn )4 is the sum of terms Xi4 , Xi2 Xj2 , Xi3 Xj , Xi2 Xj Xk , and Xi Xj Xk Xl , where i, j, k, and l are different. By the independence of X1 , . . . , Xn and the assumption E(Xj ) = 0, we have E(Xi3 Xj ) = E(Xi3 )E(Xj ) = 0. Similarly, E(Xi2Xj X k ) = 0, and E(Xi Xj Xk Xl ) = 0. There are n terms E(Xi )4 and n2 42 = 3n(n − 1) terms E(Xi2 Xj2 ). Therefore h i E (X1 + · · · + Xn )4 = nE(X12 ) + 3n(n − 1)E 2 (X12 ), showing that E (X1 + · · · + Xn )4 is bounded by cn2 for some constant c. This implies ∞ ∞ X X E (X1 + · · · + Xn )4 1 ≤c < ∞. 4 n n2 n=1 n=1 Next, by the result of Problem 9.14, n14 (X1 + · · · + Xn )4 converges almost surely to 0, which implies n1 (X1 + · · · + Xn ) converges almost surely to 0, as was to be proved. 9.19 The Kelly betting fraction suggests to stake 2.7% of your current bankroll each time. 9.20 Using the relation Vn = (1 − α + αR1 ) · · · (1 − α + αRn )V0 , the asymptotic growth rate is now given by E[ln(1 − α + αR1 )] = p ln(1 − α + αf1 ) + (1 − p) ln(1 − α + αf2 ). Putting the derivative of this expression with respect to α equal to zero leads to the formula for α∗ . Next the formula for the asymptotic 202 rate of return follows. Note: In the case that there is a rate of return r on the non-invested part of your bankroll, the expression for α∗ becomes α∗ = min (pf + (1 − p)f )(1 + r) − (1 + r)2 1 2 ,1 . (f1 + f2 )(1 + r) − f1 f2 − (1 + r)2 9.21 Denote by Vk your bankroll after k bets. Then, by the same arguments as Pnused in the derivation of the Kelly betting fraction, ln(Vn /V0 ) = i=1 ln(1 − α + αRi ), where the Ri are independent random variables with P (Ri = f1 ) and P (Ri = f2 ) = 1−p. By the central limit theorem, ln(Vn /V0 ) is approximately N (µα , σα2 ) distributed for n large enough, where µα = p ln(1 − α + f1 α) + (1 − p) ln(1 − α + f2 α) σα2 = p ln2 (1 − α + f1 α) + (1 − p) ln2 (1 − α + f2 α) − µ2α . Next it readily follows that, for large n, ln(x/V ) − nµ 0 α √ . P (Vn > x) ≈ 1 − Φ σα n 5 For p = 0.5, f1 = 1.8, f2 = 0.4, n = 52, and α = 24 , the normal approximation yields the values 0.697, 0.440, and 0.150 for x/V0 = 1, 2, and 5. A simulation study with 1 million runs yields the values 0.660, 0.446, and 0.167. For α = 1, the probability is about 0.5 that your bankroll will be no more than 1.826 ×0.426 ×10,000 = 1.95 dollars after 52 weeks. The intuitive explanation is that in the most likely scenario a path will unfold in which the stock price raises during half of the time and falls during the other half of the time. 9.22 Let the random variable St be equal to 1 if the shuttle is on the way from the hotel to the airport and be equal to 2 if the shuttle is on the way from the airport to the hotel. The stochastic process {St , t ≥ 0} is regenerative. The epochs at which the shuttle departs from the hotel can be taken as the regeneration epochs. A cycle consists of a trip from the hotel to the airport directly followed by a trip from the airport to the hotel. Imagine that a reward at rate 1 is earned when the shuttle is on the way from the hotel to the airport. If d is the 203 distance in miles between the airport and the hotel, then d 1 d 1 × + × 2 30 2 50 Z 50 d d 1 d 1 1 + × + ds. E(length of a cycle) = × 2 30 2 50 20 30 s E(reward in one cycle) = Thus, with probability 1, the long-run proportion of the shuttle’s operating time (excluding any time needed to pick up passengers) that is spent going to the airport is 1 2 1 2 × d 30 × + 1 2 d 30 + 1 2 × d 50 + × d 50 1 20 R 50 d 30 s = 0.5108. ds 9.23 Let the random variable St be the age of the bulb in use at time t. The stochastic process {St } describing the age of the light bulb in use is regenerative. It regenerates itself each time a new bulb is installed. Let the generic random variable X be distributed as the lifetime of a bulb. Let F (x) = P (X ≤ x) be the probability distribution function of the lifetime X of a bulb and f (x) be its probability density. Then, E(length of a cycle) = = Z T 0 Z T 0 tf (t) dt + Z ∞ T f (t) dt T 1 − F (t) dt E(cost incurred in one cycle) = c2 P (X ≤ T ) + c1 P (X > T ) = c1 + (c2 − c1 )F (T ). Thus, with probability 1, the long-run average cost per unit time is c1 + (c2 − c1 )F (T ) . RT 1 − F (t) dt 0 9.24 Let the random variable St be the number of orders awaiting processing at time t. The stochastic process {St , t ≥ 0} is regenerative. It regenerates itself each time N orders have accumulated. The expected length of one cycle is the expected value of the sum of N interarrival times and thus is equal to N η. The expected amount of time the first order arriving in a cycle has to wait until processing begins is (N − 1)η, the expected waiting time of the second order arriving in 204 a cycle is (N − 2)η, and so on. Hence the total expected cost in one cycle is 1 K + h[(N − 1)η + (N − 2)η + · · · + η] = K + hN (N − 1)η. 2 Hence, with probability 1, the long-run average cost per unit time is E(cost in one cycle) K 1 = + h(N − 1), E(expected length of one cycle) Nη 2 This p cost function is minimal for one of the two integers nearest to 2K/hη. 9.25 Let St be equal to 1 if the channel is on at time t and be equal to 0 otherwise. The stochastic process {St } is regenerative. Take the epochs at which an on-time starts as the regeneration epochs. Let µon be the expected length of the on-time X. Then, Z 1 x 6x(1 − x) dx = 0.5. µon = 0 By the law of conditional expectation, Z 1 E(L | X = x)f (x) dx E(length of a cycle) = 0 Z 1 √ (x + x2 x)6x(1 − x) dx = µon + µof f , = 0 √ where µof f = 0 x2 x 6x(1 − x) dx = 0.2424. By the same arguments as in Example 9.9, it now follows that the long-run fraction of time the system is on equals R1 µon 0.5 = 0.673. = µon + µof f 0.5 + 0.2424 9.26 Let the random variable St be equal to 0 if the system is out of stock at time t and be equal to 1 otherwise. The continuous-time stochastic process {St , t ≥ 0} is regenerative. The epochs at which the stock drops to zero can be taken as regeneration epochs. A cycle starts each time the stock on hand drops to zero. The system is out of stock during the time elapsed from the beginning of a cycle until the next inventory replenishment. This amount of time is exponentially distributed with 205 mean 1/µ. The expected amount of time it takes to go from stock level Q to 0 equals Q/λ. Hence, with probability 1, the long-run fraction of time the system is out of stock = 1/µ . 1/µ + Q/λ To find the fraction of demand that is lost, let the random variable In be equal to 0 if the system runs out of stock at the nth demand epoch and be equal to 1 otherwise. The discrete-time stochastic process {In , n = 1, 2, . . .} is regenerative. It regenerates itself each time a demand occurs and the stock drops to zero. In the discrete case cycle is to be interpreted as the number of demand epochs between two demand epochs at which the stock drops to 0. The expected value of the number of demands lost in one cycle equals λ × E(amount of time the system is out of stock during one cycle) = λ/µ. The expected number of demands occurring in one cycle is λ/µ + Q. Hence, with probability 1, the long-run fraction of demand that is lost = λ/µ . λ/µ + Q It now follows that the long-run fraction of customers finding the system out of stock is equal to the long-run fraction of time the system is out of stock. This is a particular instance of the property “Poisson arrivals see time averages.” 9.27 The process describing the status of the processor is regenerative. Take as cycle the time interval between two successive epochs at which an arriving job finds the processor idle. Using the memoryless property of the Poisson process, the expected length of a cycle is µ + λ1 and the expected amount of idle time in one cycle is λ1 . Thus the long-run fraction of time the server is idle is 1/λ 1 = . µ + 1/λ 1 + λµ Let N be the number of jobs arriving during the processing time X. Then E(N | X = x) = λx and so E(N ) = λµ. Thus the expected number of arrivals during one cycle is 1 + λµ. The number of jobs accepted in one cycle is 1, and so long-run fraction of jobs that are accepted is 1 . 1 + λµ 206 9.28 Let the random variable St denote the number of non-failed units at time t. The stochastic process {St , t ≥ 0} is regenerative. The regeneration epochs are the inspection epochs. A cycle is the time interval between two inspections. The expected length of a cycle is T . The expected amount of time the system is down during one cycle is E[max(T − X1 − X2 , 0)], where X1 and X2 are independent random variable having an exponential distribution with parameter α. The density of X1 + X2 is the Erlang-2 density α2 te−αt . Thus Z T (T − t)α2 te−αt dt E[max(T − X1 − X2 , 0)] = 0 (αT )2 −αT 2 1 − e−αT − αT e−αT − e = T 1 − e−αT − αT e − α 2 2 2 . = T − + e−αT T + α α −αT Hence, with probability 1, the long-run fraction of time the system is down equals T − α2 + e−αT (T + α2 ) . T 9.29 Let a cycle be the time interval between two successive replacements of the bulb. Denote by the generic variable X the length of a cycle. Imagine that a cost at rate 1 is incurred if the age of the bulb is larger than c and a cost at rate 0 is incurred otherwise. Then, the cost incurred during one cycle is max(X − c, 0). The long-run average cost per unit time is E[max(X − c, 0)] . E(X) R∞ To evaluate E[max(X−c, 0)], we use the result that E(V ) = 0 P (V > v) dv for any nonnegative random variable V , see Problem 4.26. This gives Z ∞ P (max(X − c, 0) > v) dv E[max(X − c, 0)] = 0 Z ∞ Z ∞ = 1 − F (v + c) dv = 1 − F (x) dx, 0 c which verifies the desired result. 9.30 Let a cycle be the time interval between two successive replacements of the unit. Denote by the generic variable X the length of a cycle. 207 The expected length of one cycle is E(X). Under the condition that X = x the cost incurred in the cycle is 21 x2 and so the expected cost incurred in one cycle is 21 E(X 2 ). Thus the long-run average cost per unit time is given by E(X 2 ) . 2E(X) 9.31 Define a cycle as the time elapsed between two consecutive replacements of the item. By conditioning on the lifetime X of the item, we have that the expected length of a cycle is Z ∞ Z T T + a(x) f (x) dx, xf (x) dx + T 0 where a(x) = E[min(x − T, V )] for x > T and V is the length of the interval between T and the first preventive replacement opportunity after time T . Since V is exponentially distributed with parameter λ, we have Z x−T i 1h a(x) = vλe−λv dv + (x − T )e−λ(x−T ) = 1 − e−λ(x−T ) . λ 0 Noting that P (V ≤ x − T ) = 1 − e−λ(x−T ) for x > T , we have that the expected cost incurred in one cycle is Z ∞ c1 1 − e−λ(x−T ) + c0 e−λ(x−T ) f (x)dx, c0 F (T ) + T where F (x) = P (X ≤ x. The long-run average cost per unit time is R∞ c0 F (T ) + T c1 1 − e−λ(x−T ) + c0 e−λ(x−T ) f (x)dx RT R∞ T + λ1 1 − e−λ(x−T ) f (x) dx 0 xf (x) dx + T 2 −λ(2−p)t ) into the right-hand side 9.32 Substituting M (t) = µt − 1−p 2−p (1 − e of the renewal equation for M (t), we get that the right-hand side is given by Z th i t − x 1 − p 2 −λt −λt (1 − e−λ(2−p)(t−x) ) − 1−e − λte + µ 2−p 0 h i × pe−λx + (1 − p)λ2 xe−λx dx It is matter of some algebra to show that this expression is equal to t 1 − p 2 (1 − e−λ(2−p)t ). − µ 2−p 208 9.33 For fixed clearing time T > 0, the stochastic process describing the number of messages in the buffer regenerates itself each time the buffer is cleared. The clearing epochs are regeneration epochs because of the memoryless property of the Poisson process. The expected length of a cycle is T . The expected cost incurred in the first cycle is K +E ∞ hX i h(Sn ) = K + n=1 Since E[ h(Sn )] = h ∞ X RT 0 ∞ X E[ h(Sn )]. n=1 n−1 x (T − x)λn (n−1)! e−λx dx, we get E[ h(Sn )] = h n=1 Z T 0 1 (T − x)λ dx = λT 2 . 2 Thus the average cost per unit time is (K + 21 λT 2 )/T . This expression p is minimal for T = 2K/(hλ). 9.34 Using Rule 5.10, we have that E(time that i messages are present during one cycle) ∞ ∞ i k X X T 1 X −λT (λT )j 1 (λT )j −λT (λT ) = e = = 1− . e e−λT k+1 k! λ j! λ j! k=i j=i+1 j=0 The ratio of this expression and the cycle length T gives the sought result. 9.35 In view of the interpretation of an Erlang distributed interoccurrence time as the sum of r independent phases, imagine that phases are completed according to a Poisson process with rate α, where the completion of each rth phase marks the occurrence of an event. Then P (N (t) ≤ k) can be interpreted as the probability that at most (k + 1)r − 1 phases are completed up to time t. 9.36 By conditioning on the epoch X1 of the first renewal and using the law of conditional expectation, we have that P (Y (t) > u) is equal to Z t+u P (Y (t) > u | X1 = x)f (x) dx P (Y (t) > u | X1 = x)f (x) dx + t 0 Z ∞ P (Y (t) > u | X1 = x)f (x) dx. + Z t t+u 209 Next note that the conditional distribution of Y (t) given that X1 = x with 0 ≤ x ≤ t is the same as the unconditional distribution of Y (t−x). Also, we have that Y (t) = 1 if X1 > t+u and Y (t) = 0 if t < X1 ≤ t+u. Thus we get Z t P (Y (t − x) > u)f (x) dx, P (Y (t) > u) = 1 − F (t + u) + 0 which verifies the equation for Qt (u). If f (x) = λeR−λx , then substit tution of P (Y (t) > y) = e−λy in 1 − F (t + u) + 0 P (Y (t − x) > t − u)f (x) dx gives Z t P (Y (t − x) > u)f (x) dx 1 − F (t + u) + 0 Z t −λ(t+u) e−λu λe−λx dx = e−λu . =e + 0 Since the renewal equation for Qt (u) has a unique solution, we have verified that, for any t > 0, P (Y (t) > u) = e−λu for u > 0 if the interoccurrence times are exponentially distributed with parameter λ. This is the memoryless property of the Poisson process. 9.37 Fix s ≥ 1. Imagine that a reward of 1 is earned for each customer belonging to a batch of size s. Then the expected reward earned for a batch is sps . The expected number of customers in a batch is µ. Thus the average reward per customer is sps , µ which intuitively explains the result. 210 Chapter 10 10.1 Let Xn be the number of type-1 particles in compartment A after the nth transfer. The process {Xn } is a Markov chain with state space I = {0, 1, . . . , r}. The one-step transition probabilities are pi,i−1 = 2i(r − i) (r − i)2 i2 , p = , p = for i = 0, 1, . . . , r, ii i,i+1 r2 r2 r2 and pij = 0 otherwise. 10.2 Let state 1 correspond to the situation that the professor is driving to the office and has his driver’s license with him, state 2 to the situation that the professor is driving to his office and has his driver’s license at home, state 3 to the situation that the professor is driving to the office and has his driver’s license at the office, state 4 to the situation that the professor is driving to home and has his driver’s license with him, state 5 to the situation that the professor is driving to his home and has his driver’s license at the office, and state 6 to the situation that the professor is driving to his home and has his driver’s license at home. Denoting by Xn the state at the nth drive of the professor, the process {Xn } is a Markov chain with state space I = {1, 2, . . . , 6}. The matrix of one-step transition probabilities of the Markov chain is given by from /to 1 2 3 4 5 6 1 0 0 0 0.5 0.5 0 0 2 0 0 0 0 1 0 3 0 0 0.5 0.5 0 . 0.75 0.25 0 0 4 0 0 0 5 0 1 0 0 0 6 0.75 0.25 0 0 0 0 10.3 Take as state the largest outcome in the last roll. Let Xn be the state after the nth roll with the convention X0 = 0. The process {Xn } is a Markov chain with state space I = {0, 1, . . . , 6}. The one-step transition probabilities are p0k = 16 for 1 ≤ k ≤ 6 and pjk = k j 6 − k − 1 j 6 for j, k = 1, . . . , 6, using the relation P (Y = k) = P (Y ≤ k) − P (Y ≤ k − 1). 211 10.4 Let state XY correspond to the situation that yesterday’s weather is of type X and today’s weather is of type Y , where X and Y can take on the values S (sunny) and R (rainy). Denote by Xn the state of the weather at the nth day, then {Xn } is a Markov chain with state space I = {SS, SR, RS, RR}. The one-step transition probabilities of the Markov chain are given by from\to SS SS 0.9 0 SR 0.7 RS RR 0 SR 0.1 0 0.3 0 RS RR 0 0 0.5 0.5 . 0 0 0.45 0.55 10.5 Let’s say that the system is in state (0, 0) if both machines are good, in state (0, k) if one of the machines is good and the other one is in revision with a remaining repair time of k days for k = 1, 2, and in state (1, 2) if both machines are in revision with remaining repair times of one day and two days. Defining Xn as the state of the system at the end of the nth day, the process {Xn } is a Markov chain. The one-step transition probabilities are given by 1 9 1 9 , p = , p = , p = , 10 (0,0)(0,2) 10 (0,1)(0,0) 10 (0,1)(0,2) 10 1 9 = , p(0,2)(1,2) = , p(1,2)(0,1) = 1, 10 10 p(0,0)(0,0) = p(0,2)(0,1) and pvw = 0 otherwise. 10.6 A circuit board is said to have status 0 if it has failed and is said to have status i if it functions and has the age of i weeks. Let’s say that the system is in state (i, j) with 0 ≤ i ≤ j ≤ 6 if one of the circuit boards has status i and the other one has status j just before any 7 replacement. This state description requires 2 + 7 = 28 states rather than 72 = 49 states when a separate state variable would have been used for each circuit board. Denote by Xn the state of the system at the end of the nth week. Then, the process {Xn } is a Markov chain. The one-step probabilities can be expressed in terms of the failure probabilities ri . For states (i, j) with i 6= j and 0 ≤ i < j ≤ 5, we have p(i,j),(i+1,j+1) = (1 − ri )(1 − rj ), p(i,j),(0,i+1) = (1 − ri )rj , p(i,j),(0,j+1) = ri (1 − rj ), p(i,j),(0,0) = ri rj , 212 and p(i,j),(v,w) = 0 otherwise. For states (i, i) with 0 ≤ i ≤ 5, we have p(i,i),(i+1,i+1) = (1 − ri )2 , p(i,i),(0,i+1) = 2ri (1 − ri ), and p(i,i),(0,0) = ri2 . Further, p(i,6),(i+1,1) = 1 − ri and p(i,6),(0,1) = ri for 0 ≤ i ≤ 5. Further, p(6,6),(1,1) = 1. 10.7 Let’s say that the system is in state i if the channel holds i messages (including any message in transmission). If the system is in state i at the beginning of a time slot, then the buffer contains max(i − 1, 0) messages. Define Xn as the state of the system at the beginning of the nth time slot. The process {Xn } is a Markov chain with state space I = {0, 1, . . . , K + 1}. In a similar way as in Example 10.4, the one-step transition probabilities are obtained. Letting ak = e−λ λk k! for k = 0, 1 . . . , the one-step transition probabilities can be expressed as p0j = aj for 0 ≤ j ≤ K − 1, p0,K = ∞ X k=K ak , pK+1,K = 1 − f, pK+1,K+1 = f, pij = (1 − f )aj−i+1 + f aj−i for 1 ≤ i ≤ j ≤ K, pi,i−1 = (1 − f )a0 pi,K+1 = 1 − K X j=i−1 pij for 1 ≤ i ≤ K, and pij = 0 otherwise. 10.8 Let’s follow the location of a particular car. Let Xn denote the location of the car after its nth return, then the process {Xn } is a Markov chain whose matrix P of one-step transition probabilities is given by from\to 1 1 0.8 2 0.1 0.2 3 4 0 2 3 4 0.1 0 0.1 0.7 0.2 0 . 0.1 0.5 0.2 0.2 0.1 0.7 If the car is currently at location 3, it will be back at location 3 af(5) ter being rented out five times with probability p33 = 0.2798. This 213 probability is obtained from the matrix product 0.4185 0.3089 P5 = 0.3008 0.1890 0.2677 0.3305 0.2860 0.3328 0.1151 0.1904 0.1677 0.1985 0.1987 0.1702 . 0.2455 0.2798 It appears experimentally that the matrix product Pn converges to a matrix with identical rows as n gets large (the Markov chain is aperiodic and has no two disjoint closed sets). We find that P25 0.3165 0.3165 = 0.3165 0.3165 0.3038 0.3038 0.3038 0.3038 0.1645 0.1645 0.1645 0.1645 0.2152 0.2152 = P26 = P27 = · · · . 0.2152 0.2152 The long-run frequency at which the car is returned to location i has the values 0.3168, 0.3038, 0.1645, and 0.2152 for i = 1, 2, 3, and 4. 10.9 Use a Markov chain with four states SS, SR, RS, and RR. The matrix P of one-step transition probabilities satisfies: and P30 0.71780 0.61508 P5 = 0.68371 0.60893 0.69231 0.69231 = 0.69231 0.69231 0.09767 0.10236 0.09972 0.10280 0.09890 0.09890 0.09890 0.09890 (5) 0.08787 0.13250 0.10236 0.13506 0.09890 0.09890 0.09890 0.09890 0.09666 0.15007 . 0.11422 0.15321 0.10989 0.10989 = P31 = · · · . 0.10989 0.10989 (5) The matrix P5 gives pRR,RS + pRR,SS = 0.7440. The matrix P30 gives that the long-run probability is 0.6923 + 0.0989 = 0.7912. The expected value of the number of sunny days in the coming 14 days is 14 X (t) (t) (pRR,RS + pRR,SS ) = 10.18. t=1 214 10.10 The formula for pk00 is correct for k = 1: (1) p00 = α(1 − α − β) α(α + β) β + =1− = 1 − α = p00 . α+β α+β α+β (k) Similarly, the formula for p11 is correct for k = 1. Suppose the formu(k) (k) las for p00 and p11 have been verified for k = 1, . . . , n − 1. Then, (n−1) (n−1) (n−1) using the formula for p00 together with p10 = 1 − p11 = n−1 β(1−α−β) β , we get α+β − α+β (n) (n−1) (n−1) p00 = (1 − α)p00 + αp10 h β h β α(1 − α − β)n−1 i β(1 − α − β)n−1 i + − +α = (1 − α) α+β α+β α+β α+β α(1 − α) − αβ β α(1 − α − β)n β + (1 − α − β)n−1 = + = α+β α+β α+β α+β (n) Similarly, the formula for p11 is verified. 10.11 Consider a two-state Markov chain with states 0 and 1, where state 0 means that the last bit was received incorrectly and state 1 means that the last bit was received correctly. The one-step transition probabilities are given by p00 = 0.9, p01 = 0.1, p10 = 0.001, and p11 = 0.999. The expected number of incorrectly received bits is 5,000 X (n) p10 = 49.417. n=1 10.12 It is sufficient to analyze the evolution of a single tree growing at a particular spot in the forest. The state of the system is 0 if the tree is a baby tree, is 1 if the tree is a young tree, is 2 if the tree is a middleaged tree, and is 3 if the tree is an old tree. Let Xn be the state after 50n years. Then, the process {Xn } is a Markov chain with state space I = {0, 1, 2, 3}. The matrix P of one-step transition probabilities is given by from\to 0 1 2 3 0 0.2 0.8 0 0 0.05 0 0.95 1 0 . 0.1 2 0 0 0.9 3 0.25 0 0 0.75 215 (5) From the matrix product P5 the probabilities aj = p0j are obtained for j = 0, 1, 2, 3. The values of the probabilities aj are a0 = 0.2144, a1 = 0.1675, a2 = 0.0760, and a3 = 0.5421. It appears experimentally that the matrix product Pn converges to a matrix with identical rows as n gets large (the Markov chain is aperiodic and has no two disjoint closed sets). Thus, we calculate the probabilities πj for j = 0, 1, 2, 3 as the row elements of the matrix product Pn for sufficiently large values of n (n = 20 is large enough for convergence in four decimals). The values of the probabilities πj are π0 = 0.1888, π1 = 0.1511, π2 = 0.1435, and π3 = 0.5166. The age distribution of the forest after 50 years is a multinomial distribution with parameters (10,0000, a0 , a1 , a2 , a3 ) and the age distribution of the forest in equilibrium is a multinomial distribution with parameters (10,0000, π0 , π1 , π2 , π3 ). 10.13 Using var n−1 n n n X X X X cov(It , Iu ), var(It ) + 2 It = t=1 t=1 u=t+1 t=1 we get the result for σ 2 [Vij (n)]. The normal approximation to the sought probability is 240.5 − 217.294 1−Φ = 0.0276 12.101 By simulation, we found the value 0.0267 . 10.14 Use a Markov chain with states 1, 2, and 3, where the three states correspond to the situation that the last match was a win, a loss and a draw for England. The matrix of one-step transition probabilities is from\to 1 2 3 1 0.44 0.37 0.19 0.28 0.43 0.29 . 2 3 0.27 0.30 0.43 The expected number of wins for England in the next three matches given that the last match was a draw is 3 X k=1 (k) p31 = 0.9167. 216 10.15 Take a Markov chain with the states s = (i, k), where i = 1 if England has won the last match, i = 2 if England has lost the last match, i = 3 if the last match was a draw, and k ∈ {0, 1, 2, 3} denotes the number of matches England has won so far. For states s = (i, k) with k = 0, the one-step transition probabilities are p(1,0)(1,1) = 0.44, p(1,0)(2,0) = 0.37, p(1,0)(3,0) = 0.19, p(2,0)(1,1) = 0.28, p(2,0)(2,0) = 0.43, p(2,0)(3,0) = 0.29, p(3,0)(1,1) = 0.27, p(3,0)(2,0) = 0.30, and p(3,0)(3,0) = 0.43. Similarly, the one-step probabilities for the states (i, 1) and (i, 2). The one-step transition probabilities for the states (i, 3) are not relevant and may be taken as p(i,3)(i,3) = 1. Let pk denote the probability that England will win k matches of the next three matches when the last match was a draw. Then, (3) (3) (3) pk = p(3,0)(1,k) + p(3,0)(2,k) + p(3,0)(3,k) for 0 ≤ k ≤ 3 (verify that this formula uses the fact that the second component of state s = (i, k) cannot decrease). This leads to p0 = 0.3842, p1 = P 0.3671, p2 = 0.1964 and p3 = 0.0523. As a sanity check, 3k=0 kpk = 0.9167, in agreement with the answer to Problem 10.14. 10.16 Use a Markov chain with fives states, where state 1 means deuce, state 2 means Bill has advantage, state 3 means Mark has advantage, state 4 means Bill is winner, and state 4 means Mark is winner. The states 4 and 5 are absorbing. The one-step transition probabilities are p12 = 0.55, p13 = 0.45, p21 = 0.40, p24 = 0.60, and p31 = p35 = 0.50. For starting state i, let fi be the probability that Bill will be the winner of the game and µi be the expected duration of the game. The sought probability f1 = 0.5946 follow by solving the linear equations f1 = 0.55f2 + 0.45f3 , f2 = 0.4f1 + 0.6, f3 = 0.5f1 + 0.5. The sought expected value µ1 = 3.604 follows by solving the linear equations µ1 = 1 + 0.55µ2 + 0.45µ3 , µ2 = 1 + 0.4µ1 , µ3 = 1 + 0.5µ1 . 10.17 Use a Markov chain with the 11 states 1, 2, . . . , 10 and 10+, where state i means that the particle is in position i and state 10+ means that the particle is in a position beyond position 10. The state 10+ is taken to be absorbing. The one-step transition probabilities are pij = 0.5 for j = i − ⌊i/2⌋ and for j = 2i − ⌊i/2⌋. (25) The other pij are zero. The sought probability is p1,10+ = 0.4880. 217 10.18 Use a Markov chain with six states, where state i means that the jar contains i red balls. Take state 5 as an absorbing state and so p55 = 1. The other one-step transition probabilities are p01 = 1, pi,i−1 = 5i and pi,i+1 = 5−i 5 for 1 ≤ i ≤ 4. The probability that more than n picks are needed is (n) 1 − p25 This probability has the values 0.9040, 0.8139, 0.5341, 0.2840, and 0.0761 for n = 5, 10, 25, 50, and 100. The expected number of picks is µ2 = 40.167. To obtain this result, define µi as the expected number of picks needed to reach state 5 from state i and solve the linear equations 5−i i µi+1 µi = 1 + µi−1 + 5 5 for i = 0, 1, . . . , 4 where µ−1 = µ5 = 0. 10.19 Use a Markov chain with four states 0, 1, 2, and 3, where state 0 means neither a total of 7 nor a total of 12 for the last roll, state 1 means a total of 7 for the last roll but not for the roll before, state 2 means a total of 7 for the last two rolls, and state 3 means a total of 12 for the last roll. The states 2 and 3 are absorbing with p22 = p33 = 1. 29 6 1 Further, p00 = p10 = 36 , p01 = p12 = 36 , p03 = p13 = 36 , and pij = 0 otherwise. Let fi be the probability of absorption in state 2 when the 6 , as follows by solving initial state is i. The sought probability f0 is 13 the two linear equations f0 = 29 6 29 6 f0 + f1 and f1 = f0 + . 36 36 36 36 10.20 (a) Consider a Markov chain with six states i = 0 , 1, 2, 3, 4, and 5, where state i now corresponds to the situation that the last i tosses resulted in heads but not the toss preceding those i tosses. State 5 is absorbing. For i = 0, 1, 2, 3, 4, the one-step transition probabilities are (20) pi0 = pi,i+1 = 0.5. The desired probability is p05 = 0.2499. (b) Consider again a Markov chain with six states i = 0, 1, 2, 3, 4, 5, where state i corresponds to the situation that the last i tosses resulted in the same outcomes but not the toss preceding those i tosses. State 5 is absorbing, that is p5,5 = 1. Then, p0,1 = 1. For i = 1, 2, 3, 4, the one-step transition probabilities are pi1 = pi,i+1 = 0.5. The desired (20) probability is p05 = 0.4584. (c) Use a Markov chain with the six states (i, j) for i = 0, 1 and 218 j = 0, 1, 2 and the three absorbing states ak for k = 1, 2, 3, where state (i, j) means that player A has tossed i heads in a row and player B has tossed j heads in a row, state a1 means that player A has won, state a2 means that player B has won, and state a3 means that there is a tie. The matrix P of one-step probabilities is easily determined. For example, p(1,0)(0,0) = p(1,0)(0,1) = 0.25 and p(1,0)a1 = 0.5. The sought probabilities can be obtained by solving twice a system of six linear equations. However, we obtained these probabilities by calculating Pn for n sufficiently large. Player A wins with probability 0.7398, player B with probability 0.2125, and a tie occurs with probability 0.0477. 10.21 Take a specific city (say, Venice) and a particular number (say, 53). Consider a Markov chain with state space I = {0, 1, . . . , 182}, where state i indicates the number of draws since the particular number 53 appeared for the last time in the Venice lottery. The state 182 is taken as an absorbing state. The one-step transition probabilities for the other states of the Markov chain are pi0 = 5 85 and pi,i+1 = for i = 0, 1, . . . , 181. 90 90 The probability that in the next 1,040 draws of the Venice the lottery there is some window of 182 consecutive draws in which the number (1,040) 53 does not appear can be calculated as p0,182 . This probability has the value p = 0.00077541. The five winning numbers in a draw of the lottery are not independent of each other, but the dependence is weak enough to give a good approximation for the probability that in the next 1,040 drawings of the Venice lottery there is not some number that stays away during some window of 182 consecutive draws. This probability is approximated by (1 − p)90 . The lottery takes place in 10 cities. Thus the sought probability is approximately equal to 1 − (1 − p)900 = 0.5025. This problem is another illustration of the fact that coincidences can nearly always be explained by probabilistic arguments! 10.22 Let’s say that the system is in state i if i different numbers are drawn so far. Define the random variable Xn as the state of the system after the nth drawing. The process {Xn } is a Markov chain with state space I = {0, 1, . . . , 45}. State 45 is an absorbing state. The one-step 219 transition probabilities are given by p06 = 1 and i 45−i pi,i+k = k 6−k 45 6 for i = 0, 1, . . . , 44 and k = 0, 1, . . . , min(45 − i, 6), p45,45 = 1, and pij = 0 otherwise, with the convention m n = 0 for n > m. The probability that more than r drawings are needed to obtain all of the numbers 1, 2, . . . , 45 is equal to (r) 1 − p0,45 . This probability has the values 0.9989, 0.7409, 0.2643, and 0.035 for r = 15, 25, 35, and 50. 10.23 Take a Markov chain with state space I = {(i, j) : i, j ≥ 0, i + j ≤ 25}, where state (i, j) means that i pictures are in the pool once and j pictures are in the pool twice or more. State (0, 25) is absorbing. The other one-step transition probabilities are j j 25 − i − j j × , p(i,j),(i+1,j) = 2 × × , 25 25 25 25 25 − i − j 24 − i − j × , p(i,j),(i+2,j) = 25 25 i 25 − i − j 25 − i − j i + 1 p(i,j),(i,j+1) = × + × , 25 25 25 25 j+1 j i i i−1 i × + × , p(i,j),(i−2,j+2) = × . p(i,j),(i−1,j+1) = 25 25 25 25 25 25 p(i,j),(i,j) = Then (n) P (N > n) = 1 − p(0,0),(0,25) . We haveP E(N ) = 71.4 weeks. This value can be calculated from E(N ) = ∞ n=0 P (N > n) or by solving a set of linear equations. 10.24 Let us analyze the game for the case that player A chooses HHT and player B responds with T HH. We use a Markov chain with the 7 states labeled as 0, H, T , HH, T H, HHT , and T HH. State 0 corresponds to the beginning of the game. State HH means (heads, heads) for the last two tosses and state T H means (tails, heads) for the last two tosses. State H means heads for the first toss of the game. The meaning of state T is more subtle. State T means that the first toss is heads is or the last toss is heads (it is not necessary to use states 220 T T and HT ). The states HHT (win for player A)and T HH (win for player B) are absorbing states. Let Xn be the state after the nth toss. Then the process {Xn } is a Markov chain. The matrix P of one-step transition probabilities is given by from\to 0 H T HH 0 0 0.5 0.5 0 0 0 0.5 0.5 H 0 0 0.5 T 0 HH 0 0.5 0 0 0 0 0.5 TH 0 HHT 0 0 0 0 T HH 0 0 0 0 TH 0 0 0.5 0 0 0 0 HHT 0 0 0 0.5 0 1 0 T HH 0 0 0 0 . 0.5 0 1 Calculating Pn for several large values of n (or solving a system of linear equations), we get the value 0.75 for the win probability of player B. Similarly, the other win probabilities of player B can be calculated (the calculation of the win probability 0.875 for player B when A chooses HHH does not require a Markov chain: if player A chooses HHH, then player A can only win if the first three tosses are heads). 10.25 Take a Markov chain with state space I = {0, 1, . . . 5}, where state i means that Joe’s bankroll i × 200 dollars. The states 0 and 5 are absorbing. The other one-step transition probabilities are p10 = p20 = 19 , p12 = p24 = p35 = p45 = 18 p31 = p43 = 37 37 , and pij = 0 otherwise. (a) The probability that Joe will place more than n bets is (n) (n) 1 − p4,0 − p4,5 . This probability has the values 0.2637, 0.1283, 0.0320, and 0.0080 for n = 2, 3, 5, and 7. (b) To find the probability of Joe reaching his goal, solve the four 18 18 19 linear equations f1 = 18 37 f2 , f2 = 37 f4 , f3 = 37 + 37 f1 , and f4 = 18 19 37 + 37 f3 . The probability of Joe reaching his goal is f4 = 0.78531. (c) Parameterize and let si be the expected value of the total amount you will stake in the remaining part of the game when the current state is i. To find s4 , solve the linear equations s1 = 200 + 18 18 19 19 s2 , s2 = 400 + s4 , s3 = 400 + s2 , s4 = 200 + s3 . 37 37 37 37 221 This gives s4 = 543.37 dollars. As a sanity check, the ratio of your expected loss and the expected amount staked during the game is indeed equal to the house advantage of 0.0270 dollar per dollar staked (the expected loss is 0.21469 × 800 − 0.78531 × 200 = 14.69 dollars). Note: The maximal value of the probability of Joe reaching his goal is not achieved by the strategy of bold play, see also Problem 2.43. Using the optimization method of dynamic programming, the maximal value can be calculated as 0.7900 when no combined bets are done on the outcome of the wheel. Bold play is optimal in a primitive casino with only one type of roulette bet having wf ≤ 1, where w is the probability of getting back f times the stake and 1 − w is the probability of losing the stake (payoff odds f for 1). The proof of this result requires deep mathematics 10.26 To answer the question about European roulette, we define the following Markov chain. Let’s say that the system is in state 0 if the game begins or if the last spin showed a zero and in state i if the last i spins of the wheel showed the same color for i = 1, 2, . . . 26 but not the spin before those i spins. State 26 is taken as an absorbing state. Let Xn be the state of the system after the nth spin of the wheel. The process {Xn } is a Markov chain with one-step transition probabilities 1 36 1 p00 = 37 , p01 = 37 , pi0 = 37 , pi,i+1 = pi1 = 18 37 for i = 1, 2, . . . , 25, p26,26 = 1, and pij = 0 otherwise. The probability that in the next n spins of the wheel the same color will come up 26 or more times (n) in a row is given by p0,26 . This probability has the value 0.0368 for n = 5,000,000 (and 0.0723 for n = 10,000,000). A minor adjustment of the one-step transition probabilities of the Markov chain is required for the question about American roulette. Then, the probability that in the next n spins of the wheel one of the numbers 1 to 36 will come up 6 or more times in a row is 0.0565 for n = 5,000,000 (and 0.1099 for n = 10,000,000). 10.27 Use a Markov chain with the states i = 0, 1, . . . , 8, where state i means that the dragon has i heads. The states 0, 7, and 8 are absorbing. The one-step transition probabilities are pi,i−1 = 0.7, pi,i+1 = 0.3p, and pi,i+2 = 0.3(1 − p) for 1 ≤ i ≤ 6. The win probabilities for p = 0, 0.5, and 1 are the probabilities 0.6748, 0.8255, and 0.9688 of absorption in state 0 starting from state 3. 10.28 (a) For i = 1, 2, . . . , 6, let state i mean that the outcomes of the last i rolls are different but not the outcomes of the last i + 1 rolls. The 222 auxiliary state 0 means that the first roll is to occur. State 6 is taken as an absorbing state. Let Xn be the state after the nth roll. The one-step transition probabilities of the absorbing Markov chain {Xn } are p01 = 0, pi1 = 65 and pi,i+1 = 61 for 1 ≤ i ≤ 5. Raising the matrix P of one-step transition probabilities to the power 100, it follows that the probability of getting a run of six different outcomes within 100 (100) rolls is given by p0,6 = 0.7054. To find the expected value of the number of rolls until a run of six different outcomes occurs, define µi as the expected number of rolls until such a run occurs given that the process starts in state i. Then, 5 1 µ0 = 1 + µ1 , µi = 1 + µ1 + µi+1 for i = 1, 2, . . . , 5, 6 6 where µ6 = 0. The solution of this system of six linear equations gives the desired value µ0 = 83.20. (b) Let state 1 mean that the last roll gave the outcome 1, state 2 that the last two rolls gave the run 12, state 3 that the last three rolls gave the run 123, state 4 that the last four rolls gave the run 1234, state 5 that the last five rolls gave the run 12345, and state 6 that the last six rolls gave the run 123456. The auxiliary state 0 refers to the situation that none of these six states applies. State 6 is taken as an absorbing state. Let Xn be the state after the nth roll. The one-step transition probabilities of the absorbing Markov chain {Xn } are 5 1 4 1 p00 = , p01 = , and pi0 = , pi1 = pi,i+1 = for 1 ≤ i ≤ 5. 6 6 6 6 The probability of getting the run 123456 within 100 rolls of the die is obtained by raising the matrix P of one-step transition probabilities (100) to the power 100. This gives p0,6 = 0.00203. By setting up a similar system of linear equations as in (a), it follows that the expected number of rolls until such a run occurs is 46,659. Note: The value 0.00203 for the probability of getting the run 123456 within 100 rolls of the die can also be obtained by the Poisson heuristic discussed in Section 3.7.1. Consider 95 trials i = 1, 2, . . . , 95, where trial i is said to be successful if the rolls i, i + 1, . . . , i + 6 gives the successive outcomes 1, 2, . . . , 6. The success probability of each trial is 1 . The trials are not independent, but the dependence is quite weak. 66 This justifies to approximate the number of successes by a Poisson distribution with expected value λ = 95 × 616 . Thus the probability of getting the run 123456 within 100 rolls of the die is approximately equal to 1 − e−λ = 0.00203. 223 10.29 Take a Markov chain with six states 0, 1, . . . , 5, where state 0 corresponds to the start of the game, state 1 means that all five dice show a different value, and state i with i ≥ 2 means that you have i dice of a kind. State 5 is an absorbing state. In state 1 you re-roll all five dice and in state i with i ≥ 2 you leave the i dice of a kind and re-roll the other 5 − i dice. The matrix of one-step transition probabilities is from /to 0 0 0 0 1 0 2 0 3 0 4 5 0 1 2 3 4 5 120 1296 120 1296 900 1296 900 1296 120 216 250 1296 250 1296 80 216 25 36 25 1296 25 1296 15 216 10 36 5 6 1 1296 1 1296 1 216 1 36 1 6 0 0 0 0 0 0 0 0 0 0 1 . (3) The probability of getting Yahtzee within three rolls is p05 = 0.04603. 10.30 Number the three players as 1, 2, and 3, where the players 1 and 2 play the first game. Let state 0 correspond to the start of the tournament, state (i, j) to the situation that players i and j are playing against each other with player i being the winner of the previous game, and state ai to the situation that player i has won two games in a row. Let Xn be the state after the nth game. Then {Xn } is a Markov chain with three absorbing states a1 , a2 , and a3 . The one-step transition probabilities are given by from /to 0 (1, 2) 0 0 0 (1, 2) 0 0 0 (1, 3) 0 0 (2, 1) 0 0 (2, 3) 0 0 0.5 (3, 1) 0 (3, 2) 0 0 a1 0 0 a2 0 a3 0 0 (1, 3) 0.5 0 0 0.5 0 0 0 0 0 0 (2, 1) 0 0 0 0 0 0 0.5 0 0 0 (2, 3) 0.5 0.5 0 0 0 0 0 0 0 0 (3, 1) 0 0 0 0 0.5 0 0 0 0 0 (3, 2) 0 0 0.5 0 0 0 0 0 0 0 a1 a2 a3 0 0 0 0.5 0 0 0.5 0 0 0 0.5 0 0 0.5 0 . 0 0 0.5 0 0 0.5 1 0 0 0 1 0 0 0 1 The probability of player 3 being the ultimate winner can be computed (n) as limn→∞ p0,a3 or by solving a system of 7 linear equations in 7 un- 224 knowns. The probability has the value 0.2857. The expected duration of the tournament can be computed from ∞ X (n) (n) (n) (1 − p0,a1 − p0,a2 − p0,a3 ) n=0 or by solving a system of 7 linear equations in 7 unknowns. The expected value is 3 games. This result can also be directly seen by noting that the tournament takes r games with probability 2( 12 )r for r = 2, 3, . . .. 10.31 You may use a Markov chain with eight states (0, 0, 0), . . . , (1, 1, 1), where 0 means an empty glass and 1 means a filled glass. However, for reasons of symmetry, a Markov chain with four states i = 0, 1, 2, and 3 suffices, where state i means that there are i filled glasses. State 0 is absorbing with p00 = 1. The other one-step transition probabilities are p10 = 31 , p12 = 23 , p21 = 23 , p23 = 31 , and p32 = 1. By solving the linear equations 1 2 2 µ3 = 1 + µ2 , µ2 = 1 + µ1 + µ3 , µ1 = 1 + µ2 , 3 3 3 (n) we find E(N ) = 10. The probability P (N > n) = 1 − p30 has the values 0.6049, 0.3660, 0.1722, 0.1042, and 0.0490 for n = 5, 10, 15, 20, and 25. 10.32 For i = 0, 1, . . . , 25, let state i correspond to the situation that the number of five-dollar notes in the till is equal to i and the next person wants to buy a ticket. Also, define an auxiliary state −1 for the situation that there is no change for a person wishing to buy a ticket with a ten-dollar note. The states −1 and 25 are taken as absorbing states. For the states i = 0, 1, . . . , 24, the one-step transition probabilities are given by pi,i−1 = pi,i+1 = 0.5 and pij = 0 otherwise. The probability that none of the fifty persons will have to wait for (50) change is 1 − p0,−1 = 0.1123. 10.33 Use a Markov chain with the 8 states 0, 1, . . . , 6 and −1 , where state 0 is the starting state, state i with 1 ≤ i ≤ 6 means that the outcome of the last roll is i and at least as large as the outcome of the roll 225 preceding it, and state −1 means that the outcome of the last roll is less than the outcome of the roll preceding it. State −1 is absorbing with p−1,−1 = 1. To answer the first question, take the one-step transition probabilities p0j = 61 for 1 ≤ j ≤ 6, pii = · · · = pi6 = 61 and pi,−1 = i−1 6 for 1 ≤ i ≤ 6. The probability that each of the last three rolls is at (4) least as large as the roll preceding it equals 1 − p0,−1 = 0.0972. In a similar way, the probability that each of the last three rolls is larger than the roll preceding it is 0.0116. 10.34 For the first probability, we adjust the Markov matrix P in Problem 10.4 to the matrix Q by replacing the last row of P by (0, 0, 0, 1), that is, state RR is made absorbing. The first probability is calculated as (7) 1 − qSS,RR = 0.7281. To calculate the second probability, we introduce two auxiliary states R1 and R2 and consider the following Markov matrix M = (mij ): from\to SS SS 0.9 SR 0 0.7 RS 0 RR 0 R1 R2 0 SR 0.1 0 0.3 0 0 0 RS RR 0 0 0.5 0 0 0 0.45 0 0.45 0 0 0 R1 R2 0 0 0.5 0 0 0 . 0.55 0 0 0.55 0 1 (7) This leads to the value 1−mRR,R2 = 0.4859 for the second probability. Note: The second probability can also be calculated as 1 minus the element in the intersection of the last column and last row from the matrix product P2 Q5 . 10.35 We adjust the Markov matrix P in Problem 10.8 to the matrix Q by replacing the first row of P by (1, 0, 0, 0), that is, state 1 is made absorbing. The matrix products Q4 and Q5 are given by 1 0.39310 Q4 = 0.44310 0.17090 0 0.31010 0.19760 0.32140 0 0.19840 0.16090 0.19760 0 0.09840 0.19840 0.31010 226 and 1 0.46379 Q5 = 0.49504 0.24256 0 0.25659 0.19409 0.30676 0 0.17106 0.13981 0.19409 0 0.10856 . 0.17106 0.25659 The probability that the car will be rented out more than five times (5) before it returns to location 1 is equal to 1 − q41 = 0.7574 if the car is currently at location 4 and is equal to (4) (4) (4) p12 (1 − q21 ) + p13 (1 − q31 ) + p14 (1 − q41 ) = 0.1436 if the car is currently at location 1. 10.36 Let the random variable Xn be equal to 1 if the nth letter is a vowel and be equal to 2 if the nth letter is a consonant. The process {Xn } is a Markov chain with state space I = {1, 2}. The matrix of one-step transition probabilities is given by from\to 1 2 1 0.128 0.872 . 2 0.663 0.337 The equilibrium equations are π1 = 0.128π1 + 0.663π2 π2 = 0.872π1 + 0.337π2 . Solving the equations π1 = 0.128π1 + 0.663π2 and π1 + π2 = 1 gives π1 = 0.4319 and π2 = 0.5681. The theoretical equilibrium probabilities are in perfect agreement with the empirical findings of Andrey Markov. 10.37 In Example 10.3, the equilibrium equations are π1 = 0.50π1 + 0.50π3 , π2 = 0.5π1 + 0.5π3 , and π3 = π2 . Solving two of these equations together with π1 + π2 + π3 = 1 gives π1 = π2 = π3 = 31 . The long-run proportion of time the professor has his license with him is π1 = 13 . In Problem 10.2 the Markov chain has six states 1, 2, . . . , 6, where state 1/2/3 means that the professor is driving to the office and has his driver’s license with him/at home/at office, and state 4/5/6 means that the professor is driving to home and 227 has his driver’s license with him/at office/at home. The equilibrium equations are π1 = 0.75π4 + 0.75π6 , π2 = 0.25π4 + 0.25π6 , π3 = π5 , π4 = 0.50π1 + 0.50π3 , π5 = 0.50π1 + 0.50π3 , and π6 = π2 . Solving five of these equations together with π1 +· · ·+π6 = 1 gives π1 = π3 = π4 = π5 = 0.2143, π2 = π6 = 0.0714. The long-run proportion of time the professor has his license with him is π1 + π4 = 0.4286. 10.38 Let state 1 mean that the student is eating in the Italian restaurant, state 2 mean that the student is eating in the Mexican restaurant, state 3 mean that the student is eating in the Thai restaurant, and state 4 mean that the student is eating at home. Let Xn be the state at the nth evening. The process {Xn } is a Markov chain with the one-step transition probabilities from\to 1 1 0.10 2 0.40 0.50 3 4 0.40 2 0.35 0.15 0.15 0.35 The equilibrium equations are 3 4 0.25 0.30 0.25 0.20 . 0.05 0.30 0.25 0 π1 = 0.10π1 + 0.40π2 + 0.50π3 + 0.40π4 π2 = 0.35π1 + 0.15π2 + 0.15π3 + 0.35π4 π3 = 0.25π1 + 0.25π2 + 0.05π3 + 0.25π4 π4 = 0.30π1 + 0.20π2 + 0.30π3 . Solving three of these equations together with π1 + π2 + π3 + π4 = 1 gives π1 = 0.3237, π2 = 0.2569, π3 = 0.2083, and π4 = 0.2110. The proportion of time the student is eating at home is 0.2110. 10.39 Use a four-state Markov chain with the one-step transition probabilities pii = 1 − ri and pij = 13 ri for j 6= i. The Markov chain is aperiodic. Therefore the limiting probabilities exist and are given by the equilibrium probabilities. The equilibrium equations are 3 X 1 πj = (1 − rj )πj + r i πi 3 i=1,i6=j for 1 ≤ j ≤ 4. 228 P Solving three of these equations together with 4j=1 πj = 1, we get π1 = 0.2817, π2 = 0.1690, π3 = 0.2113, and π4 = 0.3380. 10.40 The equilibrium equations are πSS = 0.90πSS + 0.70πRS , πSR = 0.10πSS + 0.30πRS πRS = 0.50πSR + 0.45πRR , πRR = 0.50πSR + 0.55πRR . Solving three of these equations together with πSS + πSR + πRS + πRR = 1 yields πSS = 0.6923, πSR = 0.0989, πRS = 0.0989, and πRR = 0.1099. The long-run proportion of days it will be sunny is πSS + πRS = 0.7912. The limiting probability of a rainy Sunday is πRR + πSR = 0.2088 (the Markov chain is aperiodic). 10.41 State j means that compartment A contains j particles of type 1. An intuitive guess is the hypergeometric distribution r r πj = j r−j 2r r for j = 0, 1, . . . , r. These πj satisfy the equilibrium equations πj = (r − j + 1)2 2j(r − j) (j + 1)2 π + π + πj+1 , j−1 j r2 r2 r2 as can be verified by substitution. Since the Markov chain has no two or more disjoint closed sets, its equilibrium distribution is uniquely determined. 10.42 Solving the equilibrium equations πA = 0.340πA + 0.193πB + 0.200πC + 0.240πD πB = 0.214πA + 0.230πB + 0.248πC + 0.243πD πC = 0.296πA + 0.345πB + 0.271πC + 0.215πD together with πA + πB + πC + πD = 1 gives πA = 0.2419, πB = 0.2343, πC = 0.2808, and πD = 0.2429. The long-run frequency of base A appearing is πA = 0.2419 and the long-run frequency of observing base A followed by another A is πA pAA = 0.0822. 10.43 The first thought might be to use a Markov chain with 16 states. However, a Markov chain with two states 0 and 1 suffices, where state 0 means that Linda and Bob are in different venues and state 1 means 229 that they are in the same venue. The one-step-transition probability p01 is 2 1 1 + 0.6 × × 0.6 × , p01 = 2 × 0.4 × 0.6 × 3 3 3 where the first term refers to the probability that exactly one of the two persons does not change of venue and the other person goes to the venue of that person, and the second term refers to the probability that both persons change of venue and go the same venue. By a similar argument, 1 . p11 = 0.4 × 0.4 + 0.6 × 0.6 × 3 This gives p01 = 0.24 and p11 = 0.28. Further, p00 = 1 − p01 and p10 = 1 − p11 . Solving the two equations π0 = p00 π0 + p10 π1 and π0 + π1 = 1 1 gives π0 = 18 19 and π1 = 19 . The long-run fraction of weekends that 1 Linda and Bob visit a same venue is π1 = 19 . The limiting probability that they visit a same venue two weekends in a row is π1 ×p11 = 0.0147 (the Markov chain is aperiodic). 10.44 In Problem 10.3 the state of the Markov chain and the one-step transition probabilities are given (in the equilibrium analysis the auxiliary state 0 is not needed). The equilibrium equations are given by πk = 6 h X k j j=1 6 − k − 1 j i πj 6 for k = 1, . . . , 6. Solving the equilibrium equations, we find that the long-run frequency at which k dice are rolled has the values 0.0004, 0.0040, 0.0230, 0.0878, 0.2571, and 0.6277 for k = 1, 2, . . . , 6. P 1 10.45 Since N k=1 pkj = 1 for 1 ≤ j ≤ N , it follows that πj = N for all j is a solution of the equilibrium equations πj = N X k=1 πk pkj for 1 ≤ j ≤ N. The Markov chain has no two or more disjoint closed sets and so the discrete uniform distribution is its unique equilibrium distribution. 230 10.46 The process {Xn } is a Markov chain with state space {0, 1, . . . , r − 1}. To give the matrix of one-step transition probabilities, we distinguish between the cases 2 ≤ r ≤ 5 and r ≥ 6. For r = 2, 3, 4, and 5 the respective matrices of one-step transition probabilities of the Markov chain are given by from\to 0 0 1/2 1 1/2 1 1/2 1/2 from\to 0 1 0 2/6 2/6 2/6 2/6 1 2 2/6 2/6 from\to 0 0 1/6 1 1/6 2/6 2 3 2/6 from\to 0 0 1/6 1/6 1 1/6 2 1/6 3 4 2/6 1 2/6 1/6 1/6 2/6 1 2/6 1/6 1/6 1/6 1/6 2 2/6 2/6 2/6 2 1/6 2/6 2/6 1/6 2 1/6 2/6 1/6 1/6 1/6 3 1/6 2/6 2/6 1/6 3 1/6 1/6 2/6 1/6 1/6 4 1/6 1/6 1/6 . 2/6 1/6 Each of the transition matrices corresponding to r = 2, 3, 4, and 5 has Pr−1 the property that i=0 pij = 1 for all j. This property is also satisfied for the matrix of one-step transition probabilities corresponding to the case of r ≥ 6. For any r ≥ 6, it is readily verified that for each fixed j the probability pij = 61 for six different values of i and pij = 0 for the other i. Thus the Markov chain is doubly stochastic for any r ≥ 2. Moreover, the Markov chain has no two or more disjoint closed sets and is aperiodic. Invoking the result of Problem 10.45, we get that limn→∞ P (Xn = 0) = 1r . 10.47 The Markov chain is a regenerative stochastic process. The times at which the process visits state r are taken as regeneration epochs and so 231 a cycle is the time interval between two successive visits to state r. The expected length of one cycle is the mean recurrence time µrr = 1/πr . Fix state j 6= r. Assume that a reward of 1 is earned each time the Markov chain visits state j. Then, by the renewal-reward theorem, the long-run average reward per unit time is γjr /µrr , where γjr is the expected number of visits to state j between two returns of the Markov chain to state r. Further, the long-run average reward per unit time is πj and πr = 1/µrr, showing that πj = γjr πr for any state j. 10.48 The perturbed Markov chain {X n } has also no two or more disjoint closed sets. Denote by {π i , i ∈ I} its unique equilibrium distribution. Then, for all j ∈ I, X X πj = pkj π k = pjj π j + pkj π k k∈I k∈I,k6=j = (1 − τ + τ pjj )π j + τ X k∈I,k6=j pkj π k = (1 − τ )π j + τ X pkj π k . k∈I This gives πj = X pkj π k for all j ∈ I. k∈I In other words, {π j , j ∈ I} is an equilibrium distribution of the Markov chain {Xn }. Since this Markov chain has a unique equilibrium distribution {πj , j ∈ I}, it follows that π j = πj for all j ∈ I. Note: Since pii > 0 for all i, the perturbed Markov chain {X n } is (n) aperiodic and so limn→∞ pij = πj for all i, j ∈ I. 10.49 (a) This result follows from the inequality (n+m) pab (m) ≥ p(n) ac pcb for all states a, b and c. (b) To prove the “if part,” assume to the contrary that C is not irreducible. Then there is closed set S ⊂ C with S 6= C. Choose any i ∈ C. Since S is closed, C(i) ⊆ S and so C(i) 6= C, contradicting that i communicates with all states in C. The “only if” part follows by showing that C(i) = C for any i ∈ C. To show this, it suffices to prove that C(i) is closed. Assume to the contrary that C(i) is not closed. Then there are states a ∈ C(i) and b ∈ / C(i) with pab > 0. (n) Since a ∈ C(i), there is an integer n ≥ 1 such that pia > 0 and so (n+1) pib (n) ≥ pia pab > 0, 232 contradicting that b ∈ / C(i). (c) Since C is finite and closed, there must be some state j ∈ C such that P (Xn = j for infinitely many n | X0 = j) = 1. Such a state is recurrent. (d) Assume to the contrary that some transient state t can be reached (m) from some recurrent state r, that is, prt > 0 for some m ≥ 1. Then P (Xn = r for infinitely many n | X0 = r) = 1 implies that this relation also holds with r replaced by t, which would mean that state t is recurrent. (e) By (c), there is some state j ∈ C that is recurrent. Take any other state i ∈ C. By (b), the states i and j communicate and so there are (r) (s) integers r, s ≥ 1 such that pij > 0 and pji > 0. It now follows from (r+n+s) pii (r) (n) (s) ≥ pij pjj pji that ∞ X n=1 (n) (r) (s) pii ≥ pij pji ∞ X (n) pjj . n=1 P (n) = ∞ for the recurrent state j, we find ∞ Since n=1 pii = ∞, showing that state i is recurrent. (r ) (f ) Since P (Nij > r) ≤ P (Nij > ri ) = 1 − pij i ≤ 1 − ρ for all i ∈ C, k we P∞have P (Nij > kr) ≤ (1 − ρ) for k ≥ 1. Then, by E(Nij ) = n=0 P (Nij ) > n), we get (n) n=1 pjj P∞ E(Nij ) = 1 + rk X ∞ X k=1 l=(r−1)k+1 P (Nij > l) ≤ 1 + ∞ X k=1 r(1 − ρ)k−1 , which proves that E(Nij ) < ∞. Note: let T be the set of transient states of a Markov chain having a single irreducible set C of states. Mimicking the foregoing proof shows that the expected number of transitions until reaching the set C is finite for any starting state i ∈ I. (g) Denote by dk the period of any state k ∈ C. Choose i, j ∈ C with (v) i 6= j. By (b), there are integers v, w ≥ 1 such that pij > 0 and (w) (v+w) pji > 0. Then pii > 0, and so v + w is divisible by di . Let n ≥ 1 (n) pjj (v+n+w) be any integer with > 0. Then pii > 0 and so v + n + w is divisible by di . Thus, n is divisible by di and so di ≤ dj . For reasons of symmetry, dj ≤ di , showing that di = dj . (d) 10.50 For fixed 1 ≤ k ≤ d, let qij = pij for i, j ∈ Rk . Then, the matrix (n) Q = (qij ), i, j ∈ Rk is a Markov matrix and has the property qij = 233 (nd) (n) pij . The Markov matrix Q is aperiodic. Therefore limn→∞ qij (nd) exists and so limn→∞ pij exists for all i, j ∈ Rk . To show that the limit is dπj for all i, j ∈ Rk , we reason as follows. Imagine now that the state of the Markov process {Xn } is only observed at the times 0, d, 2d, . . . and suppose that the starting state belongs to Rk . Then the long-run frequency at which any state j ∈ Rk will be observed is d times the long-run frequency πj at which state j will be observed when considering the process over all times 0, 1, 2, . . . . 10.51 The state of the Markov chain is the inventory position just before review. The equilibrium equations for the π1 with i 6= 0 are πi = s−1 X j=0 πj e −λ λS−i + (S − i)! S X j=max(s,i) πj e−λ λj−i (j − i)! for 0 < i ≤ S. (a) P The long-run average stock on hand at the end of the week is given by Sj=0 jπj =4.387. P (b) The long-run average ordering frequency is s−1 j=0 πj = 0.5005. P∞ −λ k (c) Let L(j) = k=j+1 (k − j)e λ /k! denote the expected amount of demand lost in the coming week if the current stock on hand just after review is j. By RuleP 10.7, the long-run average amount of demand PS lost per week is L(S) s−1 π + L(j)π j = 0.0938. j=0 j j=s 10.52 In Problem 10.40, we computed the equilibrium probabilities πSS = 0.6923, πSR = 0.0989, πRS = 0.0989, and πRR = 0.1099. Hence, the long-run average sales per day is 1,000 × (πSS + πRS ) + 500 × (πSR + πRR ) = 895.60 dollars. 10.53 In the answer to Problem 10.6, the state of the Markov chain and the one-step transition probabilities are given. (a) The long-run proportion of time the device operates properly is 1 − π(0,0) = 0.9814. (b) By Rule 10.7, the long-run average weekly cost is 750π(0,0) + P 200[π(0,0) + π(6,6) + π(0,6) ] + 100 5j=1 [π(0,j) + π(j,6) ] = 52.46 dollars. 10.54 Let state 0 mean that both stations are idle, state 1 mean that only station 1 is occupied, state 2 mean that only station 2 is occupied, and state 3 mean that both stations are occupied. Let Xn be the state just before an item arrives at t = n. Then, by the memoryless 234 property of the exponentially distributed processing times, {Xn } is a discrete-time Markov chain with one-step transition probabilities p00 = 1 − e−µ1 , p01 = e−µ1 , p10 = p20 = p30 = (1 − e−µ1 )(1 − e−µ2 ), p11 = p21 = p31 = e−µ1 (1 − e−µ2 ), p12 = p22 = p32 = (1 − e−µ1 )e−µ2 , and p13 = p23 = p33 = e−µ1 e−µ2 . Putting for abbreviation ai = e−µi for i = 1, 2, the equilibrium equations are π0 = (1 − a1 )π0 + π1 = a 1 π0 + 3 X i=1 3 X i=1 (1 − a1 )(1 − a2 )πi , a1 (1 − a2 )πi , π2 = 3 X i=1 (1 − a1 )a2 πi , π3 = 3 X a 1 a 2 πi , i=1 where µ1 = 4/3 and µ2 = 4/5. The equilibrium probabilities are π0 = 0.6061, π1 = 0.2169, π2 = 0.1304, and π3 = 0.0467. The loss probability is equal to π3 = 0.0467. Note: The long-run fraction of time that both stations are occupied is (1 − π0 )E min(U, 1) = 0.1628, where U the minimum of two independent exponentially distributed random variables with rates µ1 and µ2 . Thus U has an exponential distribution with rate µ1 + µ2 , which gives E min(U, 1) = 0.41323. 10.55 Let Xn be the number of tokens in the buffer at the beginning of the nth time slot just before a new token arrives. Then, {Xn } is a Markov chain with state space I = {0, 1, . . . , M }. Put for abbreviation ak = e−λ λk /k!. The one-step transition probabilities are pjk = aj+1−k P for 0 ≤ j < M and 1 ≤ k ≤ j + 1, pj0 = 1 − j+1 pjk for 0 ≤ k < M , Pk=1 M pM k = aM −k for 1 ≤ k ≤ M and pM 0 = 1 − k=1 pM k . Let {πj } be the equilibrium distribution of the Markov chain. By Rule 10.7, the long-run is equal Pj admitted in one timePslot PM average number of packets j to j=0 c(j)πj , where c(j) = k=0 kak + (j + 1) 1 − k=0 ak for P −1 PM −1 0 ≤ j < M and c(M ) = M k=0 kak + M 1 − k=0 ak . 10.56 Define the random variable Xn as the premium class for the transport firm at the beginning of the nth year. Then, the stochastic process {Xn } is a Markov chain with four possible states i = 1, ..., 4. The one-step transition probabilities pij are easily found. Denote by the random variable S the total damage in the coming year and let G(s) denote the cumulative probability distribution function of S. A onestep transition from state i to state 1 occurs if and only if at the end 235 of the year a damage is claimed; otherwise a transition from state i to state i + 1 occurs (state 5 = state 4). Since for premium class i only a cumulative damage S larger than αi will be claimed, it follows that pi1 = 1 − G(αi ) for i = 1, ..., 4, pi,i+1 = G(αi ) for i = 1, 2, 3, and p44 = G(α4 ). The other one-step transition probabilities pij are equal to zero. The Markov chain has no two disjoint closed sets. Hence the equilibrium probabilities πj , 1≤ j ≤ 4 are the unique solution to the equilibrium equations π1 = 1 − G(a1 ) π1 + 1 − G(α2 ) π2 + 1 − G(a3 ) π3 , + 1 − G(a4 ) π4 , π2 = G(α1 )π1 , π3 = G(α2 )π2 , π4 = G(α3 )π3 + G(α4 )π4 . together with the normalizing equation π1 + π2 + π3 + π4 = 1. Denote by c(j) the expected costs incurred during the coming year when at the beginning of the year premium Pj is paid. Then, P by Rule 15.7, the long-run average cost per year is g(α1 , ..., α4 ) = 4j=1 c(j)πj , with probability 1. The one-year cost c(j) consists of the premium Pj and any damages not compensated that year by the insurance company. By conditioning on the cumulative damage S in the coming year, it follows that Z αj c(j) = Pj + sg(s)ds + rj [1 − G(αj )]. 0 In case S has an exponential distribution with mean 1/η, the expression for c(j) can be simplified to c(j) = Pj + 1 − e−ηαj − ηαj e−ηαj + rj e−ηαj ; otherwise, numerical integration must used to obtain the c(j). 10.57 The Markov matrix is doubly stochastic and irreducible. Therefore its unique equilibrium distribution is the uniform distribution, that is, πi = 0.25 for all i, see Problem 10.45. Also, the Markov matrix has the property that pjk = pkj for all j, k. This gives that πj pjk = πk pkj for all j, k, showing that the Markov chain is reversible. 10.58 The two equilibrium equations boil down to the single equation aπ1 = bπ2 . This equation states that π1 p12 = π2 p21 , showing that the Markov chain is reversible for all a, b with 0 < a, b < 1. This result is obvious by noting that in any two-state Markov chain with no absorbing states 236 the long-run average number of transitions from state 1 to state 2 per unit time must be equal to the long-run average number of transitions from state 2 to state 1 per unit time. 10.59 The detailed balance equation πi pij = πj pji boils down to πi PN k=1 wik πj = PN k=1 wjk for i, j = 1, . . . , N. Therefore the detailed balance equations are satisfied by PN wik πi = PN k=1 for i = 1, . . . , N. PN j=1 k=1 wjk Thus the Markov chain is reversible and has the πi as its unique equilibrium distribution. The equilibrium probabilities for the mouse prob2 lem are π1 = π5 = π11 = π15 = 44 , π2 = π3 = π4 = π6 = π10 = π12 = 3 4 π13 = π14 = 44 , and π7 = π8 = π9 = 44 (take wij = 1 if the rooms i and j are connected by a door). The mean recurrence time from state i to itself is µii = 1/πi . 10.60 Let K be the common number of points in each of the sets N (i). Fix j, k ∈ I with j 6= k. If k ∈ / N (j), then pjk = pkj = 0 and so e−c(j)/T pjk = e−c(k)/T pkj . For k ∈ N (j), pjk = Therefore e −c(j)/T pjk = e −c(j)/T e−c(k)/T 1 min 1, −c(j)/T . K e 1 e−c(k)/T min 1, −c(j)/T K e ! 1 min(e−c(j)/T , e−c(k)/T ) K ! −c(j)/T e 1 = e−c(k)/T min 1, −c(k)/T = e−c(k)/T pkj . K e = Letting αi = e−c(i)/T for all i ∈ I, we now have verified that αj pjk = αk pkj for all j, k ∈ I (trivially, αj pjk = αk pkj for j = k). Converting P the αj into the probability mass function {αj∗ } with αj∗ = αj / i∈I αi , we have αj∗ pjk = αk∗ pkj for all j, k ∈ I. 237 In other words, the Markov chain with the one-step transition probabilities pjk is reversible. Summing over k in the latter equation, we get that the αj∗ satisfy the equilibrium equations of the Markov chain. Since the Markov chain has no two disjoint closed sets, the αj∗ form the unique equilibrium distribution of the Markov chain. Note 1 : The assumption that each set N (i) has the same number of elements is not essential. If this assumption does not hold, the Markov chain is defined as follows. The Markov chain moves from any state i to state j ∈ N (i) with probability pijP = K1 min(1, e−c(j)/T /e−c(i)/T ) or stays in state i with probability 1 − j∈N (i) pij , where K = maxi Ki with Ki denoting the number of elements in N (i) (note the assumption that i ∈ / N (i)). Note 2 : In the case that the function c(i) assumes its absolute minimum in a unique point m, then πm = 1+ P 1 −[c(k)−c(m)]/T k6=m e with c(k) − c(m) > 0 for all k 6= m, implying that πm tends to 1 as T → 0. More generally, if M is the set of points at which the function 1 c(i) takes on itsP absolute minimum, then πj for j ∈ M tends to M as T → 0 (and so j∈M πj tends to 1 as T → 0), where M denotes the number of elements in the set M. 10.61 Applying 100,000 iterations of the Metropolis-Hastings algorithm with random-walk sampling, we found for a = 0.02, 0.2, 1, and 5 the average values 97.3%, 70.9%, 34%, and 4.5% for the acceptance probability. Further, the simulation results clearly showed that a high acceptance probability does not necessarily guarantee a good mixing of the state. We experimentally found that a = 0.6 gives an excellent mixing with an average acceptance probability of about 49.9% (based on 100,000 iterations). For the choice a = 0.6, the estimates 1.652 and 1.428 for the expected value and the standard deviation of X1 were obtained after 100,000 iterations (the exact values are 1.6488 and 1.4294). 10.62 The following algorithm can be used to see how quickly the probability mass function π(1) = 0.2, π(2) = 0.8 can be recovered when applying the Metropolis-Hastings algorithm with q(t | s) = 0.5 for s, t = 1, 2. Algorithm: Step 0. Choose a (large) number M (the number of iteration steps) Take a starting state s0 ∈ {1, 2} and let n := 1. 238 Step 1. Generate a random number u1 from (0, 1). Let the candidate state tn := 1 if u1 ≤ 0.5 and tn := 0 otherwise. Calculate the acceptance probability π(t ) n ,1 . α = min π(sn−1 ) Step 2. Generate a random number u2 from (0, 1). If u2 ≤ α, accept tn and let sn := tn ; otherwise, sn := sn−1 . Step 3. n := n + 1. If n < M , repeat step 1 with sn−1 replaced by sn ; otherwise, stop. 1 PM The probability π(1) is estimated by M n=1 In , where In = 1 if sn = 1 and In = 0 otherwise. 10.63 Using the definition π(x1 , x2 ) , −∞ π(u1 , x2 )du1 π1 (x1 | x2 ) = R ∞ it follows that π1 (x1 | x2 ) as function of x1 is proportional to 1 2 1 2 2 2 −1 x e− 2 (1+x2 )x1 −7x1 = e− 2 [x1 −7(1+x2 ) 2 −1 1 ]/(1+x2 ) . Hence, for fixed x2 , π1 (x1 | x2 ) as function of x1 is proportional to 1 2 −1 ]2 /(1+x2 )−1 2 e− 2 [x1 −(7/2)(1+x2 ) . 6000 5000 4000 3000 2000 1000 0 −2 −1 0 1 2 3 4 5 6 7 8 In other words, the univariate conditional density π1 (x1 | x2 ) is given by the N 27 (1 + x22 )−1 , (1 + x22 )−1 density. Similarly, the univariate 239 conditional density π2 (x2 | x1 ) is the N 72 (1 + x21 )−1 , (1 + x21 )−1 density. Next it is straightforward to apply the Gibbs sampler. Using the standard Gibbs sampler, the estimates 1.6495 and 1.4285 are found for E(X1 ) and σ(X1 ) after one million runs (the exact values are 1.6488 and 1.4294). The simulated probability histogram of the marginal density of X1 is given in the above figure. 10.64 The univariate conditional densities follow from the definitions π(x, y, n) π1 (x | y, n) = Pn u=0 π(u, y, n) π2 (y | x, n) = R 1 π(x, y, n) u=0 π(x, u, n) du π(x, y, n) . π3 (n | x, y) = P∞ u=x π(x, y, u) This gives that π1 (x | y, n) is proportional to y x (1 − y)n−x as function of x with 0 ≤ x ≤ n, π2 (y | x, n) is proportional to y x+α−1 (1 − y)n−x+β−1 as function of y with 0 < y < 1, and π3 (n | x, y) is proportional to λn−x (1 − y)n−x (n − x)! as function of n with n ≥ x. This shows that the marginal distribution of X is the binomial distribution with parameters n and y, the marginal distribution of Y is the beta distribution with parameters x + α and n − x + β, and the marginal distribution of N is a Poisson distribution shifted to x and having parameter λ(1 − y). Next it is straightforward to apply the Gibbs sampler. Using the standard Gibbs sampler, the estimates 9.992 and 6.836 are found for E(X) and σ(X) after 100,000 iterations (the exact values are 10 and 6.809). In the figure below the simulated probability histogram for the marginal density of X is given. 240 0.1 0.08 0.06 0.04 0.02 0 0 10 20 30 40 50 Note: The values in the simulated probability histogram can be compared with the exact values of the probabilities. It can be analytically verified that the marginal distribution of X is given by P (X = x) = 72(x + 1) ∞ X n=x e−50 50n (n − x + 7)! , (n + 9)! (n − x)! x = 0, 1, . . . . The exact values of P (X = x) are 0.0220, 0.0500, 0.0646, 0.0541, 0.0332, 0.0167, 0.0071, 0.0026, and 0.0008 for x = 0, 2, 5, 10, 15, 20, 25, 30, and 35. 241 Chapter 11 11.1 Let state i mean that i units are in working condition, where i = 0, 1, 2. Let X(t) be the state at time t. The process {X(t)} is a continuoustime Markov chain. If an item has an exponentially distributed lifetime with failure rate α, then an item of age t will fail in the next ∆t time units with probability α∆t + o(∆t) as ∆t → 0. Thus, for ∆t → 0, µ∆t + o(∆t), i = 0, j = 1 λ∆t(1 − µ∆t) + o(∆t), i = 1, j = 0 P (X(t+∆t) = j | X(t) = i) = µ∆t(1 − λ∆t), i = 1, j = 2 (λ + η)∆t + o(∆t), i = 2, j = 1, where P (X(t + ∆t) = 1 | X(t) = 2) uses the fact that the minimum of two independent exponentially distributed lifetimes is again exponentially distributed. Therefore the transition rates of the Markov chain are given by q01 = µ, q10 = λ, q12 = µ, and q21 = λ + η. 11.2 Let the random variable X(t) be the number of cars present in the gasoline station at time t. The process {X(t)} is a continuous-time Markov chain with state space I = {0, 1, 2, 3, 4}. Noting that the probability of two of more state changes in a very small time interval of length ∆t is o(∆t) as ∆t → 0, we have P (X(t + ∆t) = i + 1 | X(t) = i) = λ∆t + o(∆t) for 0 ≤ i ≤ 3, P (X(t + ∆t) = i − 1 | X(t) = i) = µ∆t + o(∆t) for 1 ≤ i ≤ 4. Therefore the transition rates of the Markov chain are given by qi,i+1 = λ for 0 ≤ i ≤ 3 and qi,i−1 = µ for 1 ≤ i ≤ 4. The other qij are zero. 11.3 Let X( t) denote the number of customers in the barbershop at time t. The process {X(t)} is a continuous-time Markov chain with state space I = {0, 1, . . . , 7}. For 0 ≤ i ≤ 6, we have P (X(t + ∆t) = i + 1 | X(t) = i) = λ∆t × (1 − b(i)) + o(∆t). Thus qi,i+1 = λ(1 − b(i)) for 0 ≤ i ≤ 6. Also, q10 = µ, qi,i−1 = 2µ for 2 ≤ i ≤ 7, and the other qij = 0. 11.4 Let the random variable X(t) be equal to i if the ferry is at point A at time t and i cars are on the ferry for i = 0, 1, . . . , 6, be equal to 7 if the ferry is traveling to point B at time t, and be equal to 8 if the ferry is 242 traveling to point A at time t. The stochastic process {X(t), t ≥ 0} is a continuous-time Markov chain. Its transition rates qij are given by qi,i+1 = λ for i = 0, 1, . . . , 6, q78 = µ1 , q80 = µ2 , and the other qij = 0. 11.5 Let state (i, 0) mean that i passengers are waiting at the stand and no sheroot is present (0 ≤ i ≤ 7), and let state (i, 1) mean that i passengers are waiting at the stand and a sheroot is present (0 ≤ i ≤ 6). Let X(t) be the state at time t. The process {X(t)} is a continuous-time Markov chain with transition rates q(i,0),(i+1,0) = λ and q(i,0),(i,1) = µ for 0 ≤ i ≤ 6, q(7,0),(0,0) = µ, q(i,1),(i+1,1) = λ for 0 ≤ i ≤ 5, and q(6,1),(0,0) = λ. 11.6 For any t ≥ 0, define the random variable X1 (t) as the number of ships present at time t. Let X2 (t) be equal to 1 if the unloader is available at time t and be equal to 0 otherwise. The process X(t) = {(X1 (t), X2 (t))} is a continuous-time Markov chain with state space I = {(i, 0) | i = 1, . . . , 4} ∪ {(i, 1) | i = 0, 1, . . . , 4}. Noting that the probability of two or more state transitions in very small time interval of length ∆t is o(∆t) as ∆t → 0, we get the transition rates q(i,0)(i,1) = β for 1 ≤ i ≤ 4, q(i,0)(i+1,0) = q(i,1)(i+1,1) = λ for 1 ≤ i ≤ 3, q(i,1)(i,0) = δ, and q(i,1)(i−1,1) = µ for 1 ≤ i ≤ 4. 11.7 Let X1 (t) be the number of messages in the system at time t. Also, let X2 (t) be 1 if the gate is not closed at time t and be 0 otherwise. The process {(X1 (t), X2 (t))} is a continuous-time Markov chain with state space I = {(0, 1), . . . , (R − 1, 1)} ∪ {(r + 1, 0), . . . , (R, 0)}. The transition rates are q(i,1),(i+1,1) = λ for 0 ≤ i ≤ R−2, q(R−1,1),(R,0) = λ, q(i,1),(i−1,1) = µ for 1 ≤ i ≤ R − 1, q(i,0),(i−1,0) = µ for r + 2 ≤ i ≤ R, and q(r+1,0),(r,1) = µ. 11.8 For i = 1, 2, let the random variable Xi (t) be equal to 0 if station i is free and be equal to 1 if station i is occupied. The process {(X1 (t), X2 (t))} is a continuous-time Markov chain. The transition rates are q(0,0)(1,0) = q(1,0)(1,1) = q(0,1)(1,1) = λ, q(1,0)(0,0) = q(1,1)(0,1) = σ1 µ, q(0,1)(0,0) = q(1,1)(1,0) = σ2 µ, and the other qst = 0. 11.9 Define the state as the number of machines in repair. In the states i = 0 and 1 there are 10 machines in operation and 2 − i machines on standby, while in state i ≥ 2 there 12 − i machines in operation 243 and no machines on standby. Let X(t) be the state at time t. The process {X(t)} is a continuous-time Markov chain with state space I = {0, 1, . . . , 12}. The transition rates are qi,i+1 = 10λ for i = 0, 1, qi,i+1 = (12 − i)λ for 2 ≤ i ≤ 11, and qi,i−1 = iµ for 1 ≤ i ≤ 12. 11.10 The expected amount of time that the process is in state j during (0, T ) when the initial state is i is given by Z T Z I(t)dt | X(0) = i = T E(I(t) | X(0) = i) dt Z T Z T pij (t) dt. P (I(t) = 1 | X(0) = i) dt = = E 0 0 0 0 11.11 Using Problem 11.10 and Example 11.1 (continued) in Section 11.2, the answer is Z Th i µ1 µ1 −(µ1 +µ2 )t − e dt µ1 + µ2 µ1 + µ2 0 µ1 µ1 T − (1 − e−(µ1 +µ2 )T ). = µ1 + µ2 (µ1 + µ2 )2 11.12 It is matter of straightforward algebra to verify the result by substitution into the Kolmogorov forward differential equations. 11.13 (a) Let N (t) be the number of state changes in (0, t). Then P (N (T ) = 1 | X(0) = a, X(T ) = b) P (X(T ) = b, N (T ) = 1 | X(0) = a) = . P (X(T ) = b | X(0) = a) Obviously, P (X(T ) = b | X(0) = a) = pab (T ). Let νi = pij = qij /νi . Then, by the law of conditional probability, P (X(T ) = b, N (T ) = 1 | X(0) = a) = Z T P j6=i qij e−νb (T −t) νa e−νa t pab dt. 0 Thus the sought probability is qab e−νb T − e−νa T (νa − νb )pab (T ) if νa 6= νb and and qab T e−νa T if νa = νb . pab (T ) 244 (b) For fixed j, let I(t) = 1 if X(t) = j and I(t) = 0 otherwise. Then P (I(t) = j, X(T ) = b | X(0) = a) P (X(T ) = b | X(0) = a) paj (t)pjb (T − t) . = pab (T ) P (I(t) = j | X(0) = a, ] X(T ) = b) = Thus the desired expected value is Z T I(t)dt | X(0) = a, X(T ) = b = E 0 1 pab (T ) Z T 0 paj (t)pjb (T − t) dt. 11.14 Using the result of Problem 11.10, the expected amount of time that the process R t is in state j during (0, t) when the starting state is 1 is given by 0 p1j (s) ds for j = 1, 2. Therefore the expected number of Poisson events during (0, t) when the initial state is 1 is λ1 Z t p11 (s) ds + λ2 0 Z t p12 (s) ds. 0 By the result of Example 11.1 (continued), we have for any s > 0 that p11 (s) = β α −(α+β)s α α −(α+β)s + e , p12 (s) = − e . α+β α+β α+β α+β Next we find after some algebra that the expected number of Poisson events during (0, t) when the initial state is 1 is given by α(λ1 − λ2 ) λ1 α + λ2 β 1 − e−(α+β)t . t+ 2 α+β (α + β) 11.15 The number of messages in the buffer is described by a continuoustime Markov chain with state space I = {0, 1, . . . , 10} and transition rates qi,i−1 = iµ for 1 ≤ i ≤ 10 and qi,i+1 = λ for 0 ≤ i ≤ 9. Using the alternative construction of a continuous-time Markov chain, fi = iµ λ fi−1 + fi+1 for 1 ≤ i ≤ 9, λ + iµ λ + iµ where f0 = 0 and f10 = 1. 11.16 By the alternative construction of a continuous-time Markov chain, it takes an exponentially distributed time with expected value ν1i to 245 make a transition from state i and such a transition will be to state j q with probability pij = νiji . By the law of conditional expectation, µi = 1 X 1 X qij + µj pij = + µj νi νi νi j6=i j6=i for i 6= r. 11.17 (a) Let N (t) be the number of state transitions in (0, t) and Xn be the state after the nth state transition. The random variable N (t) is Poisson distributed with expected value νt and the embedded process {Xn } is a discrete-time Markov chain. By the law of conditional probability, P (J(t) = j | J(0) = i) is given by ∞ X n=0 = P (J(t) = j | J(0) = i, N (t) = n) P (N (t) = n) ∞ X n=0 P (Xn = j | X0 = i)e−νt (νt)n , n! which gives the desired result. (b) The process {J(t)} is a continuous-time Markov chain. Since a finite-state continuous-time Markov chain is uniquely determined by its transition rates, it suffices to verify that {J(t)} has the qij as its transition rates. Using the expression for pij (t) = P (J(t) = j | J(0) = (0) i) in part (a) and noting that rij = 0 for j 6= i, it follows from ∞ pij (t) 1 X (n) −νt (νt)n = lim rij e t→0 t→0 t t n! lim n=0 that limt→0 pij (t)/t = rij ν = qij for any j 6= i, as was to be proved. 11.18 (a) Consider first the case of a single standby unit. In the solution of Problem 11.1, the state is defined as the number of units in working condition and the transition rates are given. To find E(T ), define µi as the expected amount of time until the first visit to state 0 starting from state i for i = 1, 2. Then E(T ) = µ2 . The µi can be found by solving two linear equations, see Problem 11.1. Noting that the rate ν1 out of state 1 and the rate ν2 out of state 2 are given by ν1 = λ + µ and ν2 = λ + η, we get µ1 = 1 µ λ 1 + µ2 + × 0 and µ2 = + µ1 . λ+µ λ+µ λ+µ λ+η 246 For the numerical data λ = 1, η = 0 and µ = 50, the solution of the two equations is µ1 = 51 and µ2 = 52. Hence the expected time until the system goes down for the first time is E(T ) = 52. To find P (T > t), the two linear differential equations Q′1 (t) = −(λ + µ)Q1 (t) + µQ2 (t) Q′2 (t) = −(λ + η)Q2 (t) + (λ + η)Q1 (t) are solved by using a numerical code. The probability P (T > t) is given by Q2 (t). This probability has the numerical values 0.8253, 0.6184, 0.3823, 0.1461, and 0.0213 for t =10, 25, 50, 100, and 200. (b) For the case of two standby units, use a continuous-time Markov chain with the four states i = 0, 1, 2, and 3. State i means that the number of units in working condition is i. The transition rates are q01 = µ, q10 = λ, q12 = µ, q21 = λ + η, q23 = µ, q32 = λ + 2η. The other qij = 0. The rate νi at which the process leaves state i is ν1 = µ + λ, ν2 = µ + λ + η and ν3 = λ + 2η. This leads to the following linear equations for the µi : 1 µ + µ2 µ+λ µ+λ λ+η µ 1 + µ1 + µ3 µ2 = µ+λ+η µ+λ+η µ+λ+η 1 + µ2 . µ3 = λ + 2η µ1 = The solution of these three linear equations leads to E(T ) = µ3 = 2,603. To find P (T > t), the three linear differential equations Q′1 (t) = −(µ + λ)Q1 (t) + µQ2 (t) Q′2 (t) = −(µ + λ + η)Q2 (t) + (λ + η)Q1 (t) + µQ3 (t) Q′3 (t) = −(λ + 2η)Q3 (t) + Q2 (t) are solved by using a numerical code. The probability P (T > t) is given by Q3 (t). This probability has the numerical values 0.9962, 0.9905, 0.9810, 0.9623, and 0.9261 for t =10, 25, 50, 100, and 200. Note: In both cases the occurrence of a system failure is a rare event, 247 because λ ≪ µ. This implies that the time until the first system is approximately exponentially distributed (see Section 4.5). That is, P (T > t) ≈ e−t/E(T ) for t > 0. For case (a), e−t/E(T ) gives the approximate values 0.8251, 0.6183, 0.3823, 0.1462, and 0.0214 for t =10, 25, 50, 100, and 200. Indeed this is an excellent approximation for P (T > t). It is also interesting to compare the solution for case (a) with the solution of Example 4.9 in which the replacement time is a constant rather than an exponentially distributed random variable. For case (b), e−t/E(T ) gives the approximate values 0.9962, 0.9904, 0.9810, 0.9623, and 0.9260 for t =10, 25, 50, 100, and 200. 11.19 This problem can be analyzed by a continuous-time Markov chain with 6λ 6 5λ 5 4λ 4 3λ 3 2 µ 2λ η sleep 2 crash 1 µ η λ 2λ λ sleep 1 nine states. The states and the transition rates are given in the figure. The sleep mode requires two states, because it takes an exponentially distributed time until the system is converted into the sleep mode and another gyroscope may fail during this time. The state labeled as the crash state is taken as an absorbing state. Using a numerical code for linear differential equations, the values 0.000504 and 0.3901 are found for the sought probabilities. 11.20 Let state i mean that the drunkard is i steps away from his home. Denote by X(t) the state at time t. The process {X(t)} is a continuoustime Markov chain with state space I = {0, 1, . . . , 30}, where the states 0 and 30 are absorbing (ν0 = ν30 = 0). Taking the minute as time unit, the other transition rates are qi,i−1 = qi,i+1 = 1 for 1 ≤ i ≤ 29. Using the uniformization method from Problem 11.17, the sought probability p21,0 (90) is calculated as 0.4558 (the probability p21,30 (90) = 0.1332. As sanity check, we verified that p21,0 (t) tends to 23 as t gets large, in agreement with the gambler’s ruin formula (see also Problem 2.53). 248 The probability p21,0 (t) has the values 0.5899, 0.6659, and 0.6665 for t = 180, 600, and 720. 11.21 By “rate out of state i = rate into state i”, we get the balance equations µp0 = λp1 , (λ + µ)p1 = µp0 + (λ + η)p2 , and (λ + η)p2 = µp1 . Together with the normalization equation p0 + p1 + p2 = 1, these equations can be solved (one of the balance equations can be omitted). The long-run fraction of time the system is down is given by p0 = λ . λ + µ + µ2 /(λ + η) The probability p0 = 0.0129 when λ = 0.1, η = 0.05, and µ = 1. 11.22 Using the transition rates given in the solution of Problem 11.2 and equating the rate out of state i to the rate into state i, we get the balance equations λp0 = µp1 , (λ + µ)pi = λpi−1 + µpi+1 for i = 1, 2, 3, and µp4 = λp3 . Together with the normalization equation p0 + · · · + p3 = 1, the equilibrium probabilities pi can be computed. It is noted that the balance equations can be recursively solved: they can be rewritten as µpi = λpi−1 for i = 1, 2, 3. as can be seen by equating the rate out of the set {i, . . . , 3} to the rate into the set {i, . . . , 3} for i = 1, 2, and 3. For λ = 16 and µ = 14 , the equilibrium probabilities are p0 = 0.3839, p1 = 0.2559, p2 = 0.1706, p3 = 0.1137, and p4 = 0.0758. The long-run fraction of time the pump is occupied is equal to p1 + p2 + p3 + p4 = 0.6162. The average number of cars waiting in queue is Lq = 4 X i=1 (i − 1)pi = 0.6256. By the property Poisson arrivals see time averages, the long-run fraction of cars that enter the station is equal to 1 − p4 , being the long-run fraction of time that there is room for other cars in the station. Let Wq be the average waiting time per car entering the station. To find Wq , note that the conditional probability of a car finding upon arrival 249 j other cars present given that the car can enter the station is for j = 0, 1, 2, and 3. Therefore pj 1−p4 3 X j pj Wq = = 4.0615. µ 1 − p4 j=1 Note: Letting the rejection probability Prej = p4 (the long-run fraction of cars that cannot enter the station), note that Lq = λ(1 − Prej )Wq . This relation is generally valid for finite-capacity queueing systems and is known as Little’s formula. 11.23 There are five states. Let state (0, 0) mean that both stations are free, state (0, 1) that station 1 is free and station 2 is busy, state (1, 0) that station 1 is busy and station 2 is free, state (1, 1) that both stations are busy and state (b, 1) that station 1 is blocked and station 2 is busy. Using the transition rate diagram and equating the rate out of state 0,0 µ2 λ 1,0 0,1 λ µ1 µ2 µ2 1,1 µ1 b,1 (i, j) to the rate into state (i, j), we get the balance equations λp(0, 0) = µ2 p(0, 1), (µ2 + λ)p(0, 1) = µ1 p(1, 0) + µ2 p(b, 1), µ1 p(1, 0) = λp(0, 0) + µ2 p(1, 1), (µ1 + µ2 )p(1, 1) = λp(0, 1), µ2 p(b, 1) = µ1 p(1, 1). The long-run fraction of time station 1 is blocked is equal to p(b, 1). The long-run fraction of items that are rejected is equal to p(1, 0) + p(1, 1) + p(b, 1), by the property Poisson arrivals see time averages. 11.24 Let state (i, j) mean that i customers for gas and j customers for LPG are at the station. Denote by X(t) the state at time t. Then {X(t)} is a continuous-time Markov chain with the thirteen states (0, 0), (1, 0), 250 (2, 0), (3, 0), (0, 1), (1, 1), (2, 1), (3, 1), (0, 2), (1, 2), (2, 2), (0, 3), and 1 , µ1 = 12 , (1, 3). Take the minute as time unit and let λ1 = 13 , λ2 = 12 and µ2 = 31 . The transition rates are easily expressed in the λi and the µi . Denote by p(i, j) the limiting probability of state (i, j). It is helpful to draw a transition diagram in order to get the balance equations (λ1 + λ2 )p(0, 0) = µ1 p(1, 0) + µ2 p(0, 1), (λ1 + λ2 + µ1 )p(i, 0) = µ1 p(i + 1, 0) + µ2 p(i, 1) + λ1 p(i − 1, 0) for i = 1, 2, 3, (λ1 + λ2 + µ1 )p(0, j) = µ1 p(1, j) + µ2 p(0, j + 1) + λ2 p(0, j − 1) for j = 1, 2, 3, (λ1 + λ2 + µ1 + µ2 )p(1, j) = µ1 p(2, j) + µ2 p(1, j + 1) + λ1 p(0, j) + λ2 p(1, j − 1) for j = 1, 2, 3, (λ1 + λ2 + µ1 + µ2 )p(2, j) = µ1 p(3, j) + µ2 p(2, j + 1) + λ1 p(1, j) + λ2 p(2, j − 1) for j = 1, 2, (µ1 + µ2 )p(3, 1) = λ1 p(2, 1) + λ2 p(3, 0), where p(4, 0) = p(0, 4)p(1, 4) = p(2, 3) = p(3, 2) = 0. Denote by pk the limiting probability of having a total of k cars in the station. Then, p0 = p(0, 0), p1 = p(1, 0) + p(0, 1), p2 = p(2, 0) + p(0, 2) + p(1, 1), p3 = p(3, 0) + p(2, 1) + p(1, 2) + p(0, 3), p4 = p(3, 1) + p(2, 2) + p(1, 3). Also, the long-run average number of cars served per unit time is µ1 3 X i=1 + µ2 p(i, 0) + p(i, 1) + p(1, 2) + p(2, 2) + p(1, 3) 3 X j=1 (p(0, j) + p(1, j) + p(2, 1) + p(2, 2) + p(3, 1) The long-run fraction of LPG-cars that cannot enter is p(3, 1)+p(2, 2)+ p(1, 3) + p(0, 3), by the property Poisson arrivals see time averages. 1 For the numerical data λ1 = 31 , λ2 = 12 , µ1 = 12 , and µ2 = 31 , the limiting probabilities pj are p0 = 0.3157, p1 = 0.2894, p2 = 0.2127, p3 = 0.1467, and p4 = 0.0354. Also, the long-run fraction of LPG-cars that cannot enter is 0.0404 (and the long-run fraction of gas-cars that cannot enter is 0.1290). The long-run average number of cars served per hour is 60 × (0.2904 + 0.0800) = 22.2. 251 11.25 Let state (i, 0) mean that i passengers are waiting at the stand and no sheroot is present, and let state (i, 1) mean that i passengers are waiting at the stand and a sheroot is present. Using the transition rate diagram and equating the rate out of state (i, j) to the rate into µ λ µ λ ... 1,0 0,0 i,0 µ λ 0,1 µ ... 1,1 ... i+1,0 λ i,1 λ 6,0 µ 0,0 µ ... i+1,1 6,1 λ state (i, j), we get the balance equations (λ + µ)p(0, 0) = µp(7, 0) + λp(6, 1), (λ + µ)p(i, 0) = λp(i − 1, 0) for 1 ≤ i ≤ 6, µp(7, 0) = λp(6, 0), λp(0, 1) = µp(0, 0), λp(0, 1) = µp(0, 0) λp(i, 1) = µp(i, 0) + λp(i − 1, 1) for 1 ≤ i ≤ 5. P The long-run average number of waiting passengers is 6i=1 i[p(i, 0) + p(i, 1)] + 7p(7, 0). The long-run fraction of potential passengers who go elsewhere is p(7, 0). 11.26 Let the random variable X(t) be the number of units in stock at time t. The process {X(t)} is a continuous-time Markov chain with state space I = {1, 2, . . . , Q} (note that state 0 should not be included: if only one unit is in stock and a demand occurs or the unit deteriorates, the stock is immediately replenished to Q). The transition rates are given by q1Q = λ + η and qi,i−1 = λ + iη for 2 ≤ i ≤ Q. The other qij are zero. In the figure the transition rate diagram is given. This diagram is helpful in writing down the balance equations. λ+2µ 1 λ+ i µ ... 2 i-1 λ+( i +1)µ i i+1 λ+µ λ+ Q µ ... Q-1 Q 252 By equating the rate out of state i to the rate into state i, we get (λ + iµ)pi = [λ + (i + 1)µ]pi+1 (λ + Qµ)pQ = (λ + µ)p1 . for 1 ≤ i ≤ Q − 1 The equilibrium probabilities can be recursively computed. Starting with p1 = 1, the numbers p2 , . . ., pQ are recursively obtained. Next the desired pi follow by normalization. The long-run average stock equals P Q i=1 ipi . The long-run average number of replenishment orders placed per unit time equals the long-run average number of transitions from state 1 to state Q per unit time and thus is equal to (λ + µ)p1 . 11.27 The state of the system is defined as (i, k) when i ≥ 1 cars are in the gasoline station and the service time of the car in service is in phase k for k = 1, 2. State 0 means that the gasoline station is empty. Then the state of the system is described by a continuous-time Markov chain. The arrival rate of cars is λ = 61 and the rate at which a service phase is completed is β = 12 . Denoting the limiting probability of state s by p(s) the balance equations are (λ + β)p(i, 1) = λp(i − 1, 1) + βp(i + 1, 2) for 1 ≤ i ≤ 3, (λ + β)p(i, 2) = λp(i − 1, 2) + βp(i, 1) for 1 ≤ i ≤ 3, λp(0) = βp(1, 2), βp(4, 1) = λp(3, 1), and βp(4, 2) = λp(3, 2) + βp(4, 1), where p(0, 1) = p(0) and p(0, 2) = 0. The numerical solution of the balance equations leads to the following answers: (a) the fraction of time the pump is occupied is 1 − p(0) = 0.6138. (b) The average number of cars waiting in queue is Lq = 4 X i=1 (i − 1) × p(i, 1) + p(i, 2) = 0.5603. (c) The fraction of cars not entering the station is Prej = p(4, 1) + p(4, 2) = 0.0529, by the property Poisson arrivals see time averages. (d) The average waiting time in queue of a car entering the station is Wq = 3 3 X 2 X p(i, 2) 1 2 p(i, 1) i× + = 3.5494 + (i − 1) × 1 − Prej β 1 − Prej β β i=1 i=1 253 minutes, using the fact that the conditional probability of a car entering the station finds upon arrival state (i, k) is p(i, k)/(1 − Prej ). Note: As sanity check, Lq = λ(1 − Prej )Wq . This relation is generally valid for finite-capacity queues and is known as Little’s formula. 11.28 The Erlang-distributed repair time with shape parameter 3 and scale parameter β can be seen as the sum of three independent phases each having an exponential distribution with mean β1 . The system is said to be in state (0, k) if both units have failed and the repair phase of the unit in repair is phase k for k = 1, 2, and 3, and the system is said to be in state (1, k) if one unit is working and the repair phase of the unit in repair is phase k for k = 1, 2, and 3. The state of the system is 2 if both units are in working condition. Let the random variable X(t) be the state of the system at time t. Then the process {X(t)} is a continuous-time Markov chain. The transition rates are q2,(1,1) = λ + η, q(1,1)(1,2) = β, q(1,1)(0,1) = λ, q(1,2)(1,3) = β, q(1,2)(0,2) = λ, q(1,3)2 = β, q(1,3)(0,3) = λ, q(0,1)(0,2) = q(0,2)(0,3) = q(0,3)(1,1) = β. Denoting the limiting probability of state s by p(s), we have the balance equations (λ + η)p(2) = βp(1, 3), (λ + β)p(1, 1) = (λ + η)p(2) + βp(0, 3), (λ + β)p(1, 2) = βp(1, 1), (λ + β)p(1, 3) = βp(1, 2), βp(0, k) = λp(1, k) + βp(0, k − 1) for k = 1, 2, 3, where p(0, −1) = 0. The long-run fraction of time the system is down is p(0, 1) + p(0, 2) + p(0, 3). By solving the equilibrium equations for λ = 0.1, η = 0.05, and β = 3, we find p(0, 1) = 1.527 × 10−3 , p(0, 2) = 3.005 × 10−3 , and p(0, 3) = 4.435 × 10−3 . Therefore the long-run fraction of time the system is down is equal to 8.97 × 10−3 . Also, the long-run fraction of time that a repair is going on is 1 − p2 = 0.1420. 11.29 (a) In each of the problems the continuous-time Markov chain has a state space of the form {0, 1, . . . , N } and transition rates qjk that are zero if |j − k| > 1. Equating the rate at which the system leaves the set of states {j, j + 1, . . . N } to the rate at which the system enters this set gives that qj,j−1 pj = qj−1,j pj−1 for all j = 1, . . . N, 254 showing that the Markov chain is reversible. (b) By the reversibility of the Markov process {X(t)}, pj qjk = pk qkj for all j, k ∈ A with j 6= k. This gives that pj q jk = pk q kj for all j, k ∈ A with j 6= k. Summing this equality over k ∈ A, we get X X pj q jk = pk q kj for all j ∈ A. k6=j k6=j In other words, the pj satisfy the balance equations of the continuousP time Markov chain {X(t)} and the normalization equation j∈A pj = 1. The solution of these equations is unique. This proves that the pj are the limiting probabilities of the Markov process {X(t)}. 11.30 Let state 1 correspond to the perfect state, state 2 to the good state, and state 3 to the acceptable state. Let the random variable X(t) be the state at time t. The process {X(t)} is a continuous-time Markov chain with transition rates q12 = q13 = 21 µ1 , q23 = µ2 , q31 = µ3 , and qij = 0 otherwise. The equilibrium equations are µ1 p1 = µ3 p3 , µ2 p2 = 21 µ1 p1 , and µ3 p3 = 12 µ1 p1 + µ2 p2 . This gives p1 = µµ13 p3 and µ3 µ3 p2 = 2µ p3 . Thus µµ31 + 2µ +1 p3 = 1. It now follows that the average 2 2 number of replacements per unit time is µ3 p3 = µ1 µ2 µ3 . µ2 µ3 + µ1 µ2 + 12 µ1 µ3 11.31 Let Ti be the sojourn time in state i. Then, P (Ti ≤ t + ∆t | Ti > t) = (µi ∆t) × (1 − ai ) + o(∆t) for ∆t small, showing that Ti is exponentially distributed with mean 1/[(1−ai )µi ]. Let X(t) = i if a type-i part is processed at time t. Then {X(t)} is a continuous-time Markov chain with state space I = {1, 2} and transition rates q12 = µ1 (1−a1 ) and q21 = µ2 (1−a2 ). The limiting probabilities of a two-state Markov chain can be explicitly given (see also Example 11.1 (continued)). The probabilities are p1 = µ2 (1 − a2 ) µ1 (1 − a1 ) and p2 = . µ1 (1 − a1 ) + µ2 (1 − a2 ) µ1 (1 − a1 ) + µ2 (1 − a2 ) The average number of type-i parts processed per unit time is µi pi for i = 1, 2. 255 11.32 (a) The epochs at which a transition into state 0 occurs are regeneration epochs of the continuous-time Markov chain. Imagine that a reward at rate 1 is earned when the system is in state 0. Then, by the renewal-reward theorem from Section 9.4, the long-run average reward 0 per unit time is 1/ν µ00 . The long-run average reward per unit time is nothing else than the long-run fraction of time the process is in state 0. Therefore 1/ν0 = p0 , µ00 showing that µ00 = ν01p0 . (b) Conditioning upon the time the process leaves state 0 and using the law of conditional expectation, we get Z τ Z ∞ X −ν0 x x+ µj0 + E(T ) ν0 e−ν0 x dx ν0 e dx + E(T ) = τ τ 0 j6=0 1 = τ e−ν0 τ + (1 − e−ν0 τ − ν0 τ e−ν0 τ ) ν0 1 1 − + E(T ) (1 − e−ν0 τ ), + ν 0 p0 ν 0 which yields the desired expression for E(T ) after some simplification. 11.33 Let state 0 mean that both stations are idle, state 1 mean that only station 1 is occupied, state 2 mean that only station 2 is occupied and state 3 mean that both stations are occupied. Let X(t) be the state at time t. Then the process {X(t)} is a continuous-time Markov chain with transition rates q01 = λ, q10 = µ1 , q13 = λ, q20 = µ2 , q23 = λ, q31 = µ2 , and q32 = µ1 . The balance equations are λp0 = µ1 p1 + µ2 p2 , (λ + µ1 )p1 = λp0 + µ2 p3 (λ + µ2 )p2 = µ1 p3 , (µ2 + µ1 )p3 = λp1 + λp2 , where λ = 1, µ1 = 43 , and µ2 = 45 . The solution of the balance equations is p0 = 0.4387, p1 = 0.2494, p2 = 0.1327 and p3 = 0.1791. The long-run fraction of time both stations are occupied is p3 = 0.1791. By the property Poisson arrivals see time averages, this probability also gives the long-run fraction of items that are rejected. Note: The loss probability is 0.0467 in Problem 10.52, showing that the loss probability is very sensitive to the distributional form of the arrival process. 256 11.34 In the solution of Problem 11.9, the state is defined as the number of machines in repair and the transition rates are given. By equating the rate out of the set {i, i + 1, . . . , 12} to the rate into this set, we get the following recursion scheme for the limiting probabilities: iµpi = 10λpi−1 for i = 1, 2 iµpi = (12 − i + 1)λpi−1 for i = 3, . . . , 12. Starting with p0 = 1, Pwe recursively compute p1 , p2 , . . . , p12 . Next we calculate pj as pj / 12 k=0 pk for j = 0, 1, . . . , 12, using the fact that any solution to the balance equations is uniquely determined up to a multiplicative constant. 11.35 This problem can be solved through the Erlang loss model. The customers are the containers and the lots on the yard are the servers. A capacity for 18 containers is required. Then, the loss probability is 0.0071. By the insensitivity property of the Erlang loss model, the answer is the same when the holding time of a customer has a uniform distribution with the same expected value of 10 hours. 11.36 This problem is another application of the Erlang loss model and the solution of the problem is based on the insensitivity property of the Erlang loss model. Since the sum of two independent Poisson processes is again a Poisson process, cars for the parking place arrive according to a Poisson process with rate λ = 4 + 6 = 10 cars per hour. Also, any 4 1 arriving car is a short-term parker with probability λ1λ+λ = 10 and a 2 λ2 6 long-term parker with probability λ1 +λ2 = 10 , see Rule 5.4. Thus the 4 6 expected parking time of a car is 10 × 23 + 10 × 12 = 35 30 hours. Hence 30 the parameters of the Erlang loss model are given by λ = 1, µ = 35 and s = 10. Thus the limiting probability of an arriving car finding all parking places occupied is equal to (λ/µ)s /s! Ps = 0.4275. k k=0 (λ/µ) /k! 11.37 This inventory model is a special case of the Erlang loss model. Identify the number of outstanding orders with the number of busy servers. The limiting distribution of the stock on hand is given by rj = γ(λL)S−j (S − j)! for 0 ≤ j ≤ S, 257 P P where γ = 1/ Sk=0 (λL)k /k!. The average stock on hand is Sj=0 jrj . The fraction of lost demand is r0 . 11.38 Use the infinite-server queueing model from Example 11.5 to conclude that the limiting distribution of the number of outstanding replenishment orders is a Poisson distribution with an expected value of λL. Defining the net stock as the stock on hand minus the amount of backordered demand, the sum of the net stock and the number of outstanding replenishment orders is always S. This gives that the k P −λL (λL) . The longlong-run average stock on hand is S−1 k=0 (S − k) e k! run of demand that is backordered is equal to the long-run fraction of time that S or more orders are outstanding and is thus given by ∞ X j=S e−λL (λL)j . j! 11.39 The process describing the inventory position is a continuous-time Markov chain with state space I = {s + 1, . . . , s + Q} and transition rates qi,i−1 = λ for s + 2 ≤ i ≤ s + Q and qs+1,s+Q = λ. The limiting probabilities are pi = Q1 for s + 1 ≤ i ≤ s + Q. Let pi (t) be the probability that the inventory position at time t is i and rk (t) be the probability that the stock on hand at time t is k for 0 ≤ k ≤ s+Q. For any t > L, the stock on hand minus the amount backordered at time t + L equals the inventory position at time t minus the total demand in (t, t + L]. Then rk (t) = s+Q X i=k e−λL (λL)i−k pi (t) for 1 ≤ k ≤ s + Q (i − k)! P P∞ −λL (λL)l and r0 (t) = s+Q l=i e i=s+1 pi (t) l! . Noting that limt→∞ pi (t) = 1 , the limiting distribution of the stock on hand follows. Q 11.40 Let the random variable X1 (t) be the number of tables occupied by one person and X2 (t) be the number of tables occupied by two persons. The process {(X1 (t), X2 (t)} is a continuous-time Markov chain with state space {(i, j) : i + j ≤ 20, 0 ≤ i ≤ 16, and 0 ≤ j ≤ 20}. In state (i, j) there are 20 − i − j free tables. Take the hour as time unit and let λ1 = 10, λ2 = 12, µ1 = 2, and µ2 = 1.5. The transition rates are q(i,j)(i+1,j) = λ1 for i + j ≤ 15, q(i,j)(i,j+1) = λ2 for i + j ≤ 19, q(i,j)(i−1,j) = iµ1 for 1 ≤ i ≤ 16, and q(i,j)(i,j−1) = jµ2 for 1 ≤ j ≤ 20. 258 Denoting by p(i, j) the limiting probabilities, we have the balance equations (λ1 + λ2 + iµ1 + jµ2 )p(i, j) = λ1 p(i − 1, j) + λ2 p(i, j − 1) + (i + 1)µ1 p(i + 1, j) + (j + 1)µ2 p(i, j + 1) for 0 ≤ i + j ≤ 15, (λ2 + iµ1 + jµ2 )p(i, j) = λ1 p(i − 1, j) + λ2 p(i, j − 1) + (i + 1)µ1 p(i + 1, j) + (j + 1)µ2 p(i, j + 1) for 16 ≤ i + j ≤ 20, where p(i, j) = 0 for i > 16 and p(i, j) = 0 for i + j > 20. The performance measures are the average number of occupied tables = 16 20−i X X (i + j)p(i, j), i=0 j=0 the average number of singles served per hour = 16 20−i X X iµ1 p(i, j), i=1 j=0 the average number of pairs served per hour = 16 20−i X X jµ2 p(i, j). i=0 j=1 Note: By the property Poisson arrivals P see time averages, the frac16 P20 tion of singles who cannot get a table is i=0 j=16−i p(i, j) and the P fraction of pairs who cannot get a table is 16 i=0 p(i, 20 − i). 11.41 Use the M/G/∞ model. The probability that there are more than 15 parts on the conveyor is ∞ X k=16 e−3×3 (3 × 3)k = 0.0220. k! 11.42 Define X(t) as the number of customers in the system at time t. The process {X(t), t ≥ 0} is a continuous-time Markov chain with state space I = {0, 1, . . .}. In the figure below the transition rate diagram is given. Since the transition rate qij = 0 for j ≤ i − 2, the equilibrium probabilities pj can be recursively computed. By equating the rate out of the set {i, i + 1, . . .} to the rate into this set, we find σ1 µpi = λpi−1 σ2 µpi = λpi−1 for 1 ≤ i ≤ R − 1, for i ≥ R. Starting with p̄0 = 1, we can recursively compute p̄1 , p̄2 , . . . and next obtain the desired pi ’s by normalization. 259 λ σ1µ ... λ λ λ 1 0 i-1 i ... σ1µ σ1µ λ R R-1 R-2 ... k-1 σ2µ k ... σ2µ 11.43 Define the state of the system as the number of taxis waiting at the stand. It is immediate from the transition rate diagram that the 7 M/M/1 queueing model with arrival rate λ = 60 and service rate µ = 10 applies to the number of taxis present at the stand. The ser60 7 vice requests in the model are the taxis arriving at a rate of λ = 60 per 10 minute and the service rate of µ = 60 per minute is the rate at which potential passengers come to the stand. The limiting probability of having no taxi present is p0 = 1 − λ/µ = 0.3. Hence 1 λ/µ =2 1 − λ/µ 3 long-run proportion of passengers who get a taxi = 1 − p0 = 0.7, the long-run average number of waiting taxis = where the last result uses the property Poisson arrivals see time averages. 11.44 Let the random variable X(t) be the number of customers present at time t. The process {X(t)} is a birth-and-death process with state space I = {0, 1, . . .} and transition rates λi = λ for i ≥ 0, i+1 µi = µ for i ≥ 1. Note that the probability of going from state i to state i + 1 in a very 1 small time interval of length h is λh × i+1 + o(h). By Rule 11.5, we find (λ/µ)j pj = p0 for j = 0, 1, . . . . j! P Using the normalization equation ∞ j=0 pj = 1, it next follows that −λ/µ p0 = e and so pj = e−λ/µ (λ/µ)j j! for j = 0, 1, . . . . By the property Poisson arrivals see time averages, the long-run fraction of customers finding upon arrival j other customers present is pj . Any customer seeing j other customers upon arrival enters the sys1 . Using the law of conditional probability, it tem with probability j+1 260 now follows that the long-run fraction of arrivals that actually join the queue is ∞ X j=0 pj ∞ µ X −λ/µ (λ/µ)k µ 1 e = = 1 − e−λ/µ , j+1 λ k! λ k=1 in agreement with result that the long-run average number of customers served per unit time is µ(1−p0 ) = µ 1 − e−λ/µ . The long-run fraction of customers who go elsewhere is 1 − µλ 1 − e−λ/µ . 11.45 This problem is an application of the infinite-server queueing model from Example 11.5. The steady-state probability that more than seven oil tankers are on the way to Rotterdam is insensitive to the shape of the sailing-time distribution and is given by the Poisson probability ∞ X e−4 j=8 4j = 0.0511. j! 11.46 To solve this problem, a key observation is the following. If all s servers busy, then, by the memoryless property of the exponential distribution, each of the s (remaining) service times is exponentially distributed with rate µ. The minimum of s independent random variables each having an exponential distribution with rate µ is exponentially distributed with rate sµ. As long as all s servers are working, the times between service completions are independent random variables each having an exponential distribution with rate sµ. In other words, service completions occur according to a Poisson process with rate sµ as long as s or more customers are present. Thus the conditional delay in queue of a customer finding j ≥ s other customers present upon arrival is the sum of j − s + 1 independent random variables each hav1 ing an exponential distributed with expected value sµ and thus has an Erlang (j − s + 1, µ) distribution. The probability that an Erlang (j − s + 1, µ) distributed random variable is larger than t is given by j−s X k=0 e−sµt (sµt)k k! for t ≥ 0. By the property Poisson arrivals see time averages, the steady-state probability that an arriving customer sees j other customers present 261 is equal to pj . By the law of conditional probability, lim P (Wn > t) = n→∞ ∞ X πj j=s j−s X e−sµt k=0 (sµt)k k! for t ≥ 0. Using the formulas for pj and Pdelay , it is next a matter of some algebra to obtain lim P (Wn > t) = Pdelay e−sµ(1−ρ)t n→∞ for t ≥ 0. 11.47 The process describing the number of service requests present is a birth-and-death process with transition rates λi = λ for i ≥ 0, µi = iµ for 1 ≤ i ≤ s, µi = sµ + (i − s)θ for i > s. The limiting probabilities pj can be recursively obtained from jµpj = λpj−1 for 1 ≤ j ≤ s, sµ + (j − s)θ pj = λpj−1 for j > s. Since the long-run average number of balking callers per unit time is ∞ X (j − s)θpj j=s+1 and the average arrival rate of callers is λ, it follows that the long-run fraction of balking callers is ∞ 1 X (j − s)θpj . λ j=s+1 11.48 Let the random variable X1 (t) be the number of available bikes at the bike rental and X2 (t) be the number of bikes at the depot. The process {(X1 (t), X2 (t)} is a continuous-time Markov chain with state space I = {(i, j) : i + j ≤ 25, i ≥ 0, and 0 ≤ j ≤ 9}. In state (i, j) there are 25 − i − j bikes rented out. Take the hour as time unit and let λ = 10 and µ = 0.5. The transition rates are q(i,j)(i+1,j) = 0.75(25 − i − j)µ, q(i,j)(i,j+1) = 0.25(25 − i − j)µ for j < 9, q(i,9)(i+10,0) = 0.25(25 − i − 9)µ and q(i,j)(i−1,j) = λ for i ≥ 1. 262 Denoting by p(i, j) the limiting probabilities and letting δ(0) = 0 and δ(i) = 1 for i ≥ 1, we have the balance equations λδ(i) + (25 − i − j)µ p(i, j) = λp(i + 1, j) + (25 − i − j + 1)µ × [0.25p(i, j − 1) + 0.75p(i − 1, j)] for 0 ≤ i < 10, 0 ≤ j ≤ 9, λ + (25 − i)µ p(i, j) = λp(i + 1, 0) + (26 − i)µ × [0.25p(i − 10, 9) + 0.75p(i − 1, 0)] for 10 ≤ i ≤ 25, λ + (25 − i − j)µ p(i, j) = λp(i + 1, j) + (25 − i − j + 1)µ × [0.25p(i, j − 1) + 0.75p(i − 1, j)] for 10 ≤ i ≤ 25, 1 ≤ j ≤ 9, where p(i, −1) = p(−1, j) = 0 and p(i, j) = 0 for i + j > 25. The performance measures are the average number of bikes at the bike rental = (25−i,9) 25 min X X the average number of bikes at the depot = (25−i,9) 25 min X X jp(i, j), j=0 i=0 the fraction of tourists who cannot rent a bike = ip(i, j), j=0 i=0 9 X p(0, j), j=0 where the last result uses the property Poisson arrivals see time averages. Further, the average number of transports per unit time from the depot to the bike rental = 16 X i=0 0.25(25 − i − 9)µp(i, 9). 11.49 Let X(t) be the number of busy channels at time t. The process {X(t)} is a continuous-time Markov chain with state space I = {0, 1, . . . , c}. The transition rates are qi,i−1 = iµ for i = 1, . . . , c and qi,i+1 = (M −i)α for i = 0, 1, . . . , c−1. By equating the rate out of the set {j, . . . , c} to the rate into the set {j, . . . , c} for 1 ≤ j ≤ c, we find that the equilibrium probabilities pj satisfy the recursive equations jµpj = (M − j + 1)αpj−1 for j = 1, . . . , c. 263 The equilibrium probabilities are given by the truncated binomial distribution M j M −j j p (1 − p) for j = 0, 1, . . . , c, pj = Pc M k M −k k=0 k p (1 − p) where p = (1/µ)/(1/µ + 1/α). The long-run average number of service requests generated per unit time when i service channels are busy is (M − i)αpi . Thus, the long-run fraction of service requests that are lost is M − c)αpc Pc . i=0 (M − i)αpi Note: The probability model in this problem is known as the Engset loss model. This model has the property that the limiting probabilities pj are insensitive to the specific form of the service-time distribution and thus require from the service time only its expected value. In the Engset model, the limiting probabilities are also insensitive to the shape of the on-time distribution of the sources when the on-time distribution and/or the service time distribution is continuous. A proof of this deep result is beyond the scope of the book. By letting M → ∞ and α → 0 such that M α remains equal to the constant λ, it follows from the Poisson approximation to the binomial probability that pj converges to e−λ/µ (λ/µ)j /j! −λ/µ (λ/µ)k /k! k=0 e Pc for j = 0, 1, ..., c. In other words, the Erlang loss model is a limiting case of the Engset loss model. This is not surprising, since the arrival process of service requests becomes a Poisson process with rate λ when we let M → ∞ and α → 0 such that M α = λ. 11.50 (a) Let X(t) be the number of working units at time t. The process {X(t)} is a continuous-time Markov chain with state space I = {0, 1, . . . , c} and transition rates qi,i−1 = iµ for 1 ≤ i ≤ c and qi,i+1 = µ for 0 ≤ i ≤ c − 1. The transition rate diagram is identical to the transition diagram in Figure 11.2 for Example 11.4 when s is replaced by c, λ by µ, and µ by α. The lifetime in Problem 11.50 is the repair time in Example 11.4 (or the service time in the Erlang loss model). Thus the limiting distribution of the number of working units is (µ/α)j /j! p j = Pc k k=0 (µ/α) /k! for j = 0, 1, . . . , c. 264 In particular, the long-run fraction of time the system is down is p0 . This performance measure is insensitive to the shape of the lifetime distribution, by the insensitivity property of the Erlang loss model. (b) For the case of ample repairmen, the model is the same as the Engset model by identifying the number of working units with the number of active sources and taking M = c. The limiting probability of having j working units is c X c k c j c−j pj = p (1 − p)c−k p (1 − p) / k j k=0 for 0 ≤ j ≤ c, where p = (1/α)/(1/α+1/µ). The long-run fraction of time the system is down is equal to p0 . Again, the insensitivity property applies. Note: As a sanity check, in both case (a) and case (b) we have p0 = 1/α 1/α+µ when c = 1, in agreement with the result of Example 9.3. 11.51 (a) Let X(t) be the stock on hand at time t. The process{X(t)} is a continuous-time Markov chain with state space I = {1, 2, . . . , R − 1}. The transition rates are q1Q = λ, qi,i−1 = λ for 2 ≤ i ≤ R − 1, qR−1,Q = µ, qi,i+1 = µ for 1 ≤ i ≤ R − 2, and the other qij = 0. The balance equations are (λ + µ)pi = µpi−1 + λpi+1 for i = 1, . . . , Q − 1, (λ + µ)pQ = λp1 + µpQ−1 + λpQ+1 + µpR−1 (λ + µ)pi = µpi−1 + λpi+1 for i = Q + 1, . . . , R − 1, where p0 = pR = 0. PR−1 (b) The average stock on hand is i=1 ipi . (c) The average number of stock replenishments per unit time is λp1 and the average number of stock reductions per unit time is µpR−1 .