Math230, 2018-19: Workshop, Coursework & Quiz Questions Coursework and Quiz marks The maximum mark obtainable through the handwritten coursework is 15 for CW01CW09. Each week there are two or three standard coursework questions with marks summing to 15. Each week there is also a challenge question with 5 marks available. Each week you can obtain the 15 marks by answering the standard questions perfectly; however if you drop a few marks on these, then you might regain them through the challenge question. T = min (15, S + C) . where T =total marks, S =standard marks and C =challenge marks. There is also a Revision Quiz that revises key material from the first year module on probability, and further quizzes associated with CW03-CW09. You should answer the revision quiz on Moodle by 11:59pm on the Sunday at the end of Week 2; it is worth a total of 30 marks. For QZ03-QZ09, you should answer QZ0k by 11:59pm on the Sunday at the end of Week k; each such quiz is worth 5 marks. 1 MATH230 Week 01 - Moodle-assessed problems QZ k is due 11:59pm on the Sunday of Week k, except for the revision quiz (k=1), which is due on the Sunday ending Week 2. Numerical answers should be entered as decimals and accurate to 3dp. Revision Quiz M01.1 Basic probability Nigel has 3 cards; the first has a triangle on one side, the second has a square and the third has a circle. The other side of each card is blank. Showing you only the blank sides, Nigel asks you to choose a card at random. If you choose the triangle he rolls an unbiased four-sided die (i.e. with the numbers 1–4 on the faces, with each face equally likely to be chosen); if you choose the square he rolls a six-sided die and if you choose the circle he rolls an eight-sided die. The end result of the experiment is a shape and a number. However, Nigel does not tell you which shape you chose and does not let you see him roll the die, he says he will simply call out the final number after a dramatic pause. (i) What is the probability that you choose a triangle and the final number is a 3? (ii) What is the probability that the final number is a 3? (iii) What is the probability that the final number is not a 3? (iv) What is the probability that the shape is a triangle or the final number is 3 or both? (v) The final number is a 3 (Nigel has just called it out); what is the probability that your shape was a triangle? M01.2 Discrete random variable A discrete random variable, X, takes a value x ∈ {1, 2, 3, 4} with a probability of x/10. It cannot take any other value. (i) What is P (X ≥ 3)? (ii) What is E [X]? (iii) What is E [X 2 ]? (iv) What is E [1/X]? (v) What is Var [X]? 2 M01.3 Continuous random variable A continuous random variable, Y , has a cumulative distribution function of FY (y) = c − 1/(1 + y)2 for y > 0, and 0 otherwise. (i) What is c? (ii) What is P (1 ≤ Y ≤ 3)? (iii) What is fY (3) (the density, evaluated at y = 3)? (iv) What is the third quartile of Y ? (v) What is E [1 + Y ]? M01.4 Variance, standard deviation and linearity of expectation variable, Z, has E [Z] = 4 and Var [Z] = 9. A random (i) What is E [10 − 2Z]? (ii) What is E [Z 2 ]? (iii) What is E [(10 − 2Z)2 ]? (iv) What is StdDev [Z]? (v) What is StdDev [10 − 2Z]? M01.5 Colds The number of times that an individual contracts a cold in a given year follows a Poisson distribution with the expected number per year being 5. A new drug has been introduced which changes the distribution to Poisson with expected number 3 per year for 75% of the population. For the other 25% of the population the drug has no appreciable effect on colds. Let N be the number of colds in a year and B be the event that the drug is beneficial for you. (i) The random variable that counts the number of colds when the drug is not beneficial is represented by (A) B | N , (B) N | B, (C) N | B C , (D) N, B C , (E) B | N . (ii) The distribution of the random variable in (i) is (A) Poisson(3), (B) Poisson(0.25), (C) Poisson(0.75), 3 (D) Poisson(5), (E) Poisson(1). (iii) You try the drug for a year and have 2 colds. What is the approximate probability that the drug is beneficial for you? (A) 0.75, (B) 0.224, (C) 0.051, (D) 0.889, (E) 1/2. M01.6 Modelling Which of the following distributions is the most appropriate for modelling each of these situations. (A) Poisson, (B) Bernoulli, (C) Continuous Uniform, (D) Discrete Uniform, (E) Exponential, (F) Binomial, (G) Geometric. (i) The number of people in the queue for tickets, excluding you, when you first arrive at the cinema. (ii) The time from when you join the queue until the person at the very front of the queue has been served. (iii) The seat number (i.e. position from left to right, rather than row) of the oldest person in the cinema. (iv) The time during the film showing at which your mobile phone goes off (you accidentally leave it on, and it does go off). (v) The number of people you need to ask before you find someone who prefers crisps to chocolate. (vi) The number of vegetarians in your Math230 tutorial group (excluding you). (vii) Whether there are any vegetarians in your Math230 tutorial group (excluding you). 4 MATH230 Week 01 - Workshop problems Please have a go at the problems before the workshop so that in the workshop you can focus on those problems which you had trouble with. W01.1 Pregnancy An ectopic pregnancy is twice as likely to develop when a pregnant woman is a smoker as it is when she is a non smoker. If 32% of women of childbearing age are smokers, what is the probability that a woman having an ectopic pregnancy is a smoker? You may find it helpful to define the following shorthand: A ≡ ectopic pregnancy, B ≡ smoker. W01.2 Discrete cdf A discrete random variable, R, has a cdf of 0 r<0 1/9 0≤r<1 FR (r) = 5/9 1≤r<2 1 2≤r (a) What is P (R > 1)? (b) What is P (R ≥ 1)? (c) What is the pmf of R? (d) What is the median of R? W01.3 Discrete pmf to cdf pmf. Obtain the cdf of a random variable with the following pR (r) = (1 − θ)r θ 0 r = 0, 1, 2, 3, . . . otherwise. Hint Study Figure 2.1 in your notes. Your function should work for any real number r ∈ R. First calculate F at the integers, being careful to distinguish the index of summation from the upper limit of summation. Secondly extend from the integers to the real line. W01.4 Continuous cdf The continuous random variable X has a cdf of x < 10 0 (x − 10)/10 10 ≤ x ≤ 20 FX (x) = 1 x > 20 Find the pdf of X. 5 W01.5 pdf The random variable X has a pdf of 2x 0<x<1 fX (x) = 0 otherwise (a) Find P (X < 1/2). (b) Find P (X > 3/4 | X > 1/2). W01.6 pdfcst The random variable X has probability density function a(4 − x2 ) −2 < x < 2 fX (x) = 0 otherwise. (i) Find the constant, a, and the cumulative distribution function of X and, hence or otherwise, evaluate P(−1 < X < 0). (ii) Find the median, q5/32 and q27/32 of X. W01.7 Jabberwocky The brave knight is lost in the Tulgey Wood. Three paths lead on from where he stands at the Tumtum tree, clutching his vorpal sword. Unbeknownst to him, all three paths lead out of the wood; however, just off the left path lies the Jubjub bird, sleeping. Two times in three a knight walking this path would pass the bird without waking it, but if awoken the Jubjub bird would carry him up to its nest of hungry young. Along the right path lurks the frumious Bandersnatch against which the knight has no hope of prevailing. The middle path leads to the Jabberwock; if the knight encounters the Jabberwock then he might fall victim to its jaws that bite, its claws that snatch or its beguiling eyes of flame; else he will defeat it with a snicker-snack of his vorpal sword; each of the four possibilities is equally likely. The knight is as likely as not to forge straight ahead down the centre path; if he does not, then the right and left paths appeal equally to him. (a) What is the probability that the knight makes it out of the tulgey wood alive? (b) The brave knight does not emerge from the wood; what is the probability that he encountered the Jabberwock? Hint: It may be helpful to define: ‘A’=‘Alive’, ‘D’=‘Dead’, L=‘left path’, R=‘right’ and C=‘centre’. 6 MATH230 Week 01 - Assessed problems (coursework) Submission is due at 5pm on Tuesday in Week 2. Cdf, pmf, pdf and quantiles A01.1 Discrete pmfs to cdfs ables Obtain the cdf of the following discrete random vari- (i) R where for some parameter, θ ∈ (0, 1), P(R = 0) = θ2 , P(R = 1) = θ(1 − θ), P(R = 2) = θ(1 − θ), P(R = 3) = (1 − θ)2 , and for r ∈ / {0, 1, 2, 3}, P(R = r) = 0. (ii) R ∼ Unif(0, m). [4] Hint: The discrete uniform pmf is given near the start of Ch3 of your Math230 notes and in your first year notes. Study Figure 2.1 in your notes. Your function should work for any real number r ∈ R. First calculate F at the integers, being careful to distinguish the index of summation from the upper limit of summation. Secondly extend from the integers to the real line, either by stating ranges or using the int function, where relevant. A01.2 Butterfly The lifetime, X, in days of a species of butterfly has a density of 0 x≤1 fX (x) = 5 b/x x > 1. (i) Find the cdf of X in terms of b, being careful to show your working. (ii) Write down the value of b and explain how you deduced this. (iii) Find the probability that (i) a butterfly lives for more than two days, (ii) a butterfly lives for more than half a day. (iv) Find the age which only 1% of butterflies reach. [7] A01.3 Survivor of A discrete random variable, R, has a parameter θ ∈ (0, 1) and a pmf pR (r) = (1 − θ)r θ 0 r = 0, 1, 2, 3, . . . otherwise. (i) Use the solution to WS01 Discrete pmf to cdf to write down the survivor function, SR (r), of R. 7 (ii) Hence, or otherwise, find P(R > a + b|R > a) for integer a ≥ 0 and real number b ≥ 0. (iii) What is unusual about the formula for P(R > a + b|R > a)? [4] A01.4 Challenge A random variable X has the 0 1/3 + x/3 FX (x) = 1 following cdf. x<0 0≤x<1 x ≥ 1. By considering limi→∞ FX (x) − FX (x − 1/i), or otherwise, describe the random variable itself. [5] 8 MATH230 Week 02 - Workshop problems Please have a go at the problems before the workshop so that in the workshop you can focus on those problems which you had trouble with. W02.1 Discrete cdf moments A discrete 0 1/9 FR (r) = 5/9 1 random variable, R, has a cdf of r<0 0≤r<1 1≤r<2 2≤r Use the pmf that you calculated in WS01 Discrete cdf to answer the following. (a) What is E [Ra ] for any a > 0? (b) What is Var [R]? (c) What is the skewness of R? W02.2 pdfcst moments Find the expectation, variance, and skewness for the random variable X in the question in WS1 pdfcst. W02.3 Poisson accidents. Suppose that the number of accidents occurring on a highway each day is a Poisson random variable with expected number (i.e. λ parameter) 3. (a) Write the pmf of the Poisson distribution with this parameter value. (b) Find the probability that 3 or more accidents will occur today. (c) Using conditional probability, repeat part (b) if you know that at least one accident occurred today. W02.4 Indicator function. Let A be an interval on the real line. For a set A the function IA (x) of x is defined as 1 x∈A IA (x) = 0 x∈ /A For the continuous random variable X, find E [IA (X)]. Hint: either consider the distribution of Y = IA (X) and use the known results for that or first consider an interval, A = [a, b], and draw the picture. 9 MATH230 Week 02 - Assessed problems (coursework) Submission is due at 5pm on Tuesday in Week 3. Cdf, pdf and moments A02.1 Skew2 A random variable, X, has an expectation of µ and a variance of σ 2 . (a) Show that its skewness can be written as 3 µ 1 3 − 3 − µ E X . σ3 σ (b) Hence derive an expression for E [X 3 ] in terms of µ and σ for a random variable whose pdf or pmf is symmetric about µ. [5] A02.2 Butterfly moments density of The lifetime, X, in days of a species of butterfly has a fX (x) = 0 b/x5 x≤1 x>1 (You discovered the value of b in CW01.) (a) Find the expectation of X a . What constraints are there on a for the expectation to be finite? (b) Write down the expectation and variance of X. (c) Using the formula in Skew2, find the skewness of X. [6] A02.3 Unif or Exp Una and Ed have just phoned for a taxi. Una suggests modelling their waiting time as Unif(a, b) but Ed believes c + Exp(β) is better. Discuss the pros and cons of these options. Note: this question is not asking you to discuss how you would choose a, b, c or β. [4] A02.4 Challenge Question For a sufficiently smooth function, f (x), Taylor expansion about some point, µ, gives 1 f (x) = f (µ) + (x − µ)f 0 (µ) + (x − µ)2 f 00 (η), 2 for some η(µ, x) between µ and x. Consider any function f with f 00 (x) ≥ 0for all x ∈ R and show that E [f (X)] ≥ f (E [X]). Hence relate E [X 2 ] to E [X] and E eX to E [X]. [5] 10 MATH230 Week 03 - Workshop problems Please have a go at the problems before the workshop so that in the workshop you can focus on those problems which you had trouble with. Exp, Gam, Beta and Weib W03.1 Relay Four athletes, Aaron, Bill, Clare and Donna run a relay race; as soon as one finishes they pass the baton to the next athlete. The times in seconds for their individual legs are random, independent of each other, and are respectively distributed as: Gam(41, 4), Gam(38, 4), Gam(42, 4) and Gam(43, 4). Write as an integral, the probability that they take less than 40 seconds to complete the relay; write the R code that would give this probability. Hint: look up the convolution property of the Gamma distribution in your notes and use it twice. W03.2 Normalise and unit (a) What values of c are required to make the following functions valid pdfs (Hint: you will find it helpful to look up the Gamma and Beta distributions in your notes or use Appendix on Integration first). 3 cx exp(−βx) x>0 fX (x) = . 0 otherwise (for β > 0) and fY (y) = cy 3 (1 − y)5 0 0<y<1 . otherwise (b) Now use the unit integrability property to find E [X 5 ] and E [Y 2 (1 − Y )3 ]. W03.3 Weibull If X has a Weibull distribution then its density is 0 x<0 . fX (x) = α α−1 α αβ x exp{−(βx) } x≥0 where β > 0 is the scale parameter and α > 0 is the shape parameter. (a) Use the substitution u = (βx)α and the definition of the Gamma function (or the unit integrability property of the Gamma distribution) to find E [X]; verify this against the answer in your notes. (b) Use the same substitution idea to find the cdf of the Weibull distribution; verify this against the answer in your notes. 11 MATH230 Week 03 - Moodle-assessed problems QZ k is due 11:59pm on the Sunday of Week k, except for the revision quiz (k=1), which is due on the Sunday ending Week 2. Numerical answers should be entered as decimals and accurate to 3dp. M03.1 Shape and symmetry it is true or false. For each of the following statements, decide whether (S1) For a symmetric distribution, the lower quartile and the upper quartile must be the same distance from the median. (S2) For an asymmetric distribution, the lower quartile and the upper quartile can never be the same distance from the median. Choose a letter according to: A : (S1) true and (S2) true. B : (S1) true and (S2) false. C : (S1) false and (S2) true. D : (S1) false and (S2) false. M03.2 Gamma and Beta CDFs (i) The cdf of a Beta(3, 2) random variable Y is FY (y) = ax + bx2 + cx3 + dx4 . Find a, b, c and d and hence evaluate a + 2b + 4c + 8d. (ii) The cdf of a Gam(2, 1) random variable X is FX (x) = a + bx + (c + dx)e−x . Find a, b, c and d and hence evaluate a + 2b + 4c + 8d. Question format: the single number a + 2b + 4c + 8d will (typically) only be correct if you have calculated each of a, b, c and d correctly. M03.3 Expectations (i) The random variable R has a probability mass function of 1/5 r = −1 1/2 r=0 pR (r) = . 1/5 r=1 1/10 r=2 Find E [R4 ]. (ii) The random variable X has a density of 3 cos x sin2 x 0 ≤ x ≤ π/2 fX (x) = 0 otherwise. 2 Find E sin X (Hint: either use inspection or the substitution u = sin t). 12 MATH230 Week 03 - Assessed problems (coursework) Submission is due at 5pm on Tuesday in Week 4. Exp, Gam, Beta and Weib A03.1 CWwk03: BandX BandX improvises two types of song: manic and mellow. Manic songs have a length (in minutes) distributed as Exp(1), whilst mellow songs last for Gam(5, 1) minutes; the length of a given song is independent of the lengths of all other songs. BandX perform an improvised set of four manic songs and three mellow songs; the end of one song segues straight into the start of the next. What is the distribution of the total performance time for their set? (Provide an explanation as well as the final answer). [3] A03.2 BetaGamma Let X ∼ Beta(α, β) and Y ∼ Gam(α, β) with pdfs respectively ( fX (x) = Γ(α+β) α−1 x (1 Γ(α)Γ(β) − x)β−1 0 y ∈ (0, 1) otherwise and fY (y) = β α α−1 −βy y e Γ(α) 0 y>0 . otherwise for α > 0 andβ > 0. Use the unit-integrability property to show that E [X] = α/(α + β) and to find E Y e−2Y . [7] A03.3 Weibull II The Weibull distribution is important in reliability theory. Its probability density function can be written as fX (x) = 0 αβ α xα−1 exp{−(βx)α } x<0 , x≥0 where β > 0 is the scale parameter and α > 0 is the shape parameter. Its cdf is FX (x) = 0 1 − exp{−(βx)α } x<0 . x≥0 (a) Find E [X 3 ]. (b) Lengths in cms, X, between flaws in a strand of wire can be assumed to have Weibull distribution with scale 1 and shape 2. Find the numerical value of P(X > 2 | X > 1). Hint: you can check your answer with the R function pweibull. [5] 13 A03.4 Challenge Let T ∼ Weib(α, β) be the failure time of a machine, and let ∆t be some very small length of time. Use both the density and cdf of the Weibull from the previous question to find an approximate form for P (T ∈ (t, t + ∆t]|T > t) , the probability that the machine will fail in the next small time interval given that it has not yet failed. Hint: look at the derivation of the memoryless propery of the exponential distribution and the connection to the constant rate of the Poisson process. Describe the qualitatively different relationships between the failure probability and time that are obtained for different values of α. [5] 14 MATH230 Week 04 - Workshop problems Please have a go at the problems before the workshop so that in the workshop you can focus on those problems which you had trouble with. Mostly Normal The following properties of Φ(z) are useful in the workshop questions: Φ−1 (1/2) = 0, −1 Φ (3/4) = 0.674, Φ−1 (0.99) = 2.326, Φ(−1.3) = 0.0968 and Φ(−2.3) = 0.0107. W04.1 Normal and Cauchy quantiles (a) Find the median and inter-quartile range for a N(µ, σ 2 ) random variable. How do these quantities compare with the expectation and standard deviation? Find the 99th percentile of a N(0, 1) distribution. (b) Find the median, inter-quartile range and 99th percentile for the Cauchy distribution. How do these compare with the same quantities for the standard Normal distribution? W04.2 Tyre mileage Tyre mileage to failure can be assumed to have a Normal distribution with mean µ = 36, 500 miles and standard deviation σ = 5000 miles. (a) What is the probability that a tyre will fail within 30, 000 miles? (b) If a tyre is known to have failed within 30, 000 miles, what is the probability that it failed within 25, 000 miles? W04.3 Models For the following random variables identify (with justification) which family of distribution (but not the parameter values) you would consider for modelling their distribution: (a) marks out of 100 in a statistics exam paper, (b) waiting times in a queue, (c) proportion of votes in an election for a candidate. 15 MATH230 Week 04 - Moodle-assessed problems QZ k is due 11:59pm on the Sunday of Week k, except for the revision quiz (k=1), which is due on the Sunday ending Week 2. Numerical answers should be entered as decimals and accurate to 3dp. M04.1 More modelling Choose the most appropriate from the following distributions to model the random variable in each of the following scenarios. (A) Beta. (B) Binomial. (C) Cauchy. (D) Discrete Uniform. (E) Exponential. (F) Gamma. (G) Geometric. (H) Normal/Gaussian. (I) Poisson. (i) A fisherman repeatedly casts a line out into a fishing lake and then reels in the line; eventually there is a fish on his hook, at which point he packs up and goes home. Let W be the number of times he casts the line. (ii) Four small children are playing hide and seek. Alan has just counted to 100 and opened his eyes. Let X be the time it takes him to find all three of the other children. (iii) The expected mark for your end of year Math230 exam is µ. Two students, picked at random, have marks M1 and M2 . We are interested in Y = (M1 − µ)/(M2 − µ). (iv) Today, a dentist is seeing 18 patients for a routine check up. Let Z be the number of these patients that turn out to require at least some dental work. (v) Each individual has their own probability that they will trip over their own feet at least once next week. Let P be the foot-tripping probability of the next person you pass in the street. 16 MATH230 Week 04 - Assessed problems (coursework) Submission is due at 5pm on Tuesday in Week 5. Mostly Normal A04.1 Appointment durations The duration of appointments at a doctor’s surgery can be assumed to be Normally distributed with mean 9 minutes and standard deviation 3 minutes. (a) Express the probability of a randomly selected appointment taking less than 15 minutes in terms of the standard Normal cdf Φ. (b) Assuming independent appointment durations, derive and express in terms of Φ the probability that at least one of the doctor’s 20 appointments will be longer than 15 minutes. (c) Are the assumptions of independence and a Normal distribution reasonable? [5] A04.2 Exam marks Let X, the marks of a randomly selected student in a probability exam, be a Normal random variable. Lecturers are said to grade on the curve if they know the average µ and the standard deviation σ of the marks and then assign grades according to the following table: Range of mark X ≥ µ + σ grade A µ≤X <µ+σ B X<µ C (a) If a lecturer does grade on the curve, what are the probabilities of students getting each of grades A, B and C? (b) If the above boundaries are unchanged, but now (the bottom) 2 in every 15 of the students who would have obtained a C grade now obtain a D grade, where does the boundary for the D grade lie? [6] A04.3 Moment identification The figure shows the pdf for four different random variables, with their associated expectations and variances. The means and variances of the pdfs shown in the figure are (a) Top left: E [X] = 4, Var [X] = 4, (b) Top right: E [X] = 4, Var [X] = 2, (c) Bottom left: E [X] = 1/2, Var [X] = 1/4, (d) Bottom right: E [X] = 1/2, Var [X] = 1/8. 17 0.30 0.25 0.20 f(x) 0.10 0.15 0.20 0.15 f(x) 0.10 0.00 0.05 0.05 0.00 −2 0 2 4 6 8 10 0 2 4 8 4 0 0.0 1 2 f(x) 3 1.5 1.0 0.5 f(x) 6 x 2.0 x 0.0 0.5 1.0 1.5 2.0 2.5 3.0 −0.2 0.0 0.2 0.4 x 0.6 0.8 1.0 1.2 x In each case, guess the distribution family (eg Gamma), evaluate its parameters by relating these to the mean and variance and choose the correct answer; there is no need to provide your working. (a) The top left figure follows: (A) Gam(4, 1), (B) N(4, 4), (C) Gam(8, 2), (D) N(4, 2). (b) The top right figure follows: (A) Gam(4, 1), (B) N(4, 4), (C) Gam(8, 2), (D) N(4, 2). (c) The bottom left figure follows: (A) Exp(4), (B) Exp(1/4), (C) Exp(2), (D) Exp(1.5). (d) The bottom right figure follows: (A) N( 21 , 21 ), (B) Beta( 12 , 21 ), (C) N( 12 , 81 ), (D) Beta( 18 , 14 ). [4] A04.4 Challenge For x > 0, let I(x) = √1 2π R∞ x t −t2 /2 e x dt. (a) Write down an inequality between I(x) and Φ. (b) Evaluate I(x) analytically and hence re-write the inequality in terms of Φ and φ. (c) Find a bound for Φ(−10). Check the tightness of the bound using R. [5] 18 MATH230 Week 05 - Workshop problems Please have a go at the problems before the workshop so that in the workshop you can focus on those problems which you had trouble with. Transformations W05.1 Power transform If X ∼ Exp(1), use the density method to find and identify by name the distribution of Y = X 1/c , for c > 0. In your notes, the cdf method is used to solve this problem. W05.2 Bus times The length of time X (in minutes) that it takes the bus to travel the 3 miles from campus to Lancaster is a continuous random variable with cdf x≤8 0 0.25(x − 8) 8 < x ≤ 12 . FX (x) = 1 x > 12 Write down a transformation which converts X to the average speed for the trip in miles per hour. Derive the cdf and pdf of speed, and determine its expected value. W05.3 ScaledExp If X is an Exp(β) random variable and c > 0, show that the random variable cX follows an Exp(β/c) distribution. W05.4 USwitch If U ∼ Unif(0, 1) find the distribution of V = 1 − U . W05.5 Imperfection The diameter X (in mm) of a circular imperfection in a slice through a steel sample can be assumed to have pdf 2x/9 0<x≤3 fX (x) = . 0 otherwise Find and roughly sketch the pdf of the area of imperfection, Y . 19 MATH230 Week 05 - Moodle-assessed problems QZ k is due 11:59pm on the Sunday of Week k, except for the revision quiz (k=1), which is due on the Sunday ending Week 2. Numerical answers should be entered as decimals and accurate to 3dp. M05.1 Transformations For each of statements (i) - (v), choose from the following options the most general class of functions, g, for which the statement holds for a general continuous random variable X: (A) all functions; (B) all even functions (i.e. functions for which g(x) = g(−x)); (C) all 1-1 functions; (D) all increasing 1-1 functions; (E) all affine functions (i.e. functions, g(x) = ax + b); (F) none of the above. (i) P (g(X) ≤ t) = g(P (X ≤ t)). (ii) E [g(X)] = g(E [X]). (iii) median(g(X)) = g(median(X)). (iv) lower quartile(g(X)) = g(lower quartile(X)). (v) IQR(g(X)) = g(IQR(X)). Hint: to rule out a class of functions, find/sketch a counter-example. e.g. (vi) g(StdDev [X]) = StdDev [g(X)] cannot be true for general even functions, since with g(x) = x2 and X ∼ Bern(p), StdDev [X 2 ] = StdDev [X] = p(1 − p) < p2 (1 − p)2 = (StdDev [X])2 . For general affine functions, let StdDev [X] = σ; then g(StdDev [X]) = aσ + b 6= | a | σ = StdDev [g(X)]). Often in finding counter examples you also gain an understanding of when a rule might hold; in this example, for functions of the form g(x) = ax with a ≥ 0; so the answer to hypothetical Qn (vi) would be (F). 20 MATH230 Week 05 - Assessed problems (coursework) Submission is due at 5pm on Tuesday in Week 6. Transformations A05.1 Linear transform Let X have pdf 2x 0≤x≤1 fX (x) = . 0 otherwise Find the pdf of Y = −5X + 2. [5] A05.2 Two transformations (a) If X ∼ Gam(α, β), use the density method to find the distribution of Y = βX for some β > 0. (b) For some a > 0, let X have a density of a/xa+1 fX (x) = 0 1≤x<∞ . otherwise Use the cdf method to find the cdf of Y = X a . [8] A05.3 Normal transformation Let X ∼ N(µ, σ 2 ); use the symmetry of the density of X to suggest a transformation g such that Y = g(X) has the same density as X. (The identity transformation, Y = X, is not permitted). Hint: where is the symmetry in WS05 USwitch? [2] A05.4 Challenge: trig. A random variable X has a density of 12 cos3 x × sin3 x 0 ≤ x ≤ π/2 fX (x) = . 0 otherwise Find the density of Y = sin2 X and identify the distribution of Y . (Hint: the transformation x → sin2 x is 1-1 on [0, π/2]). [5] 21 MATH230 Week 06 - Workshop problems Please have a go at the problems before the workshop so that in the workshop you can focus on those problems which you had trouble with. Bivariate W06.1 Discrete tion Discrete random variables X and Y have joint probability mass func- Y X 0 0 0 2 6/15 1 2 0 4/15 3/15 2/15 Find the marginal pmf of X and Y . Find the conditional probability mass function of Y given X = 2. W06.2 BiExp The random variables (X, Y ) have joint distribution function FXY (x, y) = 1 − exp(−x) − exp(−y) + exp(−x − y) for 0 < x < ∞, 0 < y < ∞. Obtain: (a) the joint pdf, (b) P(X < 1, Y < 1), (c) P(X < 1), (d) P(X + Y ≤ 1). W06.3 GU The random variables (X, Y ) have joint pdf 1 √ exp(−x2 /2) −∞ < x < ∞, 0 < y < 1 2π fXY (x, y) = 0 otherwise. Find the marginal distributions of X and Y and identify their forms. W06.4 BiBeta and 0 otherwise. Let X and Y have joint pdf fXY (x, y) = 6(1 − y) for 0 ≤ x ≤ y ≤ 1 (a) Sketch the subset of (x, y) values for which both x ≤ 3/4 and y > 1/2. Hence find P(X ≤ 3/4, Y > 1/2). (b) Obtain the marginal distribution for X and Y . 22 W06.5 Independent? tion Random variables X and Y have joint probability density func- fXY (x, y) = 1 1 2 x exp{− (x2 + y 2 )} 2π 2 for −∞ < x < ∞, −∞ < y < ∞. (a) Are X and Y independent? (b) Find the marginal densities of X and Y ; if either has a standard form, identify it. 23 MATH230 Week 06 - Moodle-assessed problems QZ k is due 11:59pm on the Sunday of Week k, except for the revision quiz (k=1), which is due on the Sunday ending Week 2. Numerical answers should be entered as decimals and accurate to 3dp. M06.1 MoreBuses Peter is waiting at a bus stop where two different buses stop, both of which can take him home. The joint distribution (in minutes) for the times X and Y he needs to wait for the two buses has joint pdf 1 1 1 exp(− 10 x) exp(− 15 y) 0 < x < ∞, 0 < y < ∞ 150 fXY (x, y) = 0 otherwise. (i) The probability that Peter has to wait for longer than 1 minute to catch a bus is (A) exp(−1/3), (B) exp(−2/3), (C) exp(−1/4) (D) exp(−1/6). (ii) What is the probability that a bus with waiting time given by X arrives before a bus with waiting time given by Y ? M06.2 Integration range Consider a continuous bivariate random variable (X, Y ) whose density is non-zero precisely over the range 0 < X < 4, 0 < Y < 4, and over this range the density is g(x, y). For each of the probabilities (i) - (iii) below, choose all of the correct corresponding integrals. You may find it helpful to sketch the required regions. (i) P (X ≤ 2 or Y ≤ 2 or both). R2 R2 (A) x=0 y=0 g(x, y) dy dx. R4 R2 R2 R4 (B) x=0 y=0 g(x, y) dy dx + x=0 y=0 g(x, y) dy dx. R4 R2 R2 R4 (C) x=0 y=0 g(x, y) dy dx + x=0 y=2 g(x, y) dy dx. R4 R4 (D) 1 − x=2 y=2 g(x, y) dy dx. R2 R4 R4 R2 (E) x=0 y=2 g(x, y) dy dx + x=2 y=0 g(x, y) dy dx. R2 R4 R4 R2 (F) x=0 y=0 g(x, y) dy dx + x=2 y=0 g(x, y) dy dx. (ii) P (X + Y ≤ 3). R 4 R 3−x (A) x=0 y=0 g(x, y) dy dx. R 3 R 3−x (B) x=0 y=0 g(x, y) dy dx. R 3 R 3−x R4 R4 (C) x=0 y=0 g(x, y) dy dx + x=3 y=0 g(x, y) dy dx. 24 (D) 1 − R4 R4 (E) 1 − R3 R4 (F) 1 − R3 R4 x=0 x=0 x=0 y=3−x y=3−x y=3−x g(x, y) dy dx. g(x, y) dy dx. g(x, y) dy dx − R4 R4 x=3 y=0 g(x, y) dy dx. (iii) P (X + Y ≤ 5). R 4 R 5−x (A) x=0 y=0 g(x, y) dy dx. R 1 R 5−x (B) x=0 y=0 g(x, y) dy dx. R 4 R 5−x R1 R4 (C) x=0 y=0 g(x, y) dy dx + x=1 y=0 g(x, y) dy dx. R4 R4 (D) 1 − x=1 y=5−x g(x, y) dy dx. R1 R4 (E) 1 − x=0 y=5−x g(x, y) dy dx. R4 R4 R1 R4 (F) 1 − x=0 y=0 g(x, y) dy dx − x=1 y=5−x g(x, y) dy dx. 25 MATH230 Week 06 - Assessed problems (coursework) Submission is due at 5pm on Tuesday in Week 7. Bivariate A06.1 Construction Construction firms A, B, C and D all bid for (the same) two contracts. Each contract will be awarded to a single firm, one of A, B, C or D. Each contract is equally likely to be awarded to any firm, and the decisions on the two contracts are independent. Let X be the number of contracts awarded to firm A and Y be the number awarded to firm B. (a) By considering all possible combinations of contract awards, or otherwise, write down the joint probability mass function for (X, Y ) as a table. (b) Find FXY (1, 0) = P(X ≤ 1, Y ≤ 0). [7] A06.2 BiBeta2 Let X and Y have joint pdf 6(1 − y) fXY (x, y) = 0 0≤x≤y≤1 otherwise. (a) Sketch the set of (x, y) values for which fXY (x, y) > 0 and confirm that this is a probability density function. (b) Indicate on the sketch the region for which x + y > 1 and find P(X + Y > 1). [8] A06.3 Challenge Suppose X and Y are independent. The notes state that, provided the range of X does not depend on y, for X and Y to be independent it is enough that: fXY (x, y) = g(x)h(y). Independence is in fact equivalent to this: if the range of X does not depend on y then fXY (x, y) = g(x)h(y) ⇔ X and Y are independent. Prove this. Hint: use the condition that independence is equivalent to fXY (x, y) = fX (x)fY (y). [5] 26 MATH230 Week 07 - Workshop problems Please have a go at the problems before the workshop so that in the workshop you can focus on those problems which you had trouble with. Bivariate/summary/linear trans. W07.1 AllStats The rvs X and Y follow a distribution specified by X ∼ Exp(1) and Y | X = x ∼ Unif(0, x). (a) Write down E [Y | X = x] and Var [Y | X = x]. (b) Find E [X] and E [Y ]. (c) Find Cov [X, Y ]. (d) Find Var [X] and [non-examinable] show that Var [Y ] = 5/12. (e) Find Corr [X, Y ]. W07.2 Exam2000+(d) function Random variables X and Y have joint probability density fXY (x, y) = 24x(1 − y) 0 0<x<y<1 . otherwise (a) Confirm that this is a probability density function. (b) Are X and Y independent? Give reasons. (c) Find the marginal probability density functions of X and Y and identify the corresponding distributions. (d) Find E [XY ] and write down E [X] E [Y ]; is the correlation between X and Y negative, positive or zero? W07.3 Pesticide Let X and Y denote the proportions of two chemicals in a pesticide. Suppose they have joint pdf 2 x > 0, y > 0, x + y ≤ 1 fXY (x, y) = 0. otherwise (a) Find the marginal pdf of Y . (b) Find the conditional pdf of X given Y = y ∈ (0, 1). 27 W07.4 Orthogonal (a) Use the bilinearity of Cov [] to find Cov [X + Y, aX + bY ] in terms of Var [X], Var [Y ] and Cov [X, Y ]. (b) Random variables X and Y have E [X] = 2, Var [X] = 20, E [Y ] = 1, Var [Y ] = 5 and Cov [X, Y ] = 8. Let S = X + Y , and find all linear transformations T = aX + bY such that S and T are uncorrelated. (c) Find the particular linear combination which gives E [T ] = E [S]. W07.5 DoctorExp The duration of appointments at a doctor’s surgery can be assumed to be independent and identically distributed exponential random variables with an expectation of 5 minutes. If there are 2 patients ahead of you in the waiting room and 1 patient already with the doctor, what is the expectation and variance of your waiting time? W07.6 Lin3-2 Suppose X1 , X2 and X3 are independent random variables with expectations 3, −1 and 2, and variances 5, 8 and 9 respectively. Find the means, variances and covariance of Y1 = 3X1 − 2X2 + X3 and Y2 = X2 − X3 . 28 MATH230 Week 07 - Moodle-assessed problems QZ k is due 11:59pm on the Sunday of Week k, except for the revision quiz (k=1), which is due on the Sunday ending Week 2. Numerical answers should be entered as decimals and accurate to 3dp. M07.1 Independent For each of the following sets of two bivariate density or mass functions for random variables (V, W ) and (X, Y ), decide whether or not V and W are independent and decide whether or not X and Y are independent, then answer as follows: (A) V and W are independent, and X and Y are independent. (B) V and W are independent, and X and Y are dependent. (C) V and W are dependent, and X and Y are independent. (D) V and W are dependent, and X and Y are dependent. In each case the density is zero outside of the specified range; you do not need to evaluate the constants c1 and c2 . (i) fV W (v, w) = c1 (vw + 1) for 0 < v ≤ 2 and 0 < w ≤ 3; fXY (x, y) = for x > 0 and y > 0. λk+1 y k−1 e−λ(x+y) (k−1)! (ii) fV W (v, w) = c2 (vw + v + w + 1) for 0 < v ≤ 2 and 0 < w ≤ 3; fXY (x, y) = 0 < x < y < 1. 1 y for (iii) V and W have the following joint probability mass function: v=0 v=1 v=2 w=0 1/3 2/9 1/9 . w=1 2/9 1/9 0 fXY (x, y) = 21 (x + y)e−(x+y) for x > 0 and y > 0. (iv) V and W have the following joint probability mass function: v=0 v=1 v=2 w=0 3/8 1/4 1/8 . w=1 1/8 1/12 1/24 fXY (x, y) = 1 2 √ 1 e− 2σ2 (x−µ) 2πσ for 0 < y ≤ 1 and −∞ < x < ∞. M07.2 Covariance and correlation select all of the true statements below. Consider two random variables X and Y and (A) Cov [X, Y ] = 0 ⇒ X and Y are independent. (B) X and Y are dependent ⇒ Cov [X, Y ] 6= 0. (C) X and Y are independent ⇒ Cov [X, Y ] = 0. (D) If StdDev [X] = 3, StdDev [Y ] = 2 and Cov [X, Y ] = 2 then Corr [X, Y ] = 1/3. (E) If Var [X] = 3, Var [Y ] = 2 and Cov [X, Y ] = 2 then Corr [X, Y ] = 1/3. 29 MATH230 Week 07 - Assessed problems (coursework) Submission is due at 5pm on Tuesday in Week 8. Bivariate/summary/linear trans. A07.1 Bivariate expectations The random variables X and Y have a density of fXY (x, y) = x + y for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1, and fXY (x, y) = 0 elsewhere. Evaluate E [X] and E [XY ]. [4] A07.2 AllStats X ∼ Gam(3, 1) and Y |(X = x) ∼ Exp(x); find (a) E [Y ]; (b) Cov [X, Y ]; (c) Use (a) and (b) and the fact that Var [Y ] = 3/4 to find Var [3X − 2Y ]. Hint: use the formulae for the mean of an Exp(x) rv from your notes, and the formula for the rth moment of a Gamma random variable, and for the variance of a Gamma rv. You might find it helpful to check answers using simulation in R. [5] A07.3 DrinksMachine A drinks machine has a random amount Y (in gallons) in supply at the beginning of a given day and dispenses a random amount X during the day. It is not resupplied during the day and so X ≤ Y . The joint pdf is 1/2 0 < x ≤ y, 0 < y ≤ 2 fXY (x, y) = 0 otherwise. Find the conditional density of X given Y = y. Evaluate the probability that less then 0.5 gallons are sold, given that the machine contains 1 gallon at the start of the day. Is this joint model sensible? (Hint: think about P(X = y | Y = y)). [6] A07.4 Challenge Suppose X1 , X2 , . . . Xn are independent and identically distributed random variables, and let X̄ be their sample mean. Define Yi = Xi − X̄ for i = 1, 2, . . . , n. Show that X̄ and Yi are uncorrelated for any i. Hint: by symmetry, if the result holds for Y1 it will hold for any i. Write both X̄ and Y1 P as linear combinations, P i.e. find constants {aj } and {bj }, j = 1, 2, . . . , n, such that X̄ = aj Xj and Y1 = bj Xj . Use the vector results for covariances from notes. [5] 30 MATH230 Week 08 - Workshop problems Please have a go at the problems before the workshop so that in the workshop you can focus on those problems which you had trouble with. Mgf + general trans. W08.1 Binom If X ∼ Bin(n, θ), use the fact that we can write X = Xi are IID Bern(θ) random variables to: Pn i=1 Xi where (a) show that E [X] = nθ and that Var [X] = nθ(1 − θ); (b) find the moment generating function MX (t). W08.2 SumQuoExp The random variables X ∼ Exp(β) and Y ∼ Exp(β) are indeX and , pendent. Find the joint and the marginal densities and distributions of S = X+Y T = X + Y . Are S and T independent? W08.3 Exam2001 Let X1 and X2 be two continuous random variables with joint probability density function 1/2 | x1 | + | x2 | ≤ 1 fX1 X2 (x1 , x2 ) = . 0 otherwise (a) Sketch the area where fX1 X2 (x1 , x2 ) = 1/2. (b) Are X1 and X2 independent? (c) Show that the marginal probability density function of X1 is 1 − | x1 | −1 ≤ x1 ≤ 1 fX1 (x1 ) = . 0 otherwise (d) Derive the joint probability density function of S = X1 + X2 and T = X1 − X2 . (e) Are S and T independent? W08.4 230Marks Marks for course-work on this course are likely to follow a N (7, 2) distribution and the marks for the exam follow a N (45, 100) distribution. The total mark for the course is a sum of course-work and exam marks, with a pass and first corresponding to marks of 35 and 70 respectively. (a) Assuming marks for the course-work and exam are independent, what are the probabilities of failing and of obtaining a first? (b) Repeat (a) under the assumption of perfect correlation between course-work and exam marks. The corresponding probabilities are now (c) Do you find either of the assumptions of dependence realistic? State your reasons. 31 W08.5 Ratio of Normals Prove that the ratio of two independent N (0, 1) random variables X and Y has the Cauchy distribution with pdf f (s) = 1 π(1 + s2 ) for −∞ < s < ∞. Hint: try S = X/Y , T = Y , and remember to multiply by the absolute value of the Jacobian. 32 MATH230 Week 08 - Moodle-assessed problems QZ k is due 11:59pm on the Sunday of Week k, except for the revision quiz (k=1), which is due on the Sunday ending Week 2. Numerical answers should be entered as decimals and accurate to 3dp. M08.1 Range For each of the following pairs of random variables X and Y and the given transformation (X, Y ) → (S, T ) choose the joint range of S and T from the following options. (i) X > 0 and Y > 0 with S = X and T = X/Y . (A) 0 < S, 0 < T . (B) 0 < S < T . (C) 0 < T < S. (ii) 0 < X < 4 and 0 < Y < 4 with S = X and T = X/Y . (A) 0 < S and 0 < T . (B) 0 < S < 4T . (C) 0 < T < 4S. (D) 0 < T < S < 4T . (E) 0 < S < T < 4S. (iii) 0 < X < 4 and 0 < Y < 4 with S = X and T = X + Y . (A) 0 < S and 0 < T . (B) 0 < T < S + 4. (C) 0 < S < T + 4. (D) 0 < S < T < S + 4. (E) 0 < T < S < T + 4. M08.2 TransExps Let X and Y have a joint pdf of exp(−x − y) x > 0, y > 0 f (x, y) = 0 otherwise. Let S = X and T = X + Y . (i) The joint density of S and T is (A) exp(−s) for 0 < s < t. (B) exp(−t) for 0 < s < t. (C) exp(−s − t) for 0 < s < t. 33 (D) t exp(−s) for 0 < s < t. (E) s exp(−t) for 0 < s < t. (F) None of the above. (ii) The marginal density of T is (A) Exp(1). (B) Gam(2, 1). (C) Gam(3, 1). (D) Beta(2, 2). (E) None of the above. 34 MATH230 Week 08 - Assessed problems (coursework) Submission is due at 5pm on Tuesday in Week 9. Mgf + general trans. A08.1 Mgf Consider a continuous random variable Z with moment generating function MZ (t) = (c − t2 /θ2 )−1 for |t| < θ, where θ > 0 is a parameter of the distribution. (a) What is the value of c? Why? (b) Find the moment generating function of a random variable W that follows a double exponential (or Laplace) distribution with pdf fW (w) = 2θ e−θ|w| , for −∞ < w < ∞. R∞ R0 R∞ Hint: −∞ h(x) dx = −∞ h(x)dx + 0 h(x)dx; what is |x| when x is negative?. (c) Let X ∼ Exp(θ) and Y ∼ Exp(θ) be independent. Using the formula for the mgf of an Exp(θ) random variable from your notes, find the mgf of X − Y and identify its distribution. [6] A08.2 Common sense Let X > 0 be a random variable and let Y | X = x ∼ N (x, x). Write down a transformation S = g(X, Y ) such that S is independent of X and explain why the resulting rv is independent of X (trivial transformations such as S = 0 or S ∼ Unif(0, 1) indep. of X and Y are not allowed). [2] A08.3 Let X ∼ Exp(1) and Y ∼ Exp(2) be independent. (a) Using the convolution formula from Chapter 8, find the density of S = X + Y . (b) Using the transformation S = X + Y and T = Y , find the joint density fS,T (s, t). [7] A08.4 Challenge Let X have density 2 √ exp(−x2 /2) 2π fX (x) = 0 0<x<∞ otherwise. and let Y ∼ Gam(n/2, n/2) for some n > 0 be independent of X. √ (a) Show that the joint density, fT V (t, v) of V = X 2 and T = X/ Y is n (n+1)/2 h v 2 n i (n+1)/2−1 fT V (t, v) = √ v exp − 1 + 2 t2 2πnΓ(n/2)2n/2 t2 for t > 0, v > 0 and 0 elsewhere. (b) Find the density, fT (t), of T and simplify it so that the only terms involving t are of the form (1 + t2 /n). (Hint: use the unit integrability property of the Gamma distribution.) (ASIDE: find the Student-t density (e.g. on Wikipedia) and note the similarities and differences between this and your formula). [5] 35 MATH230 Week 09 - Workshop problems Please have a go at the problems before the workshop so that in the workshop you can focus on those problems which you had trouble with. CLT and Chebyshev W09.1 Basics Suppose X1 , X2 , . . . , Xn represent energy consumption of n different households over a day in appropriate units, and are iid random variables each with mean µ and finite variance σ 2 . Use the central limit theorem to give an approximate distribution for (a) mean energy consumption (b) total energy consumption. W09.2 AirlineCLT An airline knows that the weight of a randomly chosen suitcase (in kg) is a random variable with expectation 16 and standard deviation 5. A cargo area holds 100 such suitcases. Use the CLT to find the approximate probability that the total weight will exceed 1700kg? What is the 99th quantile of the distribution of total weight? (Express all answers in terms of the cdf for the standard Normal distribution, Φ(·).) W09.3 AirlineCLTVariable An airplane will have enough fuel to reach its destination provided that the combined weight of the passengers and their luggage does not exceed 39 tonnes (39000kg). The mean and standard deviation of each passenger’s weight are 79kg and 12kg respectively; the mean and standard deviation of each passenger’s luggage are 12kg and 5kg respectively. For each passenger, their weight is independent of the weight of their luggage, and passengers are independent of each other in all aspects. Use the CLT to find the largest number of passengers that the airline can let on the plane such that the probability that it will have to divert to an alternative airport is ≤ 0.001. You will need one of the following Φ(0.001) = 0.500, Φ(.999) = 0.841, Φ−1 (0.001) = −3.090, Φ−1 (0.999) = 3.090. W09.4 AirlineChebyshevVariable Consider the previous question, AirlineCLTVariable, but use Chebyshev’s inequality, rather than the CLT, to obtain an upper bound on the number of passengers, n. W09.5 Exam2018 The 80 boys and girls at Ladbury Boarding School love potatoand-leek soup. Janice the cook has prepared 30 litres of the soup, and as each pupil places their bowl on her serving counter she ladles in some soup. The amount of soup each pupil receives has an expectation of 0.25 litres and a variance of 0.005 litres2 . The amounts are independent across pupils. (a) For a random variable X, Markov’s inequality states that, subject to a condition on X, P (X ≥ t) ≤ E [X] /t. What is the condition on X? (b) For a random variable Y with E [Y ] = µ and Var [Y ] = σ 2 , Chebyshev’s inequality is: P (|Y − µ| ≥ a) ≤ σ 2 /a2 . Prove Chebyshev’s inequality. (Hint: consider σ 2 and apply Markov’s inequality to a suitable function of Y .) 36 (c) Janice is concerned that, with her usual ladling, she may run out of soup. (i) Use Markov’s inequality to obtain a bound on the probability of this. (ii) Use Chebyshev’s inequality to obtain a bound on the probability of this. (d) In fact, after all 80 pupils have been served, Janice has used up just 19 litres of soup. She wonders if she can give all pupils a second helping; they certainly all want one! The variance of the amount she ladles for each pupil is fixed, but she can alter its expectation. Use Chebyshev’s inequality to derive for her the maximum expected amount of second-helping soup per pupil that nonetheless ensures the probability the pupils all receive a second helping is at least 0.6. W09.6 Insurance Insurance companies consider annual profit on a policy for oil drilling platforms to be random variables with expectation 17.5 thousand pounds and variance of 1.62 (in units of [thousand pounds]2 ). (a) Assume that there 50 such annual policies, how would you represent them? (A) X, Y, . . . , (B) x, y, . . . , (C) x1 , x2 , . . . , (D) X1 , X2 , . . . . (b) If the profits for different annual policies are mutually independent, find the probability that the average profit from 50 annual policies will exceed 18 thousand pounds. W09.7 Exam2006 (a) State the Central Limit Theorem for a sequence of iid random variables X1 , X2 , . . . each with mean µ and finite variance σ 2 . (b) Find the expected value and variance of the random variable X ∼ Bern(θ) where 0 < θ < 1. (c) Suppose X1 , X2 , . . . is a sequence of iid Bern(θ) random variables, and that X̄n = Pn 1 X i . State the expected value and variance of X̄n . i=1 n (d) Find approximately P(X̄n < 0.6) in terms of the standard Normal distribution function Φ(), when it is known that θ = 0.5 and n = 100. 37 MATH230 Week 09 - Moodle-assessed problems QZ k is due 11:59pm on the Sunday of Week k, except for the revision quiz (k=1), which is due on the Sunday ending Week 2. Numerical answers should be entered as decimals and accurate to 3dp. M09.1 Exam2001 Two companies A and B produce batteries. Batteries from company A have an expected lifetime of µA = 9 hours with a standard deviation of σA = 2 hours. Batteries from company B have an expected lifetime of µB = 10 hours with a standard deviation of σB = 1 hour. (i) Let X1 , . . . , X10 be the lifetimes of 10 randomly chosen P10 batteries from company A. 1 The distribution of the average lifetime X̄10 = 10 i=1 Xi is approximately (A) (B) (C) (D) N(10, 4), N(9, 4), N(9, 0.2), N(9, 0.4). (ii) Independent of the batteries already chosen, 10 randomly chosen batteries from companyP B are selected. Let Y1 , . . . , Y10 be the corresponding lifetimes, and let 1 Ȳ10 = 10 10 i=1 Yi be the average lifetime. The approximate distribution of X̄10 − Ȳ10 is (A) (B) (C) (D) N(−1, 0.3), N(−1, 0.5), N(1, 0.3), N(−1, 3). (iii) Using the approximation in (ii), what is the probability that X̄10 is less than Ȳ10 ? (iv) As the probability P X̄10 < Ȳ10 is not sufficiently close to 1, it is difficult, from samples of size 10, to identify which company makes the better batteries. Let X̄n be the average lifetime of n randomly chosen batteries from company A, and let Ȳn be the average lifetime of n randomly chosen batteries from company B. How big should n be for P X̄n < Ȳn to be approximately 0.99? (v) Now, suppose the number of batteries that are chosen from companies A and B are permitted to differ. Each sample from company B takes the same amount of effort as a sample from company A. To maximise P X̄10 < Ȳ10 for a fixed total amount of sampling effort (50 batteries) you should find the integer number of samples for company A, n ≤ 50 which: (A) (B) (C) (D) (E) maximises 4/n + 1/(50 − n); minimises 4/n + 1/(50 − n); maximises 4/(50 − n) + 1/n; minimises 4/(50 − n) + 1/n; satisfies some condition other than one of the above. 38 MATH230 Week 09 - Assessed problems (coursework) Submission is due at 5pm on Tuesday in Week 10. CLT and Chebyshev A09.1 Markov vs Chebyshev (e.g. X ∼ Exp(1)). For t > 1, Let X be non-negative and have E [X] = Var [X] = 1 (a) use Markov’s inequality to find an upper bound on P (X ≥ t); (b) use Chebyshev’s inequality to find an upper bound on P (X ≥ t). (c) Which bound is preferable for large values of t? [3] A09.2 OrdersCLT Times spent on processing orders are independent random variables with mean 1.5 minutes and standard deviation 1 minute. Let n be the number of orders an operator is scheduled to process in 2 hours. Use the CLT to find the largest value of n which give at least a 95% chance of completion in that time. [6] A09.3 OrdersCheb Times spent on processing orders are independent random variables with mean 1.5 minutes and standard deviation 1 minute. Let n be the number of orders an operator is scheduled to process in 2 hours. Use Chebyshev’s Inequality to find the largest value of n which gives at least a 95% chance of completion in that time. Hint: write down the inequality you wish to satisfy and the inequality that Chebyshev provides, then reconcile the two. [6] A09.4 Challenge function: Suppose that the random variable X has a finite moment generating mX (t) = E etX < ∞ for all t ∈ R. (a) Show that P (X ≥ x) ≤ mX (t)e−tx when t > 0 and x > 0. (Hint: either proceed in a similar manner to the proof of Markov’s inequality, or use Markov’s inequality directly.) (b) Notice that the above bound is true for all t and so, P (X ≥ x) ≤ min mX (t)e−tx . t 2 /2 Suppose that X ∼ N (0, 1) so that mX (t) = et P (X > x). and find a best upper bound on [5] 39 MATH230 Week 10 - Workshop problems Please have a go at the problems before the workshop so that in the workshop you can focus on those problems which you had trouble with. WS Week 10: MVN W10.1 MVN:2→1 (3, 5)0 and variance The joint distribution of (X, Y )0 is bivariate Normal with mean 3 −1 . −1 2 Find the distribution of T = 4X − Y . W10.2 Exam2016 (a) Two random variables, X ∼ N(0, 1) and U ∼ N(0, 1) are independent of each other. A new variable, Y = 65 X + 85 U , is constructed. (i) Find E [Y ] and Var [Y ]. (ii) Identify the distribution of Y . (iii) Find E [XY ] and hence Corr [X, Y ]. (iv) Imagine a large number of simulations from the joint distribution of X and Y and sketch a scatter plot of Y (y-axis) vs X (x-axis). (b) Two random variables, X ∼ N(0, 1) and 1 with probability 0.5 W = , −1 with probability 0.5 are independent of each other. A new variable, V = W X, is constructed. (i) Find P (V ≤ v) in terms of the cdf of a standard Normal random variable, Φ, for a general v ∈ R. Hint: use the law of total probability, conditioning on the value of W . (ii) Identify the distribution of V . (iii) Find E [XV ] and hence Corr [X, V ]. (iv) Imagine a large number of simulations from the joint distribution of X and V and sketch a scatter plot of V vs X. W10.3 Exam2008 Random variables X and Y have joint pdf 1 1 2 2 fXY (x, y) = exp − (x + y ) 2π 2 for −∞ < x, y < ∞. 40 (a) Find the marginal densities of X and Y and identify their form. (b) Are X and Y independent? Give reasons. S 1 1 2 X (c) Suppose = + . Give the mean and the variance of the random T −2 −1 3 Y S vector and hence state its distribution. T W10.4 IndepN The random variables X1 , . . . Xn are independent with E(Xi ) = µi and Var [Xi ] = σ 2 for 1 ≤ i ≤ n. For constants ai , bi , 1 ≤ i ≤ n, show that " # X X X ai b i . Cov ai X i , bi X i = σ 2 i i i Deduce that if X1 , . . . , Xn P are normal random variables, then independent if and only if i ai bi = 0. W10.5 MVN:4→3 X1 X2 X3 X4 P i ai Xi and P i bi X i are Let (X1 , X2 , X3 , X4 )0 be multivariate Normal with 2 17 2 −3 −2 9 5 −9 ∼ MVN4 3 , 2 . 0 −3 5 11 1 −1 −2 −9 1 14 Find the distribution of (X1 + X3 , 2X2 − X4 , X3 + X4 ). W10.6 Conditional Suppose that X and Y have a bivariate Normal distribution 2 with parameters µX , µY , σX , σY2 and ρ. Show that the conditional distribution of X given 2 Y = y is Normal with expectation µX + ρ(y − µY )σX /σY and variance σX (1 − ρ2 ). You may assume that the marginal distribution of Y is N (µY , σY2 ). W10.7 Exam 2017 For two random variables X and Y with E [X] = 0, Var [X] = 1, E [Y ] = µY , Var [Y ] = 1 and Cov [X, Y ] = η, two new random variables, S and T are defined as S a 0 X = . T b c Y (a) In terms of µY and η, write down E [S] and E [T ], and, using the covariance sandwich formula or otherwise, find Var [S], Var [T ] and Cov [S, T ]. (b) Now, and for the remainder of this question, suppose that X and Y are independent. Use the representation of S and T in terms of X and Y to find the conditional expectation and conditional variance of T |S = s. (c) Further to the independence of X and Y , now suppose that X ∼ N (0, 1). 41 (i) Simplify the expression for Cov [S, T ] from part (a) and hence find E [ST ]. (ii) Find E [S 2 T 2 ] (Hint: E [X 4 ] = 3). 42