International Baccalaureate Higher Level International Baccalaureate Higher Level International Baccalaureate Higher Level International Baccalaureate Higher Level International Baccalaureate Higher Level International DiscreteHigher random variables Baccalaureate Higher Level Baccalaureate Level International International Higher Level International Baccalaureate Higher LearningBaccalaureate outcomes Level International Baccalaureate Higher Level International Baccalaureate This workInternational will help youBaccalaureate to learn Higher Level Higher Level International Baccalaureate Higher Level International Baccalaureate Higher Level • about probability distributions for discrete random variables International Baccalaureate Higher Level International Baccalaureate Higher to calculate and use E(X), the expectation (mean) Baccalaureate Level• how International Baccalaureate Higher Level International Higher Level International Higher Level International • how to calculate andBaccalaureate use E[g(X)], the expectation of a simple Baccalaureate Level International Baccalaureate Higher Level function of Higher X International Baccalaureate Higher Level International Baccalaureate Higher to calculate and use Var(X), theLevel variance of X Level• how International Baccalaureate Higher International Baccalaureate Higher Levelthe International HigherF(x) Level International • about cumulativeBaccalaureate distribution function Baccalaureate Higher Level International Baccalaureate Higher Level • about the binomial andHigher Poisson distributions International Baccalaureate Level International Baccalaureate Higher Level International Baccalaureate Higher Level International Baccalaureate Higher Level International Baccalaureate Higher Level International Baccalaureate Higher Level International Baccalaureate Higher Level International Baccalaureate Higher Level International Baccalaureate Higher Level International Baccalaureate Higher Level International Baccalaureate When a variable is discrete, it is possible to specify or describe all its possible numerical variables, for example • the number of females in a group of four students: the possible values are 0, 1, 2, 3, 4, • the amount gained, in pence , in a game where the entry fee is 10 p and the prizes are 50 p and £1: the possible values are 10, 40, 90, • the number of times you throw a die until a six appears: the possible values are 1, 2, 3, 4, 5, … to infinity. Probability Distributions Consider this situation: By mistake, three faulty fuses are put into a box containing two good fuses. The faulty and good fuses become mixed up and are indistinguishable by sight. You take two fuses from the box. What is the probability that you take a. no faulty fuses, b. one faulty fuse, c. two faulty fuses. Probability 2 4 3 Outcome 2 F P F F 3 2 0 .3 faulty 5 4 F 5 2 F’ 4 3 2 5 4 3 2 P F F ' 0 .3 5 4 1 faulty 2 3 0 .3 5 4 1 faulty F P F ' F F’ 1 4 a. P(no faulty fuses) = 0.1 b. P(one faulty fuse) = 0.6 c. P(two faulty fuses) = 0.3 F’ 2 1 P F ' F ' 0 .1 5 4 0 faulty The variable being considered here is ‘the number of faulty fuses’ and it can be denoted by X. then the answers to the previous questions can be written as a. P X 0 0.1 b. P X 1 0.6 c. P X 2 0.3 or placed in a table x P(X = x) 0 1 2 0.1 0.6 0.3 S ince P X 0 P X 1 P X 2 1 then X is a discrete random variable. For a discrete random variable, the sum of the probabilities is 1, P X x 1 all x The function that is responsible for allocating probabilities, P(X = x) is known as the probability density function of X. (p.d.f of X). Two tetrahedral dice, each with faces labelled 1, 2, 3 and 4 are thrown and score noted, where the score is the sum of the two numbers on which the dice land. Find the probability density function (p.d.f.) of X, where X is the random variable ‘the score when the two dice are thrown’. The p.d.f. of a discrete random variable Y is given by P(Y = y) = cy2, for y = 0, 1, 2, 3, 4. Given that c is a constant, find the value of c. The discrete random variable W has probability distribution as shown w -3 -2 -1 0 1 P(W = w) 0.1 0.25 0.3 0.15 d Find a. the value of d b. P 3 W 0 d. P 1 W 1 e. the mode c. P W 1 Expectation of X, E(X) E(X) is read as ‘E of X’ and it gives an average or typical value of X, known as the expected value or expectation of X. This is comparable with the mean in descriptive statistics. Experimental approach The frequency distribution shows the results when an unbiased die is thrown 120 times. Score, x 1 2 3 4 5 6 Frequency, f 15 22 23 19 23 18 Total 120 The mean score x fx f 1 15 2 22 3 23 4 19 5 23 6 18 120 3.6 2 s.f. You could write this out in a different way x 1 15 120 2 22 120 3 23 120 4 19 120 5 23 120 6 These are the relative frequencies of the scores of 1, 2, 3, 4, 5, 6 respectively Notice that they are close to 12200 61 18 120 Theoretical approach When an unbiased die is thrown the probability of obtaining a particular value is 1 . 6 T h e p ro b a b ility d istrib u tio n is P X x Score, x P(X = x) 1 6 fo r x 1, 2, 3, 4, 5, 6 . 1 2 3 4 5 6 1 1 1 1 1 1 6 6 6 6 6 6 The expected mean or expectation of X, is obtained by multiplying each score by its probability, then summing. It is written E(X) so E X E X 1 1 6 2 xP X all x 1 6 3 x 1 6 4 1 6 5 1 6 6 and E X 1 6 3 .5 A random variable X has probability distribution as shown. Find the expectation, E(X) x -2 -1 -1 1 2 P(X = x) 0.3 0.1 0.15 0.4 0.05 Find the expected number of sixes when three fair dice are thrown. Find the expectation, E(X) x P(X = x) 1 2 3 4 5 0.1 0.2 0.4 0.2 0.1 A fruit machine consists of three windows which operate independently. Each window shows pictures of fruits: lemons, apples, cherries or bananas. The probability that a window shows a particular fruit is as follows. The rules for playing the game on the fruit machine are: P(lemon) = 0.4 P(cherries) = 0.2 You win £1 You win 50p P(apple) = 0.1 You win 40p P(banana) = 0.3 You win 80p Find the expected gain or loss if you play the game. In any order Expectation of any function of X, E[g(X)] The definition of expectation can be extended to any function of X, 2 su ch a s 1 0 X , X , 1 X , X 4, e tc. In general, if g(X) is any function of the discrete random variable X, then E g X g x P X x a ll x For example E 10 X 10 xP X x all x E X 2 x a ll x 2 P X x 1 E x x P X x 1 a ll x E X 4 x 4P X all x x T he random variable X has p.d.f. P X x for x 1, 2, 3 show n. x P(X = x) 1 2 3 0.1 0.6 0.3 Calculate a. E(X), b. E(3), c. E(5X) d. E(5X+3) In general for constants a and b, E a a E aX aE X E aX b aE X b A six-sided die has faces marked with numbers 1, 3, 5, 7, 9 and 11. It is biased so that the probability of obtaining the number R in a single roll of the die is proportional to R. a. Show that the probability distribution of R is given by P R r r , 36 r 1, 3, 5, 7, 9, 1 1 . b. The die is to be rolled and a rectangle drawn with sides of length 6 cm and R cm. Calculate the expected value of the area of the rectangle. c. The die is to be rolled again and a square with sides of length 24R-1 cm. Calculate the expected value of the perimeter of the square. r 1 3 5 7 9 11 P(R = r) k 3k 5k 7k 9k 11k r 1 3 5 7 9 11 P(R = r) 1 36 3 36 5 36 7 36 9 36 11 36 X is the number of heads obtained when two coins are tossed find a. The expected number of heads, b. E(X2), c. E(X2 – X). x P(X = x) 0 1 2 1 4 1 2 1 4 In general,for tw o functions of X , g x and h x E g x h x E g x E h x Variance of X, Var(X) Remember that variance = (standard deviation)2 Experimental approach For a frequency distribution with mean x the variance s2 is given by s 2 f x x f 2 or s 2 f fx 2 x 2 Theoretical approach For a discrete random variable X , w ith E ( X ) the variance is d e fin e d a s fo llo w s : Var X E X 2 Var X E X 2 2 X E X 2E X E E X 2 E X E X 2 2 2 2 2 2 2 2 2 The random variable X has probability distribution as shown in the table: x P(X = x) Find 1 2 3 4 5 0.1 0.3 0.2 0.3 0.1 a. E X , b. E X c. V ar X , d. , th e sta n d a rd d e v ia tio n o f X . 2 , Two boxes each contain three cards,. The first box contains cards labelled 1, 3, and 5; the second box contains cards labelled 2, 6 and 8. In a game, a player draws one card at random from each box and his score, X, is the sum of the numbers on the two cards. a. Obtain the six possible values of X and find the corresponding probabilities. b. Calculate E(X), E(X2) and the variance of X. Second box First box x P(X = x) 2 6 8 1 3 7 9 3 5 9 11 5 7 11 13 3 5 7 9 1 9 1 9 2 9 2 9 11 2 9 13 1 9 The following results relating to variance are useful. If a and b are any constants, V ar a 0 Var aX a 2Var X 2 Var aX b a Var X For example Var 2 X 22 Var X 4 V ar X Var 2 X 3 2 Var X 2 4 V ar X V a r 5 x 1 V a r X 2 Var X The cumulative distribution function, F(x) In a frequency distribution, the cumulative frequencies are obtained by summing all the frequencies up to a particular value. In the same way, in a particular distribution, the probabilities u to certain values are summed to give the cumulative probability. The cumulative probability function is written F(x). Consider the following probability distribution. x P(X = x) 1 2 3 4 5 0.05 0.4 0.3 0.15 0.1 F 1 P X 1 0.05 F 2 P X 2 P X 1 P X 2 0.05 0.4 0.45 F 3 P X 3 0.75 F 4 P X 4 0.9 F 5 P X 5 1 The cumulative distribution function is x F(x) 1 2 3 4 5 0.05 0.45 0.75 0.9 1 In general, for the discrete random variable X, The cumulative distribution function F(x) where F x P X x The discrete random variable X has cumulative distribution function F x x fo r x 1 , 2 , 6 ,6 Write out the probability distribution and suggest what X represents. x F(x) x P(X = x) 1 2 2 6 1 6 1 1 6 3 4 5 6 4 6 5 6 1 3 4 5 6 1 6 1 6 1 6 1 6 3 6 2 1 6 For a discrete random variable X the cumulative distribution function F(x) is shown x F(x) Find a. P X 3 , 1 2 3 4 5 0.2 0.32 0.67 0.9 1 b. P X 2 The binomial distribution In a particular population, 10% of people have blood type B. If three people are selected at random from the population, what is the probability that exactly two of them have blood type B? 0 .1 0 .1 B 0 .9 B 0 .1 0 .1 0 .9 0 .1 2 B 0 .9 0 .1 2 B’ B 0 .9 0 .1 B 0 .9 0 .9 B’ 0 .1 0 .9 B’ 0 .9 0 .1 B’ 0 .9 0 .1 B B’ B B’ 0 .9 B’ 2 P exactly tw o type B P B B B ' P B B ' B P B ' B B 3 0 .9 0 .1 2 0 .0 2 7 Now consider the situation when eight people are selected. What is the probability that exactly two of the eight people will have blood type B? Can you find the probability that exactly two have blood type B in a randomly selected group of 12 people.? Conditions for binomial model For a situation to be described using a binomial model • • • • a finite number, n, trials are carried out, the trials are independent the outcome of each trial is deemed either a success or a failure, the probability, p of successful outcome is the same for each trial. The discrete random variable, X is the number of successful outcomes in n trials. Then X is said to follow a binomial distribution X B n , p or X B in n , p NOTE: The number of trials n and the probability of success p, are both needed in order to describe the distribution completely. W e w rite the p (failure) as q w here q 1 p If X B n , p , the probability of obtaining r suc cesses in n trials is P X r w h e re P X r n n r r q p fo r r 0, 1, 2, 3, r , n. At Sellitall Supermarket, 60% of the customers pay by credit card. Find the probability that in a randomly selected sample of ten customers. a. Exactly two pay by credit card b. More than seven pay by credit card. Five independent trials of an experiment are carried out. The probability of a successful outcome p and the probability of failure is 1–p=q Write out the probability distribution of X, where X is the number of successful outcomes in five trials. Comment on your answer. The random variable X is distributed B(7, 0.2). Find correct to three decimal places a. P(X = 3), b. P(1 < X ≤ 4) c. P(X > 1) A box contains a large number of pens. The probability that a pen is faulty is 0.1. How many pens would you need to select to be more than 95% certain of picking at least one faulty one? Expectation and variance of the binomial distribution It can be shown that If X EX B n , p then np and V ar X npq w here q 1 p The random variable X is B(4, 0.8). Construct the probability distribution for X and find the expectation and variance. V erify that E X x P(X = x) np and V ar X npq . 0 1 2 3 4 0.0016 0.0256 0.1536 0.4096 0.4096 The probability that it will be a fine day is 0.4. Find the expected number of fine days in a week and also the standard deviation. X is B(n, p) with mean 5 and standard deviation 2. Find the values of n and p. The mode of the binomial distribution The mode is the value of X that is most likely to occur. Consider the following probability sketches. B 7, 0 .3 5 X p 0.4 0.3 X p 0.4 0.3 B 4, 0 .5 X p 0.4 0.3 0.2 0.2 0.2 0.1 0.1 0.1 0 0 0 0 1 2 3 4 5 6 7 0 1 2 3 4 x X p 0.4 0.3 0.2 0.2 0.1 0.1 0 0 x x X p 0.4 0.3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 x B 9, 0 .5 B 6, 0 .8 B 2 0, 0 .2 5 For 13-20 the probabilities 0 1 2 3 4 5 6 7 8 9 10 11 12 too small to x illustrate • when p = 0.5 and n is odd, there are two modes, • otherwise the distribution has one mode The mode can be found by calculating all the probabilities and find the value of X with the highest probability. This without a GDC can be very tedious; it is only usually necessary to consider the probabilities of values of X close to the mean np. The probability that a student is awarded a distinction in the Mathematics examination is 0.05. In a randomly selected group of 50 students, what is the most likely number of students awarded a distinction? The Poisson distribution Consider these random variables • the number of emergency calls received by an ambulance control in an hour, • the number of vehicles approaching a expressway toll bridge in a five minute interval, • the number of flaws in a metre length of material • the number of white corpuscles on a slide. Assuming these occur randomly, they are all examples of variables that can be modelled using a Poisson distribution. Conditions for Poisson model • Events occur singly at random in a given interval of time or space. • λ, the mean number of occurrences in the given interval, is known and is finite. The variable X is the number of occurrences in the given interval. If the above conditions are satisfied , X is said to follow a Poisson distribution written X P o w h e re P X x e x x! fo r x 0, 1, 2, 3, to in fin ity A student finds that the average number of amoebas in 10 ml of pond water from a particular pond is four. Assuming that the number of amoebas follows a Poisson distribution, find the probability that in 10 ml sample a. there are exactly five amoebas, b. there are no amoebas, c. there are fewer than three amoebas. These two results are useful in general If X P o X , th e n P X 0 e a n d P X 1 e Unit interval Care must be taken to specify the interval being considered. In the previous example the mean number of amoebas in 10 ml of pond water from a particular pond is 4 so the number in 10 ml is distributed Po(4). Now suppose you want to find a probability relating to the number of amoebas in 5 ml of water from the same pond. The mean number of amoebas in 5 ml is two, so the number 5 ml is distributed Po(2) Similarly, the number of amoebas in 1 ml of pond water is distributed Po(0.4) On average the school photocopier breaks down eight times during the school week. (Monday to Friday). Assuming that the number of breakdowns can be modelled by a Poisson distribution, find the probability that is breaks down a. five times in a given week, b. once on Monday, c. eight times in a fortnight. Mean and variance of the Poisson distribution If X P o then E X and V ar X X follows a Poisson distribution with standard deviation 1.5. Find P(X ≥ 3) P o 1 X X p 0.4 p 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 1 2 3 4 X 5 6 Po 2 x 0 0 p 0.3 p 0.3 0.2 0.2 0.1 0.1 0 0 1 2 3 4 5 6 7 x P o 1.6 1 2 3 4 5 6 x X P o 2.2 5 6 0 0 1 2 3 4 7 8 x X Po 3 6 7 p 0.3 0.2 0.1 0 0 1 2 3 4 5 X 8 x P o 3.8 p 0.3 0.2 0.1 0 0 1 2 3 4 5 6 7 8 9 x X Po 5 p0.2 Notice for small values of λ, the distribution is very skew, but it becomes more symmetrical as λ increases 0.1 0 0 1 2 3 4 5 6 7 8 9 10 11 x X P o 10 0.2 p 0.1 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20x The mode of the Poisson distribution The mode is the value of X that is the most likely to occur, i.e. with the greatest probability. From the diagrams, we can see that when λ = 1, there are two modes, 0 and 1, when λ = 2, there are two modes, 1 and 2, when λ = 1, there are two modes, 2 and 3, In general, if λ is an integer, there are two modes, λ – 1 and λ. For example, if X ~ Po(8), the modes are 7 and 8. Notice also that when λ = 1.6, the mode is 1, when λ = 2.2, the mode is 2, when λ = 3.8, the mode is 3, In general, if λ is not an integer, mode is the integer below λ. For example, if X ~ Po(4.9), the mode is 4 .