Probability Distributions A probability function is a function which assigns probabilities to the values of a random variable. Individual probability values may be denoted by the symbol P(X=x), in the discrete case, which indicates that the random variable can have various specific values. All the probabilities must be between 0 and 1; 0≤ P(X=x)≤ 1. The sum of the probabilities of the outcomes must be 1. ∑ P(X=x)=1 It may also be denoted by the symbol f(x), in the continuous, which indicates that a mathematical function is involved. Probability Distributions Discrete Probability Distributions Binomial Poisson Continuous Probability Distributions Normal Example 1.19 Check whether the distribution is a probability distribution. Solution X 0 1 2 3 4 P(X=x) 0.125 0.375 0.025 0.375 0.125 4 P( X x) P( X 0) P( X 1) P( X 2) P( X 3) P( X 4) 0 = 0.125+0.375+0.025+0.375+0.125 = 1.025 1 # so the distribution is not a probability distribution. Binomial Distribution An experiment in which satisfied the following characteristic is called a binomial experiment: 1. The random experiment consists of n identical trials. 2. Each trial can result in one of two outcomes, which we denote by success, S or failure, F. 3. The trials are independent. 4. The probability of success is constant from trial to trial, we denote the probability of success by p and the probability of failure is equal to (1 - p) = q. Examples: 1. No. of getting a head in tossing a coin 10 times. 2. No. of getting a six in tossing 7 dice. 3. A firm bidding for contracts will either get a contract or not A binomial experiment consist of n identical trial with probability of success, p in each trial. The probability of x success in n trials is given by P ( X x ) nC x p x q n x ; x 0,1, 2....n The Mean and Variance of X if X ~ B(n,p) are Mean Variance E ( X ) np : : 2 V ( X ) np(1 p) npq Std Deviation : npq where n is the total number of trials, p is the probability of success and q is the probability of failure. Example 1.20 Given that X a) P( X 2) b) P( X 3) c) P( X 4) d) P(2 X 5) e) E( X ) f) Var( X ) b(12,0.4), find SOLUTIONS: a) P ( X 2) 12C2 (0.4) 2 (0.6)10 0.0639 b) P ( X 3) 12C3 (0.4)3 (0.6)9 0.1419 c) P ( X 4) 12C4 (0.4) 4 (0.6)8 0.2128 d) P (2 X 5) P ( X 2) P( X 3) P( X 4) 0.0639 0.1419 0.2128 0.4185 e) E ( X ) np 12(0.4)=4.8 f) Var ( X ) npq 12(0.4)(0.6)= 2.88 Cumulative Binomial Distribution When the sample is relatively large, tables of Binomial are often used. Since the probabilities provided in the tables are in the cumulative form P X k the following guidelines can be used: Example 1.21 In a Binomial Distribution, n =12 and p = 0.3. Find the following probabilities. a) P( X 5) P( X 4) 0.7237 b) P( X 5) 0.8822 c) P( X 9) 1 P( X 8) 1 0.9983 0.0017 d) P(5 X 9) P( X 9) P( X 5) 0.9998 0.8822 0.1176 e) P(3 X 5) P( X 5) P( X 2) 0.8822 0.2528 Bin. table Example 1.22 In August 2009, David and Maria conducted a survey for Fortune magazine to examine CEO`s attitudes toward employee`s personal problems. 30% of the CEOs interviewed felt that personal problems were none of the company`s business. Assume that this result is true for the current population of CEOs. Using the Binomial distribution tables, in a random samples of 15, find the probability that (i) The number of CEOs who hold this view is 10. (ii) The number of CEOs who hold this view is between 9 and 12. (iii) The number of CEOs who hold this view is at most 7. (iv) Find the mean and standard deviation of binomial distribution. Solution: i) P( X 10) ii) P(9 X 12) iii) P( X 7) iv) np npq Poisson Distribution Poisson distribution is the probability distribution of the number of successes in a given space*. *space can be dimensions, place or time or combination of them Examples: 1. No. of cars passing a toll booth in one hour. 2. No. defects in a square meter of fabric 3. No. of network error experienced in a day. A random variable X has a Poisson distribution and it is referred to as a Poisson random variable if and only if its probability distribution is given by e P( X x) x! x for x 0,1, 2,3,... A random variable X having a Poisson distribution can also be written as X Po ( ) with E ( X ) and Var ( X ) Example 1.23 Consider a Poisson random variable with 3 . Calculate the following probabilities : a) Write the distribution of Poisson b) P( X 0) c) P( X 1) d) P( X 1) Solution: a) Write the distribution of Poisson b) c) d) Example 1.24 The average number of traffic accidents on a certain section of highway is two per week. Assume that the number of accidents follows a Poisson distribution with mean is 2. i) ii) Find the probability of no accidents on this section of highway during a 1-week period Find the probability of a most three accidents on this section of highway during a 2-week period. Solution: i) P( X 0) ii) P( X 3) Poisson Approximation of Binomial Probabilities The Poisson distribution is suitable as an approximation of Binomial probabilities when n is large and p is small. Approximation can be made when n 30 , and either np 5 or nq 5 Example 1.25 0.9786 Example 1.25 Suppose a life insurance company insures the lives of 4000 men aged 42. If actuarial studies show the probability that any 42 year old man will die in a given year to be 0.001, find the exact probability that the company will have to pay x = 4 claims during a given year. Solution: Step: 1. Write the distribution of Binomial X ~ B(4000,0.001) 2. The value for n is large and value of p is too small, check whether np 5 3. If yes, proceed to solve using Poisson Approximation. Use formula e x x! Exercise 1.4 1. Given that X ~ B(2, 0.4) Find P( X 0), P( X 2), P( X 2), P( X 1), E ( X ),Var ( X ). (ans: 0.36, 0.16, 1.0, 0.64, 0.8, 0.48). 2. In Kuala Lumpur, 30% of workers take public transportation. In a sample of 10 workers, i) what is the probability that exactly three workers take public transportation daily? (ans: 0.2668) ii) what is the probability that at least three workers take public transportation daily? (ans: 0.6172) 3. Let X ~ P0 (12). Using Poisson distribution table, find i) P( X 8) and P( X 8) (ans: 0.1550, 0.0655) ii) P( X 4) and P( X 4) (ans: 0.9977, 0.9924) iii) P(4 X 14) (ans: 0.7697) 4. Last month ABC company sold 1000 new watches. Past experience indicates that the probability that a new watch will need repair during its warranty period is 0.002. Compute the probability that: i) At least 5 watches will need to warranty work. (ans: 0.0527) ii) At most than 3 watches will need warranty work. (ans: 0.8571) iii) Less than 7 watches will need warranty work. (ans: 0.9955) Normal Distribution Numerous continuous variables have distribution closely resemble the normal distribution. The normal distribution can be used to approximate various discrete probability distribution. A continuous random variable X is said to have a normal distribution with parameters and 2 , where and 2 0, if the pdf of X is f ( x) 1 e 2 1 x 2 2 x X is denoted by X ~ N ( , 2 ) with E X and V X 2 CHARACTERISTICS OF NORMAL DISTRIBUTION ‘Bell Shaped’ Symmetrical Mean, Median and Mode are Equal f(X) Location is determined by the mean, μ Spread is determined by the standard deviation, σ The random variable has an infinite theoretical range: + to σ X μ Mean = Median = Mode Many Normal Distributions By varying the parameters μ and σ, we obtain different normal distributions The Standard Normal Distribution Any normal distribution (with any mean and standard deviation combination) can be transformed into the standard normal distribution (Z) Need to transform X units into Z units using Z X The standardized normal distribution (Z) has a mean of , and a standard deviation of 1, 2 1 Z is denoted by Z ~ N (0,1) Thus, its density function becomes 0 Calculating Probabilities for a General Normal Random Variable Mostly, the probabilities involved x, a normal random variable with mean, and standard deviation, Then, you have to standardized the interval of interest, writing it in terms of z, the standard normal random variable. Once this is done, the probability of interest is the area that you find using the standard normal probability distribution. Normal probability distribution, X ~ N ( , 2 ) Need to transform x to z using X Z PATTERNS FOR FINDING AREAS UNDER THE STANDARD NORMAL CURVE Example 1.25 Z table Example 1.26 a) Find the area under the standard normal curve of P(0 Z 1) a) Find the area under the standard normal curve of P(2.34 Z 0) Exercise 1.5 Determine the probability or area for the portions of the Normal distribution described. a) P (0 Z 0.45) b) P (2.02 Z 0) c) P ( Z 0.87) d) P ( 2.1 Z 3.11) e) P (1.5 Z 2.55) Answer : a) 0.1736, b) 0.4783, c) 0.8078, d) 0.9812, e) 0.0614 Example 1.27 Z table Exercise 1.6 Determine Z such that a) P( Z Z ) 0.25 b) P( Z Z ) 0.36 c) P( Z Z ) 0.983 d) P( Z Z ) 0.89 Solutions: a) P( Z Z ) 0.25; Z 0.675 b) P( Z Z ) 0.36; Z 0.3585 c) P( Z Z ) 0.983; Z 2.12 d) P( Z Z ) 0.89; Z 1.2265 Z table Example 1.28 Suppose X is a normal distribution N(25,25). Find a) P (24 X 35) b) P ( X 20) Solutions 35 25 24 25 a) P (24 X 35) P Z P(0.2 Z 2) 5 5 P( Z 2) P( Z 0.2) =P(0<Z 2)+P(0<Z<0.2)=0.4472+0.0793=0.5565 20 25 b) P ( X 20) P Z 5 P( Z 1) P( Z 1) 0.84134 0.5+0.3413 = 0.8413 Example 1.29 An assessment test is used to measure a person’s readiness for college. The mathematics scores in the test are scaled to have a normal distribution with mean 500 and standard deviation 100. i. ii. What is the probability that the people taking the test will score below 350? Remedial assistance will be given to students in the bottom 10%. What is the maximum score of this group of students? Solution: i. P( X 350) ii. P( Z max 500 ) 100 Exercise 1.7 1. Suppose X is a normal distribution, N(70,4). Find a) P(67 X 75) b) P( X 74) 2. Suppose the test scores of 600 students are normally distributed with a mean of 76 and standard deviation of 8. The number of scoring is from 70 to 82 is: Answer : 1. a) 0.927 b) 0.0228 2. 328 students Normal Approximation of the Binomial Distribution When the number of observations or trials n in a binomial experiment is relatively large, the normal probability distribution can be used to approximate binomial probabilities. A convenient rule is that such approximation is acceptable when n 30, and both np 5 and nq 5. Given a random variable X b(n, p), if n 30 and both np 5 and nq 5, then X N (np, npq) with np and 2 npq Continuous Correction Factor The continuous correction factor needs to be made when a continuous curve is being used to approximate discrete probability distributions. 0.5 is added or subtracted as a continuous correction factor according to the form of the probability statement as follows: c .c a) P ( X x ) P ( x 0.5 X x 0.5) c .c b) P ( X x ) P ( X x 0.5) c .c c) P ( X x) P ( X x 0.5) c .c d) P ( X x ) P ( X x 0.5) c .c e) P ( X x ) P ( X x 0.5) c.c continuous correction factor How do calculate Binomial Probabilities Using the Normal Approximation? Find the necessary values of n and p. Calculate np and npq Write the probability you need in terms of x. Correct the value of x with appropriate continuous correction factor (ccf). Convert the necessary x-values to z-values using x 0.5 np z npq Use Standard Normal Table to calculate the approximate probability. Example 1.30 In a certain country, 45% of registered voters are male. If 300 registered voters from that country are selected at random, find the probability that at least 155 are males. Solutions: X is the number of male voters. X b(300, 0.45) c .c P( X 155) P( X 155 0.5) P( X 154.5) np 300(0.45) 135 5; nq 300(0.55) 165 5 Therefore, X N (135, 74.25) 154.5 135 PZ P ( Z 2.26) 0.5 0.4881 0.0119 74.25 Exercise 1.8 Suppose that 5% of the population over 70 years old has disease A. Suppose a random sample of 9600 people over 70 is taken. What is the probability that less than 500 of them have disease A? Answer: 0.8186 Normal Approximation of the Poisson Distribution When the mean of a Poisson distribution is relatively large, the normal probability distribution can be used to approximate Poisson probabilities. A convenient rule is that such approximation is acceptable when 10. Given a random variable X then X N ( , ) Po ( ), if 10, Example 1.31 A grocery store has an ATM machine inside. An average of 5 customers per hour comes to use the machine. What is the probability that more than 30 customers come to use the machine between 8.00 am and 5.00 pm? Solution: X is the number of customers use the ATM machine in 9 hours. X Po (45); 45 10 X N (45, 45) c .c P ( X 30) P( X 30 0.5) P( X 30.5) 30.5 45 PZ P ( Z 2.16) 0.5 0.4846 0.9846 45 Exercise 1.9 The average number of accidental drowning in United States per year is 3.0 per 100000 population. Find the probability that in a city of population 400000 there will be less than 10 accidental drowning per year. Answer : 0.2358 Exercise 1.10 1. Reported that the mean weekly income of a shift foreman in the glass industry is normally distributed with a mean of $1000 and standard deviation of $100. What is the probability of selecting a shift foreman in the glass industry whose income is a) b) Between $1000 and $1100. Between $790 and $1000. c) Between $840 and $1200. Answer : a) 0.3413, b) 0.4821, c) 0.9224