Continuous Random Variables & Probability Distributions The Normal Distribution Chapter 4: Continuous Random Variables and Probability Distributions Walid Sharabati Purdue University February 14, 2014 Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 1 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Chapter Overview Continuous random variables Probability density function (pdf) Definition and interpretation Cumulative distribution function (cdf) Definition and interpretation Relationship between cdf and pdf Expectation, variance and percentile for continuous rv Some continuous distributions Uniform and exponential The Normal distribution Using normal table Approximateing the Bionomial distribution Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 2 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Continuous rv and the Probability Density Function Continuous random variables Definitions Examples Probability density functions (pdf) Definitions Interpretations Examples Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 3 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Continuous rv Definition A random variable X is said to be continuous if its set of possible value includes an entire interval of numbers on the real line. Example Make depth measurements at a randomly selected location in a specific lake. Let X = the depth at this location. X can be any value between 0 and maximum depth M . Example A chemical compound is randomly selected and let X = the pH value. X can be any value between 0 and 14. Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 4 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Probability Density Function (PDF) Definition Let X be a continuous rv. Then a probability distribution or probability density function (pdf) of X is a function f (x) such that for any two numbers a and b with a ≤ b, Z P (a ≤ X ≤ b) = b f (x)dx. a The graph of f is the density curve. i.e., the probability that X falls in [a, b] is the area under the function f (x) above this interval. f (x) must satisfies the following: 1 2 f (x) ≥ 0 for all x. R∞ −∞ f (x)dx = 1 Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 5 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Probability Density Function (PDF) Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 6 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Interpretations of f (x) The density function f (x) gives us an idea about the distribution of probability density instead of probability itself. 1 For any c, P (X = c) = 0, i.e., the probability that X takes any specific value is 0. 2 We can only look at the probability that X falls on a specific interval. This is given by the integration of f (x). 3 For any two numbers a and b with a < b, P (a ≤ X ≤ b) = P (a < X ≤ b) = P (a ≤ X < b) Rb = P (a < X < b) = a f (x)dx. Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 7 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Pdf Example - Uniform Bus comes every 30 minutes, let X = waiting time till a bus comes. The pdf of X is: f (x) = 1 , 0 ≤ x ≤ 30. 30 What is the probability that waiting time is longer than 5 minutes? What is the probability that the waiting time is between 5 and 10 minutes? In general Given b > a, X with pdf: f (x) = said to have uniform distribution. Professor Sharabati (Purdue University) Spring 2014 1 b−a , a Continuous Random Variables ≤ x ≤ b is (Slide 8 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Pdf Example - Exponential Let X = the life span of some bacteria (in hours). X is a continuous rv, the pdf is give as: f (x) = 2e−2x , x ≥ 0 What is the probability that the bacteria lives over 2 hours? What is the probability that the bacteria dies within an hour? In general given λ > 0, X with pdf f (x) = λe−λx , x ≥ 0 is an rv with exponential distribution. Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 9 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Cumulative Distribution Function, Expectation, Variance and Percentile Cumulative distribution function (cdf) for continuous rv Definition and interpretation Relationship between pdf and cdf Examples Expectation and variance of continuous rv Definition Examples Percentile Definition and interpretation Examples Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 10 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Cumulative Distribution Function (CDF) Definition The cumulative distribution function F (x) for a continuous rv X is defined for every number x by: Z x F (x) = P (X ≤ x) = f (y)dy −∞ i.e., F (x) is the area under f (x) to the left of x. We have: 1 0 ≤ F (x) ≤ 1 2 F (x) is non-decreasing. Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 11 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values F (x) and f (x) From the definition of cdf, we can easily derive: Rx P (X ≤ x) = F (x) = −∞ f (y)dy f (x) = F 0 (x), for which the derivative F 0 (x) exists. Rb For a < b, P (a < X < b) = a f (x)dx = F (b) − F (a) R∞ P (X > a) = a f (x)dx = 1 − F (a) Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 12 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Finding F (x) and use F (x) to Compute Probabilities Uniform cdf: Find the cdf F (x) for the uniform distribution: 1 10 2 ≤ x ≤ 12 f (x) = 0 otherwise What is P (x < 6)? What is P (x > 3)? Hint In general, for uniform f (x) = 1 b−a 0 a≤x≤b otherwise The cdf is given by: F (x) = 0, x−a b−a Professor Sharabati (Purdue University) 1 Spring 2014 x<a a≤x<b x≥b Continuous Random Variables (Slide 13 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Finding F (x) and use F (x) to Compute Probabilities Uniform cdf: Find the cdf F (x) for the uniform distribution: 1 10 2 ≤ x ≤ 12 f (x) = 0 otherwise What is P (x < 6)? What is P (x > 3)? Hint In general, for uniform f (x) = 1 b−a 0 a≤x≤b otherwise The cdf is given by: F (x) = 0, x−a b−a Professor Sharabati (Purdue University) 1 Spring 2014 x<a a≤x<b x≥b Continuous Random Variables (Slide 14 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Example Continued Exponential cdf: Find the cdf F (x) for the exponential distribution f (x) = λe−λx , λ > 0, x ≥ 0. What is P (X > a)? What is P (a < X < b)? Hint The general form of an exponential cdf is: 0 x<0 F (x) = 1 − e−λx x ≥ 0 Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 15 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Example Continued Exponential cdf: Find the cdf F (x) for the exponential distribution f (x) = λe−λx , λ > 0, x ≥ 0. What is P (X > a)? What is P (a < X < b)? Hint The general form of an exponential cdf is: 0 x<0 F (x) = 1 − e−λx x ≥ 0 Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 16 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Expectation of Continuous rv Definition (Expectation) The expectation or mean value of a continuous rv X with pdf f (x) is defined as: Z ∞ E(X) = µX = x · f (x)dx −∞ Expectation for continuous rv is an integration instead of a summation, it is a measure of the center of the distribution. Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 17 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Properties of Expectation for Continuous rv 1 2 3 E(aX + b) = aE(x) + b E(a1 X1 + a2 X2 + ... + an Xn ) = a1 E(X1 ) + a2 E(X2 ) + ... + an E(Xn ) Expectation of function of X: if h(X) is any function of X, expectation of h(X) is: Z ∞ E[h(X)] = µh(X) = h(x) · f (x)dx −∞ Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 18 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Examples of Expectations Uniform expectation Find the expectation of the uniform rv with pdf: 1 b−a a ≤ x ≤ b f (x) = 0 otherwise Answer: E(X) = a+b 2 Exponential expectation Find the expectation of the exponential rv with parameter λ. Answer: E(X) = Professor Sharabati (Purdue University) Spring 2014 1 λ Continuous Random Variables (Slide 19 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Examples Continued... Find E(X 2 ) for uniform distribution with parameters a, b. Answer: E(X 2 ) = a2 + ab + b2 3 Find E(X 2 ) for exponential distribution with parameter λ,i.e., f (x) = λe−λx , λ > 0, x ≥ 0. Answer: E(X 2 ) = Professor Sharabati (Purdue University) Spring 2014 2 λ2 Continuous Random Variables (Slide 20 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Variance of Continuous rv Definition (Variance) The variance of a continuous rv X with pdf f (x) and expectation E(X) is: Z ∞ (x − E(X))2 · f (x)dx = E[(X − E(X))2 ] V ar(x) = −∞ Standard deviation of X is: p V ar(X) Variance of continuous rv is an integration instead of a summation, it is a measure of the spreadness of the distribution. Properties of Variance: 1 V ar(aX + b) = a2 V ar(X) 2 V ar(X) = E(X 2 ) − (E(X))2 = E(X 2 ) − µ2 X Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 21 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Examples of Variances Variance of uniform: Find the variance of uniform: 1 b−a a ≤ x ≤ b f (x) = 0 otherwise Answer: V ar(X) = (b − a)2 12 Variance of exponential: Find the variance of exponential with parameter λ. Answer: V ar(X) = Professor Sharabati (Purdue University) Spring 2014 1 λ2 Continuous Random Variables (Slide 22 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Percentiles of a Continuous Distribution Definition Let p be a number between 0 and 1. The (100p)th percentile of the distribution of a continuous rv X, denoted η(p), is defined by: Z η(p) p = F (η(p)) = f (y)dy −∞ η(p) is the value on the measurement axis such that 100p% of the area under the graph of f (x) lies to the left of η(p) and 100(1 − p)% lies to the right. For example, η(0.8), the 80th percentile, means that 80% of all population are below η(0.8) and 20% are above. Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 23 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Median of a Continuous rv: 50th Percentile Definition The median of a continuous distribution (denoted µ̃, is the 50th percentile. That is: Z µ̃ 0.5 = F (µ̃) = f (y)dy −∞ i.e., median divides the pdf into two halves with equal area. Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 24 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Continuous Random Variables Cumulative Distribution Functions and Expected Values Exercise Find the 50th percentile of the pdf given below: 3 2 2 (1 − x ) 0 ≤ x ≤ 1 f (x) = 0 otherwise Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 25 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Normal Distribution Normal pdf Standard Normal, pdf and cdf Normal table zα notation Non-standard Normal Examples Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 26 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Normal Distributions Definition A continuous rv X is said to have a normal distribution with parameters µ and σ, where −∞ < µ < ∞ and σ > 0, if the pdf of X is (x−µ)2 1 f (x; µ, σ) = √ e− 2σ2 , −∞ < x < ∞. σ 2π E(X) = µ. V ar(X) = σ 2 and thus std dev= σ. Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 27 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Standard Normal Distribution Definition The normal distribution with parameter values µ = 0 and σ = 1 is called a standard normal distribution. The standard normal rv is denoted by Z. pdf is: z2 1 f (z) = √ e− 2 , −∞ < z < ∞ 2π The cdf, denoted by Φ(z) (instead of F (z)) is: Z z Φ(z) = P (Z ≤ z) = f (y)dy −∞ The standard normal density curve is called z curve. z curve is bell shaped, symmetric wrt y axis. Φ(z) gives the area under the normal density curve from −∞ to the number z. Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 28 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Standard Normal Distribution Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 29 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Standard Normal Table There is no closed form for Φ(z), so standard normal cdf values have been tabulated using numeric methods. Let Z be a standard normal rv, find the following using the standard normal table: 1 P (Z ≤ 0.85) P (Z ≤ 0.85) = Φ(0.85) 2 P (Z > 1.32) P (Z > 1.32) = 1 − P (Z < 1.32) = 1 − Φ(1.32) 3 P (−2.1 < Z < 1.78) P (−2.1 < Z < 1.78) = P (Z < 1.78) − P (Z < −2.1) = Φ(1.78) − Φ(−2.1) Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 30 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Another Example Let Z be a standard normal rv, find z when: 1 P (Z < z) = 0.9278 P (Z < z) = Φ(z) = 0.9278, look for 0.9278 in table, and find z accordingly. 2 P (|Z| < z) = 0.8132 P (−z < Z < z) = P (−z < Z < 0) + P (0 < Z < z) = 2P (0 < Z < z) = 2(Φ(z) − Φ(0)) = 2(Φ(z) − 12 ) = 2Φ(z) − 1 = 0.8132, thus Φ(z) = 0.9066, look for 0.9066, and find z accordingly. in the table Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 31 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution zα N otation Later when we discuss inferential statistics, we will need values on the measurement axis that capture small tail areas under the normal curve, this is denoted zα : zα denote the value on the measurement axis for which α of the area under the z curve lies to the right of zα . 1 − α is the area lies to the left of zα under the z curve. i.e., zα is the 100(1 − α)th percentile of the standard normal dist. z curve is symmetric wrt y axis, so area to the left of −zα is also α. z is usually referred to as z critical values. Example What is z0.05 ? It is the ?-th percentile? Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 32 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Nonstandard Normal Distributions Proposition If X ∼ N (µ, σ 2 ), then X −µ σ has a standard normal distribution, thus b−µ b−µ a−µ a−µ ≤Z≤ P (a ≤ X ≤ b) = P =Φ −Φ σ σ σ σ a−µ b−µ P (X ≤ a) = Φ , P (X ≥ b) = 1 − Φ σ σ Z= Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 33 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Empirical Rule Nonstandard normal curve: 1 Roughly 68% of the values are within σ of the mean. 2 Roughly 95% of the values are within 2σ of the mean. 3 Roughly 99.7% of the values are within 3σ of the mean. Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 34 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Exercise of Nonstandard Normal Reaction time for an in-traffic response to a brake signal from standard brake lights can be modelled with a normal with mean 1.25 sec and std dev 0.46 sec. What is the probability that reaction time is between 1.00 and 1.75 sec? Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 35 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Percentiles of an Arbitrary Normal The (100p)th percentile of a normal distribution with mean µ and standard deviation σ can be easily transformed from the percentile of a standard normal. (100p)th percentile for N (µ, σ 2 ) = µ + (100p)th percentile for N (0, 1) · σ Example What is the 95th percentile of N (µ = 2, σ = 4.5)? Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 36 of 37) Continuous Random Variables & Probability Distributions The Normal Distribution Normal Approximation to Binomial Let X be a binomial rv based on n trials, each with probability of success p. Check the binomial pmf (histogram) is not too skewed, X has papproximately a normal distribution, with µ = np and σ = np(1 − p). ! x + 0.5 − np P (X ≤ x) = Φ p np(1 − p) In practice, the approximation is adequate provided that both np ≥ 10 and n(1 − p) ≥ 10 Professor Sharabati (Purdue University) Spring 2014 Continuous Random Variables (Slide 37 of 37)