ST 380 Probability and Statistics for the Physical Sciences The Normal Distribution The normal distribution plays a central role in probability theory and in statistics. It is often used as a model for the distribution of continuous random variables. Like all models, it is always wrong, but sometimes useful. Even when individual measurements are not normally distributed, the central limit theorem implies that sums or averages of the measurements are at least approximately normally distributed. 1 / 16 Continuous Random Variables Normal Distribution ST 380 Probability and Statistics for the Physical Sciences Definition A continuous random variable X is said to have a normal distribution, with parameters µ and σ, if its pdf is f (x; µ, σ) = √ 1 2πσ 2 2 /(2σ 2 ) e −(x−µ) , −∞ < x < ∞. We write X ∼ N(µ, σ 2 ), meaning “X is normally distributed with parameters µ and σ”. curve(dnorm(x, mean = 1, sd = 1), from = -6, to = 6) curve(dnorm(x, mean = -1, sd = 2), from = -6, to = 6, add = TRUE) 2 / 16 Continuous Random Variables Normal Distribution ST 380 Probability and Statistics for the Physical Sciences Standard Normal Distribution If Z is normally distributed with µ = 0 and σ = 1, that is Z ∼ N(0, 1), then Z has the standard normal distribution. The pdf of Z is denoted ϕ(z): 1 2 ϕ(z) = √ e −x /2 . 2π The cdf of Z is denoted Φ(z): Z z Φ(z) = ϕ(y ) dy −∞ 3 / 16 Continuous Random Variables Normal Distribution ST 380 Probability and Statistics for the Physical Sciences Mean For the standard normal distribution, Z ∞ E (Z ) = xϕ(z) dz = 0, −∞ and for the general normal distribution with parameters µ and σ, Z ∞ E (X ) = xf (x; µ, σ) dx = µ. −∞ That is, the parameter µ is also the expected value of X . 4 / 16 Continuous Random Variables Normal Distribution ST 380 Probability and Statistics for the Physical Sciences Variance For the standard normal distribution, Z ∞ V (Z ) = x 2 ϕ(z) dz = 1, −∞ and for the general normal distribution with parameters µ and σ, Z ∞ V (X ) = (x − µ)2 f (x; µ, σ) dx = σ 2 . −∞ That is, the parameter σ is also the standard deviation of X . 5 / 16 Continuous Random Variables Normal Distribution ST 380 Probability and Statistics for the Physical Sciences Standardizing If X ∼ N(µ, σ 2 ), then Z= X −µ ∼ N(0, 1). σ Tables of the standard normal distribution tell us that, for instance, P(|Z | < 1.96) = 0.95. So 0.95 = P(|Z | < 1.96) X − µ < 1.96 = P σ = P(µ − 1.96σ < X < µ + 1.96σ). 6 / 16 Continuous Random Variables Normal Distribution ST 380 Probability and Statistics for the Physical Sciences Percentiles Because the normal distribution is widely used, we use a special notation for its percentiles: zα = [100(1 − α)]th percentile = ηZ (1 − α). That is, zα is the value for which P(Z > zα ) = α For example, tables show that z.025 = 1.96. Note Other authors write this as z.975 ; be careful! 7 / 16 Continuous Random Variables Normal Distribution ST 380 Probability and Statistics for the Physical Sciences Other Normal Distributions If X ∼ N(µ, σ 2 ), then FX (x) = P(X ≤ x) X −µ x −µ =P ≤ σ σ x −µ =P Z ≤ σ x −µ =Φ . σ Differentiating, 1 fX (x) = ϕ σ 8 / 16 x −µ σ . Continuous Random Variables Normal Distribution ST 380 Probability and Statistics for the Physical Sciences Percentiles If X ∼ N(µ, σ 2 ) and ηX (p) is its (100p)th percentile, then ηX (p) − µ = Φ[ηZ (p)], p = FX [ηX (p)] = Φ σ so ηX (p) − µ = ηZ (p), σ or ηX (p) = µ + σηZ (p) = µ + σz(1−p) . 9 / 16 Continuous Random Variables Normal Distribution ST 380 Probability and Statistics for the Physical Sciences The Exponential Distribution The continuous analog of the geometric distribution is the exponential distribution, with pdf ( λe −λx x ≥ 0 f (x; λ) = 0 x < 0. Integrating, ( 1 − e −λx F (x; λ) = f (y ; λ) dy = 0 −∞ Z 10 / 16 x Continuous Random Variables x ≥0 x < 0. Exponential Distribution ST 380 Probability and Statistics for the Physical Sciences Mean and Variance Integration by parts shows that Z ∞ 1 E (X ) = λxe −λx dx = . λ 0 Another integration by parts shows that V (X ) = 1 , λ2 so the standard deviation of X is 1/λ, the same as the expected value. 11 / 16 Continuous Random Variables Exponential Distribution ST 380 Probability and Statistics for the Physical Sciences The “No Memory” Property The exponential distribution is often used as a model for the time you wait until some event occurs. Suppose you wait until time t0 , and the event has not occurred; what is the distribution of the remaining waiting time? P(X > t0 + t) P(X > t0 ) −λ(t0 +t) e = e −λt0 −λt =e = P(X > t). P(X > t0 + t|X > t0 ) = 12 / 16 Continuous Random Variables Exponential Distribution ST 380 Probability and Statistics for the Physical Sciences That is, the probability that you still have time t to wait is the same as it was initially. Reliability Suppose that the event is the failure of a piece of equipment. If the time to failure has the exponential distribution, the equipment does not age: its failure probability does not increase with time in service. That makes it an interesting but unrealistic model for failure times. 13 / 16 Continuous Random Variables Exponential Distribution ST 380 Probability and Statistics for the Physical Sciences The Gamma Distribution The pdf of the Gamma distribution is of the form ( kx α−1 e −x/β x ≥ 0 f (x; α, β) = 0 x <0 for an appropriate normalizing constant k. We determine k from the requirement Z ∞ f (x; α, β) dx = 1. −∞ 14 / 16 Continuous Random Variables Gamma Distribution ST 380 Probability and Statistics for the Physical Sciences The definition of the Gamma function is Z ∞ Γ(α) = x α−1 e −x dx 0 so Z ∞ 1=k x α−1 e x/−β dx = kβ α Γ(α), 0 and k= 1 β α Γ(α) . Finally, ( f (x; α, β) = 15 / 16 1 x α−1 e −x/β β α Γ(α) x ≥0 0 x < 0. Continuous Random Variables Gamma Distribution ST 380 Probability and Statistics for the Physical Sciences Mean and Variance Again using integration by parts and properties of the Gamma function, E (X ) = αβ and V (X ) = αβ 2 . Special Case If α = 1, the gamma pdf simplifies to the exponential distribution with parameter λ = 1/β. We can view the gamma distribution as a generalization of the exponential distribution, with a shape parameter α in addition to the scale parameter β. 16 / 16 Continuous Random Variables Gamma Distribution