AMS570 Lecture Notes #2 Review of Probability (continued) Probability distributions. (1) Binomial distribution Binomial Experiment: 1) It consists of n trials 2) Each trial results in 1 of 2 possible outcomes, “S” or “F” 3) The probability of getting a certain outcome, say “S”, remains the same, from trial to trial, say P(“S”)=p 4) These trials are independent, that is the outcomes from the previous trials will not affect the outcomes of the up-coming trials 5) Let X denotes the total # of “S” among the n trials in a binomial experiment, then X~B(n, p), that is, Binomial Distribution with parameters n and p. Its probability density function (pdf) is 𝑛 f(𝑥) = P(X = 𝑥) = ( ) 𝑝 𝑥 (1 − 𝑝)𝑛−𝑥 , 𝑥 = 0,1, … , n 𝑥 𝑛 𝒏! 𝒏(𝒏−𝟏)…(𝒏−𝒙+𝟏) Here ( ) = (𝒏−𝒙)!𝒙! = 𝒙(𝒙−𝟏)…𝟑∙𝟐∙𝟏 𝑥 Eg. An exam consists of 10 multiple choice questions. Each question has 4 possible choices. Only 1 is correct. Jeff did not study for the exam. So he just guesses at the right answer for each question (pure guess, not an educated guess). What is his chance of passing the exam? That is, to make at least 6 correct answers. Answer: Yes, this is a binomial experiment with n=10, p=0.25, “S”=choose the right answer for each question. Let X be the total # of “S”, then X~B(n=10, p=0.25), P(pass) = P(X 6) = P(X = 6 or X = 7 or X = 8 or X = 9 or X = 10) = P(X = 6) + P(X = 7) + P(X = 8) + P(X = 9) + P(X = 10) 1 =( 10 ) 0.256 (1 − 0.25)4 + ⋯ 6 Bernoulli Distribution X~Bernoulli(p) . It can take on two possible values, say success (S) or failure (F) with probability p and (1-) respectively. That is: P(X = ′S′ ) = p ; P(X = ′F′) = 1 − p Let X = number of "S", then X = { 0, 1 − 𝑝 1, 𝑝 The pdf of X can be written as: f(x) = P(X = 𝑥) = p𝑥 (1 − p)1−𝑥 ; 𝑥 = 0,1 Relation between Bernoulli RV and Binomial RV. (1) X~Bernoulli(p) ⇒ it is indeed a special case of Binomial random variable when n = 1 (*only one trial), that is: B(n = 1, p) (2) Let Xi ~Bernoulli(p) , i = 1, ⋯ , n. Furthermore, X′i s are all independent. Let X = ∑ni=1 Xi . Then, X~B(n, p) (***Exercise, prove this!). Note: *** This links directly to the Binomial Experiment with Xi denotes the number of ‘S’ for the i𝑡ℎ trial. Basics of Statistical Inference. 1. Distributions, Mathematical Expectations, Random Variables Eg. Let X be the height of randomly selected male from entire population of adults U.S, then Population distribution : X ~ N( , 2 ), where is the population mean, 2 is the population variance, and N stands for normal distribution. Its pdf is f ( x) 1 2 e ( x )2 2 2 , x 2 (*Note: the Normal distribution is our next distribution to review!) Let X 1 , X 2 ,, X n be a random sample, then X 1 , X 2 ,, X n are independent to each other, and each follows the same distribution as the population distribution That is, the Xi’s are independently, and identically distributed (i.i.d.) I. Cumulative distribution function (cdf) If X is continuous: x F ( x) P( X x) f (t )dt If X is discrete: F(x) = P(X ≤ x) = ∑t≤x P(X = t) II. Probability density function (pdf) Continuous random variable f ( x) [ F ( x)]' d F ( x) dx b P(a X b) f ( x)dx a Discrete random variable (for which the p.d.f. is also called the probability mass function, p.m.f.) f ( x) P( X x) P(a ≤ X ≤ b) = ∑a≤x≤b P(X = x) 2. Mathematical Expectation. Continuous random variable: E[ g ( X )] g ( x) f ( x)dx Discrete random variable: all E[ g ( X )] g ( x) P( X x) Special case: 3 1) (population) Mean: E ( X ) x f ( x)dx 2) (population) Variance: 2 E[( X )2 ] ( x )2 f ( x)dx E( X 2 ) [ E( X )]2 *** The above mean and variance formulas are for continuous X. Replace the integral with the summation (over all possible values of X), if X is discrete. Var ( X ) E[( X )2 ] E ( X 2 2 X 2 ) EX 2 2 EX 2 EX 2 2 EX 2 ( EX )2 3) Moment generating function (mgf): Definition: Suppose X is a continuous random variable (R.V.) with probability density function (pdf) f(x). The moment generating function (mgf) of X is defined as M X (t ) E (e ) e tx f ( x)dx , tX where t is a parameter. (*If X is discrete, change the integral to summation.) How to use the mgf to generate the population moments? First population moment: E ( X ) d M X (t ) |t 0 dt Second population moment: E ( X 2 ) d2 M X (t ) |t 0 dt 2 𝒅𝒌 In general, the kth population moment is: 𝑬(𝑿𝒌 ) = 𝒅𝒕𝒌 𝑴𝑿 (𝒕)|𝒕=𝟎 (Exercise, prove the above moment generating properties.) 4 1 For normal distribution, X ~ N ( , ), f ( x) 2 M X (t ) e f ( x)dx e tx 2 e ( x )2 2 2 , x 1 2 t 2t 2 Theorem. If X 1 , X 2 are independent, then f X1 , X 2 ( x1 , x2 ) f X1 ( x1 ) f X 2 ( x2 ) Theorem. If X 1 , X 2 are independent, then M X 1 X 2 (t ) M X 1 (t ) M X 2 (t ) Proof. X 1 , X 2 are independent iff (if and only if) f X 1 , X 2 ( x1 , x 2 ) f X 1 ( x1 ) f X 2 ( x 2 ) , thus M X 1 X 2 (t ) E (e t ( X1 X 2 ) ) e t ( x1 x2 ) f X 1 , X 2 ( x1 , x 2 )dx1 dx 2 tx tx e 1 f X1 ( x1 )dx1 e 2 f X 2 ( x2 )dx2 = M X1 (t )M X 2 (t ) Theorem. Under regularity conditions, there is a 1-1 correspondence between the pdf and the mgf of a given random variable X. That is, 11 pdf f ( x) mgf M X (t ) . Note: One could use this property to identify the probability distribution based on the moment generating function. Special mathematical expectations for the binomial RV. 1. Let X~B(n, p), please derive the moment generating function (m.g.f.) of X. Please show the entire derivation for full credit. m.g.f. of X 5 n MX (t) = E(e tX ) = ∑ et𝑥 P(X = 𝑥) x=0 n n = ∑ et𝑥 ( ) p𝑥 (1 − p)n−𝑥 𝑥 𝑥=0 n n = ∑ ( ) (et p) 𝑥 (1 − p)n−𝑥 𝑥 𝑥=0 = [pet + (1 − p)]n Theorem. (a + b)n = ∑nx=0(nx)ax bn−x 2. In addition, we have (*make sure you know how to derive these expectations): E(X) = np Var(X) = np(1 − p) Exercise: 1. Please show that the summation of n independent Bernoulli random variables each with success probability p, will follow B(n, p). Please show the entire derivation for full credit. 2. Let X 1 , X 2 be two independent random variables following binomial distributions B(n1, p) and B(n2, p) respectively. Please derive the distribution of X1+X2. Please show the entire derivation for full credit Solution: 1. Hint: (1). the mgf of the Bernoulli(p) RV is: pet + (1 − p) (2). Let X 1 , X 2 ,, X n be i.i.d. Bernoulli(p), and let X = ∑ni=1 Xi n Then, MX (t) = E(etX ) = E(et ∑i=1 Xi ) = E(etX1 )E(etX2 ) … E(etXn ) = [pet + (1 − p)]n (3) . Since the above mgf is the same as the mgf for B(n,p), we claim that we have proven this problem 6