Data analysis Ben Graham MA930, University of Warwick October 8, 2015 Common distributions a, b] distribution I The continuous Uniform[ I The exponential distribution I The Normal distribution (→time series) I The Binomial distribution I The Poisson distribution (→time series) Transformations of Random Variables (Sec 2.1) Transformations I I I X an r.v. g :R→R g (X ) is also a R.V. Examples: I Suppose X has p.d.f. 1/4 −1 ≤ x ≤ 3 0 otherwise ( fX (x ) = g (x ) = |x | or g (x ) = 1x > Suppose X has p.d.f. 1 I fX (x ) = g (x ) = − log x (PRNGs...) ( ≤x ≤1 1 0 0 otherwise Transformations of Random Variables (Example 2.1.9): I I I fx (x ) = √ π exp(−x /2) Normal N(0,1) distribution g (x ) = x . Y = g (X ). fY (y ) = dyd P(Y < y ) = 2 dyd P (X < √y ) d √y = f (√y )/√y = = 2 P(X < x )|x =√y × dy X − / √ y exp(−y /2) π 1 2 2 2 1 1 2 2 I This is χ21 : χ2 distribution with one degree of freedom. Transformations of Random Variables Examples X I (Thm 2.1.10) Continuous r.v. −1 I I FX is increasing Y := FX (x ) has p.d.f. fY (y ) = with c.d.f. 1 x ∈ (0, 1) 0 otherwise ( FX . I Useful for simulation/Monte Carlo integration Expectation (Section 2.2) I Weighted average - weighted by probability I E[X ] I I I P Discrete case E[X ] = x´x · pX (x ) Continuous case E[X ] = x x · fX (x )x E[g (X )] I I P Discrete case E[g (X )] = x´g (x ) · pX (x ) Continuous case E[g (X )] = x g (x ) · fX (x )dx I Linearity: E[aX + bY ] = aE[X ] + bE[Y ] (independence?) for a, b ∈ R Ex 2.2.2 The Exponential distribution I The exponential rate(λ) distribution is dened by fX (x ) = ( λ exp(−λx ) 0 x ≥0 x <0 (sometimes this is called the exponential(1/λ) distribution) I What is the c.d.f.? I What is the mean? Ex 2.2.3 The Binomial distribution A discrete random variable has the Bin( n, p) distribution if the p.m.f. is fX (x ) = P (X = x) = n x p (1 − p)n−x , x = 0, 1, . . . , n x Mean value I Directly? I Writing X = Pn i =1 Bi Bi ? p) with independent Bernoulli( ( = 1 0 probability p probability 1 − p Ex 2.2.4 The Cauchy distribution I I fX (x ) = π +x 2 ´ dx = arctan(x ) + C +x 2 1 1 1 1 1 I Symmetric about x=0? I Mean of I ´ X? x dx = 1+x 2 Mean of 1 2 log(1 |X |? + x 2) + C Properties of expectation (Thm 2.2.5) I Linearity: E[aX + bY + c ] = aE[X ] + b[Y ] + c I Positivity: If P(X ≥ 0) = 1 then E[X ] ≥ 0 Also: I Independent X and Y: E[XY ] = E[X ]E[Y ] Variance X is dened by Var (X ) = E[(X − EX ) ] = E[X I The variance of r.v. 2 I N.B. (Ex 2.2.6) EX = arg minb E[(X − b)2 ]. (Why the second power?) 2 ] − (EX )2 Characteristic functions I (Sec 2.6) r.v. X. I Characteristic function I I I φX (t ) = E[exp(itX )]. φaX (t ) = φX (at ) φX +b (t ) = e itb φX (t ) Independence X , Y : φX +Y (t ) = φX (t )φY (t ) I Thm 2.6.1 Convergence: Sequence of r.v.s ( Xk ) such that lim k →∞ φXk (t ) → φX (t ) in a neighborhood of 0. Then for all x such that FX lim k →∞ . is continuous at FXk (x ) = F (x ) x, Poisson approximation I The Poission distribution I I k P[X = k ] = e −λ λk ! , k = 0, 1, . . . Characteristic function φX (t ) = exp[λ(e it − 1)] I Binomial distribution I I P[X = k ] = kn p k (1 − p )n−k Characteristic function φX (t ) = (1 − p + pe −it )n n, λ/n) →Poisson(λ) I Convergence: Bin( Law of small numbers Weak Law of Large Numbers I Convergence in distribution: I I I I D Xi → X if for all x ∈ R such that F is continuous at x , FXn (x ) → FX (x ). (Xi )∞ i = iidrv with mean µ. S n = X + · · · + Xn . D Sn /n → µ φSn /n (t ) = [φX (t /n)]n = [1 + it µ + o (t /n)]n → e it µ as n → ∞ 1 1 Central limit theorem I Convergence in distribution: I I I I D Xi → X if for all x ∈ R such that F is continuous at x , FXn (x ) → FX (x ). (Xi )∞ µ, nite variance σ 6= 0 i =Piidrv with mean √ An = (Xi − µ)/[σ n]. D An → N (0, 1) as n → ∞ √ φAn (t ) = [φ(X −µ)/σ (t / n)]n − 1 t2 n = φN ( , ) (t ) as n → ∞ =[1 − t /n + o (1/n )] → e 2 2 1 1 2 2 0 1 Cauchy distribution φx (t ) = e −|t | φSn /n (t ) ???