Simple stochastic models 1 Random variation 1 • • • • Genetic, physiological Environmental Measurement error Random sampling Random variation 2 • Random variability as nuisance • Random variability of primary interest Terminology • • • • • Trial Sample space Event Probability Random variable Discrete random variable X Takes values: x0, x1, x2, …. with corresponding probabilities P(X=xk)=pk p0, p1, p2, … p0 + p 1 + p2 + … = 1 Discrete - integer random variable X Takes values: 0, 1, 2, …. with corresponding probabilities: P(X=k)=pk p0, p1, p2, … p0 + p 1 + p 2 + … = 1 Mean and variance of discrete random variable Mean: E ( X ) pi xi i Variance: V ( X ) pi [ xi E ( X )] E ( X ) E ( X ) 2 i 2 2 Binomial Distribution (Bernoulli trials) n – number of trials p – probability of success P(k successes in n trials) = nCk pk (1-p)n-k n n n! Ck k k!(n k )! Binomial distribution X ~ binomial(n,p) E(X) = n p V(X) = n p (1 – p ) n = 10, p = 0.5 0.3 0.25 P(X=k) 0.2 0.15 0.1 0.05 0 0 1 2 3 4 5 k 6 7 8 9 10 Poisson distribution X ~ poisson() k – number of events P( X k ) e k k! E(X)= , V(X)= =5 0.18 0.16 0.14 P(X=k) 0.12 0.1 0.08 0.06 0.04 0.02 0 0 2 4 6 8 10 k 12 14 16 18 20 When number of Bernoulli trials n is large, and probability of success p is small, the distribution of number of successes becomes Poisson. Examples • 30 % of women in Germany are smokers. We take a random (representative) sample of 20 women. The distribution of number of smokers among them is X ~ binomial(20,0.3) • Probability that an accident leading to injury happens in a factory is p=0.0001 per day. Number of accidents X over a ten year period (n=3650) is Poisson, X~Poisson(), with = np = 0.365 Generating function of discrete integer random variables Discrete - integer random variable X: 0, 1, 2, …. p0, p1, p2, … Generating function: P(s) = p0 + s p1 + s2 p2 + …= s k k pk Properties P(1) = 1 P’(1) = E(X) P’’(1) = E[X(X-1)] Binomial and Poisson distrib. Binomial: X ~ binomial(n,p), q=1-p n k nk n P( s) s p q q sp k 0 k Poisson: X ~ Poisson( ) n k P( s ) s e k 0 k k k! e s Continuity theorem for generating functions Two – dimensional discrete – integer random variables Joint probability distribution PXY(X=i, Y=k) = pik Two – dimensional generating function: PXY ( s1 , s2 ) p s s i k ik 1 2 i ,k Marginal distributions: PX(s)=PXY(s,1) PY(s)=PXY(1,s) Independent discrete – integer random variables X: PX(X=i) = pi Y: PY(Y=k) = pk PXY(X=i, Y=k) = pi pk Sum PXY(X=i, Y=k) = pik Z=X+Y P ( Z m) p i , k :i k m ik Expectation and variance of sums of independent random variables E(X+Y) = E(X) + E(Y) X, Y - independent V(X+Y) = V(X) + V(Y) Sum Z=X+Y PZ(s) = PXY(s,s) X,Y – independent: PZ(s) = PX(s) PY(s) Example X1 ~ binomial(1,p) (one Bernoulli trial) Xn ~ binomial(n,p) PX1(s)=q+sp PXn(s)=(q+sp)n Example X ~ Poisson(), Y~Poisson() Z=X+Y PZ(s) = PX(s) PY(s) PZ (s) e ( )( s 1) Z ~ Poisson(+) Can we generalize generating function method to non integer discrete r.v. ? P(X=xk)=pk: x0, x1, x2, …. p0, p1, p2, … P ( s ) pk s x k k