Discrete Univariate and Multivariate Distributions, Intro to Continuous stat 430 Heike Hofmann Outline • Overview: Binomial, Negative Binomial, Poisson, and Geometric • Compound Discrete Distributions • Continuous Random Variables: uniform, exponential Overview • Binomial: number of success in n independent, identical Bernoulli trials • Geometric: number of attempts until the first success in independent, identical Bernoulli trials • Negative Binomial: number of attempts until the r th success in independent, identical Bernoulli trials Overview • Poisson: number of ‘successes’ observed in a certain amount of time/space, given the rate at which they occur fµ,σ2 (x) = √ 1 e (x−µ)2 − 2σ 2 Poisson approximation Γ(x + 1) of = Binomial x · Γ(x) 2πσ 2 Γ(0) = 1 Γ(k) = (k − 1)! if k is integer n (n≥30) and small p (p ≤0.05) • For largeα−1 −λx αe the = Poisson approximates fλ,α (x) x λdistribution , for x > 0, α, λthe> 0 Γ(α) Binomial distribution: Bn,p ≈ P oλ with λ = np � −λx λe if x ≥ 0 fX (x) = 0 otherwise Multiple R.Vs • Let X and Y be two categorical variables, then pX,Y is the joint pmf: •p X,Y(xi,yj) = P(X = xi and Y=yj) • marginal distributions: sum over all j for X, sum over all i for Y X=1 X=2 X=i Y=1 π11 π12 Y=2 π21 π22 ... Y=J πJ1 πJ2 πIi π2i ... ... πJi 1 r -1, Y aX is a+linear → Y = b withfunction a > 0, of X →→ Y Y= = aXaX + b+with a< 0, 0, b with a> Joint distributions → Y = aX + b with a < 0, linear association between X and Y . ρ near ±1 indicate ar linear 0 indicates lack value ofbetween linear association. of association X and Y . ρ near ±1 indica expected near 0 indicates lack of linear � association. E[h(X, Y )] := h(x, y)pX,Y (x, y) � x,y E[h(X, Y )] := h(x, y)pX,Y (x, y) x,y variance/covariance Cov(X, Y ) = E[(X − E[X])(Y − E[Y ])] � Cov(X, Y )k= E[(X − E[Y ])] � d − E[X])(Y E[X ] = k MX (t)�� � � d t d t=0 independence E[X�k ] =� k � MX (t)�� tX d t random txi t=0 XMand(t) Y are independent variables, = E e = e p X X (xi ) if pX,Y = pX * pY � tX � i � txi MX (t) = E e = e pX (xi ) • • • Rules for E and Var • Let X,Y be random variables, let a,b real values, then: Var(aX+bY) = a2Var(X) + b2 Var(Y) + 2ab Cov(X,Y) Cov(X,Y) = E[XY] - E[X] E[Y] • also: Cov(X,Y) = Cov(Y,X) Cov(aX,bY) = ab Cov(X,Y) = Correlation � ∞ −∞ V (x − E[X Cov(X, Y ) • ρ := �V ar(X) · V ar(Y ) Cov(X, Y ) ρ := � V ar(X) · V ar ut ρ: correlation measures amount of linear read: association “rho” Facts about ρ: between X and Y nd 1 • ρ is between -1 and 1 s a linear function of X • if ρ = 1 or -1, Y is a linear function of X aX + b with a > 0, ρ = a1 < 0,→ Y = aX + b with a > 0, aX + b with ρ = −1 → Y = aX + b with a < 0, association between X and Y . ρ near ±1 indicates a str • • • Correlation • If X,Y are independent, Cov(X,Y) = 0 (then also correlation = 0) • If Cov(X,Y) = 0, X and Y are not necessarily independent (but there’s no linear relationship between X and Y) Continuous Random Variables Continuous R.V.s • X is continuous r.v., if its image is an interval in R • e.g. “measurements”: temperature, weight of a person, height, time to run one mile, Discrete vs Continuous discrete random variable image Im(X) finite or countable infinite continuous random variable image Im(X) uncountable probability distribution function: � FX (t) = P (X ≤ t) = k≤�t� pX (k) FX (t) = P (X ≤ t) = expected value: � E[h(X)] = x h(x) · pX (x) E[h(X)] = probability mass function: pX (x) = P (X = x) ∞ f (x)dx probability density function: � fX (x) = FX (x) variance: � V ar[X] = E[(X − E[X])2 ] = x (x − E[X])2 pX (x) � x h(x) · fX (x) V ar[X] = E[(X − E[X])2 ] = � ∞ = (x − E[X])2 fX (x)dx −∞ ρ := � �t Cov(X, Y ) Density & Distribution • density function f is density function, if it’s positive and integrates to 1 • distribution function F is distribution, if it’s monotone nondecreasing, has values between 0 and 1, and reaches those limits Density & Distribution • density is derivative of distribution • F’ = f • (which makes distribution the antiderivative of the density) −λx e α−1 α fλ,α (x) = x λ , Γ(α) 48 for x > 0, α, λ > 0 Uniform Distribution CHAPTER 2. Bn,pspecial ≈ P ocontinuous with λdensity = np functions Some λ X has uniform distribution on [a,b] if all � −λx values have same 2.4.1 Uniform Density λe probability if xto≥be0 picked 2.4 • • f (x) = X basic cases of a continuous density is the uniform density. On th One of the most (idea of a “random” number) value has the same density (cf. diagram 2.2): f (x) = � 0 1 b−a � 1 b−a if a < x < b otherwise if a < 0x < b otherwise f (x) = 0 The distribution function FX is • otherwise f(x) uniform density on (a,b) X Y E[X] = log mij = λ + λi + λj + εij Var[X] = Ua,b(x) = 1/ (b-a) a b x Figure 2.2: Density function of a uniform variable X on (a Γ(k) = (k − 1)! if k is integer Exponential Distribution −λx e fλ,α (x) = xα−1 λα , for x > 0, α, λ > 0 Γ(α) X is exponentially distributed random Bifn,p ≈density PSPECIAL oλ is: with λDENSITY = np variable, 2.4.its SOME CONTINUOUS FUNCTIONS � Definition 2.4.1 (Exponential density) −λx λeX has exponentialifdensity x ≥(cf. 0figure 2.3), if A random variable fX (x) = � λe if x ≥ 0 0 otherwise f (x) = 0 otherwise λ is called� the rate parameter. lambda is called the1 rate ifparameter a<x<b b−a f (x) = 0 otherwise E[X] = Y + λ Var[X]log = mij = λ + λX i j + εij Explambda(x) = • X −λx • • f2 f1 f0.5 x Figure 2.3: Density functions of exponential variables for different rate para Exponential Distribution • Exponential distribution is memoryless: any event is just as likely to happen within the next minute than it was in the minute ten minutes ago: Let X be an exponentially distributed r.v., i.e X ~ Expµ then P( X ≤ s) = P( X ≤ t+s | X ≥ t) Exponential Distribution • When has a variable an Exponential distribution? • when we observe process, counting #occurrences (which is Poisson), the time or distance until next event is Exponential (continuous equivalent of Geometric)