Discrete Univariate and Multivariate Distributions, Intro to Continuous stat 430

advertisement
Discrete Univariate and
Multivariate Distributions,
Intro to Continuous
stat 430
Heike Hofmann
Outline
• Overview: Binomial, Negative Binomial,
Poisson, and Geometric
• Compound Discrete Distributions
• Continuous Random Variables: uniform,
exponential
Overview
• Binomial: number of success in n
independent, identical Bernoulli trials
• Geometric: number of attempts until the
first success in independent, identical
Bernoulli trials
• Negative Binomial: number of attempts
until the r th success in independent,
identical Bernoulli trials
Overview
• Poisson: number of ‘successes’ observed in
a certain amount of time/space, given the
rate at which they occur
fµ,σ2 (x) = √
1
e
(x−µ)2
−
2σ 2
Poisson approximation
Γ(x + 1) of
= Binomial
x · Γ(x)
2πσ 2
Γ(0) = 1
Γ(k) = (k − 1)!
if k is integer
n (n≥30)
and small p (p ≤0.05)
• For largeα−1
−λx
αe
the =
Poisson
approximates
fλ,α (x)
x
λdistribution
, for
x > 0, α, λthe> 0
Γ(α)
Binomial distribution:
Bn,p ≈ P oλ
with λ = np
� −λx
λe
if x ≥ 0
fX (x) =
0
otherwise
Multiple R.Vs
• Let X and Y be two categorical variables,
then pX,Y is the joint pmf:
•p
X,Y(xi,yj)
= P(X = xi and Y=yj)
• marginal distributions:
sum over all j for X,
sum over all i for Y
X=1 X=2
X=i
Y=1 π11 π12
Y=2 π21 π22
...
Y=J πJ1 πJ2
πIi
π2i
...
...
πJi
1
r -1,
Y aX
is a+linear
→
Y =
b withfunction
a > 0, of X
→→
Y Y= =
aXaX
+ b+with
a<
0, 0,
b with
a>
Joint distributions
→
Y
=
aX
+
b
with
a
<
0,
linear association between X and Y . ρ near ±1 indicate
ar linear
0 indicates
lack value
ofbetween
linear association.
of
association
X and Y . ρ near ±1 indica
expected
near 0 indicates lack of linear
� association.
E[h(X, Y )] :=
h(x, y)pX,Y (x, y)
�
x,y
E[h(X, Y )] :=
h(x, y)pX,Y (x, y)
x,y
variance/covariance
Cov(X, Y ) = E[(X − E[X])(Y − E[Y ])]
�
Cov(X, Y )k= E[(X
− E[Y ])]
�
d − E[X])(Y
E[X ] = k MX (t)�� �
�
d
t
d
t=0
independence
E[X�k ] =� k �
MX (t)��
tX d t random
txi t=0
XMand(t)
Y are
independent
variables,
=
E
e
=
e
p
X
X (xi )
if pX,Y = pX * pY � tX � i � txi
MX (t) = E e
=
e pX (xi )
•
•
•
Rules for E and Var
• Let X,Y be random variables,
let a,b real values, then:
Var(aX+bY) = a2Var(X) + b2 Var(Y) + 2ab Cov(X,Y)
Cov(X,Y) = E[XY] - E[X] E[Y]
•
also:
Cov(X,Y) = Cov(Y,X)
Cov(aX,bY) = ab Cov(X,Y)
=
Correlation
�
∞
−∞
V
(x − E[X
Cov(X, Y )
• ρ := �V ar(X) · V ar(Y )
Cov(X, Y )
ρ := �
V ar(X) · V ar
ut ρ:
correlation measures amount of linear
read: association
“rho” Facts
about
ρ:
between X and Y
nd 1
• ρ is between -1 and 1
s a linear function of X
• if ρ = 1 or -1, Y is a linear function of X
aX + b with a > 0,
ρ = a1 < 0,→ Y = aX + b with a > 0,
aX + b with
ρ = −1 → Y = aX + b with a < 0,
association between X and Y . ρ near ±1 indicates a str
•
•
•
Correlation
• If X,Y are independent,
Cov(X,Y) = 0
(then also correlation = 0)
• If Cov(X,Y) = 0,
X and Y are not necessarily independent
(but there’s no linear relationship between X and Y)
Continuous
Random Variables
Continuous R.V.s
• X is continuous r.v.,
if its image is an interval in R
• e.g. “measurements”: temperature, weight of a
person, height, time to run one mile,
Discrete vs Continuous
discrete random variable
image Im(X) finite or countable infinite
continuous random variable
image Im(X) uncountable
probability distribution function:
�
FX (t) = P (X ≤ t) = k≤�t� pX (k)
FX (t) = P (X ≤ t) =
expected value:
�
E[h(X)] = x h(x) · pX (x)
E[h(X)] =
probability mass function:
pX (x) = P (X = x)
∞ f (x)dx
probability density function:
�
fX (x) = FX (x)
variance:
�
V ar[X] = E[(X − E[X])2 ] =
x (x −
E[X])2 pX (x)
�
x h(x)
· fX (x)
V ar[X] = E[(X − E[X])2 ] =
� ∞
=
(x − E[X])2 fX (x)dx
−∞
ρ := �
�t
Cov(X, Y )
Density & Distribution
• density function
f is density function, if it’s positive and
integrates to 1
• distribution function
F is distribution, if it’s monotone nondecreasing, has values between 0 and 1, and
reaches those limits
Density & Distribution
• density is derivative of distribution
• F’ = f
• (which makes distribution the antiderivative
of the density)
−λx
e
α−1 α
fλ,α (x) = x
λ
,
Γ(α)
48
for x > 0, α, λ > 0
Uniform Distribution
CHAPTER 2.
Bn,pspecial
≈ P ocontinuous
with λdensity
= np functions
Some
λ
X has uniform distribution on [a,b] if all
�
−λx
values have
same
2.4.1 Uniform
Density
λe probability
if xto≥be0 picked
2.4
•
•
f (x) =
X basic cases of a continuous density is the uniform density. On th
One of the most
(idea of a “random” number)
value has the same density (cf. diagram 2.2):
f (x) =
�
0
1
b−a
�
1
b−a
if a < x < b
otherwise
if a < 0x < b
otherwise
f (x) =
0
The distribution function FX is
•
otherwise
f(x)
uniform density on (a,b)
X
Y
E[X]
=
log mij = λ + λi + λj + εij
Var[X] =
Ua,b(x) =
1/
(b-a)
a
b
x
Figure 2.2: Density function of a uniform variable X on (a
Γ(k) = (k − 1)!
if k is integer
Exponential Distribution
−λx
e
fλ,α (x) = xα−1 λα
, for x > 0, α, λ > 0
Γ(α)
X is exponentially distributed random
Bifn,p
≈density
PSPECIAL
oλ is:
with λDENSITY
= np
variable,
2.4.its
SOME
CONTINUOUS
FUNCTIONS
�
Definition 2.4.1 (Exponential
density)
−λx
λeX has exponentialifdensity
x ≥(cf. 0figure 2.3), if
A random variable
fX (x) =
�
λe
if x ≥ 0
0
otherwise
f (x) =
0
otherwise
λ is called�
the rate parameter.
lambda is called the1 rate ifparameter
a<x<b
b−a
f (x) =
0
otherwise
E[X] =
Y
+
λ
Var[X]log
= mij = λ + λX
i
j + εij
Explambda(x) =
•
X
−λx
•
•
f2
f1
f0.5
x
Figure 2.3: Density functions of exponential variables for different rate para
Exponential Distribution
• Exponential distribution is memoryless:
any event is just as likely to happen within
the next minute than it was in the minute
ten minutes ago:
Let X be an exponentially distributed r.v., i.e
X ~ Expµ then
P( X ≤ s) = P( X ≤ t+s | X ≥ t)
Exponential Distribution
• When has a variable an Exponential
distribution?
• when we observe process, counting
#occurrences (which is Poisson), the time or
distance until next event is Exponential
(continuous equivalent of Geometric)
Download