Uploaded by Enrico F

R and Distributions

advertisement
Useful Univariate Distributions
Modeling Univariate Distributions
Location, Scale, and Shape Parameters
Location parameter: shifts a distribution to the right or left without changing the distribution’s shape
or variability. If f is a pdf, µ is a location parameter for the density f if f can be written as f (x − µ) and
f (x − µ; µ) = f (x; 0). Which means that if µ is the location parameter of X, then a + µ is the location
parameter of a + X.
Scale parameter: quantifies dispersion. A parameter is a scale parameter for a univariate sample if the
parameter is incrased by the amount |a| when the data are multiplied by a. So if σ(X) is a scale parameter
for a random variable X, then σ(aX) = |a|σ(X), provided that the s.d. is finite. If F is a CDF and f the
corresponding pdf, λ is a scale parameter if F (x; λ) = F ( λx ; 1) or equivalently
f ( λx ; 1)
λ
−1
If λ is a scale parameter (quantifying dispersion), then λ is a precision parameter.
f (x; λ) =
Examples:
• if X ∼ N [µ, σ 2 ], then µ is a location parameter, σ is a scale parameter, and sigma−1 is a precision
parameter.
• if X ∼ U [a, a + b], then a is a location parameter, b is a scale parameter, and bˆ-{-1} is a precision
parameter.
Location-scale family of distribution: If f is a pdf such that:
x−µ
−1
; 0, 1
f (x; µ, λ) = λ
λ
then f is a family of distributions with location parameter µ and scale parameter λ (for example Normal
and Uniform).
Shape parameter: it is any parameter which is not changed by location and scale changes, it affects its
shape rather than shift or stretch it. The most important moments to characterize a shape are the skewness
and the kurtosis.
Skewness:
it measures the degree of asymmetry, with:
• Symmetry implying zero skewness.
• Positive skewness (right skewness) indicating a relatively long right tail compared to the left tail.
• Negative skewness (left skewness) indicating the opposite.
The skewness of a random variable X is:
Sk = E
"
X − E[x]
sd[X]
1
3 #
Kurtosis: the kurtosis of a random variable X is:
"
4 #
X − E[x]
Sk = E
sd[X]
It is possible to prove that Kur > 1. Furthermore, the kurtosis is usually only considered for symmetric
distributions, even though symmetry is not necessary for its definition.
We can estimate them from an i.i.d. sample with the estimators:
"
3 #
n
Xi − X̄n
1X
Sk =
n i=1
S̄n
"
4 #
n
1X
Xi − X̄n
Kur =
n i=1
S̄n
Deviations from sample skewness of 0 and/of sample kurtosis of 3 are possible indicators of nonormality, but
these estimators do not have good properties as they are biased and the sampling distribution cannot, in
general, be obtained.
Tests of Normality
The null hypothesis is that the sample comes from a normal distribution, and the alternative is that it comes
from a nonnormal distribution. Often the distribution of the test statistics for these tests is not of a common
family, for small values of n their results should be considererd with care.
The Shapiro-Wilk test uses a test statistic that can be interpreted as the square correlation between sample
quantiles and the quantiles of a standard normal distribution.
The Jarque-Bera test uses a test statistic combining skeweness and kurtosis:
!
ˆ2
ˆ − 3)2
(Kur
Sk
+
JB = n
6
24
JB = 0 when sample skewness is 0 and sample kurtosis is 3 as in a normal, and it increases in the other
cases.
The Normal Distribution
The PDF of a normal distribution is:
f (x) =
(x−µ)2
1
√ e− 2σ2
σ 2π
The normal distribution with µ = 0 and σ 2 = 1 is called standard normal distribution. Given X ∼ N [µ, σ 2 ]
the transformed r.v. Z = X−µ
follows a standard normal distribution. The q-quantile of X can be obtained
σ
as a linear transformation of the corresponding quantile of the standard normal: xq = µ + σzq .
• A linear transformation of the normal distribution is itself a normal.
• If X ∼ N [µ, σ 2 ] then for any value of µ and σ 2 , we have that Sk = 0 and Kur = 3.
#dnorm(x, mean, sd) returns the PDF evaluated in x
dnorm(0, mean = 0, sd = 1)
## [1] 0.3989423
2
0.2
0.0
dnorm(x)
0.4
curve(dnorm, from = -4, to = 4)
−4
−2
0
2
4
x
#pnorm(q) returns the integral from -inf to q of the normal pdf, where q is a z-score.
pnorm(0)
## [1] 0.5
#if we include lower.tail = FALSE it returns integral from q to +inf
pnorm(4.5, 5, 1, lower.tail = FALSE)
## [1] 0.6914625
0.8
0.4
0.0
pnorm(x, 3, 0.5)
curve(pnorm(x, 3, 0.5), from = 0, to = 6)
0
1
2
3
4
x
3
5
6
4.0
3.0
2.0
qnorm(x, 3, 0.5)
curve(qnorm(x, 3, 0.5), from = 0, to = 1)
0.0
0.2
0.4
0.6
0.8
1.0
x
#The quantile function is given by qnorm(x), which is the inverse of pnorm()
qnorm(c(0.025, 0.975)) #critical values for two tailed 95% CI
## [1] -1.959964
1.959964
pnorm(qnorm(c(0.025, 0.975)))
## [1] 0.025 0.975
qnorm(0.05)
## [1] -1.644854
qnorm(0.95, lower.tail = FALSE)
## [1] -1.644854
Skew Normal Distribution
It’s an extension of the normal pdf, that allows for skewness.
Let φ and Φ be respectively standard normal pdf and CDF. The pdf:
f (x; α) = 2φ(x)Φ(αX)
where α is the shape parameter has the following properties:
• if α = 0 it coincides with standard normal (symmetric).
4
• the skweness increases as α increases in absolute value. To the right if α > 0, to the left if α < 0.
0
−3
2
0.0
0.3
dsn(x)
0.6
..
−4
−2
0
2
4
x
Location scale transformation: given the random variable X with pdf f (x; α), the linear transformation Y = µ + λX is said to
have a skew-normal distribution, with location paramter µ, scale parameter λ and shape parameter α, so
Y ∼ SN [µ, λ2 , α].
√
Let δ = α/ 1 + α2 , then:
p
• E[Y ] = µ + λ 2/πδ.
2
• V ar[Y ] = λ2 (1 − 2δπ )
√
(δ 2/π)3
• Sk[Y ] = 4−π
2 (1−2δ 2 /π)3/2
√
(δ 2/π)4
• Kur[Y ] = 3 + 2(π − 3) (1−2δ2 /π)2
Student’s-t Distribution
T is a Student’s t distribution with ν degrees of freedom with the following density function
Γ( v+1 )
fT (x) = √ 2 ν
νπΓ( 2 )
− ν+1
2
x2
1+
ν
where:
•
•
•
•
•
E[T ] = 0, exists only if ν > 1.
ν
V ar[T ] = ν−2
if ν > 2, ∞ for 1 < ν ≤ 2 and indefinite otherwise.
Sk[T ] = 0, for ν > 3.
6
Kur[T ] = 3 + ν−4
for ν > 4 –> always greater (or equal) than 3.
When ν → ∞ it converges to a Gaussian.
5
0.4
0.2
1
5
10
400
0.0
dt(x, 1)
v
−4
−2
0
2
4
x
v
Student's−t
Gaussian
0.2
1
5
10
0.0
dnorm(x)
0.4
Comparing a Student’s t with a Gaussian:
−4
−2
0
2
4
x
t-distribution
We can extend the standardized t-distribution by introducing a location and scale parameter with a lineal
transformation.
If T has a tν distribution, then Y = µ + λT, µ ∈ R, λ 6= 0 is said to have a tν [µ, λ2 ] distribution, where µ
is the location parameter and λ is the scale parameter. tν [µ = 0, λ = 1] = tν .
We have that:
• E[Y ] = µ, for ν > 1.
ν
• V ar[T ] = λ2 ν−2
if ν > 2.
• Sk[T ] = 0, for ν > 3.
6
6
• Kur[T ] = 3 + ν−4
for ν > 4, and Kur[T ] = +∞ for 2 < ν ≤ 4.
• ν can take any value above 0, not just the integers.
Lognormal Distribution
The lognormal distribution is a distribution whose logarithm has a normal distribution, this can be used
(though not always optimally) to model right-skewed financial data. Let X ∼ N [µ, σ 2 ] and Y = exp(X).
Then Y is said to have a lognormal distribution Y ∼ lnN [µ, σ 2 ]. The two parameters are called log-mean
and log-variance, but actually they are the expected value and variance of log(Y ) (the normal). We have
that E(Y ) = exp(µ + σ 2 /2) and V ar(Y ) = (exp(2µ + σ 2 )(exp(σ 2 ) − 1)) = (E(Y ))2 (exp(σ 2 ) − 1) –> Recall
Jensen’s inequality.
Let X ∼ lnN [µ, σ 2 ]:
•
•
•
•
The log-mean µ is a scale parameter.
The log-standard deviation
p σ is a shape parameter.
Sk(X) = (exp(σ 2 )) + 2 exp(σ 2 ) − 1 –> right skewed.
Kur(Y ) = exp(4σ 4 ) + 2exp(3σ 2 ) + 3exp(2σ 2 ) − 3 –> often with an important kurtosis.
0.3
0.0
dlnorm(x)
0.6
curve(dlnorm, to = 6)
0
1
2
3
4
x
curve(plnorm, to = 6)
7
5
6
0.8
0.4
0.0
plnorm(x)
0
1
2
3
4
5
6
x
Notice that with quantiles (only) we can:
qlnorm(p = 0.95)
## [1] 5.180252
exp(qnorm(p = 0.95))
## [1] 5.180252
The Binomial Distribution
We conduct n experiments and on each there are two possible outcomes, the probability of one (success) is p
and the probability of “failure” is q = 1-p. It’s assumed p and q are constant. The PDF of the Binomial(n,p)
is:
n k n−k
P (Y = k) =
p q
, k = 0, 1, 2, ..., n.
k
n!
where nk = k!(n−k)!
, E(Y ) = np and V ar(Y ) = npq. The Binomial(1,p) is also called the Bernoulli
distribution and its density is:
P (Y = y) = py (1 − p)1−y , y = 0, 1.
py is equal to either p (when y = 1 ) or 1 (when y = 0 ), and same for $(1-p)ˆ{1-y}.
plot(0:20, dbinom(0:20, 20, 0.3), type = 'h')
8
0.10
0.00
dbinom(0:20, 20, 0.3)
0
5
10
15
20
0:20
The Uniform Distribution
A Uniform(a,b) on the interval (a,b) has a PDF equal to 1/(b − a) on (a,b) and 0 outside this interval.
E(Y ) =
V ar(Y ) =
.
a+b
2
(b − a)2
12
The χ2 distribution
Definition: Let Z ∼ N (0, 1). Then Z 2 ∼ χ21 , chi-square with 1 degree of freedom.
Pk
Sum: Let Zi ∼ N (0, 1) with i=1,. . . ,k independently. Then X = i=1 Zi2 ∼ χ2k with:
• E[X] = k.
• V ar[X] = 2k.
• M ode = max(k − 2, 0)
Its density function is:
fX (x) =
1
xk/2−1 e−x/2
2k/2 Γ(k/2)
where Γ() is the Gamma function:
Γ(z) =
Z
∞
exp(−x)xz−1 dx
0
9
0.0 0.4 0.8 1.2
dchisq(x, 1)
k
1
2
3
5
0
2
4
6
8
10
x
Double Expontential (Laplace) Distribution
The random variable X follows a Laplace distribution with mean µ and scale parameter θ, so X ∼
Laplace[µ, θ] if X has pdf:
1
|x − µ|
f (x) =
exp −
, −∞ < x < ∞
2θ
θ
with −∞ < µ + ∞ and θ > 0.
The distribution is symmetric about µ with:
E[X] = µ.
V ar[X] = 2θ2 .
Sk[X] = 0.
Kur[X] = 6.
Laplace[0,1]
N[0,1]
Laplace[0,v2]
0.2
0.0
f(x)
0.4
•
•
•
•
−4
−2
0
2
x
10
4
Download