1 Uniform Distribution

advertisement
STAT 350 - An Introduction to Statistics
Named Continuous Distributions
Jeremy Troisi
1
Uniform Distribution - X ∼ U (a, b)
Probability is uniform or the same over an interval a to b.
X ∼ U (a, b), a < b where a is the beginning of the interval and b is the end of the interval.
The Uniform Distribution derives ’naturally’ from Poisson Processes and how it does will be covered in the
Poisson Process Notes. However, for the Named Continuous Distribution Notes, we will simply discuss its
various properties.
1.1
Probability Density Function (PDF) - fX (x) =
fX (x) =
1.1.1
1
b−a
1
b−a
:a<x<b
a<x<b
Else
0
Rules
1
b−a
1. a < b ⇒
2. limnր∞
Rn
−n
> 0 ≥ 0 for a < x < b and 0 ≥ 0 else. ⇒ fX (x) ≥ 0 for all (∀)x
fX (x)dx = limnր∞
Ra
−n
(0)dx+
Rb
Rn
1
dx+ b (0)dx
a b−a
1
[x]ba +(0) =
= (0)+ b−a
1
b−a (b−a)
=1
Therefore, X is a PDF.
1.2
Cumulative Distribution Function (CDF) - FX (x) =
Rx
Ra
Rx
FX (x) = P (X ≤ x) = −∞ fX (t)dt = −∞ (0)dt + a

x≤a
 0
x−a
a
<x<b
⇒ FX (x) =
 b−a
1
x≥b
1
b−a dt
= (0) +
1
x
b−a [t]a
=
x−a
b−a
:a<x<b
x−a
b−a
Thus, geometrically, we take the (x − a) portion of the entire (b − a) length of the interval (or rectangle).
1.3
Percentile, p : x∗ = a(1 − p) + bp
For 0 < p < 1:
FX (x∗ ) = p ⇒
x∗ −a
b−a
= p ⇒ x∗ − a = p(b − a) ⇒ x∗ = p(b − a) + a = a(1 − p) + bp
Thus, the p percentile of a Uniform Distribution takes p proportion of the right endpoint value b and
the remaining (1 − p) proportion of the left endpoint value a.
1
1.4
Probability - SX (x) =
b−x
b−a
:a<x<b

 1
b−x
b−a
C
SX (x) = P (X > x) = 1 − P ((X > x) ) = 1 − P (X ≤ x) = 1 − FX (x) =

0
x≤a
a<x<b
x≥b
Thus, geometrically, we take the (b − x) portion of the entire (b − a) length of the interval (or rectangle).
AND
x1 < x2
⇒ P (x1 < X < x2 ) = FX (x2 ) − FX (x1 ) =

0



x2 −a

 b−a





x1 < x2 ≤ a < b
x1 ≤ a < x2 < b
a < x1 < x2 < b
a < x1 < b ≤ x2
a < b ≤ x1 < x2
x2 −x1
b−a
b−x1
b−a
0
Thus, geometrically, we take the (x2 − x1 ) portion of the entire (b − a) length of the interval (or rectangle).
1.5
Mean/Expected Value - µ = E[X] =
µ = E[X] = limnր∞
= (0) +
1
2 b
2(b−a) [x ]a
Rn
−n
xfX (x)dx = limnր∞
1
2
2(b−a) (b
+ (0) =
− a2 ) =
Ra
−n
a+b
2
x(0)dx +
(b−a)(b+a)
2(b−a)
=
Rb
a
1
)dx
x( b−a
a+b
2
Rn
b
x(0)dx
An expected value is a ’center of gravity’ from Physics. The mean is the location that will holster the
weight of our density and prevent it from toppling over in one direction or the other. Since the weight
is distributed uniformally or even more generally, symmetrically (so long as the mean exists...see Cauchy
Distribution for a symmetric distribution that does not possess a mean), this place is the midpoint of a
and b, a+b
2 , and we have constructed a perfect ’teeter-totter’.
Keep in mind, this value will always need to be bigger than or equal to the smallest possible value and
smaller than or equal to the largest possible value. Beyond that, some basic logic can narrow down further
whether your answer is plausible or has an error somewhere.
a<
1.6
a+b
2
= E[X] =
a+b
2
<b
(b−a)2
12
Variance - σ 2 = V ar[X] =
σ 2 = V ar[X] = E[X 2 ] − µ2 = limnր∞
Rn
−n
2
x2 fX (x)dx − ( a+b
2 )
Rn
Ra
Rb
1
)dx b x2 (0)dx) −
= limnր∞ ( −n x2 (0)dx + a x2 ( b−a
2
1
= ((0)+ 3(b−a)
[x3 ]ba +(0))− a
+2ab+b2
4
=
4(b−a)(a2 +ab+b2 )
12(b−a)
=
(b−a)(4(a2 +ab+b2 )−3(a2 +2ab+b2 ))
12(b−a)
−
=
3(b−a)(a2 +2ab+b2 )
12(b−a)
=
(a+b)2
22
(b−a)(a2 +2ab+b2 )
1
3
3
3(b−a) (b −a )−
4(b−a)
=
=
(b−a)(b2 +ab+a2 ) (b−a)(a2 +2ab+b2 )
−
3(b−a)
4(b−a)
4(b−a)(a2 +ab+b2 )−3(b−a)(a2 +2ab+b2 )
12(b−a)
(4a2 +4ab+4b2 )−(3a2 +6ab+3b2 )
12
=
a2 −2ab+b2
12
=
(b−a)2
12
Intuitively, the variance should be based on the length of the interval (b − a). The furthest a point
2
can be from the mean/midpoint on this interval is thus
does, yields
(b−a)2
.
4
a quadratic term,
b−a
2 .
Squaring this value, as the variance operator
The ’averaging’ of this quantity is to divide by another 3, because we are integrating
x2 dx =
R
x3
3 .
Thus, the result is
(b−a)2
12 .
Keep in mind, this value will always be bigger than zero and less than the largest distance from the center
or mean squared.
0<
(b−a)2
12
1.6.1
σ=
= σ2 =
(b−a)2
12
=
(b−a)2
4
3
<
Standard Deviation (σ =
√
σ2
=
q
(b−a)2
12
=
√
(b−a)2
√
12
=
(b−a)2
4
√
3(b−a)
)
6
√
(b−a) 3
6
Keep in mind, this value will always be bigger than zero and less than the largest distance from the
center or mean.
0<
2
√
(b−a) 3
6
=σ=
√
(b−a) 3
6
=
b−a
√2
3
<
(b−a)
2
Exponential Distribution - X ∼ Exp(λ)
The Exponential Distribution is the random variable (r.v.) that models the waiting time (distance or other
continuous metric) until the next rare event. The Poisson Distribution, a Discrete Distribution, counts
the number of rare events over a continuous metric interval. The Exponential Distribution is interwoven
with the Poisson Distribution by measuring the length of the continuous metric until the next count. This
interwoven nature is known as the Poisson Process and will be in its own independent set of notes.
X ∼ Exponential(λ) where λ > 0 is the rate at which counts occur per waiting ‘time’ just as it was in the
P oisson(λ) Distribution.
The Exponential Distribution derives from the Geometric Distribution in the limit as p ց 0 and the discrete, ’trial’, metric is extended to a continuous metric making the Exponential Distribution a Continuous
Analog of the Geometric Distribution. This will be demonstrated with a little ’hand waving’:
Provided the following two items we will derive the Survival Function SX (x) = P (X > x).
1. Y ∼ geo(p) ⇒ SY (y) = P (Y > y) = (1 − p)y
2. n ր ∞, p ց 0, such that (s.t.)
np → λ > 0
⇒ SX (x) =2 limnր∞,pց0,s.t.np→λ>0 P (Y > y)n =1 limnր∞,pց0,s.t.np→λ>0 [(1 − p)x ]n
=2 limnր∞,0<λ<∞ [(1 − nλ )n ]x = [e−λ ]x = e−λx
1
x≤0
⇒ SX (x) = P (X > x) =
e−λx
x>0
Provided the Survival Function SX (x) we are able to derive all of the other properties of the Exponential
Distribution.
3
2.1
Probability Density Function (PDF) - fX (x) = λe−λx : x > 0
fX (x) =
2.1.1
d
d
[−SX (x)] =
[−e−λx ] = λe−λx ⇒ fX (x) =
dx
dx
0
λe−λx
x≤0
x>0
Rules
1. λ > 0 and e−λx > 0 ⇒ λe−λx ≥ 0 for x > 0 AND 0 ≥ 0 for x ≤ 0 ⇒ fX (x) ≥ 0, ∀x.
Rn
R0
Rn
R −λn u du
2. limnր∞ −n fX (x)dx = limnր∞ −n (0)dx+ 0 λe−λx dx = (0)+limnր∞ λ 0
e ( −λ ) = limnր∞ −[eu ]−n
0
= −(limnր∞ e−n − e0 ) = −((0) − (1)) = 1
Therefore, X is a PDF.
2.2
Cumulative Distribution Function (CDF) - FX (x) = 1 − e−λx : x > 0
−λx
FX (x) = P (X ≤ x) = 1 − SX (x) = 1 − e
2.3
⇒ FX (x) =
0
1 − e−λx
x≤0
x>0
Percentile p : x∗ = − λ1 ln(1 − p)
For 0 < p < 1:
FX (x∗ ) = p ⇔ 1 − e−λx = p ⇔ SX (x) = e−λx = 1 − p ⇔ ln(e−
∗
∗
x∗
µ
) = −λx∗ = ln(1 − p)
⇔ x∗ = − λ1 ln(1 − p)
Thus, the p percentile of a Exponential Distribution inverts the exponential function to find the needed
metric value.
2.4
Probability - SX (x) = e−λx : x > 0
SX (x) = P (X > x) =
1
e−λx
x≤0
x>0
AND

 0
1 − e−λx2
x1 < x2 ⇒ P (x1 < X < x2 ) = FX (x2 ) − FX (x1 ) =
 −λx1
− e−λx2
e
2.4.1
x1 < x2 ≤ 0
x1 ≤ 0 < x2
0 < x1 < x2
Memoryless Property (as with the Geometric Distribution) - P (X > s + t|X > t) =
P (X > s)
P (X > s + t|X > t) =
P (X>s+t∩X>t)
P (X>t)
=
P (X>s+t)
P (X>t)
=
e−λ(s+t)
e−λt
4
=Property
of Exponential Functions
e−λs = SX (s)
Similarly,
P (X ≤ s + t|X > t) =
P (X<s+t∩X>t)
P (X>t)
P (t<X<s+t)
P (X>t)
=
=
e−λt −e−λ(s+t)
e−λt
= 1 − e−λs = FX (s)
Mean/Expected Value (µ = E[X] = λ1 )
2.5
µ = E[X] = limnր∞
Rn
−n
xfX (x)dx = limnր∞
R0
−n
x(0)dx +
Rn
0
x(λe−λx )dx
Rn d
R
[x][ e−λx dx]dx
0 dx
Rn
Rn
= λ limnր∞ [x(− λ1 e−λx )]n0 − 0 (1)(− λ1 e−λx )dx = limnր∞ −[xe−λx ]n0 + 0 e−λx dx
=Integration-by-Parts
(IbP)
(0) + λ limnր∞ [x
= −((0) − (0)e−λ∗0 + limnր∞
R −λn
0
σ 2 = V ar[X] = E[X 2 ]−µ2 = limnր∞
=IbP (0) + λ limnր∞ ([x2
R
Rn
0
= 2 limnր∞ ([x(− λ1 e−λx )]n0 −
Rn
0
2.6.1
3
1
λ2
σ2 =
q
1
λ2
=
1
λ
R n d 2 R −λx
[x ][ e
dx]dx) −
0 dx
R
R −λn
0
e−λx dx]n0 −
du
)) −
eu ( −λ
= − λ22 (0 − 1) −
Standard Deviation (σ =
√
R0
Rn
x2 fX (x)dx−( λ1 )2 = limnր∞ ( −n x2 (0)dx+ 0 x2 (λe−λx )dx)− λ12
(1)(− λ1 e−λx )dx) −
= λ2 (−((0) − (0)e−λ(0) ) + limnր∞
= − λ22 limnր∞ (e−n − e0 ) −
−n
1
λ
1
λ
= µ2 )
(2x)(− λ1 e−λx )dx) −
=IbP ((0) − (0)2 e−λ(0) ) + limnր∞ 2([x
σ=
1
λ2
Rn
e−λx dx]n0 −
= λ limnր∞ ([x2 (− λ1 e−λx )]n0 −
e−λx dx]n0 −
1
du
−n
) = − λ1 limnր∞ [eu ]−n
− e0 ) = − λ1 (0 − 1) =
eu ( −λ
0 = − λ limnր∞ (e
Variance (σ 2 = V ar[X] =
2.6
R
1
λ2
1
λ2
1
λ2
= − limnր∞ [x2 e−λx ]n0 + 2
R
Rn d
[x][ e−λx dx]dx) −
0 dx
1
λ2
=
2
λ
1
λ2
= − λ22 limnր∞ [eu ]−n
0 −
=
1
λ2
Rn
0
xe−λx dx −
1
λ2
1
λ2
limnր∞ (−[xe−λx ]n0 +
1
λ2
Rn
0
e−λx dx) −
1
λ2
= µ2
= µ)
=µ
Normal Distribution - X ∼ N (µ, σ 2 )
The Normal Distribution is most utilized as an approximate distribution to sums and averages of a large
number, n, of Independent, Identically Distributed (iid) r.v.s, which is derived from the Central Limit
Theorem (CLT). As ”Limit” is in CLT, calculus limits as n ր ∞ tells us such sums and averages approach
a Normal Distribution. However, statistically there is no way to take a sample of infinite size. Thus, only
the approximation is utilized in statistics. Further, the assumptions described above can be relaxed.
1. Identical: The distributions need not be identical, it is merely the most common circumstance that
statisticians find themselves in when taking samples and the precision of the approximation will be
very difficult to determine without this property.
5
2. Independent: The distributions need not be independent, but the sequence of dependency must
continue reduce and approach 0 as the sample approaches infinite. The strictness of this requirement is beyond the scope of this course, so this course shall only utilize the CLT approximation for
independent distributions.
Further, the Normal Distribution is known as a Location-Scale Distribution meaning that one only need
know its location, in this case specifically the measure of center of the mean, and its scale, in this case
specifically the measure of spread of the variance, the average distance squared away from the mean/center.
Many Location-Scale Distributions including the Normal Distribution possess a very useful property of
being able to standardize the distribution to be utilized in a simpler form:
Standard Normal Distribution: Z =
X−µ
σ
∼ N ormal(0, 1)
We will always be transforming any generic Normal Distribution X to the Standard Normal Distribution
OR x = µ + zσ if one wishes to
for purposes of computing probability with the transformation z = x−µ
σ
solve a percentile problem.
Finally, like the Uniform Distribution, the Normal Distribution ⇒ the mean = median. These values also
happen to be the same as the mode.
3.1
PDF - fX (x) =
fX (x) =
3.1.1
√ 1 e−
2πσ
(x−µ)2
2σ 2
√ 1 e−
2πσ
(x−µ)2
2σ 2
2
z
√1 e− 2
2π
: x ∈ ℜ ⇒ fZ= X−µ (z) =
σ
: x ∈ ℜ ⇒ fZ (z) =
2
z
√1 e− 2
2π
:z∈ℜ
:z∈ℜ
Rules
1. σ > 0 ⇒
2. limnր∞
√1
2πσ
> 0 and e−
(x−µ)2
2σ 2
Rn
>0⇒
f (x)dx = limnր∞
−n X
Rn
√ 1 e−
2πσ
√ 1 e−
−n 2πσ
(x−µ)2
2σ 2
(x−µ)2
2σ 2
≥ 0, ∀x
dx = limnր∞
√1
2πσ
R
n−µ
σ
−n−µ
σ
e−
z2
2
(σdz) =
√1
2π
limnր∞
... = 1
The completion of the ”...” part of the proof requires the use of a transformation to Polar Coordinates in
3-dimensions prompting knowledge of integration over volumes from Multivariate Calculus, which is not
required for this course. Thus, the rest of the proof is omitted. If interested, feel free to attempt the
problem on your own or ask anyone working for STAT 350.
Therefore, X is a PDF.
3.2
CDF - FX (x) = limnր∞
FX (x) = P (X ≤ x) = limnր∞
=
√1
2π
limnր∞
R
x−µ
σ
−n
e−
z2
2
Rx
−n
Rx
f (t)dt = limnր∞
−n X
fX (t)dt = limnր∞
Rx
−n
R x−µ
σ
−n
√ 1 e−
2πσ
fZ ( x−µ
)dz = FZ ( x−µ
) = Φ( x−µ
)
σ
σ
σ
(t−µ)2
2σ 2
dx =
x−µ
du = FZ ( x−µ
σ ) = Φ( σ ) = ...Z − T ABLE/R
6
√1
2πσ
limnր∞
R
x−µ
σ
−n−µ
σ
e−
z2
2
(σdz)
Rn
−n
e−
z2
2
dz =
3.3
Percentile p
For 0 < p < 1:
∗
FX (x∗ ) = p ⇔ FZ ( x σ−µ ) = FZ (z ∗ ) = p ⇒ Use Z-Table to find z ∗ and use it to solve for x∗
z∗ =
3.4
x∗ −µ
σ
⇒ x∗ = µ + z ∗ σ
Probability
x−µ
P (X > x) = 1 − P ((X > x)C ) = 1 − P (X ≤ x) = 1 − FX (x) = 1 − FZ ( x−µ
σ ) = 1 − Φ( σ ) =
...Z − T ABLE/R
AND
x1 < x2
⇒ P (x1 < X < x2 ) = P ( x1σ−µ < Z <
...Z − T ABLE/R
3.5
= FZ ( x2σ−µ ) − FZ ( x1σ−µ ) = Φ( x2σ−µ ) − Φ( x1σ−µ ) =
Mean/Expected Value (µ = E[X])
µ = E[X] = limnր∞
=
x2 −µ
σ )
√1
2πσ
limnր∞
Rn
xfX (x)dx = limnր∞
−n
2
R − 21 ( n−µ
σ )
)2
− 12 ( −n−µ
σ
2
Rn
1
e−
(x − (µ − µ))( √2πσ
−n
dz
)+µ
(x − µ)ez (− σx−µ
Rn
−n
√ 1 e−
2πσ
(x−µ)2
2σ 2
(x−µ)2
2σ 2
)dx
dx
−n
√σ
= − √σ2π limnր∞ [ez ]−n
− e−n ) + µ = − √σ2π (0 − 0) + µ = µ
−n + µ(1) = − 2π limnր∞ (e
This mathematics is thoroughly unnecessary though as in defining the Normal Distribution the mean is
necessarilly defined as well.
3.6
Variance (σ 2 = V ar[X])
σ 2 = V ar[X] = limnր∞
Rn
(x − µ)2 fX (x)dx = limnր∞
−n
Rn
1
(x − µ)2 ( √2πσ
e−
−n
(x−µ)2
2σ 2
)dx = ... = σ 2
Again, the completion of the proof requires mathematical knowledge in the area of probability known as
Moment Generating Functions (MGFs) or Characteristic Functions which is not required for this course.
Thus, the rest of the proof is omitted.
Further, this mathematics is again thoroughly unnecessary as in defining the Normal Distribution the
variance is necessarilly defined as well.
7
Download