Math 105 - Notes on Continuous Random Variables 1 Probability and Distributions

advertisement
1
PROBABILITY AND DISTRIBUTIONS
1
Math 105 - Notes on Continuous
Random Variables
1
Probability and Distributions
1.1
Probability Density Functions
Definitions:
• A variable X whose value depends on the outcome of a random process is
called a random variable.
ex: X is the outcome of a coin toss
ex: X is the 1st number drawn in the next 6/49 lottery draw
ex: X is the age of an individual chosen at random from the vancouver
population
• A continuous random variable is a random variable that can assume any
value in an interval.
ex: X is the length of time until the next time you are sick.
ex: X is the weight of someone chosen at random from the Canadian population.
For continuous random variables, we are often interested in the probability, P(c <
X < d), that they take on a value between two numbers c and d
• ex: What is the probability you are not sick for at least 1 month? We write
this as P(1 < X < ∞).
• ex: What is the probability that the person chosen at random weighs less
than 100 pounds? We write this as P(0 < X < 100).
To find these probabilities, we need to know the probability density function for
the random variable.
Definition: If X is a continuous random variable, then a function f (x) with domain (a, b) is called a probability density function for X if and only if
1. f (x) ≥ 0 for all x in (a, b)
1
2
PROBABILITY AND DISTRIBUTIONS
2.
Rb
a
f (x)dx = 1, and
3. for any a ≤ c < d ≤ b,
P(c < X < d) =
Rd
c
f (x)dx.
Notes:
1. a and b could possibly be −∞ and ∞.
2. f (x) is NOT a probability (ie. f(1) is the not the probability that X=1), it is
the probability density.
3. P(c < X < d) is the area under the curve f (x) between c and d.
4. For a continuous random variable, the probability of any single value is
zero. ie. P(X = c) = 0. This also means that P(c ≤ X ≤ d) = P(c < X <
d) = P(c ≤ X < d) = P(c < X ≤ d).
5. The domain of f (x) can include the endpoints [a, b], or not (a, b) (or a combination).
EXAMPLE 1
The distribution f (x) of people’s ages in Canada is shown below, where 0 ≤ x ≤
100 is age in years and f (x) measures the probability density.
0.02
0.018
0.016
0.014
0.012
y
0.01
0.008
0.006
0.004
0.002
0
20
40
60
80
100
x
(a) If m = f (0), find the exact value of m
(b) find the fraction of the population between 50 and 60 years old.
1
3
PROBABILITY AND DISTRIBUTIONS
SOLUTION
(a)
Z ∞
.
⇒
f (x)dx = 1
−∞
Z 100
f (x)dx = 1
0
1
⇒ 60(m) + (40)(m) = 1
2
1
⇒ m=
80
(b)
Z 60
P(50 < X < 60) =
f (x)dx = 10 ·
50
1
1
= .
80 8
EXAMPLE 2
Consider the probability density function
f (x) =
c
,
(x + 1)2
x≥0
Find the necessary value of c.
SOLUTION
R
Set 0∞ f (x)dx = 1 and then solve for c.
Z ∞
1=
0
Therefore, c = 1.
b
1
c
dx
=
c
·
lim
dx
b→∞ 0 (x + 1)2
(x + 1)2
1 b
= c lim −
b→∞
x+1 0
1
1
= c lim −
+
b→∞
b+1 0+1
= c.
Z
1
4
PROBABILITY AND DISTRIBUTIONS
1.2
Cumulative Distribution Functions
Definition: The cumulative distribution function F(x) for the continuous random variable X with density function f (t) whose range is (a, b) or [a, b], is defined
by
Z
x
F(x) = P(X ≤ x) =
f (t)dt
a
Note:
• By the Fundamental Theorem of Calculus, F 0 (x) = f (x). (See text pg. 721
for proof)
Properties of Cumulative Distribution function
1. F(x) is non-decreasing, since F 0 (x) = f (x) ≥ 0.
2. limx→b F(x) = 1, and limx→a F(x) = 0.
3. for a ≤ c < d ≤ b, P(c ≤ X ≤ d) =
Rd
c
f (t)dt = F(d) − F(c). (FTC).
EXAMPLE 3
Consider again the density function f (x) for people’s ages in Canada:

1

0 ≤ x ≤ 60
 80 ,
f (x) = 100−x
3200 , 60 ≤ x ≤ 100


0,
otherwise
(a) Find the corresponding cumulative distribution function.
(b) Using the cumulative distribution function, find the probability that a Canadian
is between 50 and 60 years of age.
SOLUTION
R
(a) F(x) = 0x f (t)dt, soR
t x
1
for 0 ≤ x ≤ 60, F(x) = 0x 80
dt = 80
=
0
x
80 .
2
STATISTICAL MEASURES OF PROBABILITY DISTRIBUTIONS
5
for 60 ≤ x ≤ 100,
Z x
F(x) =
Z 60
f (t)dt =
0
Z x
f (t)dt +
0
f (t)dt
60
x 100 − t
60
+
dt
80
3200
" 60
#
2 x
100t − t2
3
+
4
3200
60
3
1
x2
602
+
100x − − 6000 −
4 3200
2
2
2
1
x
x
x2
9
3
+
100x − − 4200 =
−
− .
4 3200
2
32 6400 16
Z
=
=
=
=
So we have


0,



x,
x<0
0 ≤ x ≤ 60
F(x) = 80
2
x
x
9


32 − 6400 − 16 , 60 < x ≤ 100


1,
x > 100.
(b)
P(50 < X < 60) = F(60) − F(50) =
2
2.1
60 50 10 1
−
=
= .
80 80 80 8
Statistical Measures of Probability Distributions
Median
The median of a random variable X that has probability density function f (x) is a
number m such that
Z m
1
f (x)dx =
2
−∞
What does this mean? The median m is the number for which the probability
is exactly 12 that the random variable will have a value greater than m, and 21 that
it will have a value less than m.
2
STATISTICAL MEASURES OF PROBABILITY DISTRIBUTIONS
6
For example, suppose we look at a population of people. Let X be the age
(in years) of a randomly chosen person from this popoulation, and let m be the
median age of the population. This means that half the population is less than or
equal to m years old, and half is greater than or equal to m years old.
EXAMPLE 4
Find the median age of the age density function
(
1
0 ≤ x ≤ 60
80 ,
f (x) = 100−x
3200 , 60 ≤ x ≤ 100
SOLUTION
We want to find m such that
1
=
2
Z m
f (x)dx
0
First, note that
Z 60
1
60 3 1
= >
80 4 2
0 80
So we know that m must be less than 60. So we just solve
Z m
h x im
1
m
1
=
dx =
=
2
80 0
80
0 80
P(X < 60) =
dx =
From which we get m = 40 is the median age.
2.2
Mean
The mean value (average) of n numbers a1 , a2 , a3 , ..., an is given by
a1 + a2 + a3 + ... + an
n
Definition:
For a random variable X with density function f (x), a ≤ x ≤ b, the mean µ, or
expected value E(X), is given by
Z b
µ = E(x) =
x f (x)dx
a
2
STATISTICAL MEASURES OF PROBABILITY DISTRIBUTIONS
7
This is the average value of the random variable X.
Notes:
1. The mean and median are not usually equal to the same number (see examples later).
2. Also, don’t confuse the average value of a function with the average value
(mean) of a random variable.
EXAMPLE 5
Find the mean value of the probability density function
2
2
f (x) = √ e−x ,
π
x≥0
SOLUTION
Z ∞
µ =
Z ∞
2
2x
√ e−x dx
x f (x)dx =
0
2
= √ · lim
π b→∞
π
0
Z b
2
xe−x dx
0
Integrate this using the substitution u = x2 , so du = 2xdx or dx =
when x = 0, u = 0 and when x = b, u = b2 . So we have
2
b
1
µ = √ · lim
e−u du
π b→∞ 0
h
i
2
1
= √ · lim −e−b + e0
π b→∞
1
1
= √ ·1 = √ .
π
π
Z
du
2x .
Note that
2
STATISTICAL MEASURES OF PROBABILITY DISTRIBUTIONS
2.3
8
Variance and Standard Deviation
The variance and standard deviation of a random variable X with a given probability density function f (x) measure how much, on average, the actual value of the
random variable differs from the mean µ. For example, if X is the temperature of
Vancouver on a random day in January of last year we would expect most actual
temperatures from those days in January to be near to the mean (so low variance
and standard deviation). If on the other hand X is the temperature of Vancouver
on any random day of the last year we would expect that many days actually differ a lot from the mean temperature over the year (so high variance and standard
deviation).
Definition: The variance , Var(X) or σ2 , of a random variable X with probability
density function f (x), a ≤ x ≤ b and mean µ is
Var(X) = σ =
2
Z b
(x − µ)2 f (x)dx
a
Note:
1. We can actually simplify this by
σ
2
Z b
=
2
(x − µ) f (x)dx =
Z b
a
Z b
=
a
2
x f (x)dx − 2µ
a
Z b
=
(x2 − 2xµ + µ2 ) f (x)dx
Z b
x f (x)dx + µ
a
2
Z b
f (x)dx
a
x2 f (x)dx − 2µ · µ + µ2 · 1
a
Z b
=
x2 f (x)dx − µ2
a
This last form is easier to use than the original formula. So we will often
use the result
Z b
2
Var(X) = σ =
x2 f (x)dx − µ2
a
Definition: The standard deviation, σ, of the random variable X is given by
p
p
σ = σ2 = Var(X)
2
STATISTICAL MEASURES OF PROBABILITY DISTRIBUTIONS
EXAMPLE 6
The random variable X has probability density function f (x) where
(
4
, x≥1
x
f (x) = 5
0, x < 1
(a) Find the median of X
(b) Find the mean of X
(c) Find the variance and standard deviation of X
SOLUTION
(a)
1
=
2
Z m
f (x)dx =
Z m
4
−∞
Solving this for m we get
(b)
1
1
m4
=
1
2.
−4 m
1
1
=
−
dx
=
−x
+
1
m4 1
x5
Hence the median is m = 21/4 .
Z ∞
µ =
x f (x)dx =
Z ∞
4
−∞
1
x4
dx
−3 b
x
= lim 4
x dx = 4 · lim −
b→∞
b→∞
3 1
1
4
1
= − lim
−1
3 b→∞ b3
4
.
=
3
Z b
−4
(c)
σ
2
Z ∞
Z ∞
4
16
9
−∞
1
Z b
b 16
16
= 4 · lim
x−3 dx −
= −2 · lim x−2 0 −
b→∞ 1
b→∞
9
9
1
16
−1 −
= −2 · lim
b→∞ b2
9
16 2
= 2−
= .
9
9
=
2
2
x f (x)dx − µ =
Then we have
σ=
p
σ2 =
r
x3
dx −
√
2
2
=
.
9
3
9
2
STATISTICAL MEASURES OF PROBABILITY DISTRIBUTIONS
10
EXAMPLE 7
The number of sunny December days that a person in Vancouver witnesses is
modeled by the cumulative distribution function
F(x) = 1 − e−x − xe−x ,
x≥0
(a) What is the probability that a person in Vancouver witnesses more than 5 sunny
days in December?
(b) Find the density function f (x) that corresponds to F(x).
(c) What is the mean of f (x)?
(d) What is the standard deviation of f (x)?
SOLUTION
(a)
−5
P(X ≤ 5) = F(5) = 1 − e
−5
− 5e
Z 5
=
f (x)dx
0
Since
Z ∞
P(X > 5) =
Z ∞
f (x)dx =
5
f (x)dx −
0
Z 5
f (x)dx = 1 − F(5)
0
we have that P(X > 5) = 1 − (1 − e−5 − 5e−5 ) = e−5 + 5e−5 =
(b)
f (x) = F 0 (x) = e−x + xe−x − e−x = xe−x
6
.
e5
(c) First we must find µ
Z ∞
Z b
x f (x)dx = lim
x2 e−x dx
b→∞ 0
Z b
2 −x b
−x
xe dx
= lim −x e 0 + 2
b→∞
0
Z b
2 −b
−x b
−x
e dx
= lim −b e + 2 −xe 0 + 2
b→∞
0
= lim −b2 e−b − 2be−b − 2e−b + 2
µ =
0
b→∞
= 2.
2
STATISTICAL MEASURES OF PROBABILITY DISTRIBUTIONS
Now find σ2
σ
2
Z ∞
=
2
2
x f (x)dx − µ =
0
Z ∞
x3 e−x dx − 4
0
Z b
= lim
b→∞ 0
x3 e−x dx − 4
Z b
3 −x b
2 −x
= lim −x e 0 + 3
x e dx − 4
b→∞
0
= lim −b3 e−b + 3µ − 4
b→∞
= 6 − 4 = 2.
√
√
Finally, we have σ = σ2 = 2.
Remark: As in this example, you may often want to use the fact that
P(X ≥ c) = 1 − P(X ≤ c)
and vice versa.
11
Download