Applied Statistics I Liang Zhang June 19, 2008

advertisement
Applied Statistics I
Liang Zhang
Department of Mathematics, University of Utah
June 19, 2008
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
1 / 26
Expectations
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
2 / 26
Expectations
Definition
Let X be a discrete rv with set of possible values D and pmf p(x). The
expected value or mean value of X , denoted by E (X ) or µX , is
X
E (X ) = µX =
x · p(x)
x∈D
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
2 / 26
Expectations
Definition
Let X be a discrete rv with set of possible values D and pmf p(x). The
expected value or mean value of X , denoted by E (X ) or µX , is
X
E (X ) = µX =
x · p(x)
x∈D
e.g (Problem 30)
A group of individuals who have automobile insurance from a certain
company is randomly selected. Let Y be the number of moving violations
for which the individual was cited during the last 3 years. The pmf of Y is
y
0
1
2
3
Then the expected value of moving
p(y) 0.60 0.25 0.10 0.05
violations for that group is
µY = E (Y ) = 0 · 0.60 + 1 · 0.25 + 2 · 0.10 + 3 · 0.05 = 0.60
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
2 / 26
Expectations
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
3 / 26
Expectations
y
0
1
2
3
p(y) 0.60 0.25 0.10 0.05
Assume the total number of individuals in that group is 100, then there are 60
individuals without moving violation, 25 with 1 moving violation, 10 with 2
moving violations and 5 with 3 moving violations.
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
3 / 26
Expectations
y
0
1
2
3
p(y) 0.60 0.25 0.10 0.05
Assume the total number of individuals in that group is 100, then there are 60
individuals without moving violation, 25 with 1 moving violation, 10 with 2
moving violations and 5 with 3 moving violations.
The population mean is calculated as
µ=
Liang Zhang (UofU)
0 · 60 + ·1 · 25 + ·2 · 10 + 3 · 5
= 0.60
100
Applied Statistics I
June 19, 2008
3 / 26
Expectations
y
0
1
2
3
p(y) 0.60 0.25 0.10 0.05
Assume the total number of individuals in that group is 100, then there are 60
individuals without moving violation, 25 with 1 moving violation, 10 with 2
moving violations and 5 with 3 moving violations.
The population mean is calculated as
µ=
0 · 60 + ·1 · 25 + ·2 · 10 + 3 · 5
= 0.60
100
60
25
10
5
+1·
+2·
+3·
100
100
100
100
= 0 · 0.60 + 1 · 0.25 + 2 · 0.10 + 3 · 0.05
µ=0·
= 0.60
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
3 / 26
Expectations
y
0
1
2
3
p(y) 0.60 0.25 0.10 0.05
Assume the total number of individuals in that group is 100, then there are 60
individuals without moving violation, 25 with 1 moving violation, 10 with 2
moving violations and 5 with 3 moving violations.
The population mean is calculated as
µ=
0 · 60 + ·1 · 25 + ·2 · 10 + 3 · 5
= 0.60
100
60
25
10
5
+1·
+2·
+3·
100
100
100
100
= 0 · 0.60 + 1 · 0.25 + 2 · 0.10 + 3 · 0.05
µ=0·
= 0.60
The population size is irrevelant if we know the pmf!
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
3 / 26
Expectations
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
4 / 26
Expectations
Examples:
Let X be a Bernoulli rv with pmf


1 − p
p(x) = p


0
Liang Zhang (UofU)
x =0
x =1
x 6= 0, or 1
Applied Statistics I
June 19, 2008
4 / 26
Expectations
Examples:
Let X be a Bernoulli rv with pmf


1 − p
p(x) = p


0
x =0
x =1
x 6= 0, or 1
Then the expected value for X is
E (X ) = 0 · p(0) + 1 · p(1) = p
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
4 / 26
Expectations
Examples:
Let X be a Bernoulli rv with pmf


1 − p
p(x) = p


0
x =0
x =1
x 6= 0, or 1
Then the expected value for X is
E (X ) = 0 · p(0) + 1 · p(1) = p
We see that the expected value of a Bernoulli rv X is just the
probability that X takes on the value 1.
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
4 / 26
Expectations
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
5 / 26
Expectations
Examples:
Consider the cards drawing example again and assume we have infinitely
many cards this time. Let X = the number of drawings until we get a ♠.
If the probability for getting a ♠ is α, then the pmf for X is
(
α(1 − α)x−1 x = 1, 2, 3, . . .
p(x) =
0
otherwise
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
5 / 26
Expectations
Examples:
Consider the cards drawing example again and assume we have infinitely
many cards this time. Let X = the number of drawings until we get a ♠.
If the probability for getting a ♠ is α, then the pmf for X is
(
α(1 − α)x−1 x = 1, 2, 3, . . .
p(x) =
0
otherwise
The expected value for X is
E (X ) =
X
D
Liang Zhang (UofU)
x · p(x) =
∞
X
xα(1 − α)x−1 = α
x=1
∞
X
d
[− (1 − α)x ]
dα
x=1
Applied Statistics I
June 19, 2008
5 / 26
Expectations
Examples:
Consider the cards drawing example again and assume we have infinitely
many cards this time. Let X = the number of drawings until we get a ♠.
If the probability for getting a ♠ is α, then the pmf for X is
(
α(1 − α)x−1 x = 1, 2, 3, . . .
p(x) =
0
otherwise
The expected value for X is
E (X ) =
X
D
x · p(x) =
∞
X
xα(1 − α)x−1 = α
x=1
∞
X
d
[− (1 − α)x ]
dα
x=1
∞
d X
d 1−α
1
)} =
E (X ) = α{− [ (1 − α)x ]} = α{− (
dα
dα α
α
x=1
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
5 / 26
Expectations
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
6 / 26
Expectations
Examples 3.20
Let X be the number of interviews a student has prior to getting a job.
The pmf for X is
(
k
x = 1, 2, 3, . . .
2
p(x) = x
0 otherwise
P∞
2
where
P∞ k is 2chosen so that x=1 (k/x ) = 1. (It can be showed that
x=1 (1/x ) < ∞, which implies that such a k exists.)
The expected value of X is
µ = E (X ) =
∞
X
x=1
∞
x·
X1
k
=k
= ∞!
2
x
x
x=1
The expected value is NOT finite!
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
6 / 26
Expectations
Examples 3.20
Let X be the number of interviews a student has prior to getting a job.
The pmf for X is
(
k
x = 1, 2, 3, . . .
2
p(x) = x
0 otherwise
P∞
2
where
P∞ k is 2chosen so that x=1 (k/x ) = 1. (It can be showed that
x=1 (1/x ) < ∞, which implies that such a k exists.)
The expected value of X is
µ = E (X ) =
∞
X
x=1
∞
x·
X1
k
=k
= ∞!
2
x
x
x=1
The expected value is NOT finite!
Heavy Tail:
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
6 / 26
Expectations
Examples 3.20
Let X be the number of interviews a student has prior to getting a job.
The pmf for X is
(
k
x = 1, 2, 3, . . .
2
p(x) = x
0 otherwise
P∞
2
where
P∞ k is 2chosen so that x=1 (k/x ) = 1. (It can be showed that
x=1 (1/x ) < ∞, which implies that such a k exists.)
The expected value of X is
µ = E (X ) =
∞
X
x=1
∞
x·
X1
k
=k
= ∞!
2
x
x
x=1
The expected value is NOT finite!
Heavy Tail: distribution with a large amount of probability far from µ
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
6 / 26
Expectations
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
7 / 26
Expectations
Example (Problem 38)
Let X = the outcome when a fair die is rolled once. If before the die is
1
dollars or X1 dollars, would you accept the
rolled you are offered either 3.5
guaranteed amount or would you gamble?
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
7 / 26
Expectations
Example (Problem 38)
Let X = the outcome when a fair die is rolled once. If before the die is
1
dollars or X1 dollars, would you accept the
rolled you are offered either 3.5
guaranteed amount or would you gamble?
x
1 2 3 4 5 6
p(x) 16 16 61 16 16 16
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
7 / 26
Expectations
Example (Problem 38)
Let X = the outcome when a fair die is rolled once. If before the die is
1
dollars or X1 dollars, would you accept the
rolled you are offered either 3.5
guaranteed amount or would you gamble?
x
1 2 3 4 5 6
p(x) 16 16 61 16 16 16
1
1 12 13 14 15 16
x
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
7 / 26
Expectations
Example (Problem 38)
Let X = the outcome when a fair die is rolled once. If before the die is
1
dollars or X1 dollars, would you accept the
rolled you are offered either 3.5
guaranteed amount or would you gamble?
x
1 2 3 4 5 6
p(x) 16 16 61 16 16 16
1
1 12 13 14 15 16
x
Then the expected dollars from gambling is
6
X1
1
1
E( ) =
· p( )
X
x
x
x=1
1 1 1
1 1
+ · + ··· + ·
6 2 6
6 6
1
49
=
<
120
3.5
=1·
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
7 / 26
Expectations
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
8 / 26
Expectations
Proposition
If the rv X has a set of possible values D and pmf p(x), then the expected
value of any function h(X ), denoted by E [h(X )] or µhX , is computed by
X
h(x) · p(x)
E [h(X )] =
D
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
8 / 26
Expectations
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
9 / 26
Expectations
Example 3.23
A computer store has purchased three computers of a certain type at $500
apiece. It will sell them for $1000 apiece. The manufacturer has agreed to
repurchase any computers still unsold after a specified period at $200
apiece.
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
9 / 26
Expectations
Example 3.23
A computer store has purchased three computers of a certain type at $500
apiece. It will sell them for $1000 apiece. The manufacturer has agreed to
repurchase any computers still unsold after a specified period at $200
apiece.
Let X denote the number of computers sold, and suppose that p(0) = 0.1,
p(1) = 0.2, p(2) = 0.3, p(3) = 0.4.
Let h(X ) denote the profit associated with selling X units,
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
9 / 26
Expectations
Example 3.23
A computer store has purchased three computers of a certain type at $500
apiece. It will sell them for $1000 apiece. The manufacturer has agreed to
repurchase any computers still unsold after a specified period at $200
apiece.
Let X denote the number of computers sold, and suppose that p(0) = 0.1,
p(1) = 0.2, p(2) = 0.3, p(3) = 0.4.
Let h(X ) denote the profit associated with selling X units, then
h(X ) = revenue − cost = 1000X + 200(3 − X ) − 1500 = 800X − 900.
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
9 / 26
Expectations
Example 3.23
A computer store has purchased three computers of a certain type at $500
apiece. It will sell them for $1000 apiece. The manufacturer has agreed to
repurchase any computers still unsold after a specified period at $200
apiece.
Let X denote the number of computers sold, and suppose that p(0) = 0.1,
p(1) = 0.2, p(2) = 0.3, p(3) = 0.4.
Let h(X ) denote the profit associated with selling X units, then
h(X ) = revenue − cost = 1000X + 200(3 − X ) − 1500 = 800X − 900.
The expected profit is
E [h(X )] = h(0) · p(0) + h(1) · p(1) + h(2) · p(2) + h(3) · p(3)
= (−900)(0.1) + (−100)(0.2) + (700)(0.3) + (1500)(0.4)
= 700
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
9 / 26
Expectations
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
10 / 26
Expectations
Proposition
E (aX + b) = a · E (X ) + b
(Or, using alternative notation, µaX +b = a · µX + b.)
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
10 / 26
Expectations
Proposition
E (aX + b) = a · E (X ) + b
(Or, using alternative notation, µaX +b = a · µX + b.)
e.g. for the previous example,
E [h(X )] = E (800X − 900) = 800 · E (X ) − 900 = 700
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
10 / 26
Expectations
Proposition
E (aX + b) = a · E (X ) + b
(Or, using alternative notation, µaX +b = a · µX + b.)
e.g. for the previous example,
E [h(X )] = E (800X − 900) = 800 · E (X ) − 900 = 700
Corollary
1. For any constant a, E (aX ) = a · E (X ).
2. For any constant b, E (X + b) = E (X ) + b.
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
10 / 26
Expectations
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
11 / 26
Expectations
Definition
Let X have pmf p(x) and expected value µ. Then the variance of X,
denoted by V (X ) or σX2 , or just σX2 , is
V (X ) =
X
(x − µ)2 · p(x) = E [(X − µ)2 ]
D
The stand deviation (SD) of X is
σX =
Liang Zhang (UofU)
q
σX2
Applied Statistics I
June 19, 2008
11 / 26
Expectations
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
12 / 26
Expectations
Example:
For the previous example, the pmf is given as
x
0
1
2
3
p(x) 0.1 0.2 0.3 0.4
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
12 / 26
Expectations
Example:
For the previous example, the pmf is given as
x
0
1
2
3
p(x) 0.1 0.2 0.3 0.4
then the variance of X is
V (X ) = σ 2 =
3
X
(x − 2)2 · p(x)
x=0
2
= (0 − 2) (0.1) + (1 − 2)2 (0.2) + (2 − 2)2 (0.3) + (3 − 2)2 (0.4)
=1
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
12 / 26
Expectations
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
13 / 26
Expectations
Recall that for sample variance s 2 , we have
Sxx
s2 =
=
n−1
Liang Zhang (UofU)
P
Applied Statistics I
xi )2
n
P
xi2 − (
n−1
June 19, 2008
13 / 26
Expectations
Recall that for sample variance s 2 , we have
Sxx
s2 =
=
n−1
P
xi )2
n
P
xi2 − (
n−1
Proposition
X
V (X ) = σ 2 = [
x 2 · p(x)] − µ2 = E (X 2 ) − [E (X )]2
D
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
13 / 26
Expectations
Recall that for sample variance s 2 , we have
Sxx
s2 =
=
n−1
P
xi )2
n
P
xi2 − (
n−1
Proposition
X
V (X ) = σ 2 = [
x 2 · p(x)] − µ2 = E (X 2 ) − [E (X )]2
D
e.g. for the previous example, the pmf is given as
x
0
1
2
3
p(x) 0.1 0.2 0.3 0.4
Then V (X ) = E (X 2 ) − [E (X )]2 = 12 · 0.2 + 22 · 0.3 + 32 · 0.4 − (2)2 = 1
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
13 / 26
Expectations
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
14 / 26
Expectations
Proposition
If h(X ) is a function of a rv X , then
X
2
V [h(X )] = σh(X
{h(x) − E [h(X )]}2 · p(x) = E [h(X )2 ] − {E [h(X )]}2
) =
D
If h(X ) is linear, i.e. h(X ) = aX + b for some nonrandom constant a and
b, then
2
2
2
V (aX + b) = σaX
+b = a · σX and σaX +b =| a | ·σX
In particular,
σaX =| a | ·σX , σX +b = σX
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
14 / 26
Expectations
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
15 / 26
Expectations
Example 3.23 continued
A computer store has purchased three computers of a certain type at $500
apiece. It will sell them for $1000 apiece. The manufacturer has agreed to
repurchase any computers still unsold after a specified period at $200
apiece. Let X denote the number of computers sold, and suppose that
p(0) = 0.1, p(1) = 0.2, p(2) = 0.3, p(3) = 0.4. Let h(X ) denote the profit
associated with selling X units, then
h(X ) = revenue − cost = 1000X + 200(3 − X ) − 1500 = 800X − 900.
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
15 / 26
Expectations
Example 3.23 continued
A computer store has purchased three computers of a certain type at $500
apiece. It will sell them for $1000 apiece. The manufacturer has agreed to
repurchase any computers still unsold after a specified period at $200
apiece. Let X denote the number of computers sold, and suppose that
p(0) = 0.1, p(1) = 0.2, p(2) = 0.3, p(3) = 0.4. Let h(X ) denote the profit
associated with selling X units, then
h(X ) = revenue − cost = 1000X + 200(3 − X ) − 1500 = 800X − 900.
The variance of h(X ) is
V [h(X )] = V [800X − 900]
= 8002 V [X ]
= 640, 000
And the SD is σh(X ) =
Liang Zhang (UofU)
p
V [h(X )] = 800.
Applied Statistics I
June 19, 2008
15 / 26
Binomial Distribution
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
16 / 26
Binomial Distribution
1. The experiment consists of a sequence of n smaller experiments called
trials, where n is fixed in advance of the experiment;
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
16 / 26
Binomial Distribution
1. The experiment consists of a sequence of n smaller experiments called
trials, where n is fixed in advance of the experiment;
2. Each trial can result in one of the same two possible outcomes
(dichotomous trials), which we denote by success (S) and failure (F );
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
16 / 26
Binomial Distribution
1. The experiment consists of a sequence of n smaller experiments called
trials, where n is fixed in advance of the experiment;
2. Each trial can result in one of the same two possible outcomes
(dichotomous trials), which we denote by success (S) and failure (F );
3. The trials are independent, so that the outcome on any particular trial
dose not influence the outcome on any other trial;
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
16 / 26
Binomial Distribution
1. The experiment consists of a sequence of n smaller experiments called
trials, where n is fixed in advance of the experiment;
2. Each trial can result in one of the same two possible outcomes
(dichotomous trials), which we denote by success (S) and failure (F );
3. The trials are independent, so that the outcome on any particular trial
dose not influence the outcome on any other trial;
4. The probability of success is constant from trial; we denote this
probability by p.
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
16 / 26
Binomial Distribution
1. The experiment consists of a sequence of n smaller experiments called
trials, where n is fixed in advance of the experiment;
2. Each trial can result in one of the same two possible outcomes
(dichotomous trials), which we denote by success (S) and failure (F );
3. The trials are independent, so that the outcome on any particular trial
dose not influence the outcome on any other trial;
4. The probability of success is constant from trial; we denote this
probability by p.
Definition
An experiment for which Conditions 1 — 4 are satisfied is called a
binomial experiment.
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
16 / 26
Binomial Distribution
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
17 / 26
Binomial Distribution
Examples:
1. If we toss a coin 10 times, then this is a binomial experiment with
n = 10, S = Head, and F = Tail.
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
17 / 26
Binomial Distribution
Examples:
1. If we toss a coin 10 times, then this is a binomial experiment with
n = 10, S = Head, and F = Tail.
2. If we draw a card from a deck of well-shulffed cards with replacement,
do this 5 times and record whether the outcome is ♠ or not, then this is
also a binomial experiment. In this case, n = 5, S = ♠ and F = not ♠.
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
17 / 26
Binomial Distribution
Examples:
1. If we toss a coin 10 times, then this is a binomial experiment with
n = 10, S = Head, and F = Tail.
2. If we draw a card from a deck of well-shulffed cards with replacement,
do this 5 times and record whether the outcome is ♠ or not, then this is
also a binomial experiment. In this case, n = 5, S = ♠ and F = not ♠.
3. Again we draw a card from a deck of well-shulffed cards but without
replacement, do this 5 times and record whether the outcome is ♠ or not.
However this time it is NO LONGER a binomial experiment.
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
17 / 26
Binomial Distribution
Examples:
1. If we toss a coin 10 times, then this is a binomial experiment with
n = 10, S = Head, and F = Tail.
2. If we draw a card from a deck of well-shulffed cards with replacement,
do this 5 times and record whether the outcome is ♠ or not, then this is
also a binomial experiment. In this case, n = 5, S = ♠ and F = not ♠.
3. Again we draw a card from a deck of well-shulffed cards but without
replacement, do this 5 times and record whether the outcome is ♠ or not.
However this time it is NO LONGER a binomial experiment.
P(♠ on second | ♠ on first) =
12
= 0.235 6= 0.25 = P(♠ on second)
51
We do not have independence here!
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
17 / 26
Binomial Distribution
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
18 / 26
Binomial Distribution
Examples:
4. This time we draw a card from 100 decks of well-shulffed cards without
replacement, do this 5 times and record whether the outcome is ♠ or not.
Is it a binomial experiment?
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
18 / 26
Binomial Distribution
Examples:
4. This time we draw a card from 100 decks of well-shulffed cards without
replacement, do this 5 times and record whether the outcome is ♠ or not.
Is it a binomial experiment?
1299
= 0.2499 ≈ 0.25
5199
1295
P(♠ on sixth draw | ♠ on first five draw) =
= 0.2492 ≈ 0.25
5195
1300
P(♠ on tenth draw | not ♠ on first nine draw) =
= 0.2504 ≈ 0.25
5191
...
P(♠ on second draw | ♠ on first draw) =
Although we still do not have independence, the conditional probabilities
differ so slightly that we can regard these trials as independent with
P(♠) = 0.25.
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
18 / 26
Binomial Distribution
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
19 / 26
Binomial Distribution
Rule
Consider sampling without replacement from a dichotomous population of
size N. If the sample size (number of trials) n is at most 5% of the
population size, the experiment can be analyzed as though it wre exactly a
binomial experiment.
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
19 / 26
Binomial Distribution
Rule
Consider sampling without replacement from a dichotomous population of
size N. If the sample size (number of trials) n is at most 5% of the
population size, the experiment can be analyzed as though it wre exactly a
binomial experiment.
e.g. for the previous example, the population size is N = 5200 and the
sample size is n = 5. We have Nn ≈ 0.1%. So we can apply the above rule.
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
19 / 26
Binomial Distribution
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
20 / 26
Binomial Distribution
Definition
The binomial random variable X associated with a binomial experiment
consisting of n trials is defined as
X = the number of S’s among the n trials
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
20 / 26
Binomial Distribution
Definition
The binomial random variable X associated with a binomial experiment
consisting of n trials is defined as
X = the number of S’s among the n trials
Possible values for X in an n-trial experiment are x = 0, 1, 2, . . . , n.
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
20 / 26
Binomial Distribution
Definition
The binomial random variable X associated with a binomial experiment
consisting of n trials is defined as
X = the number of S’s among the n trials
Possible values for X in an n-trial experiment are x = 0, 1, 2, . . . , n.
Notation
We use X ∼ Bin(n, p) to indicate that X is a binomial rv based on n trials
with success probability p.
We use b(x; n, p) to denote the pmf of X , and B(x; n, p) to denote the cdf
of X , where
x
X
B(x; n, p) = P(X ≤ x) =
b(x; n, p)
y =0
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
20 / 26
Binomial Distribution
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
21 / 26
Binomial Distribution
Example:
Assume we toss a coin 3 times and the probability for getting a head for
each toss is p. Let X be the binomial random variable associated with this
experiment. We tabulate all the possible outcomes, corresponding X
values and probabilities in the following table:
Outcome
HHH
HHT
HTH
HTT
X
3
2
2
1
Liang Zhang (UofU)
Probability
p3
2
p · (1 − p)
p 2 · (1 − p)
p · (1 − p)2
Outcome
TTT
TTH
THT
THH
Applied Statistics I
X
0
1
1
2
Probability
(1 − p)3
(1 − p)2 · p
(1 − p)2 · p
(1 − p) · p 2
June 19, 2008
21 / 26
Binomial Distribution
Example:
Assume we toss a coin 3 times and the probability for getting a head for
each toss is p. Let X be the binomial random variable associated with this
experiment. We tabulate all the possible outcomes, corresponding X
values and probabilities in the following table:
Outcome
HHH
HHT
HTH
HTT
X
3
2
2
1
Probability
p3
2
p · (1 − p)
p 2 · (1 − p)
p · (1 − p)2
Outcome
TTT
TTH
THT
THH
X
0
1
1
2
Probability
(1 − p)3
(1 − p)2 · p
(1 − p)2 · p
(1 − p) · p 2
e.g. b(2; 3, p) = P(HHT ) + P(HTH) + P(THH) = 3p 2 (1 − p).
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
21 / 26
Binomial Distribution
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
22 / 26
Binomial Distribution
More generally, for the binomial pmf b(x; n, p), we have
number of sequences of
probability of any
b(x; n, p) =
·
length n consisting of x S’s
particular such sequence
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
22 / 26
Binomial Distribution
More generally, for the binomial pmf b(x; n, p), we have
number of sequences of
probability of any
b(x; n, p) =
·
length n consisting of x S’s
particular such sequence
n
number of sequences of
=
and
length n consisting of x S’s
x
probability of any
= p x (1 − p)n−x
particular such sequence
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
22 / 26
Binomial Distribution
More generally, for the binomial pmf b(x; n, p), we have
number of sequences of
probability of any
b(x; n, p) =
·
length n consisting of x S’s
particular such sequence
n
number of sequences of
=
and
length n consisting of x S’s
x
probability of any
= p x (1 − p)n−x
particular such sequence
Theorem
( n
b(x; n, p) =
Liang Zhang (UofU)
x
p x (1 − p)n−x
0
Applied Statistics I
x = 0, 1, 2, . . . , n
otherwise
June 19, 2008
22 / 26
Binomial Distribution
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
23 / 26
Binomial Distribution
Example: (Problem 55)
Twenty percent of all telephones of a certain type are submitted for service
while under warranty. Of these, 75% can be repaired, whereas the other
25% must be replaced with new units. if a company purchases ten of
these telephones, what is the probability that exactly two will end up being
replaced under warranty?
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
23 / 26
Binomial Distribution
Example: (Problem 55)
Twenty percent of all telephones of a certain type are submitted for service
while under warranty. Of these, 75% can be repaired, whereas the other
25% must be replaced with new units. if a company purchases ten of
these telephones, what is the probability that exactly two will end up being
replaced under warranty?
Let X = number of telephones which need replace, S = a telephone need
repair and with replacement. Then
p = P(repair and replace) = P(replace | repair)·P(repair) = 0.25·0.2 = 0.05
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
23 / 26
Binomial Distribution
Example: (Problem 55)
Twenty percent of all telephones of a certain type are submitted for service
while under warranty. Of these, 75% can be repaired, whereas the other
25% must be replaced with new units. if a company purchases ten of
these telephones, what is the probability that exactly two will end up being
replaced under warranty?
Let X = number of telephones which need replace, S = a telephone need
repair and with replacement. Then
p = P(repair and replace) = P(replace | repair)·P(repair) = 0.25·0.2 = 0.05
Now,
10
P(X = 2) = b(2; 10, 0.05) =
0.052 (1 − 0.05)10−2 = 0.0746
2
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
23 / 26
Binomial Distribution
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
24 / 26
Binomial Distribution
Binomial Tables
Table A.1 Cumulative Binomial
Probabilities (Page 664)
P
B(x; n, p) = xy =0 b(x; n, p) . . .
b. n = 10
p
0.01 0.05 0.10 . . .
0
.904 .599 .349 . . .
1
.996 .914 .736 . . .
2 1.000 .988 .930 . . .
3 1.000 .999 .987 . . .
...
...
... ...
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
24 / 26
Binomial Distribution
Binomial Tables
Table A.1 Cumulative Binomial
Probabilities (Page 664)
P
B(x; n, p) = xy =0 b(x; n, p) . . .
b. n = 10
p
0.01 0.05 0.10 . . .
0
.904 .599 .349 . . .
1
.996 .914 .736 . . .
2 1.000 .988 .930 . . .
3 1.000 .999 .987 . . .
...
...
... ...
Then for b(2; 10, 0.05), we have
b(2; 10, 0.05) = B(2; 10, 0.05) − B(1; 10, 0.05) = .988 − .914 = .074
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
24 / 26
Binomial Distribution
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
25 / 26
Binomial Distribution
Mean and Variance
Theorem
If X ∼ Bin(n, p), then E (X ) = np, V (X ) = np(1 − p) = npq, and
√
σX = npq (where q = 1 − p).
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
25 / 26
Binomial Distribution
Mean and Variance
Theorem
If X ∼ Bin(n, p), then E (X ) = np, V (X ) = np(1 − p) = npq, and
√
σX = npq (where q = 1 − p).
The idea is that X = nY , where Y is a Bernoulli random variable
with probability p for one outcome, i.e.
(
1, with probabilityp
Y =
0, with probability1 − p
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
25 / 26
Binomial Distribution
Mean and Variance
Theorem
If X ∼ Bin(n, p), then E (X ) = np, V (X ) = np(1 − p) = npq, and
√
σX = npq (where q = 1 − p).
The idea is that X = nY , where Y is a Bernoulli random variable
with probability p for one outcome, i.e.
(
1, with probabilityp
Y =
0, with probability1 − p
E (Y ) = p and V (Y ) = (1 − p)2 p + (−p)2 (1 − p) = p(1 − p).
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
25 / 26
Binomial Distribution
Mean and Variance
Theorem
If X ∼ Bin(n, p), then E (X ) = np, V (X ) = np(1 − p) = npq, and
√
σX = npq (where q = 1 − p).
The idea is that X = nY , where Y is a Bernoulli random variable
with probability p for one outcome, i.e.
(
1, with probabilityp
Y =
0, with probability1 − p
E (Y ) = p and V (Y ) = (1 − p)2 p + (−p)2 (1 − p) = p(1 − p).
Therefore E (X ) = np and V (X ) = np(1 − p) = npq.
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
25 / 26
Binomial Distribution
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
26 / 26
Binomial Distribution
Example: (Problem 60)
A toll bridge charges $1.00 for passenger cars and $2.50 for other vehicles.
Suppose that during daytime hours, 60% of all vehicles are passenger cars.
If 25 vehicles cross the bridge during a particular daytime period, what is
the resulting expected toll revenue? What is the variance
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
26 / 26
Binomial Distribution
Example: (Problem 60)
A toll bridge charges $1.00 for passenger cars and $2.50 for other vehicles.
Suppose that during daytime hours, 60% of all vehicles are passenger cars.
If 25 vehicles cross the bridge during a particular daytime period, what is
the resulting expected toll revenue? What is the variance
Let X = the number of passenger cars and Y = revenue. Then
Y = 1.00X + 2.50(25 − X ) = 62.5 − 1.50X .
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
26 / 26
Binomial Distribution
Example: (Problem 60)
A toll bridge charges $1.00 for passenger cars and $2.50 for other vehicles.
Suppose that during daytime hours, 60% of all vehicles are passenger cars.
If 25 vehicles cross the bridge during a particular daytime period, what is
the resulting expected toll revenue? What is the variance
Let X = the number of passenger cars and Y = revenue. Then
Y = 1.00X + 2.50(25 − X ) = 62.5 − 1.50X .
E (Y ) = E (62.5 − 1.5X ) = 62.5 − 1.5E (X ) = 62.5 − 1.5 · (25 · 0.6) = 40
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
26 / 26
Binomial Distribution
Example: (Problem 60)
A toll bridge charges $1.00 for passenger cars and $2.50 for other vehicles.
Suppose that during daytime hours, 60% of all vehicles are passenger cars.
If 25 vehicles cross the bridge during a particular daytime period, what is
the resulting expected toll revenue? What is the variance
Let X = the number of passenger cars and Y = revenue. Then
Y = 1.00X + 2.50(25 − X ) = 62.5 − 1.50X .
E (Y ) = E (62.5 − 1.5X ) = 62.5 − 1.5E (X ) = 62.5 − 1.5 · (25 · 0.6) = 40
V (Y ) = V (62.5 − 1.5X ) = (−1.5)2 V (X ) = 2.25 · (25 · 0.6 · 0.4) = 13.5
Liang Zhang (UofU)
Applied Statistics I
June 19, 2008
26 / 26
Download