Applied Statistics I Liang Zhang June 10, 2008

advertisement
Applied Statistics I
Liang Zhang
Department of Mathematics, University of Utah
June 10, 2008
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
1 / 37
Measures of Variability
Sample I:
Sample II:
Sample III:
30, 35, 40, 45, 50, 55, 60, 65, 70
30, 41, 48, 49, 50, 51, 52, 59, 70
41, 45, 48, 49, 50, 51, 52, 55, 59
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
2 / 37
Measures of Variability
Sample I:
Sample II:
Sample III:
30, 35, 40, 45, 50, 55, 60, 65, 70
30, 41, 48, 49, 50, 51, 52, 59, 70
41, 45, 48, 49, 50, 51, 52, 55, 59
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
2 / 37
Measures of Variability
Sample Range: the difference between the largest and the smallest
sample values.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
3 / 37
Measures of Variability
Sample Range: the difference between the largest and the smallest
sample values.
e.g. for Sample I: 30, 35, 40, 45, 50, 55, 60, 65, 70
the sample range is 40(= 70 − 30).
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
3 / 37
Measures of Variability
Sample Range: the difference between the largest and the smallest
sample values.
e.g. for Sample I: 30, 35, 40, 45, 50, 55, 60, 65, 70
the sample range is 40(= 70 − 30).
Deviation from the Sample Mean: the diffenence between the
individual sample value and the sample mean.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
3 / 37
Measures of Variability
Sample Range: the difference between the largest and the smallest
sample values.
e.g. for Sample I: 30, 35, 40, 45, 50, 55, 60, 65, 70
the sample range is 40(= 70 − 30).
Deviation from the Sample Mean: the diffenence between the
individual sample value and the sample mean.
e.g. for Sample I: 30, 35, 40, 45, 50, 55, 60, 65, 70
the sample mean is 50 and thus the deviation from the sample mean
for each data is -20, -15, -10, -5, 0, 5, 10, 15, 20.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
3 / 37
Measures of Variability
Sample Variance: the mean (or average) of the sum of squares of
the deviations from the sample mean for each individual data.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
4 / 37
Measures of Variability
Sample Variance: the mean (or average) of the sum of squares of
the deviations from the sample mean for each individual data.
If our sample size is n, and we use x̄ to denote the sample mean, then
the sample variance s 2 is given by:
Pn
(xi − x̄)2
Sxx
s 2 = i=1
=
n−1
n−1
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
4 / 37
Measures of Variability
Sample Variance: the mean (or average) of the sum of squares of
the deviations from the sample mean for each individual data.
If our sample size is n, and we use x̄ to denote the sample mean, then
the sample variance s 2 is given by:
Pn
(xi − x̄)2
Sxx
s 2 = i=1
=
n−1
n−1
Sample Standard Deviation: the square root of the sample variance
s=
Liang Zhang (UofU)
√
s2
Applied Statistics I
June 10, 2008
4 / 37
Measures of Variability
e.g. for Sample I: 30, 35, 40, 45, 50, 55, 60, 65, 70, the mean is 50 and
we have
xi
30
35
40 45 50 55
60
65
70
xi − x̄
-20 -15 -10 -5
0
5
10
15
20
(xi − x̄)2 400 225 100 25
0 25 100 225 400
Therefore the sample variance is
(400 + 225 + 100 + 25 + 0 + 25
√ + 100 + 225 + 400)/(9 − 1) = 187.5
and the standard deviation is 187.5 = 13.7.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
5 / 37
Measures of Variability
e.g. for Sample II: 30, 41, 48, 49, 50, 51, 52, 59, 70, the mean is also 50
and we have
xi
30 41 48 49 50 51 52 59
70
xi − x̄
-20 -9 -2 -1
0
1
2
9
20
(xi − x̄)2 400 81
4
1
0
1
4 81 400
Therefore the sample variance is
(400 + 81 + 4 + 1 + 0 + 1 + 4√+ 81 + 400)/(9 − 1) = 121.5
and the standard deviation is 121.5 = 11.0.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
6 / 37
Measures of Variability
e.g. for Sample III: 41, 45, 48, 49, 50, 51, 52, 55, 59, the mean is also 50
and we have
xi
41 45 48 49 50 51 52 55 59
xi − x̄
-9 -5 -2 -1
0
1
2
5
9
2
(xi − x̄)
81 25
4
1
0
1
4 25 81
Therefore the sample variance is
(81 + 25 + 4 + 1 + 0 + 1 + 4 +
√ 25 + 81)/(9 − 1) = 27.75
and the standard deviation is 27.75 = 4.9.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
7 / 37
Measures of Variability
sample variance for Sample I is 187.5, for Sample II is 121.5 and for
Sample III is 27.75.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
8 / 37
Measures of Variability
Remark: 1. Why use the sum of squares of the deviations? Why not sum
the deviations?
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
9 / 37
Measures of Variability
Remark: 1. Why use the sum of squares of the deviations? Why not sum
the deviations?
Because the sum of the deviations from the sample mean EQUAL TO 0!
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
9 / 37
Measures of Variability
Remark: 1. Why use the sum of squares of the deviations? Why not sum
the deviations?
Because the sum of the deviations from the sample mean EQUAL TO 0!
n
n
n
X
X
X
(xi − x̄) =
xi −
x̄
i=1
i=1
=
=
n
X
i=1
n
X
i=1
xi − nx̄
n
xi − n(
i=1
1X
xi )
n
i=1
=0
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
9 / 37
Measures of Variability
Remark:
2. Why do we use divisor n − 1 in the calculation of sample variance while
we use use divisor N in the calculation of the population variance?
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
10 / 37
Measures of Variability
Remark:
2. Why do we use divisor n − 1 in the calculation of sample variance while
we use use divisor N in the calculation of the population variance?
The variance is a measure about the deviation from the “center”.
However, the “center” for sample and population are different, namely
sample mean and population mean.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
10 / 37
Measures of Variability
Remark:
2. Why do we use divisor n − 1 in the calculation of sample variance while
we use use divisor N in the calculation of the population variance?
The variance is a measure about the deviation from the “center”.
However, the “center” for sample and population are different, namely
sample mean and population mean.
P
If we use µ instead of x̄ in the definition of s 2 , then s 2 = (xi − µ)/n.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
10 / 37
Measures of Variability
Remark:
2. Why do we use divisor n − 1 in the calculation of sample variance while
we use use divisor N in the calculation of the population variance?
The variance is a measure about the deviation from the “center”.
However, the “center” for sample and population are different, namely
sample mean and population mean.
P
If we use µ instead of x̄ in the definition of s 2 , then s 2 = (xi − µ)/n.
But generally, population mean is unavailable to us. So our choice is the
sample mean. In that case, the observations xi0 s tend to be closer to their
average x̄ then to the population average µ. So to compensate, we use
divisor n − 1.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
10 / 37
Measures of Variability
Remark:
3. It’ customary to refer to s 2 as being based on n − 1 degrees of
freedom (df).
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
11 / 37
Measures of Variability
Remark:
3. It’ customary to refer to s 2 as being based on n − 1 degrees of
freedom (df).
s 2 is the average of n quantities: (x1 − x̄)2 , (x2 − x̄)2 , . . . , (xn − x̄)2 .
However, the sum of x1 − x̄, x2 − x̄, . . . , xn − x̄ is 0. Therefore if we know
any n − 1 of them, we know all of them.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
11 / 37
Measures of Variability
Remark:
3. It’ customary to refer to s 2 as being based on n − 1 degrees of
freedom (df).
s 2 is the average of n quantities: (x1 − x̄)2 , (x2 − x̄)2 , . . . , (xn − x̄)2 .
However, the sum of x1 − x̄, x2 − x̄, . . . , xn − x̄ is 0. Therefore if we know
any n − 1 of them, we know all of them.
e.g. {x1 = 4, x2 = 7, x3 = 1, and x4 = 10}.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
11 / 37
Measures of Variability
Remark:
3. It’ customary to refer to s 2 as being based on n − 1 degrees of
freedom (df).
s 2 is the average of n quantities: (x1 − x̄)2 , (x2 − x̄)2 , . . . , (xn − x̄)2 .
However, the sum of x1 − x̄, x2 − x̄, . . . , xn − x̄ is 0. Therefore if we know
any n − 1 of them, we know all of them.
e.g. {x1 = 4, x2 = 7, x3 = 1, and x4 = 10}.
Then the mean is x̄ = 5.5 and x1 − x̄ = −1.5, x2 − x̄ = 1.5 and
x3 − x̄ = −4.5. From that, we know directly that x4 − x̄ = 4.5 since their
sum is 0.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
11 / 37
Measures of Variability
Some mathematical results for s 2 :
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
12 / 37
Measures of Variability
Some mathematical results for s 2 :
P
P
Sxx
s 2 = n−1
where Sxx = (xi − x̄)2 = xi2 −
Liang Zhang (UofU)
Applied Statistics I
P
( xi )2
;
n
June 10, 2008
12 / 37
Measures of Variability
Some mathematical results for s 2 :
P
P
Sxx
s 2 = n−1
where Sxx = (xi − x̄)2 = xi2 −
If y1 = x1 + c, y2 = x2 + c, . . . , yn = xn + c,
Liang Zhang (UofU)
Applied Statistics I
P
( xi )2
;
n
then sy2
= sx2 ;
June 10, 2008
12 / 37
Measures of Variability
Some mathematical results for s 2 :
P
P
Sxx
s 2 = n−1
where Sxx = (xi − x̄)2 = xi2 −
If y1 = x1 + c, y2 = x2 + c, . . . , yn = xn + c,
P
( xi )2
;
n
then sy2
= sx2 ;
If y1 = cx1 , y2 = cx2 , . . . , yn = cxn , then sy =| c | sx .
Here sx2 is the sample variance of the x’s and sy2 is the sample
variance of the y ’s. c is any nonzero constant.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
12 / 37
Measures of Variability
e.g. in the previous example, Sample III is {41, 45, 48, 49, 50, 51, 52, 55,
59} then we can calculate the sample variance as following
xi
41
45
48
49
50
51
52
55
59
2
x
1681
2025
2304
2401
2500
2601
2704
3025
3481
Pi
P x2i 450
xi 22722
Therefore the sample variance is
(22722 −
Liang Zhang (UofU)
4502
)/(9 − 1) = 27.75
9
Applied Statistics I
June 10, 2008
13 / 37
Measures of Variability
Boxplots
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
14 / 37
Measures of Variability
Boxplots
e.g. A recent article (“Indoor Radon and Childhood Cancer”) presented the accompanying data
on radon concentration (Bq/m2 ) in two different samples of houses. The first sample consisted
of houses in which a child diagnosed with cancer had been residing. Houses in the second
sample had no recorded cases of childhood cancer. The following graph presents a stem-and-leaf
display of the data.
2. No cancer
1. Cancer
9683795
86071815066815233150
12302731
8349
5
7
Liang Zhang (UofU)
0
1
2
3
4
5
6
7
8
95768397678993
12271713114
99494191
839
55
5
Stem: Tens digit
Leaf: Ones digit
Applied Statistics I
June 10, 2008
14 / 37
Measures of Variability
The boxplot for the 1st data set is:
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
15 / 37
Measures of Variability
The boxplot for the 2nd data set is:
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
16 / 37
Measures of Variability
We can also make the boxplot for both data sets:
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
17 / 37
Measures of Variability
Some terminology:
Lower Fourth: the median of the smallest half
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
18 / 37
Measures of Variability
Some terminology:
Lower Fourth: the median of the smallest half
Upper Fourth: the median of the largest half
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
18 / 37
Measures of Variability
Some terminology:
Lower Fourth: the median of the smallest half
Upper Fourth: the median of the largest half
Fourth spread: the difference between lower fourth and upper fourth
fs = upper fourth − lower fourth
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
18 / 37
Measures of Variability
Some terminology:
Lower Fourth: the median of the smallest half
Upper Fourth: the median of the largest half
Fourth spread: the difference between lower fourth and upper fourth
fs = upper fourth − lower fourth
Outlier: any observation farther than 1.5fs from the closest fourth
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
18 / 37
Measures of Variability
Some terminology:
Lower Fourth: the median of the smallest half
Upper Fourth: the median of the largest half
Fourth spread: the difference between lower fourth and upper fourth
fs = upper fourth − lower fourth
Outlier: any observation farther than 1.5fs from the closest fourth
An outlier is extreme if it is more than 3fs from the nearest fourth,
and it is mild otherwise.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
18 / 37
Measures of Variability
The boxplot for the 2nd data set is:
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
19 / 37
Sample Spaces and Events
Basic Concepts in Probability:
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
20 / 37
Sample Spaces and Events
Basic Concepts in Probability:
Experiment: any action or process whose outcome is subject to
uncertainty
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
20 / 37
Sample Spaces and Events
Basic Concepts in Probability:
Experiment: any action or process whose outcome is subject to
uncertainty
e.g. tossing a coin 3 times, testing the pH value of some reagent,
counting the number of customers visiting a store in one day, etc.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
20 / 37
Sample Spaces and Events
Basic Concepts in Probability:
Experiment: any action or process whose outcome is subject to
uncertainty
e.g. tossing a coin 3 times, testing the pH value of some reagent,
counting the number of customers visiting a store in one day, etc.
Sample Space: the set of all possible outcomes of an experiment,
usually denoted by S
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
20 / 37
Sample Spaces and Events
Basic Concepts in Probability:
Experiment: any action or process whose outcome is subject to
uncertainty
e.g. tossing a coin 3 times, testing the pH value of some reagent,
counting the number of customers visiting a store in one day, etc.
Sample Space: the set of all possible outcomes of an experiment,
usually denoted by S
e.g. for the above 3 examples, the sample spaces are {TTT, TTH,
THH, THT, HHH, HHT, HTH, HTT}, [0,14] and {0, 1, 2, . . . , N,
. . . }, respectively.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
20 / 37
Sample Spaces and Events
Basic Concepts in Probability:
Experiment: any action or process whose outcome is subject to
uncertainty
e.g. tossing a coin 3 times, testing the pH value of some reagent,
counting the number of customers visiting a store in one day, etc.
Sample Space: the set of all possible outcomes of an experiment,
usually denoted by S
e.g. for the above 3 examples, the sample spaces are {TTT, TTH,
THH, THT, HHH, HHT, HTH, HTT}, [0,14] and {0, 1, 2, . . . , N,
. . . }, respectively.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
20 / 37
Sample Spaces and Events
Basic Concepts in Probability:
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
21 / 37
Sample Spaces and Events
Basic Concepts in Probability:
Event: any colletcion (subset) of outcomes contained in the sample
space S.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
21 / 37
Sample Spaces and Events
Basic Concepts in Probability:
Event: any colletcion (subset) of outcomes contained in the sample
space S.
An event is simle if it consists of exactly one outcome and
compound if it consists of more than one outcome.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
21 / 37
Sample Spaces and Events
Basic Concepts in Probability:
Event: any colletcion (subset) of outcomes contained in the sample
space S.
An event is simle if it consists of exactly one outcome and
compound if it consists of more than one outcome.
e.g. for the coin tossing example: {all the outcomes such that the
first result is Head}, i.e. {HHT, HTH, HTT, HHH}, is an event and
this is a compoud event;
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
21 / 37
Sample Spaces and Events
Basic Concepts in Probability:
Event: any colletcion (subset) of outcomes contained in the sample
space S.
An event is simle if it consists of exactly one outcome and
compound if it consists of more than one outcome.
e.g. for the coin tossing example: {all the outcomes such that the
first result is Head}, i.e. {HHT, HTH, HTT, HHH}, is an event and
this is a compoud event;
{all the outcomes which have 3 consecutive Head}, i.e. {HHH}, is
also an event, while this is a single event.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
21 / 37
Sample Spaces and Events
Examples:
For the pH value testing example:
{pH value is less than 7.0}, i.e. [0, 7.0), is an event, and it is compound;
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
22 / 37
Sample Spaces and Events
Examples:
For the pH value testing example:
{pH value is less than 7.0}, i.e. [0, 7.0), is an event, and it is compound;
{pH value is between 2.0 and 3.0}, i.e. [2.0, 3.0], is another event, and it
is also compound.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
22 / 37
Sample Spaces and Events
Examples:
For the pH value testing example:
{pH value is less than 7.0}, i.e. [0, 7.0), is an event, and it is compound;
{pH value is between 2.0 and 3.0}, i.e. [2.0, 3.0], is another event, and it
is also compound.
For the customers’ visiting investigation example:
{the number of cumstomers visited in one day is less than 100}, i.e. {1, 2,
3, . . . , 98, 99}, is an event, and it is compound;
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
22 / 37
Sample Spaces and Events
Examples:
For the pH value testing example:
{pH value is less than 7.0}, i.e. [0, 7.0), is an event, and it is compound;
{pH value is between 2.0 and 3.0}, i.e. [2.0, 3.0], is another event, and it
is also compound.
For the customers’ visiting investigation example:
{the number of cumstomers visited in one day is less than 100}, i.e. {1, 2,
3, . . . , 98, 99}, is an event, and it is compound;
{the number of cumstomers visited in one day is more than 200}, i.e.
{201, 202, . . . } is also an event and it is compound.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
22 / 37
Sample Spaces and Events
Another Coin Tossing Example:
This time our experiment is tossing a coin many times until we get our
first Head.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
23 / 37
Sample Spaces and Events
Another Coin Tossing Example:
This time our experiment is tossing a coin many times until we get our
first Head.
Then the sample space would be {1, 2, 3, 4, 5, . . . } and the
corresponding outcomes are {H, TH, TTH, TTTH, TTTTH, · · · }.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
23 / 37
Sample Spaces and Events
Another Coin Tossing Example:
This time our experiment is tossing a coin many times until we get our
first Head.
Then the sample space would be {1, 2, 3, 4, 5, . . . } and the
corresponding outcomes are {H, TH, TTH, TTTH, TTTTH, · · · }.
Here are some events:
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
23 / 37
Sample Spaces and Events
Another Coin Tossing Example:
This time our experiment is tossing a coin many times until we get our
first Head.
Then the sample space would be {1, 2, 3, 4, 5, . . . } and the
corresponding outcomes are {H, TH, TTH, TTTH, TTTTH, · · · }.
Here are some events:
{we tossed at most 3 times}, i.e. {1, 2, 3} or {H, TH, TTH}.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
23 / 37
Sample Spaces and Events
Another Coin Tossing Example:
This time our experiment is tossing a coin many times until we get our
first Head.
Then the sample space would be {1, 2, 3, 4, 5, . . . } and the
corresponding outcomes are {H, TH, TTH, TTTH, TTTTH, · · · }.
Here are some events:
{we tossed at most 3 times}, i.e. {1, 2, 3} or {H, TH, TTH}.
{we totally tossed an even number of times}, i.e. {2, 4, 6, . . . } or {TH,
TTTH, TTTTTH, · · · }.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
23 / 37
Sample Spaces and Events
Another Coin Tossing Example:
This time our experiment is tossing a coin many times until we get our
first Head.
Then the sample space would be {1, 2, 3, 4, 5, . . . } and the
corresponding outcomes are {H, TH, TTH, TTTH, TTTTH, · · · }.
Here are some events:
{we tossed at most 3 times}, i.e. {1, 2, 3} or {H, TH, TTH}.
{we totally tossed an even number of times}, i.e. {2, 4, 6, . . . } or {TH,
TTTH, TTTTTH, · · · }.
Both events are compound.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
23 / 37
Sample Spaces and Events
Basic Set Theory
Complement: the complement of an event A denoted by A’ is the
set of all outcomes in S that are not contained in A.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
24 / 37
Sample Spaces and Events
Basic Set Theory
Complement: the complement of an event A denoted by A’ is the
set of all outcomes in S that are not contained in A.
e.g. for our first coin tossing example, if
A = {the first outcome is Head} = {HHH, HHT, HTH, HTT}, then
A’ = {the first outcome is not Head, i.e. Tail} = {TTT, TTH, THT,
THH}
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
24 / 37
Sample Spaces and Events
Basic Set Theory
Complement: the complement of an event A denoted by A’ is the
set of all outcomes in S that are not contained in A.
e.g. for our first coin tossing example, if
A = {the first outcome is Head} = {HHH, HHT, HTH, HTT}, then
A’ = {the first outcome is not Head, i.e. Tail} = {TTT, TTH, THT,
THH}
for the pH value testing example, if
A = {the pH value of the reagent is below 7.0}, then
A’ = {the the pH value of the reagent is above 7.0}
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
24 / 37
Sample Spaces and Events
Basic Set Theory
Complement: the complement of an event A denoted by A’ is the
set of all outcomes in S that are not contained in A.
e.g. for our first coin tossing example, if
A = {the first outcome is Head} = {HHH, HHT, HTH, HTT}, then
A’ = {the first outcome is not Head, i.e. Tail} = {TTT, TTH, THT,
THH}
for the pH value testing example, if
A = {the pH value of the reagent is below 7.0}, then
A’ = {the the pH value of the reagent is above 7.0}
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
24 / 37
Sample Spaces and Events
Basic Set Theory
Union: the union of two events A and B, is the event consisting of all
outcomes that are eigther in A or in B or in both events — that is, all
outcomes in at least one of the events, denoted by A∪B
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
25 / 37
Sample Spaces and Events
Basic Set Theory
Union: the union of two events A and B, is the event consisting of all
outcomes that are eigther in A or in B or in both events — that is, all
outcomes in at least one of the events, denoted by A∪B
e.g. for the coin tossing example, if
A = {the first outcome is Head} = {HHH, HHT, HTH, HTT}, and
B = {the last outcome is Head} = {HHH, TTH, HTH, THH}, then
A ∪ B = {the first or the last outcomem is Head}
= {HHH, HHT , HTH, HTT , TTH, THH}
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
25 / 37
Sample Spaces and Events
Basic Set Theory
Union: the union of two events A and B, is the event consisting of all
outcomes that are eigther in A or in B or in both events — that is, all
outcomes in at least one of the events, denoted by A∪B
e.g. for the coin tossing example, if
A = {the first outcome is Head} = {HHH, HHT, HTH, HTT}, and
B = {the last outcome is Head} = {HHH, TTH, HTH, THH}, then
A ∪ B = {the first or the last outcomem is Head}
= {HHH, HHT , HTH, HTT , TTH, THH}
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
25 / 37
Sample Spaces and Events
Basic Set Theory
Intersection: the intersection of two events A and B, is the event
consisting of all outcomes that are both in A and in B, denoted by
A∩B
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
26 / 37
Sample Spaces and Events
Basic Set Theory
Intersection: the intersection of two events A and B, is the event
consisting of all outcomes that are both in A and in B, denoted by
A∩B
e.g. for the coin tossing example, if
A = {the first outcome is Head} = {HHH, HHT, HTH, HTT}, and
B = {the last outcome is Head} = {HHH, TTH, HTH, THH}, then
A ∩ B = {the first and the last outcomem is Head}
= {HHH, HTH}
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
26 / 37
Sample Spaces and Events
Basic Set Theory
Intersection: the intersection of two events A and B, is the event
consisting of all outcomes that are both in A and in B, denoted by
A∩B
e.g. for the coin tossing example, if
A = {the first outcome is Head} = {HHH, HHT, HTH, HTT}, and
B = {the last outcome is Head} = {HHH, TTH, HTH, THH}, then
A ∩ B = {the first and the last outcomem is Head}
= {HHH, HTH}
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
26 / 37
Sample Spaces and Events
Basic Set Theory
Null Event: the event consistion of no outcomes, denoted by ∅
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
27 / 37
Sample Spaces and Events
Basic Set Theory
Null Event: the event consistion of no outcomes, denoted by ∅
e.g. the event {the first outcome is neither Head nor Tail} for the
coin tossing experiment is a null event.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
27 / 37
Sample Spaces and Events
Basic Set Theory
Null Event: the event consistion of no outcomes, denoted by ∅
e.g. the event {the first outcome is neither Head nor Tail} for the
coin tossing experiment is a null event.
Mutually Exclusive: if two events A and B satisfy A∩B = ∅, then A
and B are said to be mutually exclusive or mutually disjoint.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
27 / 37
Sample Spaces and Events
Basic Set Theory
Null Event: the event consistion of no outcomes, denoted by ∅
e.g. the event {the first outcome is neither Head nor Tail} for the
coin tossing experiment is a null event.
Mutually Exclusive: if two events A and B satisfy A∩B = ∅, then A
and B are said to be mutually exclusive or mutually disjoint.
e.g. for the coin tossing example, if
A = {the first outcome is Head} = {HHH, HHT, HTH, HTT}, and
B = {the first outcome is Tail} = {THH, TTH, TTT, THT}, then
A ∩ B = {the first outcomem is Head and Tail}
=∅
So A and B are mutually disjoint.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
27 / 37
Sample Spaces and Events
Remark:
1. The union and intersection operation can be extended to more than two
events.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
28 / 37
Sample Spaces and Events
Remark:
1. The union and intersection operation can be extended to more than two
events.
e.g. for any three events A, B and C, the event A ∪ B ∪ C is the set of all
outcomes contained in at least one of the three events;
Similarly, A ∩ B ∩ C is the set of all outcomes contained in all three events.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
28 / 37
Sample Spaces and Events
Remark:
1. The union and intersection operation can be extended to more than two
events.
e.g. for any three events A, B and C, the event A ∪ B ∪ C is the set of all
outcomes contained in at least one of the three events;
Similarly, A ∩ B ∩ C is the set of all outcomes contained in all three events.
2. Given n events A1 , A2 , . . . , An . They are said to be mutually disjoint or
pairwise disjoint, if any two events are mutually disjoint.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
28 / 37
Sample Spaces and Events
Venn Diagrams:
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
29 / 37
Sample Spaces and Events
Venn Diagrams:
e.g.
A∪B
Liang Zhang (UofU)
A∩B
Applied Statistics I
June 10, 2008
29 / 37
Sample Spaces and Events
Venn Diagrams:
e.g.
A∪B
A∩B
mutually disjoint
A complement
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
29 / 37
Axiomatic Probability
The objective of probability is to assign to each event A a number
P(A), called the probability of the event A, which will give a precise
measure of the chance thtat A will occur.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
30 / 37
Axiomatic Probability
The objective of probability is to assign to each event A a number
P(A), called the probability of the event A, which will give a precise
measure of the chance thtat A will occur.
Probability Axioms:
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
30 / 37
Axiomatic Probability
The objective of probability is to assign to each event A a number
P(A), called the probability of the event A, which will give a precise
measure of the chance thtat A will occur.
Probability Axioms:
AXIOM 1 For any event A, P(A) ≥ 0.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
30 / 37
Axiomatic Probability
The objective of probability is to assign to each event A a number
P(A), called the probability of the event A, which will give a precise
measure of the chance thtat A will occur.
Probability Axioms:
AXIOM 1 For any event A, P(A) ≥ 0.
AXIOM 2 P(S) = 1.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
30 / 37
Axiomatic Probability
The objective of probability is to assign to each event A a number
P(A), called the probability of the event A, which will give a precise
measure of the chance thtat A will occur.
Probability Axioms:
AXIOM 1 For any event A, P(A) ≥ 0.
AXIOM 2 P(S) = 1.
AXIOM 3 If A1 , A2 , A3 , . . . is an infinite collection
of disjoint events,
P∞
then P(A1 ∪ A2 ∪ A3 ∪ · · · ) = i=1 P(Ai )
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
30 / 37
Axiomatic Probability
Proposition
P(∅) = 0 where ∅ is the null event. This in turn implies that the property
contained in Axiom 3 is valid for finite collection of events, i.e. if
A1 , A2 , . . . , An is a finite collection
of disjoint events, then
Pn
P(A1 ∪ A2 ∪ · · · ∪ A3 ) = i=1 P(Ai )
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
31 / 37
Axiomatic Probability
Examples:
1. Consider the coin tossing experiment and we are only interested in
tossing the coin one time. Then S = {H, T}.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
32 / 37
Axiomatic Probability
Examples:
1. Consider the coin tossing experiment and we are only interested in
tossing the coin one time. Then S = {H, T}.
Since P(S) = 1 (Axiom 1), and the event {H} and {T} are mutually
disjoint, by Axiom 3, we have
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
32 / 37
Axiomatic Probability
Examples:
1. Consider the coin tossing experiment and we are only interested in
tossing the coin one time. Then S = {H, T}.
Since P(S) = 1 (Axiom 1), and the event {H} and {T} are mutually
disjoint, by Axiom 3, we have
P({H}) + P({T }) = P({H} ∪ {T }) = P(S) = 1
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
32 / 37
Axiomatic Probability
Examples:
1. Consider the coin tossing experiment and we are only interested in
tossing the coin one time. Then S = {H, T}.
Since P(S) = 1 (Axiom 1), and the event {H} and {T} are mutually
disjoint, by Axiom 3, we have
P({H}) + P({T }) = P({H} ∪ {T }) = P(S) = 1
If the coin is fair, we should assign 0.5 to P({H}) and 0.5 to P({T }).
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
32 / 37
Axiomatic Probability
Examples:
1. Consider the coin tossing experiment and we are only interested in
tossing the coin one time. Then S = {H, T}.
Since P(S) = 1 (Axiom 1), and the event {H} and {T} are mutually
disjoint, by Axiom 3, we have
P({H}) + P({T }) = P({H} ∪ {T }) = P(S) = 1
If the coin is fair, we should assign 0.5 to P({H}) and 0.5 to P({T }).
If the coin is more likely to give a Head, then 0.8 for P({H}) and 0.2 for
P({T }) may be suitable.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
32 / 37
Axiomatic Probability
Examples:
1. Consider the coin tossing experiment and we are only interested in
tossing the coin one time. Then S = {H, T}.
Since P(S) = 1 (Axiom 1), and the event {H} and {T} are mutually
disjoint, by Axiom 3, we have
P({H}) + P({T }) = P({H} ∪ {T }) = P(S) = 1
If the coin is fair, we should assign 0.5 to P({H}) and 0.5 to P({T }).
If the coin is more likely to give a Head, then 0.8 for P({H}) and 0.2 for
P({T }) may be suitable.
In fact, if p is any fixed number between 0 and 1, then P({H}) = p , and
P({T }) = 1 − p is an assignment consistent with the axioms.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
32 / 37
Axiomatic Probability
Examples:
2. Consider again the coin tossing example. However, this time we are
interested in getting a Head, i.e. we toss a coin many times untill we get a
Head. Then S = {H, TH, TTH, TTTH, TTTTH, . . . }.
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
33 / 37
Axiomatic Probability
Examples:
2. Consider again the coin tossing example. However, this time we are
interested in getting a Head, i.e. we toss a coin many times untill we get a
Head. Then S = {H, TH, TTH, TTTH, TTTTH, . . . }.
If P({H}) = 0.4 then P({T }) = 0.6, P({TH}) = (0.4)0.6,
P({TTH}) = (0.4)(0.6)2 , P({TTTH}) = (0.4)(0.6)3 , . . . .
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
33 / 37
Axiomatic Probability
Examples:
2. Consider again the coin tossing example. However, this time we are
interested in getting a Head, i.e. we toss a coin many times untill we get a
Head. Then S = {H, TH, TTH, TTTH, TTTTH, . . . }.
If P({H}) = 0.4 then P({T }) = 0.6, P({TH}) = (0.4)0.6,
P({TTH}) = (0.4)(0.6)2 , P({TTTH}) = (0.4)(0.6)3 , . . . .
Since {H}, {TH}, {TTH}, {TTTH}, {TTTTH}, . . . are mutually disjoint
and S = {H} ∪ {TH} ∪ {TTH} ∪ {TTTH} ∪ {TTTTH} ∪ . . . , we have
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
33 / 37
Axiomatic Probability
Examples:
2. Consider again the coin tossing example. However, this time we are
interested in getting a Head, i.e. we toss a coin many times untill we get a
Head. Then S = {H, TH, TTH, TTTH, TTTTH, . . . }.
If P({H}) = 0.4 then P({T }) = 0.6, P({TH}) = (0.4)0.6,
P({TTH}) = (0.4)(0.6)2 , P({TTTH}) = (0.4)(0.6)3 , . . . .
Since {H}, {TH}, {TTH}, {TTTH}, {TTTTH}, . . . are mutually disjoint
and S = {H} ∪ {TH} ∪ {TTH} ∪ {TTTH} ∪ {TTTTH} ∪ . . . , we have
1 = 0.4 + (0.4)(0.6) + (0.4)(0.6)2 + (0.4)(0.6)3 + · · ·
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
33 / 37
Axiomatic Probability
More Probability Properties
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
34 / 37
Axiomatic Probability
More Probability Properties
Proposition
For any event A, P(A) + P(A0 ) = 1, from which P(A) = 1 − P(A0 ).
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
34 / 37
Axiomatic Probability
More Probability Properties
Proposition
For any event A, P(A) + P(A0 ) = 1, from which P(A) = 1 − P(A0 ).
Example 2.13
Consider a system of five identical components connected in series, as
illustrated below.
Denote a component failure by F and success by S. Let A be the event
that the system fails. For A to occur, at least one of the individual
components must fail. If we know P({F }) = 0.1, then what is P(A)?
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
34 / 37
Axiomatic Probability
Proposition
For any event A, P(A) ≤ 1 .
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
35 / 37
Axiomatic Probability
Proposition
For any event A, P(A) ≤ 1 .
Proposition
For any two events A and B,
P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
35 / 37
Axiomatic Probability
Proposition
For any event A, P(A) ≤ 1 .
Proposition
For any two events A and B,
P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
A Venn Diagram proof:
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
35 / 37
Axiomatic Probability
Proposition
For any event A, P(A) ≤ 1 .
Proposition
For any two events A and B,
P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
A Venn Diagram proof:
=
Liang Zhang (UofU)
+
Applied Statistics I
June 10, 2008
35 / 37
Axiomatic Probability
Example 2.14
In a certain residential suburb, 60% of all households subscribe to the
metropolitan newspaper published in a nearby city, 80% subscribe to the
local paper, and 50% of all households subscribe to both papers. If a
househlld is selected at random, what is the probability that it subscribes
to (1)at least one of the two newspapers and (2) exactly one of the two
newspapers?
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
36 / 37
Axiomatic Probability
Proposition
For any three events A, B, and C ,
P(A ∪ B ∪ C ) =P(A) + P(B) + P(C )
− P(A ∩ B) − P(B ∩ C ) − P(C ∩ A)
+ P(A ∩ B ∩ C )
A Venn Diagram interpretation:
Liang Zhang (UofU)
Applied Statistics I
June 10, 2008
37 / 37
Download