Chapter 3 Basic Concepts in Statistics (ppt)

advertisement
Chapter 3
Selected Basic Concepts in
Statistics
Expected Value, Variance, Standard
Deviation
 Numerical summaries of selected
statistics
 Sampling distributions

Expected Value
Let y be a random variable with values y1 , y2 ,
respective probabilities p( y1 ), p( y2 ),
, y N that have
, p( y N ).
The expected value of y, denoted E ( y) or  is
N
E ( y )     yi p( yi )
i 1
Weighted average
Not the value of y you “expect”; a long-run
average
E(y) Example 1
Toss a fair die once. Let y be the number of
dots on upper face.
y
1
2
3
4
5
6
p(y)
1/6
1/6
1/6
1/6
1/6
1/6
E ( y ) 1 16   2  16   3  16   4  16   5  16   6  16 
21
  3.5
6
E(y) Example 2: Green
Mountain Lottery
Choose 3 digits between 0 and 9. Repeats
allowed, order of digits counts. If your 3-digit
number is selected, you win $500. Let y be
your winnings (assume ticket cost $0)
y
$0
$500
p(y)
0.999
0.001
E ( y )  $0  0.999   $500  0.001
 $.50




US Roulette Wheel and
Table
The roulette wheel has
alternating black and
red slots numbered 1
through 36.
There are also 2 green
slots numbered 0 and
00.
A bet on any one of the
38 numbers (1-36, 0,
or 00) pays odds of
35:1; that is . . .
If you bet $1 on the
winning number, you
receive $36, so your
winnings are $35
American Roulette 0 - 00
(The European version has
only one 0.)
US Roulette Wheel: Expected Value of a $1 bet
on a single number

Let y be your winnings resulting from a $1 bet
on a single number; y has 2 possible values
y
p(y)



-1
37/38
35
1/38
E(y)= -1(37/38)+35(1/38)= -.05
So on average the house wins 5 cents on every
such bet. A “fair” game would have E(y)=0.
The roulette wheels are spinning 24/7, winning
big $$ for the house, resulting in …
Variance and Standard Deviation
Let y be a random variable with values y1 , y2 ,
respective probabilities p( y1 ), p( y2 ),
, y N that have
, p( y N ).
The variance of y, denoted V ( y ) or  2 is
N
V ( y )    E ( y   )   ( yi   ) 2 p( yi )
2
2
i 1
The standard deviation of y, denoted SD( y ) or  ,
is the square root of the variance.
SD( y )     2

Measure spread around the middle,
where the middle is measured by 
Variance Example
Toss a fair die once. Let y be the number of
dots on upper face.
y
1
2
3
4
5
6
p(y)
1/6
1/6
1/6
1/6
1/6
1/6
Recall  = 3.5
V ( y )  (1  3.5) 2  16   (2  3.5) 2  16   (3  3.5) 2  16 
 (4  3.5) 2  16   (5  3.5) 2  16   (6  3.5) 2  16 
 (2.5) 2  16   (1.5) 2  16   ( .5) 2  16 
 (.5) 2  16   (1.5) 2  16   (2.5) 2  16 
 17.5  2.917;
6
SD( y )  2.917  1.708
V(y) Example 2: Green
Mountain Lottery
y
$0
$500
p(y)
0.999
0.001
Recall  = .50
V ( y )  (0  .50) 2  0.999   (500  .50) 2  0.001
 (.50) 2 (0.999)  (499.5) 2 (0.001)
 249.75
SD( y )  249.75  15.8
Estimators for , 2, 
Let y1 , y2 ,
, yn denote sample observations.
n
Sample mean y 
y
i 1
i
n
n
Sample variance s 2 
2
(
y

y
)
 i
i 1
n 1
Sample standard deviation s  s 2



s2 “average” squared deviation from the middle
Automate these calculations
Examples
Linear Transformations of Random
Variables and Sample Statistics


Random variable y
with E(y) and V(y)
Lin trans y*=a+by,
what is E(y*) and
V(y*) in terms of
original E(y) and
V(y)?

Data y1, y2, …, yn with mean y
and standard deviation s

Lin trans y* = a + by; new data
y1*, y2*, …, yn*; what is y* and
s* in terms of y and s
Linear Transformations
Rules for E(y*), V(y*) and
SD(y*)



E(y*)=E(a+by)
= a + bE(y)
V(y*)=V(a+by)
= b2V(y)
SD(y*)=SD(a+by)
=|b|SD(y)
Rules for y*, s*2 , and s*

y* = a + by

s*2 = b2s2

s* = bs
Expected Value and
Standard Deviation of
Linear Transformation
a + by
Let y=number of repairs a new computer needs each year.
Suppose E(y)= 0.20 and SD(y)=0.55
The service contract for the computer offers unlimited repairs for
$100 per year plus a $25 service charge for each repair.
What are the mean and standard deviation of the yearly cost of
the service contract?
Cost = $100 + $25y
E(cost) = E($100+$25y)=$100+$25E(y)=$100+$25*0.20=
= $100+$5=$105
SD(cost)=SD($100+$25y)=SD($25y)=$25*SD(y)=$25*0.55=
=$13.75
Addition and Subtraction Rules for
Random Variables



E(X+Y) = E(X) + E(Y);
E(X-Y) = E(X) - E(Y)
When X and Y are independent random variables:
1.
Var(X+Y)=Var(X)+Var(Y)
2.
SD(X+Y)= Var ( X )  Var (Y )
SD’s do not add:
SD(X+Y)≠ SD(X)+SD(Y)
3.
Var(X−Y)=Var(X)+Var(Y)
4.
SD(X −Y)= Var ( X )  Var (Y )
SD’s do not subtract:
SD(X−Y)≠ SD(X)−SD(Y)
SD(X−Y)≠ SD(X)+SD(Y)
Example: rv’s NOT independent







X=number of hours a randomly selected student from our
class slept between noon yesterday and noon today.
Y=number of hours the same randomly selected student
from our class was awake between noon yesterday and
noon today. Y = 24 – X.
What are the expected value and variance of the total hours
that a student is asleep and awake between noon yesterday
and noon today?
Total hours that a student is asleep and awake between
noon yesterday and noon today = X+Y
E(X+Y) = E(X+24-X) = E(24) = 24
Var(X+Y) = Var(X+24-X) = Var(24) = 0.
We don't add Var(X) and Var(Y) since X and Y are not
independent.
Pythagorean Theorem of Statistics for
Independent X and Y
c2=a2+b2
Var(X)
a2
Var(X+Y)
a
c
SD(X+Y)
SD(X)
a 2 + b2 = c 2
Var(X)+Var(Y)=Var(X+Y)
b
SD(Y)
b2
Var(Y)
a+b≠c
SD(X)+SD(Y) ≠SD(X+Y)
Pythagorean Theorem of Statistics for
Independent X and Y
32 + 42 = 52
Var(X)+Var(Y)=Var(X+Y)
25=9+16
Var(X)
9
Var(X+Y)
3
5
SD(X+Y)
SD(X)
4
SD(Y)
16
Var(Y)
3+4≠5
SD(X)+SD(Y) ≠SD(X+Y)
Example: meal plans
Regular plan: X = daily amount spent
 E(X) = $13.50, SD(X) = $7
 Expected value and stan. dev. of total spent in
2 consecutive days? (assume independent)
 E(X1+X2)=E(X1)+E(X2)=$13.50+$13.50=$27
SD(X1 + X2) ≠ SD(X1)+SD(X2) = $7+$7=$14

SD( X 1  X 2 )  Var ( X 1  X 2 )  Var ( X 1 )  Var ( X 2 )
 ($7)  ($7)  $ 49  $ 49  $ 98  $9.90
2
2
2
2
2
Example: meal plans (cont.)
Jumbo plan for football players Y=daily
amount spent
 E(Y) = $24.75, SD(Y) = $9.50
 Amount by which football player’s spending
exceeds regular student spending is Y-X
 E(Y-X)=E(Y)–E(X)=$24.75-$13.50=$11.25
SD(Y ̶ X) ≠ SD(Y) ̶ SD(X) = $9.50 ̶

$7=$2.50
SD(Y  X )  Var (Y  X )  Var (Y )  Var ( X )
 ($9.50)  ($7)  $ 90.25  $ 49  $ 139.25  $11.80
2
2
2
2
2
For random variables, X+X≠2X

1)
2)
Let X be the annual payout on a life insurance policy.
From mortality tables E(X)=$200 and SD(X)=$3,867.
If the payout amounts are doubled, what are the new
expected value and standard deviation?
The risk to the
 Double payout is 2X. E(2X)=2E(X)=2*$200=$400
insurance co.
 SD(2X)=2SD(X)=2*$3,867=$7,734
when doubling the
Suppose insurance policies are sold to 2payout
people.(2X)
Theis not
the2same
as the
annual payouts are X1 and X2. Assume the
people
risk when
selling
behave independently. What are the expected
value
policies to 2
and standard deviation of the total payout?
people.
 E(X + X )=E(X ) + E(X ) = $200 + $200
= $400
1
2
1
2
SD(X1 + X 2 )= Var ( X1  X 2 )  Var ( X1)  Var ( X 2 )
 (3867)2  (3867)2  14,953,689 14,953,689
 29,907,378  $5,468.76
Estimator of population mean 
Let y1 , y2 ,
, yn denote sample observations.
n
Sample mean y 


y
i 1
i
n
y will vary from sample to sample
What are the characteristics of this sample-tosample behavior?
Numerical Summary of Sampling
Distribution of y
Consider the sample observations y1 , y2 ,
, yn as independent
observations of the population variable y with E ( y )   and
SD( y )   .
 n
  yi
E ( y )  E  i 1
 n




 1 n
n
   E ( yi ) 

n
 n i 1


Unbiased: a statistic is unbiased if it has
expected value equal to the population
parameter.
Numerical Summary of Sampling
Distribution of y
Consider the sample observations y1 , y2 ,
, yn as independent
observations of the population variable y with E ( y )   and
SD( y )   .
 n
  yi
V ( y )  V  i 1
 n



SD( y ) 
n

 1  n  1
  2 V   yi   2
 n  i 1  n


n 2  2
V ( yi )  2 

n
n
i 1
n
Standard Error

Standard error - square root of the estimated
variance of a statistic
 important building block for statistical inference
V ( y) 
2
n
2
s
Vˆ ( y ) 
n
Standard error of the sample mean:
s
SE ( y ) 
n
Recall SD( y ) 

n
Shape?
We have numerical summaries of the
sampling distribution of y
 What about the shape of the sampling
distribution of y ?

THE CENTRAL LIMIT
THEOREM
The World is Normal Theorem
The Central Limit Theorem
(for the sample mean y)
If a random sample of n observations is
selected from a population (any
population), then when n is sufficiently
large, the sampling distribution of y will
be approximately normal.
(The larger the sample size, the better will
be the normal approximation to the
sampling distribution of y.)

The Importance of the Central
Limit Theorem

When we select simple random samples of
size n, the sample means we find will vary
from sample to sample. We can model the
distribution of these sample means with a
probability model that is
 

N  ,

n

Shape of population is irrelevant
Estimating the population total 
Let y1 , y2 ,
, yn denote sample observations.
n
Sample mean y 
y
i 1
i
n
ˆ  Ny ( N is the population size)
Estimating the population total 

Expected value
E (ˆ)  E ( Ny )
 NE ( y )
 N  
Estimating the population total 

Variance, standard deviation, standard error
V (ˆ) V ( Ny )
2
 N V ( y)
N
2

SD(ˆ)  V ( Ny )
 N V ( y)
2
2
N
n
SE (ˆ)  N

n
s
n
Finite population case

Example: sampling w/ replacement to estimate 
Population: {1, 2, 3, 4} N = 4
sample n=2 with varying probabilities:
1  .1,  2  .1,  3  .4,  4  .4
estimate with ˆ 
1
n
yi

n

i 1
i

1
2
yi

2

i 1
For the sample {1, 2}


1 1
1
ˆ 
 2  (10  20)  15
.1 2
2 .1
i
Finite population case

Example: sampling w/ replacement to estimate 
Sample
Prob of Sample

V()
{1, 2}
.02
15
25.0
{1, 3}
.08
35/4
1.5625
{1, 4}
.08
10
0
{2, 3}
.08
55/4
39.0625
{2, 4}
.08
15
25.0
{3, 4}
.32
35/4
1.5625
{1, 1}
.01
10
0
{2, 2}
.01
20
0
{3, 3}
.16
15/2
0
{4, 4}
.16
10
0
Finite population case
Example: sampling w/ replacement to estimate 
 From the table:

E (ˆ) 15(.02)  35 (.08) 
4
 10(.16)  10  
 
V (ˆ)  (15  10) (.02)   35  10  (.08)
 4

2

 (10  10) (.16)  6.250
2
2
Finite population case

Example: sampling w/ replacement to estimate 
2


y
1
1
i
Vˆ (ˆ) 
  ˆ  ;

n n  1 i 1   i

From the table,
E (Vˆ (ˆ))  25(.02)  1.5625(.08) 
n
39.0625(.08)  25(.08)  1.5625(.32)
 6.25  V (ˆ)
Finite population case
Example: sampling w/ replacement to estimate 
 Example Summary

n
yi
1
ˆ   ;
n i 1  i
E (ˆ)   ;


y
1
1
i
Vˆ (ˆ) 
  ˆ 

n n  1 i 1   i

E (Vˆ (ˆ))  V (ˆ)
n
2
Finite population case
Sampling w/ replacement to estimate pop. total 
 In general

ˆ 
1
n
yi

n

i 1
is unbiased for any choice of  i ;
i
y


ˆ (ˆ ))  V (ˆ )
ˆ
Vˆ (ˆ ) 


;
E
(
V



n n 1


Want to choose  i so that
2
1
1
n
i
i 1
i
V (ˆ) is as small as possible.
Finite population case

Sampling w/ replacement to estimate pop. total 
A specific choice for the  i ' s :
Suppose we know the values yi , i  1,
(so the population total  is known)
choose  i 
ˆ 
1
n
n

i 1
yi
i

yi
1
n

n

i 1
(assume all yi  0)
yi
yi


n
n
every ˆ estimates  exactly

N
Finite population case

Sampling w/ replacement to estimate pop. total 
In reality, do not know value of yi for every
item in the population.
BUT can choose i proportional to a known
measurement highly correlated with yi .
Finite population case

Sampling w/ replacement to estimate pop. total 
Example: want to estimate total number of job
openings in a city by sampling industrial firms.
•
•
•
•
Many small firms – employ few workers;
A few large firms – employ many workers;
Large firms influence number of job openings;
Large firms should have greater chance of being in sample
to improve estimate of total openings.
Firms can be sampled with probabilities
proportional to the firm’s total work force, which
should be correlated to the firm’s job openings.
Finite population case

Sampling without replacement to estimate pop.
total 
Thus far we have assumed a population that does
not change when the first item is selected, that is,
we sampled with replacement.
When sampling without replacement this is not true
Example: population {1, 2, 3, 4}; n=2, suppose
equally likely.
• Prob. of selecting 3 on first draw is ¼.
• Prob. of selecting 3 on second draw depends on
first draw (probability is 0 or 1/3)
Finite population case
Sampling without replacement to estimate pop.
total 
Let  i  P( yi is selected in the sample)
 i ' s change with the draw
Replace  i with average probability  i n that yi is selected

across the n draws.
ˆ 
1
n
yi

n

i 1

i
where wi 
1
n
yi


n
i 1
1
i
Worksheet
i
n
n

i 1
yi
i
n
  wi yi
i 1
End of Chapter 3
Download