MathStat

advertisement
Statistical Inference and Regression
Analysis:
Stat-GB.3302.30, Stat-UB.0015.01
Professor William Greene
Stern School of Business
IOMS Department
Department of Economics
Part 4 – Statistical Inference
4.1 – The Normal Family of
Distributions
Part 4 – Statistical Inference
4/34
Normal
2

1
1 x -μ 
f(x) =
exp - 
 , -  < x < + 
σ 2π
 2  σ  
  Mean
 = Standard deviation
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
700000
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
5/34
Standard Normal
 1 2
exp - x  , -  < x < + 
f(x) =
2π
 2 
  0,  = 1
1
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
700000
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
6/34
Chi Squared 1 = Square of N(0,1)
Z ~ N[0,1]
Dens i ty of Chi Squared[1]
X  Z2 ~ Chi Squared with 1 degree of freedom = 12
2. 50
.5.5 x .51e .5X
1 1
f(x) =
, x  0 = Gamma  , 
(.5)
2 2
2. 00
1. 50
E[X]  E[Z ]  1  0  1
2
2
F
2
Var[X]  E[X 2 ]  E 2 [X]
1. 00
 E[Z4 ]  E 2 [Z2 ]  3  1  2
. 50
. 00
0
1
2
3
4
 Z 
2
2
If Z ~ N[,] then 
  (N[0,1])  Chi Squared(1) = 1
  
2
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
9
Marginal Plot of Listing vs IncomePC
Mean
StDev
N
10
500000
300000
10
8
Normal
100
12
700000
400000
30
7
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
6
300000
100000
Probability Plot of Listing
99
6
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
5
Z_SQ D
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
7/34
Limit Result for Square of N(0,1)
Suppose Z N 
 Normal(0,1)
d
(as a consequence of a central limit theorem).
Then Z 

d
800000
800000
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
7
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
Percent
900000
0
1000000
60
800000
40
Listing
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
2
1
Frequency
2
N
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
8/34
Sum of Two Independent Chi
Squared(1) Variables
Dens i ty of Chi Squared[2]
Sum of Two Independent Chi Squared Variables
. 2000
2
2
. 1750
. 1500
Chi squared with 2 degrees of
. 1250
900000
800000
800000
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
400000
100000
60
50
40
369687
156865
51
0.994
0.012
20000
22500
25000
IncomePC
27500
30000
32500
16
18
Marginal Plot of Listing vs IncomePC
Normal
Mean
StDev
N
369687
156865
51
80
6
2
1
100000
15000
1000000
14
Empirical CDF of Listing
100
8
200000
800000
12
10
500000
5
400000
600000
Listing
10
12
700000
4
200000
8
Histogram of Listing
300000
0
6
14
800000
400000
30
4
X2
600000
70
10
17500
2
Scatterplot of Listing vs IncomePC
20
300000
100000
15000
0
80
500000
200000
. 0000
900000
Mean
StDev
N
AD
P-Value
90
600000
200000
. 0250
Normal - 95% CI
700000
300000
. 0500
Probability Plot of Listing
99
95
8
. 0750
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
. 1000
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
F
2/2 ( 1/2)x (2/2) 1
x
 1 2  ( 12 ) e
freedom is Gamma  ,  
(2 / 2)
2 2
Expected Value = 2
Variance = 2*2 = 4
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
2
1
Percent
 (1)   (2) ~ 
2
1
. 2250
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
9/34
Sum of N Independent Chi Squareds
X1 ,..., X N all independent, all chi squared (1)
X=  i 1 X i ~ Chi Squared (N)
N
1 N
Gamma  ,  . f(x) =
2 2 
 
1
2
e  (1/2)x x  N/21
N
2
  N2 
Mean = N, Variance = 2N.
(Prove by sum of independent variables each with mean 1
and variance 2.)
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
9
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
10/34
Limit Result for Square of Normal
ZN   d
 ZN   
d
2
Suppose

 Normal(0,1) so 



1


  
p
Suppose s N 
 .
2
2
 ZN   
ZN   d
d
2
Then

 Normal(0,1) and 




1
sN
s
 N 
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
10
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
11/34
Noncentral Chi Squared
Suppose Z~N[,1],   0
Z2 ~ Noncentral chi squared with
Central and Nonc entral Chi Squared
. 254
2
noncentrality parameter
2
Suppose Zi ~N[,1],   0
. 203
Vari abl e
. 153
. 102

. 051
. 000
3
4
5
6
7
8
9
k 1
K 2
noncentrality parameter
2
X
Pepperoni
21.8%
Sausage
5.8%
900000
800000
800000
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
500000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
90
600000
300000
100000
Probability Plot of Listing
99
95
11
Listing
Meatball
Garlic 5.0%
2.3%
Listing
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
NO NCNTRL
Percent
CENTRAL
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
2
Percent
1
Z2k ~ Noncentral chi squared[K] with
Frequency
0
K
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
12/34
t distribution
N[0,1]
t

v
2
v
If v=1, t=N[0,1]/N[0,1] = Cauchy. No finite moments.
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
12
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
13/34
Limiting Form of t
tv 
N[0,1]
 2v
v
2
1 2
1
1
2
 v has mean v  1 and variance   2v =  
v
v
v
v
As v 
 , random variable in denominator converges
in mean square to 1. Implication
d
t v 
 N[0,1]
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
13
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
14/34
F Distribution
x n  numerator chi squared variable
x d  denominator chi squared variable
Independent
800000
800000
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
14
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
Percent
900000
0
1000000
60
800000
40
Listing
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
Frequency
Fn,d
x n / n  n2 / n

 2
x d / d d / d
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
15/34
Limiting Form of F
Limit form of F relates to denominator degrees of freedom, den.
x num  numerator chi squared variable
x den  denominator chi squared variable
Independent
Fnum,den
2
x num / num num
/ num

 2
x den / den
den / den
d
2
As den 
 , d2 / den 
1 and num  F 
 num
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
15
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
16/34
95% critical values
for chi squared
95% critical values for limiting F distribution
Multiply value in last row by degrees of freedom. Equals value for chi-squared.
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
16
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
17/34
Special Case of F
Sausage
5.8%
900000
800000
800000
700000
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
900000
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
17
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepper and Onion
7.3%
 /k
Percent
Pepperoni
21.8%
 F(1, k)
2
k
Listing
Meatball
Garlic 5.0%
2.3%
/1
Listing
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
N[0,1]


2
Percent
t
2
k

k
2
k
Frequency
tk 
N[0,1]
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
18/34
Independence of Sample Mean and
Variance in Normal Sampling
X  (X1 , X 2 ,..., X N )  n independent Normal[, 2 ]
1
N
Xi

N i 1
1
N
Sample variance =
s2 
(X i  X) 2

i 1
N 1
2
Main result: X and s are independent.
Long elemental proof in text pp 195-197
Sample mean
X
=
 2 
Brief proof: (1) X  sum of normals, X ~ Normal , 
 N
N  1

*** (2) X i  X = linear function of normals, each is ~ N 0,  2
N 

(3) Cov[X, X i  X]  0
(4) In multivariate normals, zero covariance ==> independence
(5) X and s 2 are functions of independent variables
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
18
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
19/34
Useful Result
2
(N  1)s 2
1
N
2

X

X
~



 i
N 1
2
2 i 1
Note, N-1 degrees of freedom, not N.
(Terms are not independent).
Proof in text.
2
Limiting form: Cov  Xi  X, X j  X  = - 
0
N
Implication: E[s 2 /2 ]  1, Var[s 2 /2 ]  2 / (N  1)
s2 p

1 (converges in mean square)
2

600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
19
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
20/34
Distribution of the t statistic
 X  



N


~ t N 1
(N  1)s 2
(N  1)
2

800000
800000
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
20
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
Percent
900000
0
1000000
60
800000
40
Listing
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
Frequency
X 

s/ N
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
4.2 – Interval Estimation
Part 4 – Statistical Inference
22/34
Estimation
800000
800000
700000
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
22
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
Percent
900000
0
1000000
60
800000
40
Listing
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
Frequency
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepper and Onion
7.3%
Interval Estimator: Provides a range of values
that incorporates both the point estimator and
the uncertainty about the ability of the point
estimator to find the population feature exactly.
Listing
Pepperoni
21.8%

Listing
Meatball
Garlic 5.0%
2.3%
Point Estimator: Provides a single estimate of
the feature in question based on prior and
sample information.
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%

20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
23/34
Obtaining a Confidence Interval
Pivotal quantity
f(estimator, parameters) that has a
known distribution free of parameters
and data
 Probability statement can be made
about the pivotal quantity
 Manipulate the interval to describe the
parameter.

600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
23
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
24/34
Example – Normal Mean
In random sampling from the normal distribution with mean 
and variance 2 ,
N (x-μ)
~t[N-1] This is free of x.
s
 N (x-μ)

Prob 
 t *  (1  )
s


Therefore,
s


Prob  x-μ 
t *  (1  )
N 

s
s


Prob  xt*    x+
t *  (1  )
N
N 

600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
24
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
25/34
t distribution – values of t*
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
25
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
26/34
Normal Variance
In random sampling from the normal distribution,
( N  1) s 2
~  2 [ N  1]
2

Therefore,
 2

( N  1) s 2
2
Prob   / 2 


1 (  / 2)   (1   )
2



 1
2
1 
Prob  2

 2   (1   )
2
(  / 2) 
 1 (  / 2) ( N  1) s
 ( N  1) s 2
( N  1) s 2 
2
Prob  2
 
  (1  )
2
(  / 2) 
 1 (  / 2)
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
26
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
27/34
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
27
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
28/34
GSOEP Income Data
Descriptive Statistics for
1 variables
--------+--------------------------------------------------------------------Variable|
Mean
Std.Dev.
Minimum
Maximum
Cases Missing
--------+--------------------------------------------------------------------HHNINC|
.353343
.157058
.035000
1.500000
24
0
--------+---------------------------------------------------------------------
For the mean, t* for 24-1 = 23 degrees of freedom = 2.069
Confidence interval for mean is .353343 +/- 2.069 * (.15708/sqr(24))
= .353343 +/- .032064
Confidence interval for variance: Critical values from chi squared 23 are
11.69 and 38.08. Confidence interval for 2 is
(24-1).157082/38.08 to (24-1).157082/11.69 = .014903 to .048546
Confidence interval for  is
.122078 to .220332
2
Notice, not symmetric around s or s.
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
28
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
29/34
Large Sample Results
900000
800000
800000
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
29
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Percent
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Frequency
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
Listing

There are almost no other cases in which there
exists an exact pivotal quantity
Most estimators rely on large sample results
based on central limit theorems
(estimator – parameter)
----------------------------------------  N(0,1)
standard error of estimator
Percent

20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
30/34
Confidence Intervals
Relying on the Central Limit Theorem
ˆ
(θ-θ)
d

 N [0,1]
ˆ
EstimatedVar[θ]


ˆ
(θ-θ)
Prob 
 z *  (1  )
 EstimatedVar[θ]

ˆ


Therefore, we use
ˆ  θ  θˆ  z * EstimatedVar[θ]
ˆ   (1  )
Prob θˆ  z * EstimatedVar[θ]


600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
30
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
31/34
Interpretation of The Interval
Not a statement about probabilities
that  will lie in specific intervals.
 (1-) percent of the time, the interval
will contain the true parameter

600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
31
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
32/34
Application: Credit Modeling

1992 American Express analysis of
Application process: Acceptance or
rejection; X = 0 (reject) or 1 (accept).
Cardholder behavior


• Loan default (D = 0 or 1).
• Average monthly expenditure (E = $/month)
• General credit usage/behavior (Y = number of
charges)
800000
800000
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
700000
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
Percent
900000
0
1000000
60
800000
40
Listing
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Frequency
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Listing
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
13,444 applications in November, 1992
Percent

20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
33/34
X in 100 samples with N = 144 in each sample
0.7809 is the true proportion in the population of 13,444 we are sampling from.
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
700000
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Part 4 – Statistical Inference
34/34
Estimates plus and minus 1 and 2 standard errors
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
90
500000
400000
200000
100000
15000
60
50
40
17500
20000
22500
25000
IncomePC
27500
30000
32500
6
5
200000
2
1
100000
15000
200000
400000
600000
Listing
800000
1000000
369687
156865
51
80
8
4
0
Mean
StDev
N
10
500000
300000
10
Normal
100
12
700000
400000
30
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
14
800000
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
Histogram of Listing
900000
Mean
StDev
N
AD
P-Value
95
700000
300000
100000
Probability Plot of Listing
99
17500
20000
22500
25000
IncomePC
27500
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Download