x - Pages

advertisement
UWHC Scholarly Forum
April 17, 2013
Ismor Fischer, Ph.D.
UW Dept of Statistics,
UW Dept of Biostatistics
and Medical Informatics
ifischer@wisc.edu
UWHC Scholarly Forum
April 17, 2013
Ismor Fischer, Ph.D.
UW Dept of Statistics,
UW Dept of Biostatistics
and Medical Informatics
ifischer@wisc.edu
All slides posted at http://www.stat.wisc.edu/~ifischer/UWHC
• Click on image
for full .pdf article
• Links in article
to access datasets
“Statistical Inference”
POPULATION
Study Question:
Has “Mean (i.e., average) Age at
First Birth” of women in the U.S.
changed since 2010 (25.4 yrs old)?
Present Day: Assume “Mean Age at
First Birth” follows a normal distribution
(i.e., “bell curve”) in the population.
~ The Normal Distribution ~

“population
standard
deviation”

 f ( x) 
 symmetric about its mean
 unimodal (i.e., one peak),
with left and right “tails”
 models many (but not all)
naturally-occurring systems
 useful mathematical
properties…
“population mean”
Example: Body Temp (°F)
low
variability
small 
98.6
~ The Normal Distribution ~

“population
standard
deviation”

 f ( x) 
 symmetric about its mean
 unimodal (i.e., one peak),
with left and right “tails”
 models many (but not all)
naturally-occurring systems
 useful mathematical
properties…
“population mean”
IQ score
Example: Body
Temp (°F)
low
high
variability
small 
large 
98.6
100
~ The Normal Distribution ~
“population
standard
deviation”

95%
2.5%
≈2σ
2.5%
≈2σ

 f ( x) 
 symmetric about its mean
 unimodal (i.e., one peak),
with left and right “tails”
 models many (but not all)
naturally-occurring systems
 useful mathematical
properties…
“population mean”
Approximately 95% of the population
values are contained between
 – 2σ and  + 2 σ.
95% is called the confidence level.
5% is called the significance level.
POPULATION
Study Question:
Has “Mean (i.e., average) Age at
First Birth” of women in the U.S.
changed since 2010 (25.4 yrs old)?
“Statistical Inference”
via… “Hypothesis Testing”
Present Day: Assume “Mean Age at
First Birth” follows a normal distribution
(i.e., “bell curve”) in the population.

 cannot be found with 100% certainty,
but can be estimated with high confidence
(e.g., 95%).
H0: pop mean age  = 25.4
(i.e., no change since 2010)
“Null Hypothesis”
POPULATION
Study Question:
Has “Mean (i.e., average) Age at
First Birth” of women in the U.S.
changed since 2010 (25.4 yrs old)?
“Statistical Inference”
via… “Hypothesis Testing”
Present Day: Assume “Mean Age at
First Birth” follows a normal distribution
(i.e., “bell curve”) in the population.

T-test
x2
“Null Hypothesis”
x4
x1
x3
x5
… etc…
x400
H0: pop mean age  = 25.4
(i.e., no change since 2010)
FORMULA
sample mean age x  25.6
x1  x2 
x
n
 xn
Do the data tend to support or refute the null hypothesis?
Is the difference STATISTICALLY SIGNIFICANT, at the 5% level?
~ The Normal Distribution ~
CENTRAL LIMIT THEOREM


n

x2
x1
x3
x4
x5
… etc…

~ The Normal Distribution ~

95%
2.5%
≈2σ
2.5%

n
≈2σ

Approximately 95% of the population
values are contained between
 – 2 σ and  + 2 σ.
Approximately 95% of the sample
mean values are contained between
  2 n and   2 n
Approximately 95% of the intervals
x  2 n from x  2 n
to
contain , and approx 5% do not.

Approximately 95% of the intervals
x  2 n from x  2 n
to
contain , and approx 5% do not.
2

n
95% margin of error
2

n
POPULATION
Study Question:
Has “Mean (i.e., average) Age at
First Birth” of women in the U.S.
changed since 2010 (25.4 yrs old)?
“Statistical Inference”
via… “Hypothesis Testing”
Present Day: Assume “Mean Age at
First Birth” follows a normal distribution
(i.e., “bell curve”) in the population.
“Null Hypothesis”
H0: pop mean age  = 25.4
(i.e., no change since 2010)
FORMULA
SAMPLE
n = 400 ages
sample mean
x
x4
x1
x2
x3
x5
… etc…
x400
x1  x2 
n
 xn
= 25.6
Approximately 95% of the intervals
x  2 n from x  2 n
to
contain , and approx 5% do not.
PROBLEM!
95% margin of error
σ is unknown the vast

2
majority of the time!
n
POPULATION
Study Question:
Has “Mean (i.e., average) Age at
First Birth” of women in the U.S.
changed since 2010 (25.4 yrs old)?
“Statistical Inference”
via… “Hypothesis Testing”
Present Day: Assume “Mean Age at
First Birth” follows a normal distribution
(i.e., “bell curve”) in the population.
“Null Hypothesis”
H0: pop mean age  = 25.4
(i.e., no change since 2010)
FORMULA
SAMPLE
n = 400 ages
sample mean
x
x4
x1
x2
x3
x5
… etc…
x400
x1  x2 
n
 xn
= 25.6
sample variance
= modified average of the squared
deviations from the mean
sample standard deviation
95% margin of error
2

n
“Statistical Inference”
via… “Hypothesis Testing”
POPULATION
Study Question:
Has “Mean (i.e., average) Age at
First Birth” of women in the U.S.
changed since 2010 (25.4 yrs old)?
Present Day: Assume “Mean Age at
First Birth” follows a normal distribution
(i.e., “bell curve”) in the population.
“Null Hypothesis”
H0: pop mean age  = 25.4
(i.e., no change since 2010)
FORMULA
SAMPLE
n = 400 ages
sample mean
x
x4
x1
x2
x1  x2 
n
 xn
= 25.6
sample variance
x3
x5
… etc…
x400
( x1  x ) 2  ( x2  x ) 2 
s 
n 1
 ( xn  x ) 2
2
sample standard deviation
s   s = 1.6
2
95% margin of error
2

n
2
s
= 0.16
n
Approximately 95% of the intervals
x  2 n from x  2 n
to
contain , and approx 5% do not.
x = 25.6
95% margin of error
2
25.44
s
= 0.16
n
2
x = 25.6
s
= 0.16
n
25.76
BASED ON OUR SAMPLE DATA, the true value of μ today is between
25.44 and 25.76 years, with 95% “confidence” (…akin to “probability”).
Two main ways to
conduct a formal
hypothesis test:
95% CONFIDENCE INTERVAL FOR µ
 = 25.4
25.44
x = 25.6
25.76
BASED ON OUR SAMPLE DATA, the true value of μ today is between
25.44 and 25.76 years, with 95% “confidence” (…akin to “probability”).
IF H0 is true, then we would expect a random sample mean x that is at least
0.2 years away from  = 25.4 (as ours was), to occur with probability 1.24%.
“P-VALUE” of our sample
Very informally, the p-value of a sample is the probability (hence a
number between 0 and 1) that it “agrees” with the null hypothesis.
Hence a very small p-value indicates strong evidence against the
null hypothesis. The smaller the p-value, the stronger the evidence,
and the more “statistically significant” the finding.
Two main ways to
conduct a formal
95% CONFIDENCE INTERVAL FOR µ
hypothesis
test: CONCLUSIONS:
FORMAL
 The 95% confidence interval corresponding to our sample mean does not
 =value”
25.4 of25.44
x = 25.6
contain the “null
the population mean,
μ = 25.4 years. 25.76
 The
p-value
ourSAMPLE
sample,DATA,
.0124,the
is less
predetermined
α = .05
BASED
ON of
OUR
truethan
valuethe
of μ
today is between
significance
25.44 andlevel.
25.76 years, with 95% “confidence” (…akin to “probability”).
Based on our sample data, we may (moderately) reject the null hypothesis
H0: μ = 25.4 in favor of the two-sided alternative hypothesis HA: μ ≠ 25.4,
at the
significance
level. expect a random sample mean x that is at least
IF Hα0 =is .05
true,
then we would
0.2 years away from  = 25.4 (as ours was), to occur with probability 1.24%.
INTERPRETATION: According to the results of this study, there exists a
our sample
statistically significant difference between the mean “P-VALUE”
ages at firstofbirth
in
2010 (25.4 years old) and today, at the 5% significance level. Moreover, the
informally,
p-value
of a that
sample
the probability
evidence from Very
the sample
datathe
would
suggest
theispopulation
mean(hence
age a
number
between
and 1)rather
that itthan
“agrees”
with theyounger.
null hypothesis.
today is significantly
older
than in0 2010,
significantly
Hence a very small p-value indicates strong evidence against the
null hypothesis. The smaller the p-value, the stronger the evidence,
and the more “statistically significant” the finding.
POPULATION
Study Question:
Has “Mean (i.e., average) Age at
First Birth” of women in the U.S.
changed since 2010 (25.4 yrs old)?
“Statistical Inference”
via… “Hypothesis Testing”
Present Day: Assume “Mean Age at
First Birth” follows a normal distribution
(i.e., “bell curve”) in the population.

T-test
x2
“Null Hypothesis”
x4
x1
x3
x5
… etc…
x400
H0: pop mean age  = 25.4
(i.e., no change since 2010)
FORMULA
sample mean age x  25.6
x1  x2 
x
n
 xn
Do the data tend to support or refute the null hypothesis?
Is the difference STATISTICALLY SIGNIFICANT, at the 5% level?
POPULATION
Study Question:
Has “Mean (i.e., average) Age at
First Birth” of women in the U.S.
changed since 2010 (25.4 yrs old)?
“Statistical Inference”
via… “Hypothesis Testing”
Present Day: Assume “Mean Age at
First Birth” follows a normal distribution
(i.e., “bell curve”) in the population.

T-test
H0: pop mean age  = 25.4
(i.e., no change since 2010)
“Null Hypothesis”
Check?
The reasonableness of the normality assumption is empirically verifiable,
and in fact formally testable from the sample data. If violated (e.g.,
skewed) or inconclusive (e.g., small sample size), then “distribution-free”
nonparametric tests can be used instead of the T-test.
Examples: Sign Test, Wilcoxon Signed Rank Test (= Mann-Whitney Test)
POPULATION
Study Question:
Has “Mean (i.e., average) Age at
First Birth” of women in the U.S.
changed since 2010 (25.4 yrs old)?
“Statistical Inference”
via… “Hypothesis Testing”
Present Day: Assume “Mean Age at
First Birth” follows a normal distribution
(i.e., “bell curve”) in the population.

T-test
x2
“Null Hypothesis”
x4
x1
H0: pop mean age  = 25.4
(i.e., no change since 2010)
x3
x5
… etc…
x400
Sample size n partially depends on the
power of the test, i.e., the desired
probability of correctly rejecting a false
null hypothesis. HOWEVER……
~ The Normal Distribution ~
“population
standard
deviation”

95%
2.5%
≈2σ
2.5%
≈2σ

“population mean”
x2
x1
x3
x4
x5
… etc…
Approximately 95% of the population
values are contained between
 – 2 σ and  + 2 σ.
Approximately 95% of the sample mean
values are contained between
  2 n and   2 n
Approximately 95% of the intervals
x  2 n from x  2 n
to
contain , and approx 5% do not.
~ The Normal Distribution ~
“population
standard
deviation”

95%
2.5%
≈2σ
2.5%
≈2σ

“population mean”
x2
x1
x3
x4
x5
… etc…
Approximately 95% of the population
values are contained between
 – 2 s and  + 2 s.
Approximately 95% of the sample mean
values are contained between
  2 s n and   2 s n
Approximately 95% of the intervals
x  2 s n from x  2 s n
to
contain , and approx 5% do not.
…IF n is large,  30 traditionally.
But if n is small…
… this “T-score" increases (from ≈ 2 to a
max of 12.706 for a 95% confidence level)
as n decreases  larger margin of error
 less power to reject.
If n is small,
T-score > 2.
If n is large,
T-score ≈ 2.
Download