Statistical inference …..is inference about a population •

advertisement
Statistical inference
…..is inference about a population
from a random sample drawn from it. It includes:
• point estimation
• interval estimation
• hypothesis testing (or statistical significance testing)
• prediction
1
Estimation and Inference
Suppose we have a single sample. The questions we might
want to answer are these:
• what is the population mean value?
• is the population mean value significantly different from
current expectation or recommended level?
• what is the level of uncertainty associated with our estimate
of the population mean level?
2
In order to be reasonably confident that our inferences are
correct, we need to establish some facts about the distribution
of the data:
• is the sample size large enough ?
• are there outliers in the data ?
• is the mean a sensible summary statistics ?
• if data were collected over a period of time, is there evidence
of serial correlation?
• are the values normally distributed or not?
3
What is the level of
uncertainty associated
with our point estimate x
of the population mean µ
?
4
Confidence Intervals
for µ
x ± margin of error
5
Confidence Interval
for µ
Shows the likely range in which the true population
mean is situated.
The more ‘confidence’ required the wider the
interval.
The convention is to calculate 95% confidence
intervals.
6
Confidence Interval
for µ
x ± Confidence Coefficient ∗
σ
n
Standard error
Gosset and the t-distribution
7
To improve the ‘precision’ of
the sample mean:
• decrease σ (?)
• increase n
σ
n
8
One-Sample T: Cadmium
Variable
Cadmium
N
168
Mean
0.268690
StDev
0.163347
SE Mean
0.012602
95% CI
(0.243810, 0.293571)
Interpretation ??
9
Hypothesis Testing - Strategy
i.
Take a random sample from the population of
interest and calculate a suitable test statistic
ii.
Investigate how likely the value of that
statistic is given some specified hypothesis is
true
iii. Make a decision as to whether your
hypothesis is true given ii.
10
Significance.
The key concept is the amount of variation that we
would expect to occur by chance alone when nothing
scientifically interesting was going on.
If we measure bigger differences than we would expect
by chance, we say that the result is statistically
significant.
If we measure no more variation that we might
reasonably expect by chance alone we say our result is
not statistically significant.
11
The first step in a hypothesis
test is to state a claim that we
will try to find evidence
against.
“Innocent until proven guilty …”
12
The hypothesis that the population
parameter is equal to some claimed
value is called the null hypothesis (H0).
The hypothesis that must be true if the
null hypothesis is false is called the
alternative hypothesis (H1 or HA).
13
Tests Concerning Means
One Sample Test
‘mean cadmium levels’
Is µ = 0.30 ?
14
Ho : µ = 0.30
Various alternative hypotheses
H1 : µ ≠ 0.30
Two sided
H1 : µ < 0.30
H1 : µ > 0.30
One sided
One sided
15
One-Sample T: Cadmium
Test of mu = 0.3 vs not = 0.3
Variable
Cadmium
N
168
Mean
0.268690
95% CI
(0.243810, 0.293571)
StDev
0.163347
T
-2.48
SE Mean
0.012602
P
0.014
(Large sample with N > 30 → Normality Assumption not necessary)
16
P-value:
• An estimate of the probability that the test statistic
could have occurred by chance, if the null hypothesis
were true.
• How likely is this data if the true mean Cadmium
level was equal to 0.3?
• A low p-value means the null hypothesis is unlikely
to be true (a p-value of < 0.05 is considered low).
P-value: 0.014
Moderate evidence against H0.
17
Comparing CI & Hypothesis tests
• Confidence interval: A fixed level of confidence
is chosen. We determine a range of possible values
for the parameter that are consistent with the data (at
the chosen significance level).
• Hypothesis test: Only one possible value for the
parameter is tested. We determine the strength of the
evidence provided by the data against the proposition
that the hypothesized value is the true value.
18
CAUTION!
A non-significant test does
not imply that the null
hypothesis is true.
19
Download