calculation of confidence intervals for point estimates and change

advertisement
Methodology Glossary Tier 2
Calculation of Confidence
Estimates and Change
1
Intervals
for
Point
Introduction and Assumptions
We do sample surveys to make estimates about a population of interest. We can say
how good our estimates are by applying the principles of the central limit theorem,
which means that we can assume our estimate is the theoretical mean from many
theoretical surveys.
A confidence interval is a measure of uncertainty around the estimate from our
sample survey, telling us the range of values within which the true (population) mean
lies with a given degree of confidence. In other words the confidence intervals show
us the range within which 95% or 99% or 99.9% of sample means could be expected
to lie if the survey was repeated. This helps us to decide whether a sample mean is
reliable enough for our purposes. A 95% confidence interval is the most commonly
used.
In these examples, we will be focussing on calculating confidence intervals for
percentages or proportions. Initially, we will assume that:
 The confidence intervals calculated are 95%.
 The data is collected through a simple random sample (SRS) and there are
no design effects.
 The sample size is large (n>30) to allow the assumption of approximation to a
normal distribution to be used.
 The populations we are dealing with are large enough to be assumed as
infinite.
Don’t worry if you don’t know what all these assumptions mean at the moment, as
most will be explored later in this paper.
2
Calculating a confidence interval for a single percentage
In order to calculate a confidence interval, we need to know the following
information:
 The point estimate, which is the percentage or proportion estimated from our
sample (or the sample mean);
 The standard error. This measures how far our estimate is from the mean
estimate that would be obtained from many (theoretical) surveys. This is
calculated as:
s.e. = √ ((p(1-p))/n)
where:
p= the point estimate
n=sample size
The confidence interval (CI) is calculated as:
CI = p ± (1.96 * s.e.)
Methodology Glossary Tier 2
The value 1.96 specifies that this is a 95% Confidence Interval. To calculate other
levels of confidence, please see section 6 below.
EXAMPLE 1
A survey of 205 adults estimates that 43% have caring responsibilities.
So,
s.e. = √ ((0.43*0.57)/205) = 0.03547
CI= 0.43 ± (1.96*s.e.) = 0.43 ± 0.07
Therefore the 95% Confidence Interval for this estimate is (36%, 50%), i.e. we are
95% confident that the true percentage of adults with caring responsibilities for adults
is between 36% and 50%.
3
Effect of changes in sample size
Increasing the sample size will reduce confidence intervals and result in more
precise estimates, but it must be remembered that diminishing returns will be
realised from doing this; i.e. after a certain size of sample it is not really worth
increasing the sample any more.
This is illustrated in the graph below.
95% Confidence Intervals on an estimate of 50% for different sample sizes
10
9
8
95% CI (+/-)
7
6
5
4
3
2
1
0
100
200
300
400
500
600
700
800
900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000
Sample Size
Methodology Glossary Tier 2
So, increasing the sample size from 100 to 500 reduces the CIs from 9.8 to 4.3,
whereas increasing the sample size further to 1000 only reduces the CIs to 3.1.
Even when we examine doubling sample size, there is a clear tail-off in the benefits
gained, as shown in the table below.
Sample Size
100
200
400
800
1600
3200
6400
12800
25600
51200
95% CI on 50% Estimate
9.8
6.9
4.9
3.5
2.5
1.7
1.2
0.9
0.6
0.4
Sample sizes for many Government surveys need to be large to allow a minimum
sample size for the smaller local authorities and other sub-groups of interest: the
sample collected is not required for the Scotland level estimates but rather the
disaggregation required. This is also the case when a sub-group of interest cannot
be identified on the sampling frame. This means it is not possible to stratify by it, and
it is not able be over-sampled at the time of drawing the sample. This may mean a
large overall sample is required to get a sufficient number in a small sub-group.
4
Using confidence intervals to assess statistical significance and levels
of change required
An important use of confidence intervals in Government is to make a statement
about whether there is a significant difference between two numbers. The question
can be posed in two slightly different ways:
 Looking backward: Has there been a significant increase between the last
two data points/is there a significant difference between two sub-groups?
 Looking forward: What level of change would be required between two data
points/ subgroups to say there had been a significant change?
Looking Backward
Quite often in Government, the information we receive is simply the estimate and the
relevant CIs. We need to be able to use this information as a rule of thumb to assess
whether we think there has been a significant change. If the intervals do not overlap
then there is a significant difference between two points, but if they do overlap it
does not necessarily mean there is no significant difference. This is best illustrated
with an example.
Methodology Glossary Tier 2
EXAMPLE 2
Imagine this was the information we had been given:
Year
2006
2007
Point Estimate
a%
b%
CI
± 3 p.p.
± 4 p.p.
p.p = percentage point
Policy colleagues wish to know whether there had been a real change between 2006
and 2007.
Firstly, we can see if the following statement is true:
1.
The difference between a and b is greater than or equal to the sum of the
magnitude of the intervals (in this case 7).
If this is the case, there is a significant difference between the two points.
So, for example if,
a=34%
and
b=42%
then
b – a = 8 p.p.
This is greater than 7, so we conclude that there is a significant difference
between these points.
If the first statement is not true, we can then look to see if the second statement is
true:
2.
The difference between a and b is smaller than the larger interval of the
two points (in this case 4).
If this is the case there is not a significant difference between the two points
So, for example if,
a=34%
and
then
b – a = 3 p.p.
b=37%
Methodology Glossary Tier 2
This is less than 7, so the first statement is not true.
This is less than 4, which means the second statement is true and so we conclude
that there is not a significant difference between these points.
If neither statement 1 or 2 is true, then we can see if the third statement is true
3.
The difference between a and b is greater than or equal to the square
root of the sum of the squares of the two magnitudes (in this case √(32+42) =
√(9+16) = 5)
If this is the case, there is a significant difference between the two points.
So, for example if,
a=34%
and
b=40%
then
b – a = 6 p.p.
This is less than 7, so the first statement is not true.
This is more than 4, so the second statement is not true.
This is greater than 5, so we conclude that there is a significant difference
between these points.
If you have a case where none of these statements are true then there is not a
significant difference between these points.
EXAMPLE 3
The Scottish Household Survey collects information about smoking levels in adults.
Suppose we wish to know whether there has been a change in smoking levels
between 1999 and 2007.
Year
1999
2007
Smoking Level
30.4%
24.7%
95% CI
+/- 0.9 p.p.
+/- 1.0 p.p.
Methodology Glossary Tier 2
Is the difference between 1999 and 2007 greater than the sum of the
confidence intervals?
In this case, the difference is 4.4 percentage points.
The sum of the confidence intervals is 1.9 percentage points.
The answer to our question is Yes, so we conclude that there is a significant
difference between 1999 and 2007.
Suppose instead we want to know whether smoking levels in women and men are
different.
Smoking Level
95% CI
Male
26.0%
+/- 1.4 p.p.
Female
24.0%
+/- 1.3 p.p.
Is the difference between men and women bigger than the sum of the
confidence intervals?
The difference is 2.0 percentage points.
The sum of the confidence intervals is 2.7 percentage points.
The answer to our question is No.
Is the difference between men and women smaller than the largest confidence
interval?
The difference is 2.0 percentage points.
The largest confidence interval is 1.4 percentage points.
The answer to our question is No.
So, we ask our final question:
Is the difference between men and women greater than or equal to the square
root of the sum of the squares of the two confidence intervals?
The difference is 2.0 percentage points.
We need to calculate:
√(1.42 + 1.32) = 1.9
As 2.0 is larger than 1.9, the answer to our question is Yes. We therefore conclude
that there is a significant difference between the smoking rates of men and women.
Methodology Glossary Tier 2
Word of caution – multiple tests
When carrying out any kind of hypothesis testing, as we are doing with the
confidence intervals in these examples, we need to be aware of the probabilities of
false positives and false negatives. These are known as Type I (accepting that there
is a difference when in fact there is none, or false positive) and Type II (accepting
there is no difference when in fact there is, or false negative) errors.
The chances of finding a false positive are inflated greatly the more tests you do.
Intuitively, this makes sense: if you do 20 different comparisons at the 5% level then
we would expect that one of them will be a Type I error. So, be wary of (e.g.)
comparing several different years in the same time series. Really, the test that
should be applied here is an ANOVA (as you will usually be testing whether any one
year is different from the others), or if you do want to know whether all the years are
different from each other, then a Bonferroni adjustment must be applied.
A Bonferroni adjustment is a simple method to ensure that a test is still significant to
the level we originally intended i.e. a way to keep the chances of a Type I error for
the comparisons as a whole at the 5% level.
EXAMPLE 4
Suppose 4 samples are to be compared, and the maximum overall probability of a
Type I error (or α) we would like is 0.05.
Our 4 samples are in different years:
2003
2004
2005
2006
The total number of comparisons here is 6.
Given our overall α, we need to calculate the α level for each of the 6 comparisons to
ensure that the overall α is 0.05. The Bonferroni adjustment divides the overall α
by the number of comparisons, so in this case:
α for each comparison = α/6 = 0.05/6 = 0.0083.
Therefore the CIs we would use to make these 6 comparisons, to ensure an overall
Type I error rate of 0.05, are 99.17% confidence intervals.
Or, roughly, to make 6 comparisons which keep the overall error rate at 0.05, 99%
CIs have to be used for each of the 6 comparisons.
Looking Forward
This is a reasonably simple figure to work out if the data are to be collected in the
same way (i.e. with the same design and sample size) and with the same sort of
assumptions as they were before.
Methodology Glossary Tier 2
If this is the case, and we have a point estimate plus the CI, we can answer the
question: How much change in the next period would be considered a statistically
significant change? This is best illustrated with an example. This will not give an
absolutely precise result, but it is usually good enough for our purposes (i.e. the
result will be the same to 1 decimal point).
EXAMPLE 5
Basically we use the rules given above in the previous section. Say we have an
estimate of 50% from a sample size of 600. The CI for this estimate is therefore ± 4
pp.
To work out the change required we simply take the square root of the sum of the
squares, assuming that the next time point will have a similar design:
Change required = √ (16+16)= 4√2
Therefore we simply have to multiply the CI by √2 to get the change required.
However, if we know that the design or sample size is likely to change in the next
year, it is slightly more difficult to answer. However, we can recalculate what the CI
would have been in our current data point if we made these other assumptions, and
use this, along with our real CI, to calculate the level of change required. Clearly this
requires us to have information about sample sizes and the design of the data.
5
Finite Population Correction
The size of the population we are interested in does not normally affect confidence
intervals. It means that a sample of 10,000 households in Scotland (representing a
population of about 2.5 million) will have the same confidence intervals as a sample
of 10,000 households in England. This means for many calculations, especially if we
are using a dataset which represents households or adults in Scotland, the sample is
such a small proportion of the population that the population can be treated as if it
were infinite. It is the absolute size of the sample that is important.
Only when the sample represents over 5% of the population does the assumption of
an infinite population need to be scrapped. In this case, a finite population correction
need to be applied. This is included in the calculation of the standard error, and will
reduce s.e.s by the appropriate amount, depending on how large your sample is in
relation to the population. The calculation is shown in the equation below.
F =√ (N-n)/(N-1)
where
F= Finite Population correction
N=Population size
n= sample size
Methodology Glossary Tier 2
This is then incorporated into the equation below to calculate the correct standard
error:
s.e. = √(F*(p(1-p))/n)
You will see from the equation that F tends towards 1 the larger N becomes, hence
why we do not have to worry about this when populations are large but sample sizes
small in relation to N.
EXAMPLE 6
Suppose there is a need to survey a sample of Scottish Government employees
about their method of travel to work. There are 5,000 SG employees, and suppose a
simple random sample of 2,000 is drawn.
We discover that 40% of respondents travel to work by bus. If we were assuming an
infinite population, the 95% confidence interval around the estimate would be +/- 1.8
percentage points.
However, the finite population correction is:
F = √ (5000-2000)/4999 = √3000/4999 = √0.6 = 0.77
Therefore, the 95% Confidence interval is +/-(0.77*1.8) = +/- 1.39 p.p.
6
Calculating Confidence intervals which are not 95%
We have been concentrating on calculating 95% confidence intervals, as this the
convention used. The z-score 1.96 is what specifies that this is a 95% confidence
interval.
Another z-score can be used to specify another level of confidence. Some z-scores
are shown below.
XX% Confidence Interval
80%
90%
95%
99%
99.9%
Z-Score
1.28
1.65
1.96
2.58
3.29
Methodology Glossary Tier 2
7
Calculating Confidence Intervals when the data are not proportions
This paper has concentrated on calculating confidence intervals for percentages
/proportions, but clearly this can be generalised to incorporate variance measured
from any data collection. s2 is used to represent the estimated variance here, and
this can be incorporated into the equation for the s.e. as shown below:
s.e. = √ (s2/n)
In the case of a proportion, s2=p(1-p).
8
In the real world – how to incorporate survey design into confidence
intervals
We have assumed in this paper that the data we have come from a Simple Random
Sample (SRS). However, in practice, sampling techniques like stratification,
systematic random sampling and clustering are commonly used to either improve the
spread of the sample or to save money.
Features of the design such as stratification and clustering can affect the standard
errors that are used to calculate confidence intervals. These are known as complex
standard errors, and are calculated through a number of iterative techniques. These
can then be expressed as a design factor, which is a multiplier which states by how
much the error is increased or decreased compared to a SRS design, given the
design you have used.
Please note that there is a different design factor for every level of every variable.
What is usually given from surveys is an average design factor to use in calculation
of confidence intervals, but remember that by definition it will either over or
underestimate the effect that the design has on your particular estimate. So be
careful to consider what you are estimating and how the design could impact on that.
For example, the Scottish Household Survey had an average design factor of 1.2,
but the design factor for accommodation type (which is more likely to be clustered in
geographical areas, and therefore more affected by the clustering in the sample) will
be larger.
The design factor is more useful for adjusting standard errors. But the design effect
is the square of the design factor, and tells you how much information you have
gained or lost by using a complex survey rather than a simple random sample. A
design effect of 2 means that you would need to have a survey that is twice the size
of a simple random sample to get the same amount of information. Whereas a
design effect of 0.5 means that you would gain the precision from a complex survey
of only half the size of a simple random sample. Design effects of 2 are quite
common, but those of 0.5 are rare.
The design factor is incorporated into the CI equation as shown below:
CI = p +/- D*(1.96 * s.e.)
Methodology Glossary Tier 2
Where
D = Design Factor
If you would like to know more details about design effects and other issues related
to analysing complex survey data, then visit the Practical Exemplars of Analysis of
Surveys (PEAS) website, a useful resource created by Napier University:
http://www2.napier.ac.uk/depts/fhls/peas/index.htm
10
Summary Equation
The general equation for a confidence interval is:
P +/- Z* F*D*√ (s2/n)
Where:
P = the sample mean estimate. In many cases, from national government sources,
this will be the proportion of people with a certain characteristic.
Z = the z-score that we use, assuming normality, to specify the level of confidence.
For 95%, this will be 1.96.
F = the finite population correction. We don’t have to worry about this in general until
our sample is about 5% of the population or more
D = a design factor, which takes account of the clustering, stratification etc.
s2 = the estimated variance from the sample.
n = the sample size.
Download