CHAPTER EIGHT Confidence Intervals, Effect Size, and Statistical

advertisement
CHAPTER EIGHT
Confidence Intervals, Effect Size,
and Statistical Power
NOTE TO INSTRUCTORS
Students have a tendency to think that if
something is statistically significant, the story
is over and that’s all that a person needs to
know. In other words, they frequently confuse
“statistically
significant”
with
“meaningful.”
This chapter will help students recognize that
this is not always the case. Aside from using the
discussion questions and classroom exercises,
present examples of studies that demonstrate a
significant difference between groups but not a
very meaningful one. It is also important to break
students of the habit of using phrases such as
“very significant” by discussing effect sizes.
Although students might be tempted to describe an
effect as “very significant,” emphasize that they
should use effect sizes for this purpose instead.
OUTLINE OF RESOURCES
I.
Confidence Intervals
 Discussion Question 8-1 (p. 74)
 Classroom Activity 8-1: Understanding Confidence
Intervals (p. 74)
 Discussion Question 8-2 (p. 75)
II.
Effect Size and prep
 Discussion Question 8-3 (p. 75)
 Discussion Question 8-4 (p. 76)
III. Next Steps: prep
IV.
Statistical Power
 Discussion Question 8-5 (p. 77)
 Discussion Question 8-6 (p. 77)

Classroom Activity 8-2: Working with Confidence
Intervals and Effect Size (p. 77)
V.
Next Steps: Meta-Analysis
 Discussion Question 8-7 (p. 78)
 Classroom Activity 8-3: Analyzing Meta-Analyses
(p. 78)
 Additional Reading (p. 78)
 Online Resources (p. 79)
VI.
Handouts
 Handout 8-1: Classroom Activity: Working with
Confidence Intervals and Effect Size (p. 80)
 Handout 8-2: Analyzing Meta-Analyses (p. 82)
CHAPTER GUIDE
II.
Confidence Intervals
1. A point estimate is a summary statistic from a
sample that is just one number as an estimate
of the population parameter.
2. Instead of using a point estimate, it is wiser
to use an interval estimate, which is based on
a sample statistic and provides a range of
plausible values for the population parameter.
> Discussion Question 8-1
What is the difference between a point estimate and an
interval estimate?
Your students’ answers should include:
 A
point estimate is a summary statistic from a
sample that is just one number as an estimate of
the population parameter. Point estimates are
useful for gauging the central tendency, but by
themselves can be misleading.
 An
interval estimate is based on a sample
statistic and provides a range of plausible
values for the population parameter. Interval
estimates are frequently used in media reports,
particularly when reporting political polls.
Classroom Activity 8-1
Understanding Confidence Intervals
The following Web site provides a nice applet to
help
your
students
understand
confidence
intervals:
http://www.ruf.rice.edu/~lane/stat_sim/conf_interv
al/index.html
The applet simulates a known population mean and
standard deviation and allows you to control the
sample size, providing a graphical display of the
resulting confidence intervals.
3. A confidence interval is an interval estimate
that includes the mean we would expect a
certain percentage of the time for the sample
statistic were we to sample from the same
population repeatedly.
4. With a confidence interval, we expect to find
a mean in this interval 95% of the time that we
conduct the same study (if our confidence level
is 95%).
5. To calculate a confidence interval with a z
test, we first draw a normal curve that has the
sample mean in the center.
6. We then indicate the bounds of the confidence
interval
on
either
end
and
write
the
percentages under each segment of the curve.
7. Next, we look up the z statistics for the
lower and upper ends of the confidence interval
in the z table.
8. We then convert the z statistic to raw means
for the lower and upper ends of the confidence
interval. To do so, we first calculate the
standard error as our measure of spread using
the formula M = /. Then, with this standard
error and the sample mean, we can calculate the
raw mean at the upper and lower end of the
confidence interval. For the lower end we use
the formula: MLower = –z(M) + MSample. For the
upper end, we use the formulat: MUpper = –z(M) +
MSample.
9. Lastly, we should check our answer to ensure
that each end of the confidence interval is
exactly the same distance from the sample mean.
> Discussion Question 8-2
How would you calculate a confidence interval with a z
test?
Your students’ answers should include:
To calculate a confidence interval with a z test:
 Draw
a normal curve with a sample mean in the
center.
Indicate the bounds of the confidence interval on
either end and write the percentages under each
segment of the curve.
 Look up the z statistics for the lower and upper
ends of the confidence interval in the z table.
 Convert
the z statistic to raw means for the
lower and upper ends of the confidence interval.
For
the
lower
end,
use
the
formula:
MLower = –z(M) + MSample. For the upper end, use the
formula:
MUpper = z(M) + MSample.


II.
Lastly, check the answer to ensure that each end
of the confidence interval is exactly the same
distance from the sample mean.
Effect Size and prep
1. Increasing the sample size can lead to an
increased test statistic during hypothesis
testing.
In
other
words,
it
becomes
progressively easier to declare statistical
significance as we increase the sample size.
2 An effect size indicates the size of a
difference and is unaffected by sample size.
3. Effect size tells us how much two populations
do not overlap. Two populations can overlap
less if either their means are farther apart or
the
variation
within
each
population
is
smaller.
> Discussion Question 8-3
What is an effect size, and why would reporting it be
useful?
Your students’ answers should include:
 An
effect size is a measure of the degree to
which groups differ in the population on the
dependent variable.
 It is useful to report the effect size because it
provides you with a standardized value of the
degree to which two populations do not overlap
and
addresses
the
relative
importance
and
generalizability of your sample statistics.
3. Cohen’s d is a measure of effect size that
assesses the difference between two means in
terms of standard deviation, not standard
error.
4. The formula for Cohen’s d for a z distribution
is: d = (M – )/
5. A d of .2 is considered a small effect size, a
d of .5 is considered a medium effect size, and
a d of .8 is considered a large effect size.
6. The sign of the effect size does not matter.
> Discussion Question 8-4
Imagine you obtain an effect size of –0.3. How would you
interpret this number?
Your students’ answers should include:
 If you obtained an effect size of –0.3, you would
interpret this as a small effect size.
III. Next Steps: prep
1. Another method we can use in hypothesis
testing
is
prep
or
the
probability
of
replicating an effect given a particular
population and sample size. It is interpreted
as “This effect will replicate 100(prep)% of the
time.”
2. To calculate prep, we first calculate the
specific p value associated with our test
statistic.
3. Next, using Excel we enter into one cell the
formula
=NORMDIST(NORMSINV
(1-P/SQRT(2)))
where
we
substitute the actual p value for p.
IV.
Statistical Power
1. Statistical power is a measure of our ability
to reject the null hypothesis given that the
null hypothesis is false. In other words, it is
the probability that we will not make a Type II
error, or the probability that we will reject
the null hypothesis when we should reject the
null hypothesis.
2. Our calculation of statistical power ranges
from
a
probability
of
0.00
to
1.00.
Historically,
statisticians
have
used
a
probability
of
.80
as
the
minimum
for
conducting a study.
3. There
are
three
steps
to
calculating
statistical power. In the first step, we
determine the information needed to calculate
statistical power, including the population
mean, the population standard deviation, the
hypothesized mean for the sample, the sample
size, and the standard error based on this
sample size.
4. In step two, we caculate the critical value in
terms of the z distribution and in terms of the
raw mean so that statistical power can be
calculated.
5. In step three we calculate the statistical
power or the percentage of the distribution of
means for population 2 (the distribution
centered on the hypothesized sample mean) that
falls above the critical value.
> Discussion Question 8-5
What is statistical power, and how would you calculate it?
Your students’ answers should include:
 Statistical power is the probability of rejecting
the null hypothesis when it is false.
 You
calculate statistical power in three steps.
First determine the characteristics of the two
populations. Next, calculate the raw mean value
that determines your cutoff values. Finally,
determine the percentage that falls above the raw
mean and at the cutoff value using population 2.
6. There are five ways that we can increase the
power of a statistical test. First, we can
increase alpha. Second, we could turn a twotailed hypothesis into a one-tailed hypothesis.
Third, we could increase N. Fourth, we could
exaggerate the levels of the independent
variable.
Lastly,
we
could
decrease
the
standard deviation.
> Discussion Question 8-6
What are ways that you could increase statistical power?
Your students’ answers should include:
Three
ways
that
you
could
increase
your
statistical power are:
 Adapt a more lenient alpha level.
 Use
a one-tailed test in place of a two-tailed
test.
 Increase the size of the sample.
 Exaggerate
the
levels
of
the
independent
variable.
 Decrease the standard deviation.
Classroom Activity 8-2
Working with Confidence Intervals and Effect Size
For this activity, you will need to have the class
take a sample IQ test. You can find many examples
of abbreviated IQ tests online (www.iqtest.com is
one such site). Have students anonymously submit
their scores and compare the class data to data
for the general population (population mean = 100,
population standard deviation = 15). Using these
data:
 Have
students calculate the confidence interval
for the analysis.
 Have students calculate the effect size.
Use Handout 8-1, found at the end of this chapter,
to complete the activity.
V.
Next Steps: Meta-Analysis
1. A meta-analysis is a study that involves the
calculation of a mean effect size from the
individual effect sizes of many studies.
2. A meta-analysis can provide added statistical
power by considering many studies at once. In
addition, a meta-analysis can help to resolve
debates
fueled
by
contradictory
research
findings.
> Discussion Question 8-7
What is a meta-analysis and why is it useful?
Your students’ answers should include:
 A
study that involves the calculation of a mean
effect size from the individual effect sizes of
many studies.
 It is useful because it considers many studies at
once and helps to resolve debates fueled by
contradictory research findings.
3. The first step in a meta-analysis is to choose
the topic and make a list of criteria for which
studies will be included.
4. Our next step is to gather every study that
can be found on a given topic and calculate an
effect size for every study that was found.
5. Lastly,
we
calculate
statistics—ideally,
summary statistics, a hypothesis test, a
confidence interval, and a visual display of
the effect sizes.
6. A
file-drawer
analysis
is
a
statistical
calculation following a meta-analysis of the
number of studies with null results that would
have to exist so that a mean effect size is no
longer statistically significant.
Classroom Activity 8-3
Analyzing Meta-Analyses
Directions: In this activity, have students find a
meta-analysis within the psychological literature.
You may want to point them in the right direction
by suggesting journals that will typically publish
meta-analyses such as Psychological Bulletin,
Personality and Social Psychology Review, or
Journal of Applied Psychology. Once students have
found their meta-analysis, they should answer the
questions in Handout 8-2.
Additional Readings
Cohen, J. (1988). Statistical Power Analysis for
the Behavioral Sciences. Hillsdale, NJ: Lawrence
Erlbaum.
This is arguably the definitive source for
power analysis. Many of the procedural guidelines
for determining power that are useful in many
types of research design are clearly laid out in
this text.
Neyman, J. (1937). Outline of a theory of
statistical estimation based on the classical
theory of probability. Philosophical Transactions
of the Royal Society of London. Series A, 236,
333–380.
This is considered the seminal paper for
confidence intervals.
Rosenthal, R. (1994). Parametric measures of
effect size. In Cooper, H., and Hedges, L. V.
(Eds.), The handbook of research synthesis (pp.
231–244). New York: Russell Sage Foundation.
A very readable account of many of the
techniques for calculating effect sizes. The
chapter
also
includes
a
lot
of
background
information about these techniques and how to
interpret them.
Online Resources
This is an excellent Web site with numerous
statistical demonstrations that you can run in
your classroom to help explain the concepts
concretely: http://onlinestatbook.com/. Here you
will find demonstrations of effect size, goodness
of fit, and power.
Math World is an excellent and extensive resource
site,
providing
background
information
and
succinct explanations for all of the statistical
concepts covered in the textbook and beyond.
http://mathworld.wolfram.com/topics/Probabilityand
Statistics.html
PLEASE NOTE: Due to formatting, the Handouts are only available in Adobe
PDF®.
Download