PPT 07

advertisement
Chapter 7
Statistical Issues in Research Planning and
Evaluation
Research Methods in Physical Activity
To plan your own study or evaluate a study by
someone else, you need to understand these concepts
and their interrelationships: probability, alpha,
power, sample size, and effect size.
Probability — The odds that a certain event will
occur. A concept of probability related to statistics is
called equally likely events.
• equally likely events — A concept of probability
in which the chances of one event occurring are the
same as the chances of another event occurring.
For example, if you roll a die, the chances of the numbers from 1 to 6
occurring are equally likely.
Another pertinent approach to probability involves relative frequency.
• relative frequency - A concept of probability concerning the comparative
likelihood of two or more events occurring.
For example, suppose that you toss a coin 100 times.You would expect
heads 50 times and tails 50 times; the probability of either result is one-half, or .50.
When you toss, however, you may get heads 48 times, or .48.This is the relative
frequency.You might perform 100 tosses 10 times and never get .50, but the
relative frequency would be distributed closely around .50, and you would still
assume the probability as .50.
Research Methods in Physical Activity
Probability in statistical tests
In a statistical test, you sample from a population of participants and
events. You use probability statements to describe the confidence that
you place in the statistical findings.
Frequently, you encounter a statistical test followed by a probability
statement such as p < .05. This interpretation is that a difference or
relationship of this size would be expected less than 5 times in 100
as a result of chance.
alpha
(level of significance)
In research, the test statistic is compared with a probability table for that
statistic, which tells you what the chance occurrence is. In behavioral
research, alpha (probability of chance occurrence) is frequently set at .05 or
.01 (the odds that the findings are due to chance are either 5 in 100 or 1 in
100).
• These values are used to control for a type I error.
Research Methods in Physical Activity
Error Types (Type I and Type II Errors)
In a study, the experimenter may make two types of error. A type I
error is to reject the null hypothesis when the null hypothesis is true.
For example, a researcher concludes that there is a difference between two
methods of training, but there really is not.
A type II error is not to reject the null hypothesis when the null
hypothesis is false. For example, a researcher concludes that there is no
difference between the two training methods, but there really is a difference.
Figure 7.1 (p. 116) is called a truth table, which displays type I and type II
errors.
You control for type I errors by setting alpha. For example, if
alpha is set at .05, then if 100 experiments are conducted, a true
null hypothesis of no difference or no relationship would be
rejected on only 5 occasions. To some extent the issue is this: If you
had to make an error, which type of error would you be willing to make? The
level of alpha reflects the type of error that you are willing to make.
Research Methods in Physical Activity
Acceptable variations in reporting alpha in research
Even when experimenters set alpha at a specific level (e.g., .05) before the
research, they often report the probability of a chance occurrence for the
specific effects of the study at the level it occurred (e.g., p = .012). This
procedure is appropriate (and recommended), because the researchers are
only demonstrating to what degree the level of probability exceeded the
specified level.
beta
beta is the magnitude of the type II error.
Although the magnitude of type I error is specified by alpha, you may also
make a type II error, the magnitude of which is determined by beta (ß).
See Figure 7.2, (p.118) - you can see the overlap of the score distribution on
the dependent variable for x (the sampling distribution if the null hypothesis is
true) and y (the sampling distribution if the null hypothesis is false).
Continued on next slide…
Research Methods in Physical Activity
Beta
By specifying alpha, you indicate that the mean of y (given a certain
distribution) must be at a specified distance from the mean of x before the
null hypothesis is rejected. But if the mean of y falls anywhere between the
mean of x and the specified y, you could be making a type II error (beta);
that is, you do not reject the null hypothesis when, in fact, there is a true
difference.
There is a relationship between alpha and beta; for example, as
alpha is set increasingly smaller, beta becomes larger.
Research Methods in Physical Activity
Meaningfulness (Effect Size)
Meaningfulness is the importance or practical significance of an effect or
relationship. The meaningfulness of a difference between two means can be
estimated by effect size (also called delta).
effect size (ES)— The standardized value that is the difference between the
means divided by the standard deviation. (ES vs. Sample size listed in Tables 7.3 and
7.4, p.119)
The formula for effect size is:
= (M1 – M2)/s
Also written as:
Cohen's = M1 - M2 / spooled
where spooled = �[(s 1�+ s 2�) / 2]
This formula subtracts the mean of one group (M1) from the mean of a second
group (M2) and divides the difference by the standard deviation. That places the
difference between the means in the common metric called standard deviation
units.
Pay attention to the formula – the greater the difference between the group
means and the less variance within the groups = greater effect size. A 0.2 or
less is a small ES, about 0.5 is a moderate ES, and 0.8 or more is a large ES.
Research Methods in Physical Activity
Power
Power is the probability of rejecting the null hypothesis when the null
hypothesis is false (e.g., detecting a real difference), or the probability of
making a correct decision. Power ranges from 0 to 1. The greater the power
is, the more likely you are to detect a real difference or relationship.
Rejecting the Null Hypothesis (power) is possible by increasing subject
numbers, but are your results meaningful? There are important questions
to answer:
1) How large a difference is important in theory or practice? and,
2) How many participants are needed to declare an important
difference as significant?
Understanding the concept of power can answer these questions. If a
researcher can identify the size of an important effect through previous
research or even simply estimate an effect size (e.g., 0.5 is a moderate ES)
and establish how much power is acceptable (e.g., a common estimate in
the behavioral science is 0.8), then the size of the sample needed for a
study can be estimated.
Research Methods in Physical Activity
Power (refer to figures 7.3 and 7.4)
• Power is simply calculated as 1 – beta
• Beta should be kept at 4x alpha (seriousness of type 1 vs. type II error)
• Thus if alpha is .05 then beta is .20, and power is calculated at 0.8
• You can review the literature and determine the ES.
• Now you have calculated the power (ability to find differences) and effect
size (meaningfulness), with a pre-selected alpha level. Based on this
information, you can determine how many subject you will need to recruit
per group to achieve the desired outcomes (to find differences)
•Note how this works – if you have a small ES, more subjects are required to
find differences (obtain power), thus the introduction of the independent
variable is not meaningful. Also, if you make the alpha level more stringent,
and the ES is unchanged, you will have to recruit more subjects to find
significant results.
•The sample size is extremely influential on power (see Table 7.1)
Research Methods in Physical Activity
Power
Keep in mind the relationships of alpha, sample size, and ES in planning a study.
1. If you have access to only a small number of participants, then you
need to have a really large ES or use a larger alpha, or both.
2.
Do not just blindly specify the .05 alpha if detecting a real difference is the
main issue. Use a higher one, such as .20 or even .30. This approach is
extremely pertinent in pilot studies.
Research Methods in Physical Activity
Using Information in the Context of the Study
Context — The interrelationships found in the real-world
setting.
Are effect sizes for significant findings large enough to be
meaningful when interpreted within the context of the
study, or for the application of findings to other related
samples, or for planning a related study?
Remember - Effect sizes are based on the difference
between the means (divided by the standard deviation)
The larger the effect size, the less the overlap between the
distribution of scores in the two groups (control and
experimental).
In very small samples a single unusual value can substantially influence the results.
Moreover, within- and between-participant variation (the error variance) tends to be large,
which causes the error term in tests of significance to be large, resulting in few significant
findings. On the other extreme of sample size, statistics has little value for very large
samples because nearly any difference or relationship is significant.
Research Methods in Physical Activity
Context
Context is what matters with regard to meaningfulness.
You must ask yourself,
“Within the context of what I do, does an effect of this size matter?”
The answer nearly always depends on who you are and what you are doing
(and practically never on whether p = .05 or .01).
Thus, having a significant (reliable) effect is a necessary, but not sufficient,
condition in statistics. To meet the criteria of being both necessary and
sufficient, the effect must be significant and meaningful within the context of its
use. Said another way,
♦ estimates of significance are driven by sample size,
♦ estimates of meaningfulness are driven by the size of the difference, and
♦ context is driven by how the findings will be used.
----- End of presentation ---Research Methods in Physical Activity
Download