PPT 09

advertisement
Chapter 9
Differences Among Groups
Research Methods in Physical Activity
Purpose and Protocol of the Statistical Test
The purpose of the statistical test is to evaluate the null hypothesis at a
specific level of probability (e.g., p < .05). In other words, do the two levels
of treatment differ significantly (p < .05) so that these differences are not
attributable to a chance occurrence more than 5 times in 100?
The statistical test is always of the null hypothesis. All that statistics can do
is reject or fail to reject the null hypothesis. Statistics cannot accept the
research hypothesis. Only logical reasoning, good experimental design, and
appropriate theorizing can do so. Statistics can determine only whether
the groups are different, not why they are different.
The t and the F ratios are used to determine whether groups are
significantly different. In ANOVA (Analysis Of Variance)techniques, R2 is
also used to establish meaningfulness. R2 is the percent variance in the
dependent variable accounted for by the independent variable. (all discussed
later in presentation)
The meaningfulness of the differences is estimated by effect size (ES).
Research Methods in Physical Activity
Assumptions of t and F ratios
The uses of the t and the F distributions have four assumptions (in addition to
the assumptions for parametric statistics presented in chapter 6):
♦ Observations are drawn from normally distributed populations.
♦ Observations represent random samples from populations.
♦ The numerator and denominator are estimates of the same
population variance.
♦ The numerator and denominator of F (or t) ratios are independent.
Research Methods in Physical Activity
When to use one vs. two tail tests for calculation of
significance
When designing the research model, the researcher must decide
whether they will use a one or two tail test in determining the
significance or the findings.
IF you tested a research hypothesis that predicted not only that
the sample mean would be different from the population mean but
that it would be different in a specific direction — example: “it
would be lower”. This test is called a directional or one-tailed
test because the region of rejection is entirely within one tail of
the distribution.
Some hypotheses predict only that one value will be different from
another, without additionally predicting which will be higher. The
test of such a hypothesis is non-directional or two-tailed
because an extreme test statistic in either tail of the distribution
(positive or negative) will lead to the rejection of the null
hypothesis of no difference.
Research Methods in Physical Activity
Types of t tests
t Test between sample and population mean (see formula, Eq. 9.1, p. 148)
The One-Sample t Test (also called one sample t Test) compares the mean score
of a sample to a known value. The known value is a population mean.
Hypotheses:
Null: There is no significant difference between the sample mean and the
population mean.
Alternate: There is a significant difference between the sample mean and the
population mean.
(we will review problem, calculation, and results in text with example 9.1, p. 149)
Research Methods in Physical Activity
Types of t tests
Independent t Tests (see formula, Eq. 9.3, p. 149)
The Independent Samples t Test compares the mean scores of two groups on a
given variable. These groups are separate and independent of each other, but
are tested on the same variable.
Hypotheses:
Null: The means of the two groups are not significantly different.
Alternate: The means of the two groups are significantly different.
(you may refer 9.3, / 9.4 to reference the formulas, p. 149)
Note the degrees of freedom for the independent t Test : df = n1 + n2 – 2
Thus there are two groups, and each group has df = n-1
Research Methods in Physical Activity
Types of t tests
Independent t Tests
Meaningfulness
As with other statistical tests, we would like to know the meaningfulness of the
treatment effect.
To estimate the degree to which the treatment influenced the outcome, use
effect size (ES), the standardized difference between the means.
ES = (M1 – M2)/s
M1 = the mean of one group or level of treatment, M2 = the mean of a second
group or level of treatment, and s = the standard deviation.
* If there is no control group, then the pooled standard deviation (equation
9.7, p.150) should be used. Remember, effect size can be interpreted as
follows: An ES of 0.8 or greater is large, an ES around 0.5 is moderate, and an
ES of 0.2 or less is small.
Research Methods in Physical Activity
Homogeneity of Variance
(basic assumption of parametric statistics)
All techniques for comparisons between groups assume that the variances
(standard deviation squared) between the groups are equivalent. Although
mild violations of this assumption do not present major problems, serious
violations are more likely if group sizes are not approximately equal.
If this is the case, you will need to select formulas that will account for
unequal numbers of subjects per group. Typically these are listed as test for
unequal variances.
Research Methods in Physical Activity
Dependent t Tests
The Dependent t Test is used when the two groups of scores are related in
some manner. Usually, the relationship takes one of two forms:
♦ Two groups of participants are matched on one or more characteristics and
thus are no longer independent.
♦ One group of participants is tested twice on the same variable, and the
experimenter is interested in the change between the two tests.
( See Formula 9.8 for an example of the formula)
Note; (see formula, 9.9, p, 151) The same participants are tested twice (pretest
and posttest). Thus, we adjust the error term of the t test downward (make it
smaller) by taking into account the relationship (r) between the pretest and
posttest adjusted by their standard deviations.
The degrees of freedom for the dependent t test are df = N – 1
where N = the number of paired observation.
See example 9.3, p. 152, for calculating the dependent t with the raw score formula
Research Methods in Physical Activity
t tests and Power in Research (refer to formula 9.11,
p.154)
There are three ways to obtain power via the independent t Test.
(The dependent t Test already has increased power because there is less within
group variance [same subjects repeat measures] )
1) The first level (M1 – M2) gives power if we can increase the difference
between M1 and M2. This occurs if there is a greater treatment effect. If
the value in the numerator becomes larger, then the t statistic becomes
larger, and increases the likelihood of rejecting the Null.
2) The second level of the independent t formula is the variances (s2) for
each group. If the variance is smaller, then the value in the denominator
becomes smaller and thereby increases the t statistic, and increases the
likelihood of rejecting the Null.
3) Finally, the third level (n1, n2) is the number of participants in each group.
If n1 and n2 are increased and the first and second levels remain the
same, the denominator becomes smaller (note n is divided into s2) and t
becomes larger, thus increasing the odds of rejecting the null hypothesis
and obtaining power.
Research Methods in Physical Activity
How the strength of the t statistic is evaluated
After the null hypothesis is rejected, the strength (meaningfulness) of the
effects must be evaluated.
The t ratio has a numerator and a denominator. From a theoretical point of
view, the numerator is regarded as true variance, or the real difference
between the means. The denominator is considered error variance, or
variation about the mean. Thus,
t =
true variance
error variance
If there are no differences between the variances, then t =1.
Thus, when a significant t ratio is found, we are really saying that true
variance exceeds error variance to a significant degree. The amount by which
the t ratio must exceed 1.0 for significance depends on the number of
participants (df) and the alpha level established.
Research Methods in Physical Activity
Relationships of t Test and Correlation Statistics
There are two sources of variance: true variance and error variance (true
variance + error variance = total variance).
• The t test is the ratio of true variance to error variance, whereas r is
the square root of the proportion of total variance accounted for by
true variance.
• To get t from r only means manipulating the variance components in a
slightly different way. This is because all parametric correlational and
differences-among-groups techniques are based on the general linear
model.
(see text pp, 157,158 for mathematical application of this concept)
Research Methods in Physical Activity
Classroom Examples on Using Excel
1) In MS Excel, you will need (If you have not done so already), to select
the Office Button
then select excel options, then select “add-ins”.
2) Choose Analysis Tool pack, click “go” and check it off the list, click “ok”
During this class session we will create and process data for the following
functions:
1)
2)
3)
4)
5)
6)
Descriptive Statistics
Percentiles
Correlation Data
Regression Data
Independent t Test
Dependent t Test
Research Methods in Physical Activity
ANOVA ( Analysis of Variance)
ANOVA is an extension of the independent t test. In fact, t is just a
special case of simple ANOVA in which there are two groups. Simple
ANOVA allows the evaluation of the null hypothesis among two or
more group means with the restriction that the groups represent
levels of the same independent variable.
See Table 9.1, p 159 for example. Notice that the explained variance is the SS
within, and the unexplained variance is the SS within.
Using the ANOVA in more than two groups prevents from
committing a type 1 error, because each group (three groups) has
been used in two comparisons (e.g., 1 vs. 2 and 1 vs. 3) rather than
only one when the probability (alpha) has been established form
comparisons between only two sets of scores. (FYI - Making this type of
comparison, in which the same group’s mean is used more than once, is an example
of increasing the experimentwise error rate)
Research Methods in Physical Activity
ANOVA ( Analysis of Variance)
Calculating Simple ANOVA (see Table 9.2, p.160)
Table 9.2 provides the formulas for calculating simple ANOVA and the
F ratio. This method, the so-called ABC method, is simple:
♦A = ∑X2: Square each participant’s score, sum these squared scores
(regardless of which group the participant is in), and set the total
equal to A.
♦B = (∑X)2/N: Sum all participants’ scores (regardless of group),
square the sum, divide by the total number of participants, and set the
answer equal to B.
♦C = (∑X1)2/n1 + (∑X2)2/n2 + . . . + (∑Xk)2/nk. Sum all scores in
group 1, square the sum, and divide by the number of participants in
group 1; do the same for the scores in group 2, and so on for however
many groups (k) there are. Then add all the group sums and set the
answer to C.
Note these formulas are just partitioning variance in different ways.
Research Methods in Physical Activity
ANOVA ( Analysis of Variance)
see Table 9.2, p.160
Degrees of freedom is used to determined the Mean Squares (MS) for
the between and with group variance.
The F ratio is determined by dividing the MS between groups by the
MS within group variance. ( MSB / MSW …. i.e., the ratio of true
variance to error variance)
Note: The F ratio is increased by either decreasing the within group
variance and/or increasing the between group variance.
To determine if “F” is significant you can refer to Table 6 , beginning on
page 431. The degrees of freedom in the numerator is (k-1 : number
of groups minus one), and the degrees of freedom in the denominator
is ( N-k : total number of participants minus number of groups)
Research Methods in Physical Activity
ANOVA ( Analysis of Variance)
Follow-up Testing
With an ANOVA of three groups, if we find that significant
differences exist among the three group means, we do not
know whether all three groups differ. Thus, significant findings
must be follow-up with a multiple comparison test.
Your text explains the Scheffé technique (you may review the
process, but you will not be examined on the protocol).
Computer software will calculate significant findings between
groups)
You may determine meaningfulness by calculating Omega
Squared (see example 9.13, p.163)
Research Methods in Physical Activity
Factorial ANOVA
Factorial ANOVA — Analysis of variance in which there is
more than one independent variable.
See Table 9.3, p.165.
Look at table 9.3 and note that the first independent variable
(IV1) has two levels, labeled A1 and A2. In our example this IV
represents the intensity of training: high intensity and low
intensity. The second independent variable (IV2) represents
the level of fitness of the participants: low fitness (B1) and high
fitness (B2).
There are two MAIN effects: Fitness and Intensity
There may be interaction between the different levels of the
MAIN effects.
Research Methods in Physical Activity
Factorial ANOVA
See Table 9.3, p.165. (2 X 2 ANOVA)
This particular factorial ANOVA is labeled a 2 (intensity of
training) × 2 (level of fitness) ANOVA (read “2-by-2 ANOVA”).
The true variance can be divided into three parts:
♦True variance because of A (intensity of training)
♦True variance because of B (level of fitness)
♦True variance because of the interaction of A and B
Each of these true variance components is tested against (divided by)
error variance to form the three F ratios for this ANOVA. Each of
these Fs has its own set of degrees of freedom so that it can be
checked for significance in the F table.
First determine if there is a significant interaction, then look at the
main effects.
Research Methods in Physical Activity
Factorial ANOVA (review of possible outcome
scenarios)
Figure 9.2 (2 X 2 ANOVA)
… reflects a significant interaction, because the mean attitude
scores of the low-fitness group toward the low-intensity
training was higher than their attitude toward high-intensity
training, whereas the opposite was shown for the high-fitness
participants. They preferred the high-intensity program.
This example shows how power is increased by using a particular
type of statistical test. If we had just used a t test or simple ANOVA,
we would not have found any difference in attitude toward the two
levels of intensity (both means were identical, M = 25). When we
added another factor (fitness level), however, we were able to discern
that there were differences in attitude dependent on the level of
fitness of the participants.
Research Methods in Physical Activity
Factorial ANOVA (review of possible outcome
scenarios)
Figure 9.3 (2 X 2 ANOVA)
… shows a nonsignificant interaction. In this case, both groups
preferred the same type of program over the other; hence, the
lines are parallel. Significant interactions show deviations from
parallel (as was shown in figure 9.2). The lines do not have to
cross to reflect a significant interaction.
Figure 9.4 (2 X 2 ANOVA)
… shows a significant interaction in which the high-fitness
group liked both forms of exercise equally, but there was a
decided difference in preference in the low-fitness group, who
preferred the low-intensity program over the high-intensity
program
Research Methods in Physical Activity
Repeated Measures ANOVA
Repeated-measures ANOVA — Analysis of scores for the same
individuals on successive occasions, such as a series of test trials; also
called split-plot ANOVA or subject × trials ANOVA.
The most frequent use of repeated measures involves a factorial
ANOVA in which one or more of the factors (independent variables)
are repeated measures.
Benefits of Repeated Measures ANOVA
1. Provides the experimenter the opportunity to control for individual
differences among participants, probably the largest source of variation in
most studies
2. The variation from individual differences can be identified and separated
from the error term, thereby reducing it and increasing power. Because of
the advantage of controlling individual differences, repeated-measures
designs are more economical because fewer participants are required.
Research Methods in Physical Activity
Repeated Measures ANOVA
Benefits of Repeated Measures ANOVA (continued)
3. Repeated-measures designs allow the study of a phenomenon across
time. This feature is particularly important in studies of change in, for
example, learning, fatigue, forgetting, performance, and aging
Problems with Repeated Measures ANOVA
1. Carryover effects. Treatments given earlier influence treatments
given later.
2. Practice effects. Participants improve at the task (dependent
variables) as a result of repeated trials in addition to the
treatment (also called the testing effect).
3. Fatigue. Participants’ performance is adversely influenced by
fatigue (or boredom).
4. Sensitization. Participants’ awareness of the treatment is
heightened because of repeated exposure.
Example of Repeated Measures ANOVA is found on p. 170.
Research Methods in Physical Activity
End of Presentation
Research Methods in Physical Activity
Download