Stats Chapter I

advertisement
3.5 Inferential Statistics 1
3.5 Inferential Statistics:
The Plot Thickens
We have been talking about ways to calculate and describe characteristics about data.
Descriptive statistics tell us information about the distribution of our data, how varied the data are, and
the shape of the data. Now we are also interested in information related to our data parameters. In other
words, we want to know if we have relationships, associations, or differences within our data and
whether statistical significance exists. Inferential statistics help us make these determinations and allow
us to generalize the results to a larger population. We provide background about parametric and
nonparametric statistics and then show basic inferential statistics that examine associations among
variables and tests of differences between groups.
Parametric and Nonparametric Statistics
In the world of statistics, distinctions are made in the types of analyses that can be used by the
evaluator based on distribution assumptions and the levels of measurement data. For example,
parametric statistics are based on the assumption of normal distribution and randomized sampling that
results in interval or ratio data. The statistical tests usually determine significance of difference or
relationships. These parametric statistical tests commonly include t-tests, Pearson product-moment
correlations, and analyses of variance.
Nonparametric statistics are known as distribution-free tests because they are not based on the
assumptions of the normal probability curve. Nonparametric statistics do not specify conditions about
parameters of the population but assume randomization and are usually applied to nominal and ordinal
data. Several nonparametric tests do exist for interval data, however, when the sample size is small and
the assumption of normal distribution would be violated. The most common forms of nonparametric
tests are chi square analysis, Mann-Whitney U test, the Wilcoxon matched-pairs signed ranks test,
Friedman test, and the Spearman rank-order correlation coefficient. These non-parametric tests are
generally less powerful tests than the corresponding parametric tests. Table 3.5(1) provides parametric
and nonparametric equivalent tests used for data analysis. The following sections will discuss these
types of tests and the appropriate parametric and nonparametric choices.
Table 3.5(1) Parametric and Nonparametric Tests for Data Analysis (adapted from Loftus and Loftus,
1982)
______________________________________________________________________________
Data
Purpose of Test
Parametric
Nonparametric
______________________________________________________________________________
single sample
To determine if an association
---
Chi-square
3.5 Inferential Statistics 2
exists between two nominal variables
single sample
To determine if sample mean
or median differs from some
hypothetical value
Matched t-test
Sign test
Two samples,
between subjects
To determine if the populations
of two independent samples
have the same means or
median
Independent t-test
Mann-Whitney
U test
Two conditions,
test or
within subjects
To determine if the populations
Independent t-test
of two samples have the
same mean or median
Sign
Wilcoxon signed
ranks test
More than two
conditions, between
subjects
To determine if the populations
of more than two independent
samples have the same mean
or median
One-way ANOVA
Kruskal-Wallis
More than two
conditions, within
subjects
To determine if the populations
or more than two samples have
the same mean or median
Repeated measures
ANOVA
Friedman
Analysis of
Variance
Set of items with
To determine if the two
Pearson correlation
Spearman
two measures on
measures are associated
each item
______________________________________________________________________________
Associations Among Variables
Chi Square (Crosstabs)
One of the most common statistical tests of association is the chi-square test. For this test to be
used, data must be discrete and at nominal or ordinal levels. During the process of the analysis, the
frequency data are arranged in tables that compare the observed distribution (your data) with the
expected distributions (what you would expect to find if no difference existed among the values of a
given variable or variables). In most computerized statistical packages, this chi-square procedure is
called "crosstabs" or "tables." This statistical procedure is relatively easy to hand calculate as well. The
basic chi square formula is:
X2=∑[(O-E)2/E]
where: X2= chi square
O= observed
E= expected
3.5 Inferential Statistics 3
Sigma= sum of
Spearman and Pearson Correlations
Sometimes we want to determine relationships between scores. As an evaluator, you have to
determine whether the data are appropriate for parametric or nonparametric procedures. Remember, this
determination is based upon the level of data and sample size. If the sample is small or if the data are
ordinal (rank-order) data, then the most appropriate choice is the nonparametric Spearman rank order
correlation statistic. If the data are thought to be interval and normally distributed, then the basic test for
linear (straight line) relationships is the parametric Pearson product-moment correlation technique.
The correlation results for Spearman or Pearson correlations can fall between +1 and -1. For the
parametric Pearson or nonparametric Spearman the data are interpreted in the same way even though a
different statistical procedure is used. You might think of the data as sitting on a matrix with an x and yaxis. A positive correlation of +1 (r = 1.0) would mean that the slope of the line would be at 45 degrees
upward. A negative correlation of -1 (r = -1) would be at 45 degrees downward (See Figure 3.5(1)). No
correlation at all would be r = 0 with a horizontal line. The correlation approaching +1 means that as one
score increases, the other score increases and is said to be positively correlated. A correlation
approaching -1 means that as one score increases, the other decreases, so is negatively correlated. A
correlation of 0 means no linear relationship exists. In other words, a perfect positive correlation of 1
occurs only if the values of the x and y scores are identical, and a perfect negative correlation of -1 only
if the values of x are the exact reverse of the values of y (Hale, 1990).
Insert Figure 3.5(2) about here
For example, if we found that the correlation between age and number of times someone swam at
a community pool was r = -.63, we could conclude that older individuals were less likely than younger
people to swim at a community pool. Or suppose we wanted to see if a relationship existed between
body image scores of adolescent girls and self-esteem scores. If we found r=.89, then we would say a
positive relationship was found (i.e., as positive body image scores increased, so did positive self-esteem
scores). If r=-.89, then we would show a negative relationship, (i.e., as body image scores increased,
self-esteem scores decreased). If r=.10, we would say little to no relationship was evident between body
image and self-esteem scores. If the score was r=.34 you might say a weak positive relationship exists
between body image and self-esteem scores. Anything above r=.4 or below r=-.4 might be considered a
moderate relationship between two variables. Of course, the closer the correlation is to r=+1 or -1, the
stronger the relationship.
Correlations can also be used for determining reliability as was discussed in Chapter 2.3 on
trustworthiness. Let's say that we administered a questionnaire about nutrition habits for the adults in an
3.5 Inferential Statistics 4
aerobics class for the purpose of establishing reliability of the instrument. A week later we gave them the
same test over. All of the individuals were ranked based upon the scores for each test, then compared
through the Spearman procedure. If we found a result of r = .94, we would feel comfortable with the
correlation referred to as the reliability coefficient of the instrument because this finding would indicate
that the respondents scores were consistent from test to re-test. If r = .36, then we would believe the
instrument to have weak reliability because the scores differed from the first test to the re-test.
Two cautions should be noted in relation to correlations. First, you cannot assume that a
correlation implies causation. These correlations should be used only to summarize the strength of a
relationship. In the above example where r=.89, we can't say that improved body image causes increased
self-esteem, because many other variables could be exerting an influence. We can however, say that as
body image increases, so does self-esteem and vice versa. Correlations point to the fact that there is
some relationship between two variables whether it is negative, nonexistent, or positive. Second, most
statistical analyses of correlations also include a probability (p-value). A correlation might be
statistically significant as shown in the probability statistics, but the reliability value (-1 to +1) gives
much more information. The p-value suggests that a relationship exists but one must examine the
correlation statistic to determine the direction and strength of the relationship.
Tests of Differences Between and among Groups
Sometimes, you may want to evaluate by comparing two or more groups through some type of
bivariate or multivariate procedure. Again, it will be necessary to know the types of data you have
(continuous or discrete), the levels of measurement data (nominal, ordinal, interval, ratio), and the
dependent and independent variables of interest to analyze.
In parametric analyses, the statistical process for determining difference is usually accomplished
by comparing the means of the groups. The most common parametric tests of differences between means
are the t-tests and analysis of variance (ANOVA). In nonparametric analyses, you are usually checking
the differences in rankings of your groups. For nonparametrics, the most common statistical procedures
for testing differences are the Mann-Whitney U test, the Sign test, the Wilcoxon Signed Ranks test,
Kruskal-Wallis, and the Friedman analysis of variance. While these parametric and nonparametric tests
can be calculated by hand, they are fairly complicated and a bit tedious, so most evaluators rely on
computerized statistical programs. Therefore, each test will be described in this section, but no hand
calculations will be provided. For individuals interested in the manual calculations, any good statistics
book will provide the formulas and necessary steps.
Parametric Choices For Determining Differences
T-tests
3.5 Inferential Statistics 5
Two types of t-tests are available to the evaluator: the two-sample or independent t-test and the
matched pairs or dependent t-test. The independent t-test, also called the two-sample t-test, is used to test
the differences between the means of two groups. The result of the analysis is reported as a t value. It is
important to remember that the two groups are mutually exclusive and not related to each other. For
example, you may want to know if youth coaches who attended a workshop performed better on some
aspect of teaching than those coaches who didn't attend. In this case, the teaching score is the dependent
variable and the two groups of coaches (attendees and non-attendees) is the independent variable. If the t
statistic that results from the t-test has a probability <.05, then we know the two groups are different. We
would examine the means for each group (those who attended as one group and those who did not) and
determine which set of coaches had the higher teaching scores.
The dependent t-test, often called the matched pairs t-test, is used to show how a group differs
during two points in time. The matched t-test is used if the two samples are related. This test is used if
the same group was tested twice or when the two separate groups are matched on some variable. For
example, in the above situation where you were offering a coaching clinic, it may be that you would want
to see if participants learned anything at the clinic. You would test a construct such as knowledge of the
game before the workshop and then again after the training to see if any difference existed in knowledge
gained by the coaches that attended the clinic. The same coaches would fill out the questionnaire as a
pretest and a posttest. If a statistical difference of p <.05 occurs, you can assume that the workshop made
a difference in the knowledge level of the coaches. If the t-test was not significant (p>.05), you could
assume that the clinic made little difference in the knowledge level of coaches before and after the clinic.
Analysis of Variance (ANOVA)
Analysis of variance is used to determine differences among means when there are more than
two groups examined. It is the same concept as the independent t-tests except you have more than two
groups. If there is only one independent variable (usually nominal or ordinal level data) consisting of
more than two values, then the procedure is called a one-way ANOVA. If there are two independent
variables, you would use a two-way ANOVA. The dependent variable is the measured variable and is
usually interval or ratio level data. Both one-way and two-way ANOVAs are used to determine if three
or more sample means are different from one another (Hale, 1990). Analysis of variance results are
reported as an F statistic. For example, a therapeutic recreation specialist may want to know if
inappropriate behaviors of psychiatric patients are affected by the type of therapy used (behavior
modification, group counseling, or nondirective). In this example, the measured levels of inappropriate
behavior would be the dependent variable while the independent variable would be type of therapy that is
divided into three groups. If you found a statistically significant difference in the behavior of the patients
(p<. 05), you would examine the means for the three groups and determine which therapy was best.
3.5 Inferential Statistics 6
Additional statistical post-hoc tests, such as Bonferroni or Scheffes, may be used with ANOVA to
ascertain which groups differ from one another.
Other types of parametric statistical tests are available for more complex analysis purposes. The
t-tests and ANOVA described here are the most frequently used parametric statistics by beginning
evaluators.
Nonparametric Choices for Determining Differences
For any of the common parametric statistics, a parallel nonparametric statistic exists in most
cases. The use of these statistics depends on the level of measurement data and the sample as is shown in
Table 3.5. The most common nonparametric statistics include the Mann-Whitney U, Sign Test,
Wilcoxon Signed Ranks Test, Kruskal-Wallis, and Friedman Analysis of Variance.
Mann-Whitney U Test
The Mann-Whitney U test is used to test for differences in rankings on some variable between
two independent groups when the data are not scaled in intervals (Lundegren & Farrell, 1985). For
example, you may want to analyze self-concept scores of high fit and low fit women who participated in
a weight-training fitness program. You administer a nominal scale form where respondents rank
characteristics as "like me" or "not like me." Thus, this test will let you determine the effects of this
program on your participants by comparing the two groups and their self-concept scores. This test is
equivalent to the independent t-test.
The Sign Test
The Sign test is used when the independent variable is categorical and consists of two levels.
The dependent variable is assumed to be continuous but can't be measured on a continuous scale, so a
categorical scale is substituted (Hale, 1990). This test becomes the nonparametric equivalent to the
dependent or matched t-test. The Sign test is only appropriate if the dependent variable is binary (i.e.,
takes only two different values). Suppose you want to know if switching from wood to fiber
daggerboards on your Sunfish sailboats increases the likelihood of winning regattas. The Sign test will
let you test this question by comparing the two daggerboards as the independent variable and the number
of races won as the dependent variable.
Wilcoxon Signed Ranks Test
The Wilcoxon test is also a nonparametric alternative to the t-test and is used when the
dependent variable has more than two values that are ranked. Positions in a race are a good example of
this type of dependent variable. As in the Sign test, the independent variable is categorical and has two
levels. For example, you might wish to know if a difference exists between runners belonging to a
running club and those that do not compared to their finish positions in a road race. The results of this
3.5 Inferential Statistics 7
analysis indicate the difference in ranks on the dependent variable (their finish position) between the two
related groups (the two types of runners).
Kruskal-Wallis
The Kruskal-Wallis test is equivalent to the one-way ANOVA. It is used when you have an
independent variable with two or more values that you wish to compare to a dependent variable. You use
this statistic to see if more than two independent groups have the same mean or median. Since this
procedure is a nonparametric test, you would use the statistic when your sample size is small or you are
not sure of the distribution. An example of an application would be to compare the attitudes of
counselors at a camp (the dependent variable) toward three different salary payment plans: weekly, twice
a summer, or end of the summer (independent variable).
Friedman Analysis of Variance
The Friedman Analysis of Variance test is used for repeated measures analysis when you
measure a subject two or more times. You must have a categorical independent variable with more than
two values and a rank-order dependent variable that is measured more than twice (Hale, 1990). This test
is the nonparametric equivalent to repeated measures of analysis of variance (two-way ANOVA). An
example of a situation when you might use this statistical procedure would be if you had ranked data
from multiple judges. For example, suppose you were experimenting with four new turf grasses on your
four soccer fields. You ask your five maintenance workers to rank the grasses on all four fields for
durability. You could analyze these results with the Friedman's procedure.
Making Decisions about Statistics
The decision model for choosing the appropriate type of statistical procedure is found in Figure
3.5(3). This model provides a logical progression of questions about your parametric and nonparametric
data that will help you in your selection of appropriate statistics. Many other forms of statistical
measures are possible, but the ones described here will provide you with the most common forms used in
evaluations.
Insert Figure 3.5(3) about here
__________________
From Ideas to Reality
This chapter initially may seem complex and full of terms not familiar to you. We realize that
statistics are often confusing and overwhelming to some people, but as you begin to understand what they
mean, they can be extremely helpful. We acknowledge that one short chapter will not make you an
expert on these statistics, but we hope this chapter will at least show you some of the ways that we can
gain valuable information by examining the relationships between and among variables. The more that
you use statistical procedures, the more you will understand them. Thank goodness we have computers
3.5 Inferential Statistics 8
these days to save time and effort. Computers, however, do not absolve us from knowing what statistics
to use to show associations or differences within our data. If you want more information than is offered
in this short chapter, you may want to consult one of the many statistics books that exist.
__________________
Now that You have Studied this Chapter, You Should be Able to:
--Explain when parametric and nonparametric statistics should be used
--Given a situation with the levels of measurement data known, choose the correct statistical
procedure to use
--Use the decision model for choosing appropriate statistical procedures to make decisions about
a statistical problem
40
40
35
35
+ 1 correlation
30
-1
correlation
30
25
25
20
20
15
15
10
10
5
5
0
0
0
10
20
30
40
0
Positive Correlation
10
20
Negative Correlation
Figure 3.5(2) Examples of Correlation Relationships
Figure 3.5(3) A Statistical Decision Tree
(adapted from Hall, 1990)
30
40
3.5 Inferential Statistics 9
Is the dependent
variable ordinal,
interval, or ratio?
No
Is the dependent
variable measured
more than once?
No
No
Is there more than one
independent variable?
Choose the one-way
chi-square
Yes
Yes
Are the dependent
variables ranks related
to one another or from
the same case?
Choose the two-way
chi-square
Yes
No
No
Is the dependent
variable binary?
Stop
Yes
Is the dependent
variable measured
more than twice?
Yes
Yes
Choose Friedman
nonparametric analysis
or variance
Choose the Sign test
No
Choose the
Wilcoxon test
Are you comparing a sample
mean to a known or
hypothesized population
mean?
Yes
Do you know the
population standard
deviation?
Are these two samples
related?
Yes
Choose the
one-sample z-test
Choose the
one-sample t-test.
No
No
Are you comparing the
means of two samples?
Yes
Yes
No
Choose the
dependent t-test
Choose the
independent t-test
No
Are you comparing the
means of more than two
groups?
Is there only a single
independent variable?
Yes
Yes
Choose the one-way ANOVA
No
Are there two or more
categorical independent
variables?
Yes
Choose the two-way
ANOVA
No
No
Are you interested in
describing the relation
between two variables?
Stop
Yes
Are the variables
interval or ordinal?
Yes
Choose the Pearson
correlation coefficient
Yes
Choose the Spearman
rank correlation
No
Are the variables
ranked?
Download