Formative Exercise 1 Answers

advertisement
EDF 802
Dr. Jeffrey Oescher
Topic 1
Answers to the Formative Exercises
Revised 23 January 2014
The following are sample responses to Items 1-6. Please compare these to the responses you've written.
If there are any substantive differences, please let me know.
1. An examination of Table 1 indicates students in this sample answered approximately 70% of the
items on Exam 1 correctly. This is considered a ‘C’ on a ten point grading scale. Their
performance increased on Exam 2 where they correctly responded to slightly more than 81% of
the items. The grade associated with this score is a ‘B’. Variation in the scores for both exams is
moderate. This interpretation was determined by multiply a standard deviation of approximately
10 by 4 (i.e., 10 * 4 = 40). Forty points on a scale of 100 points was deemed by me to be
moderate as it represents slightly less than one-half of all possible scores.
Students’ attitudes can be described as somewhat positive based on the mean score for this
variable. Variation in the scores is moderate. This interpretation was determined by multiplying
the standard deviation of 0.51 by 4 (i.e., 0.51 * 4 = 2.04). This result represents about one half of
the total range of variation in scores on a four point scale, and it is deemed moderate by me.
Table 1
Descriptive Statistics for Variables in Formative Exercise 1
Variable
Exam 1
Exam 2
Attitude
N
30
30
30
Mean
51.67
60.97
3.87
SD
10.17
9.96
0.51
2. A sampling distribution is a frequency distribution of a specific statistic calculated from repeatedly
sampling subjects from a population. The most common one discussed is that of the sampling
distribution of the mean. This results from having taken numerous samples of the same size from
a population, calculating the mean for each sample, and plotting these means into a frequency
distribution. Statisticians have determined the "shape" of a sampling distribution of the mean is
much like that of the common bell curve. Actually, there is a family of distributions based on the
number of subjects included in the sample. This family of distributions is known as a set of tdistributions. Each distribution has a specific degree of freedom calculated on the basis of the
size of the sample. Thus, one might say the sampling distribution of the mean for samples of 20
subjects is a t-distribution with 19 degrees of freedom.
There are many sampling distributions, some of which you have likely studied (e.g., difference
between two means (t), difference between two proportions (normal z-distribution), one sample
case for the mean (t), one sample case for correlation (normal z-distribution)). There is a
sampling distribution for every descriptive statistic as well as statistics calculated from them. For
example, there is a sampling distribution for the difference between two means. Consider two
populations from which you sample 20 subjects from each. If you calculate the mean for both
samples and subtract one from the other, you have a single "difference between two means."
Repeating this process many times would result in many differences between two means.”
Plotting all of these differences would result in a frequency distribution of the difference between
means. Again, statisticians have determined the "shape" of this distribution is normal. In fact, it
too is a t-distribution, and again there is a family of distributions with each one based on the size
of the sample or samples used. In the example above, one would say the sampling distribution of
the difference between two means is a t-distribution with 38 degrees of freedom (i.e., n1 + n2 – 2 =
1
38).
An understanding of the nature of a sampling distribution is critically important to understanding
the logic of hypothesis testing. As you will see in my response to the next question, one of the
important steps in hypothesis testing is the assumption the null hypothesis is true and generating
a sampling distribution based on this assumption. Picture in your mind a sampling distribution of
the difference between two means as a normal curve. Based on the null hypothesis of no
difference between the two means, we know the middle of this distribution is 0. We also know that
this distribution has the same characteristics as any normal curve. For example, approximately
two-thirds of the differences between means lie between ±1 "standard deviations" of the mean.
The standard deviation of this distribution is known as the standard error of the difference
between two means. (That language makes sense in that we have a distribution of the
differences between two means.) Calculating this standard error is somewhat laborious, but it can
be done. I let SPSS worry about this stuff at this point in my career. Once the value is
determined, we can predict that two-thirds of the differences between means lie between ±1
standard error of the mean. We could also predict that almost all of the differences between
means fall between ±2 standard errors. In fact we can calculate the actual percentage of
differences between means that lie above, below, or between any values of the actual observed
difference between sample means. This used to be done by looking at the tables of t values in the
appendices of textbooks, but I now use SPSS-Windows to tell me.
3. There are six basic steps in the inferential analysis of any hypothesis.
1. State the null and alternative hypotheses.
2. Set alpha (i.e., the level of significance).
3. Assume the null hypothesis is true and generate a sampling distribution of the
appropriate statistic.
4. Calculate the observed statistic from the sample data collected.
5. Map the observed statistic into the sampling distribution.
6. Ascertain if the observed statistic is typical of atypical of the values of the statistics in the
sampling distribution.
4. The comparison of Exam 1 scores across Groups 1 and 2 requires the use of an independent
samples (i.e., two groups and one dependent variable) t-test. Why? Because I said so. Actually,
statisticians have determined the sampling distribution of the difference between two means for
independent samples is a t-distribution with n1 + n2 - 2 degrees of freedom. In this example, the
sampling distribution is t28. How was this hypothetical or theoretical sampling distribution
generated?
I will explain the steps for testing this hypothesis briefly. First, the null hypothesis states there is
no difference between the population means for Groups 1 and 2 (H0: µ1 − µ2 = 0). The
alternative hypothesis suggests there is a difference (H1: µ1 − µ2 ≠ 0). Alpha is set at .05.
Assuming the null hypothesis is true, a sampling distribution of the difference between two means
was generated. This distribution is centered on 0 and has a standard error of 3.78. (From where
did this standard error come and what is it?) An observed test statistic of t 28 = .04 was calculated.
(How was this done?) When mapped into the underlying sampling distribution, this observed
statistic is quite common. That is, the probability level associated with it is .972, suggesting the
value of .04 is located quite close to the middle of the sampling distribution. As a result, the null
hypothesis is accepted. There is no difference between the mean Exam 1 scores for Groups 1
and 2 despite the fact that the scores are not exactly the same.
The write-up for this analysis is very simple. The comparison of the scores on Exam 1 for the
students in Groups 1 and 2 was non-significant (t28 = 0.57, p = .972) suggesting there is no
significant difference between these scores between groups.
We will discuss in class how you could use a different SPSS procedure to analyze this data. I
2
recommend using either the ONE-WAY ANOVA or GLM-UNIVARIATE procedures as they have
many options that will become important later in the course.
5. The basic analysis for this question parallels that of the former question. The comparison of Exam
2 scores across Groups 1 and 2 requires the use of an independent samples (i.e., two groups
and one dependent variable) t-test. Statisticians have determined the sampling distribution of the
difference between two means for independent samples is a t distribution with n 1 + n2 - 2 degrees
of freedom. In this example, the sampling distribution is t28. (Actually, the statistical assumption of
equal variances between the two groups was violated based on Levene’s test for homogeneity of
variance. This requires an adjustment to the degrees of freedom for the underlying sampling
distribution. The correct sampling distribution is t24.59. This information can be found in output file,
but it rarely has any impact on the results. At this point in the course, you can ignore this matter.)
To determine whether the two means are different requires the researcher to state the null and
alternative hypotheses: H0: µ1 − µ2 = 0 and H1: µ1 − µ2 ≠ 0 respectively. Alpha is set at .05.
Assuming the null hypothesis is true, a sampling distribution of the difference between two means
was generated. This distribution is centered on 0 and has a standard error of 3.35. An observed
test statistic of t24.59 = -2.49 was calculated. When mapped into the underlying sampling
distribution, this observed statistic is atypical of the other t-statistics in the sampling distribution.
That is, the probability level of .020, suggests the value of -2.49 is very unusual; a difference of
this magnitude is likely to be seen less than two percent of the time if the two means are equal.
The result is the rejection of the null hypothesis; there is a difference between the mean Exam 2
scores for Groups 1 and 2.
Again, the write-up for this analysis is very simple. The comparison of the scores on Exam 2 for
the students in Groups 1 and 2 was statistically significant (t24.59 = -2.49, p = .020) suggesting
there is no significant difference between these scores for between groups.
We will discuss in class how you could use a different SPSS-Windows procedure to analyze this
data. I recommend using either the ONE-WAY ANOVA or GLM - UNIVARIATE procedures as
they have many options that will become important later in the course.
6. The comparison of the mean attitude scores for all subjects to a neutral value of 3.00 is
accomplished using a one sample test of the mean. The underlying sampling distribution is a tdistribution with n-1 degrees of freedom. Again, this comes from statistical theory. (Can you
describe how this sampling distribution was created?)
The null hypothesis suggests the population mean for attitude scores is 3.00. The alternative
hypothesis suggests the population mean is something other than 3.00 (i.e., negative or positive,
but not neutral). Alpha is set at .05. The statistical notation for these hypotheses is as follows: H 0:
µ = 3.00 and H1: µ ≠ 3.00. Assuming the null hypothesis is true, a sampling distribution of t29 is
generated. This is centered on 3.00 and has a standard error of the mean of .09. (Can you find
this standard error in the PASW printout?) An observed value t29 = 9.30 was calculated.
At this point, I want to digress to explain the formula and how the value of 9.30 was calculated.
The actual formula for this test has as the numerator the sample attitude mean less the
hypothesized population mean. In this case the sample mean is 3.87 and the hypothesized
population mean is 3.00. The difference is 0.87, but this must be standardized by dividing by the
standard error of the mean. This standard error is .09. (Again, how did I know this?) Dividing 0.87
by .09 results in a quotient of 9.33. The difference between this and the statistic produced in
SPSS is due to rounding error.
𝑡=
(𝑥− µ)
or
𝑠𝑥̅
3
𝑡=
(3.87−3.00)
.09
What needs to be determined now is where this observed value of t (i.e., 9.30) falls in the
underlying sampling distribution. The probability level given in the output file is .000, suggesting
this difference is quite different from one that would be expected under the assumption the null
hypothesis was true. The null hypothesis is rejected, and the alternative hypothesis is accepted.
This analysis suggests the attitudes for this group are positive.
Once again the write-up is straight-forward. The comparison of the mean attitudinal score for the
30 subjects was significantly different from a neutral value of 3.00 (t 29 = 9.30, p = .000)
suggesting the subjects attitudes are positive.
4
Download