EDF 802 Dr. Jeffrey Oescher Topic 1 Answers to the Formative Exercises Revised 23 January 2014 The following are sample responses to Items 1-6. Please compare these to the responses you've written. If there are any substantive differences, please let me know. 1. An examination of Table 1 indicates students in this sample answered approximately 70% of the items on Exam 1 correctly. This is considered a ‘C’ on a ten point grading scale. Their performance increased on Exam 2 where they correctly responded to slightly more than 81% of the items. The grade associated with this score is a ‘B’. Variation in the scores for both exams is moderate. This interpretation was determined by multiply a standard deviation of approximately 10 by 4 (i.e., 10 * 4 = 40). Forty points on a scale of 100 points was deemed by me to be moderate as it represents slightly less than one-half of all possible scores. Students’ attitudes can be described as somewhat positive based on the mean score for this variable. Variation in the scores is moderate. This interpretation was determined by multiplying the standard deviation of 0.51 by 4 (i.e., 0.51 * 4 = 2.04). This result represents about one half of the total range of variation in scores on a four point scale, and it is deemed moderate by me. Table 1 Descriptive Statistics for Variables in Formative Exercise 1 Variable Exam 1 Exam 2 Attitude N 30 30 30 Mean 51.67 60.97 3.87 SD 10.17 9.96 0.51 2. A sampling distribution is a frequency distribution of a specific statistic calculated from repeatedly sampling subjects from a population. The most common one discussed is that of the sampling distribution of the mean. This results from having taken numerous samples of the same size from a population, calculating the mean for each sample, and plotting these means into a frequency distribution. Statisticians have determined the "shape" of a sampling distribution of the mean is much like that of the common bell curve. Actually, there is a family of distributions based on the number of subjects included in the sample. This family of distributions is known as a set of tdistributions. Each distribution has a specific degree of freedom calculated on the basis of the size of the sample. Thus, one might say the sampling distribution of the mean for samples of 20 subjects is a t-distribution with 19 degrees of freedom. There are many sampling distributions, some of which you have likely studied (e.g., difference between two means (t), difference between two proportions (normal z-distribution), one sample case for the mean (t), one sample case for correlation (normal z-distribution)). There is a sampling distribution for every descriptive statistic as well as statistics calculated from them. For example, there is a sampling distribution for the difference between two means. Consider two populations from which you sample 20 subjects from each. If you calculate the mean for both samples and subtract one from the other, you have a single "difference between two means." Repeating this process many times would result in many differences between two means.” Plotting all of these differences would result in a frequency distribution of the difference between means. Again, statisticians have determined the "shape" of this distribution is normal. In fact, it too is a t-distribution, and again there is a family of distributions with each one based on the size of the sample or samples used. In the example above, one would say the sampling distribution of the difference between two means is a t-distribution with 38 degrees of freedom (i.e., n1 + n2 – 2 = 1 38). An understanding of the nature of a sampling distribution is critically important to understanding the logic of hypothesis testing. As you will see in my response to the next question, one of the important steps in hypothesis testing is the assumption the null hypothesis is true and generating a sampling distribution based on this assumption. Picture in your mind a sampling distribution of the difference between two means as a normal curve. Based on the null hypothesis of no difference between the two means, we know the middle of this distribution is 0. We also know that this distribution has the same characteristics as any normal curve. For example, approximately two-thirds of the differences between means lie between ±1 "standard deviations" of the mean. The standard deviation of this distribution is known as the standard error of the difference between two means. (That language makes sense in that we have a distribution of the differences between two means.) Calculating this standard error is somewhat laborious, but it can be done. I let SPSS worry about this stuff at this point in my career. Once the value is determined, we can predict that two-thirds of the differences between means lie between ±1 standard error of the mean. We could also predict that almost all of the differences between means fall between ±2 standard errors. In fact we can calculate the actual percentage of differences between means that lie above, below, or between any values of the actual observed difference between sample means. This used to be done by looking at the tables of t values in the appendices of textbooks, but I now use SPSS-Windows to tell me. 3. There are six basic steps in the inferential analysis of any hypothesis. 1. State the null and alternative hypotheses. 2. Set alpha (i.e., the level of significance). 3. Assume the null hypothesis is true and generate a sampling distribution of the appropriate statistic. 4. Calculate the observed statistic from the sample data collected. 5. Map the observed statistic into the sampling distribution. 6. Ascertain if the observed statistic is typical of atypical of the values of the statistics in the sampling distribution. 4. The comparison of Exam 1 scores across Groups 1 and 2 requires the use of an independent samples (i.e., two groups and one dependent variable) t-test. Why? Because I said so. Actually, statisticians have determined the sampling distribution of the difference between two means for independent samples is a t-distribution with n1 + n2 - 2 degrees of freedom. In this example, the sampling distribution is t28. How was this hypothetical or theoretical sampling distribution generated? I will explain the steps for testing this hypothesis briefly. First, the null hypothesis states there is no difference between the population means for Groups 1 and 2 (H0: µ1 − µ2 = 0). The alternative hypothesis suggests there is a difference (H1: µ1 − µ2 ≠ 0). Alpha is set at .05. Assuming the null hypothesis is true, a sampling distribution of the difference between two means was generated. This distribution is centered on 0 and has a standard error of 3.78. (From where did this standard error come and what is it?) An observed test statistic of t 28 = .04 was calculated. (How was this done?) When mapped into the underlying sampling distribution, this observed statistic is quite common. That is, the probability level associated with it is .972, suggesting the value of .04 is located quite close to the middle of the sampling distribution. As a result, the null hypothesis is accepted. There is no difference between the mean Exam 1 scores for Groups 1 and 2 despite the fact that the scores are not exactly the same. The write-up for this analysis is very simple. The comparison of the scores on Exam 1 for the students in Groups 1 and 2 was non-significant (t28 = 0.57, p = .972) suggesting there is no significant difference between these scores between groups. We will discuss in class how you could use a different SPSS procedure to analyze this data. I 2 recommend using either the ONE-WAY ANOVA or GLM-UNIVARIATE procedures as they have many options that will become important later in the course. 5. The basic analysis for this question parallels that of the former question. The comparison of Exam 2 scores across Groups 1 and 2 requires the use of an independent samples (i.e., two groups and one dependent variable) t-test. Statisticians have determined the sampling distribution of the difference between two means for independent samples is a t distribution with n 1 + n2 - 2 degrees of freedom. In this example, the sampling distribution is t28. (Actually, the statistical assumption of equal variances between the two groups was violated based on Levene’s test for homogeneity of variance. This requires an adjustment to the degrees of freedom for the underlying sampling distribution. The correct sampling distribution is t24.59. This information can be found in output file, but it rarely has any impact on the results. At this point in the course, you can ignore this matter.) To determine whether the two means are different requires the researcher to state the null and alternative hypotheses: H0: µ1 − µ2 = 0 and H1: µ1 − µ2 ≠ 0 respectively. Alpha is set at .05. Assuming the null hypothesis is true, a sampling distribution of the difference between two means was generated. This distribution is centered on 0 and has a standard error of 3.35. An observed test statistic of t24.59 = -2.49 was calculated. When mapped into the underlying sampling distribution, this observed statistic is atypical of the other t-statistics in the sampling distribution. That is, the probability level of .020, suggests the value of -2.49 is very unusual; a difference of this magnitude is likely to be seen less than two percent of the time if the two means are equal. The result is the rejection of the null hypothesis; there is a difference between the mean Exam 2 scores for Groups 1 and 2. Again, the write-up for this analysis is very simple. The comparison of the scores on Exam 2 for the students in Groups 1 and 2 was statistically significant (t24.59 = -2.49, p = .020) suggesting there is no significant difference between these scores for between groups. We will discuss in class how you could use a different SPSS-Windows procedure to analyze this data. I recommend using either the ONE-WAY ANOVA or GLM - UNIVARIATE procedures as they have many options that will become important later in the course. 6. The comparison of the mean attitude scores for all subjects to a neutral value of 3.00 is accomplished using a one sample test of the mean. The underlying sampling distribution is a tdistribution with n-1 degrees of freedom. Again, this comes from statistical theory. (Can you describe how this sampling distribution was created?) The null hypothesis suggests the population mean for attitude scores is 3.00. The alternative hypothesis suggests the population mean is something other than 3.00 (i.e., negative or positive, but not neutral). Alpha is set at .05. The statistical notation for these hypotheses is as follows: H 0: µ = 3.00 and H1: µ ≠ 3.00. Assuming the null hypothesis is true, a sampling distribution of t29 is generated. This is centered on 3.00 and has a standard error of the mean of .09. (Can you find this standard error in the PASW printout?) An observed value t29 = 9.30 was calculated. At this point, I want to digress to explain the formula and how the value of 9.30 was calculated. The actual formula for this test has as the numerator the sample attitude mean less the hypothesized population mean. In this case the sample mean is 3.87 and the hypothesized population mean is 3.00. The difference is 0.87, but this must be standardized by dividing by the standard error of the mean. This standard error is .09. (Again, how did I know this?) Dividing 0.87 by .09 results in a quotient of 9.33. The difference between this and the statistic produced in SPSS is due to rounding error. 𝑡= (𝑥− µ) or 𝑠𝑥̅ 3 𝑡= (3.87−3.00) .09 What needs to be determined now is where this observed value of t (i.e., 9.30) falls in the underlying sampling distribution. The probability level given in the output file is .000, suggesting this difference is quite different from one that would be expected under the assumption the null hypothesis was true. The null hypothesis is rejected, and the alternative hypothesis is accepted. This analysis suggests the attitudes for this group are positive. Once again the write-up is straight-forward. The comparison of the mean attitudinal score for the 30 subjects was significantly different from a neutral value of 3.00 (t 29 = 9.30, p = .000) suggesting the subjects attitudes are positive. 4