Turner, J. Using statistics in small-scale language education research: Focus on non-parametric data Chapter Five: Practice Problem Key 1. Identify the independent variable. amount of exposure to authentic spoken Spanish 2. What type of scale defines the independent variable? Amount of exposure to authentic spoken Spanish is a nominal variable. 3. Identify the levels of the independent variable. Amount of exposure to authentic spoken Spanish has two levels, 2 hours and 6 hours. (Note that the 4 hours of reading is intended to reduce the possible threat to the internal validity of the study that would be present if one group of students had only 2 hours of homework and the other had a 6 hour requirement.) 4. Identify the dependent variable. Accuracy of pronunciation in spoken Spanish (as measured by the Versant test). 5. What type of scale defines the dependent variable? interval scale 6. Identify any explicit control variables. level of study of Spanish (only beginning level students are included in the study) 7. Identify any explicit moderator variables. There are no moderator variables. 8. What is the research design for this study? I used the 3 ordered questions on page 80 in the textbook to determine the research design. 1) Is there an experimental treatment? Yes, there is an innovation. The students listen to 6 hours of spoken Spanish instead of the usual 2 hours plus 4 hours of reading. (Note: If you perceive the study to simply have two different conditions, neither of which is an innovation/change from the usual practice, the design would be ex post facto; however, I believe the assignment to do 6 hours of listening is an innovation.) 2) Are there legitimate comparison groups? The researcher randomly selected the participants from a population, and randomly assigned them to one of the two levels of the independent variable, forming legitimate comparison groups. 3) Are the legitimate comparison groups formed through random selection and random assignment? Yes, so the design of the study is true experimental. 9. Follow the steps in statistical logic to determine whether there is a statistically significant difference in the groups’ performance on the Versant for Spanish test. The research question that guides this investigate is something like this: “Do learners of Spanish who have 6 hours of 1 Turner, J. Using statistics in small-scale language education research: Focus on non-parametric data weekly exposure to authentic spoken Spanish develop a higher degree of pronunciation accuracy that learners who have 2 hours of weekly exposure to authentic spoken Spanish?” Step 1: State formal research hypotheses. Null hypothesis: There is no statistically significant difference between the mean pronunciation score for the group that listened to authentic spoken Spanish for 6 hours a week and the group that who listened to authentic spoken Spanish for 2 hours a week. [ HO : X 1 X 2 ] Alternative hypothesis 1: The mean pronunciation score for the group that listened to authentic spoken Spanish for 6 hours a week is significantly higher than the mean pronunciation score of the group that listened to authentic spoken Spanish for 2 hours a week. [ H1 : X 1 X 2 ] Alternative hypothesis 2: The mean pronunciation score for the group that listened to authentic spoken Spanish for 6 hours a week is significantly lower than the mean pronunciation score of the group that listened to authentic spoken Spanish for 2 hours a week. [ H 2 : X 1 X 2 ] Step 2: Set alpha. I set alpha at .01 because the researcher may want to make a strong statement about the outcome of the research. Step 3: Propose the statistic to be used to analyze the data. I propose to use the Case II Independent Samples t-test statistic because 1) the data are randomly drawn from a population & the researcher wants to be able to generalize the findings to the population (so a parametric statistic is appropriate); 2) the independent variable has two levels (the group that listened to authentic spoken Spanish 2 hours a week and the group than listened to authentic spoken Spanish 6 hours a week); 3) the dependent variable may be normally distributed in the population; and 3) the researcher is interested in determining whether there is a statistically significant difference between the means of two groups (the two levels of the independent variable). Step 4: Collect the data. I’ll use the fabricated data presented in the problem. Step 5: Check the assumptions for use of the Case II Independent Samples t-test formula. I used R to calculate the descriptive statistics and check the assumptions. The commands are presented in the chart below. The assumptions for using the Case II t-test formula are: 2 Turner, J. Using statistics in small-scale language education research: Focus on non-parametric data 1. The independent variable is nominal and has only two levels. The independent variable is nominal and it has just two levels; one level is represented by the group that did 6 hours of listening a week and the second level is represented by group than did 2 hours of listening a week. 2. The dependent variable is interval in nature and the scores in each group should be normally distributed. The histogram for Class 2 looks a little skewed, but the Shapiro Wilk analysis shows that there 95% certainty of the data being normally distributed. The histogram for Class 1 looks normally distributed, as confirmed by the Shapiro Wilk analysis. (See the R commands and output, including the histograms, in the chart below.) 3. The two groups have exactly the same number of people or the variances of the two groups are approximately equal. The number of people in the two groups is exactly the same. Here are the R commands and output. In this problem, I entered the data for the two groups in two separate sets; class.1 is the group that did 2 hours of listening to authentic spoken Spanish each week and class.2 is the group that did 6 hours of listening to authentic spoken Spanish each week. . For each group, use the summary command to calculate the mean and identify the median. (The means are highlighted in yellow and the medians are highlighted in grey.) class.1 =c (63, 68, 63, 58, 57, 59, 54, 57, 40, 45, 42, 47, 49, 49, 39, 47, 37, 34, 36, 39) class.2 = c (62, 60, 60, 55, 58, 57, 55, 47, 36, 47, 48, 44, 44, 49, 42, 44, 39, 39, 38, 36) > summary (class.1) Here’s the output. Min. 1st Qu. Median Mean 3rd Qu. 34.00 39.75 48.00 49.15 57.25 Max. 68.00 >summary (class.2) Here’s the output. Min. 1st Qu. Median Mean 3rd Qu. Max. 36.00 41.25 47.00 48.00 55.50 62.00 Use the subset (table) command to determine the mode for each group. (The modes are highlighted in green.) subset (table (class.1), table(class.1)==max(table(class.1))) Output: 39 47 49 57 63 3 Turner, J. Using statistics in small-scale language education research: Focus on non-parametric data 2 2 2 2 2 subset (table (class.2), table(class.2)==max(table(class.2))) Output: 44 3 To calculate the standard deviation, I used the shortcut command sd. (The standard deviations are highlighted in pink.) sd (class.1) Output: 10.20462 sd (class.2) Output: 8.583951 I used the hist command to make a frequency distribution of the scores for each group. (I added color and a main title to each histogram). hist (class.1, col = "dark red", main = "Class 1 Pronunciation Scores") The histogram for Class 1: hist (class.2, col = "lime green", main = "Class 2 Pronunciation Scores") The histogram for Class 2: 4 Turner, J. Using statistics in small-scale language education research: Focus on non-parametric data To calculate the range, I retrieved the maximum and minimum scores from the output for the summary command and found the differences. 68-34 Output: 34 62-36 Output: 26 I checked each set of scores using the Shapiro Wilk statistic to verify that they approximate the normal distribution shape. (The exact p-values indicate with 99% certainty that each of the datasets approximates a normal distribution.) shapiro.test(class.1) Output: Shapiro-Wilk normality test data: class.1 W = 0.9496, p-value = 0.3616 shapiro.test (class.2) Output: shapiro.test(class.2) Shapiro-Wilk normality test data: class.2 W = 0.927, p-value = 0.1354 5 Turner, J. Using statistics in small-scale language education research: Focus on non-parametric data Step 6: Calculate the observed value of t. Here’s the formula: tobserved = X1 X 2 s12 s2 2 n 1 n2 Here are the calculations—I retrieved the values for the means and standard deviations from the R output: tonserved = 49.15 48 = 10.2 8.58 20 20 1.15 1.15 = = .39 2.98 8.88 1.15 104.04 73.62 20 20 Step 7: Determine the degrees of freedom for the analysis and use this value and alpha (which was set in Step 2) to find the appropriate value of tcrtiical from the chart of tcritical values. (See Figure 5.9 in the textbook for a chart of tcritical values). The degrees of freedom for the Case II Independent Samples t-test statistic are determined using this formula: [( n1 – 1) + (n2 – 1)]. There are 20 people in each group, so there are 38 degrees of freedom for this analysis [(20 - 1) + (20 - 1)]. The alpha level was set at .01. There is no value of tcritical for df = 38, so use the value for df = 35.1 Use the value listed for alpha = .01. The critical value of t for the analysis is 2.7238. Step 8: Compare the observed value and the critical value to interpret the formal research hypotheses. Here are the rules for interpreting the formal research hypotheses using the critical value approach. If tobserved < tcritical → accept the null hypothesis If tobserved > tcritical → reject the null hypothesis When I compared tobserved to tcritical I see that I have .39 < 2.7238. The critical value approach rules tell us that when the observed value of t is less than the critical value of t the null hypothesis is rejected, so I accepted the null hypothesis: 1 When using a chart of critical values, if the exact df for a study is not listed, tradition indicates that the nearest most conservative value should be used; df = 35 is more conservative than df = 40, so use the tcritical value for df = 35. 6 Turner, J. Using statistics in small-scale language education research: Focus on non-parametric data Null hypothesis: There is no statistically significant difference between the mean pronunciation score for the group that listened to authentic spoken Spanish for 6 hours a week and the group that who listened to authentic spoken Spanish for 2 hours a week. [ HO : X 1 X 2 ] Step 9: Interpret the findings as a probability statement. I can be 99% certain that there is no significant difference between the mean pronunciation score of the group that listened to authentic spoken Spanish for 6 hours weekly and the mean pronunciation score of the group that listened to authentic spoken Spanish for 2 hours weekly. Step 10: Interpret meaningfulness of the findings. Remember that there are two avenues for interpreting the meaningfulness of an inferential statistic. First refer to the research question: “Do learners of Spanish who have 6 hours of weekly exposure to authentic spoken Spanish develop a higher degree of pronunciation accuracy that learners who have 2 hours of weekly exposure to authentic spoken Spanish?” We can be 99% certain that learners who received 6 hours of exposure each week to authentic spoken Spanish and learners who received 2 hours of exposure each week did not show any difference in pronunciation accuracy (tobserved = .39, df = 38, α = .01). The second avenue is calculating the effect size. For t-tests the formula for effect size is: t2 t 2 df so .392 = .392 38 .15 = .15 38 .15 = .004 = .06 38.15 The effect size is .06. According to the guidelines for interpreting effect size (Field, Miles, & Field, 2012, p. 58), an effect size value above .10 is weak. I believe the results of this study might be interpreted in this manner: The analysis indicated that learners who listened to 6 hours of authentic spoken Spanish each week did not develop more accurate pronunciation than learners who listened to 2 hours each week (t = .39, df = 38, α = .01, effect size = .06). Therefore, on the basis of this study, one can conclude that requiring students to have more than 2 hours a week of exposure to authentic spoken Spanish may not result in more accurate pronunciation. 7