Turner Answer Key for Chapter Eight Using statistics in small-scale language education research: Focus on non-parametric data Chapter Eight Practice Problem Answer Key You can retrieve the data for this problem from the Companion Website. Download the dataset and save it on your computer as an Excel document in the comma-separated values format. Note that there are three columns of information in the dataset, but for this analysis, the independent variable values are in the column labeled “class” and the dependent variable values are in the column labeled “posttest.” Step 1. State hypotheses H0: There is no statistically significant difference in students’ language learning (as measured by their performance on the CELT) depending on the type of instruction they received. H1: There is a statistically significant difference in students’ language learning (as measured by their performance on the CELT) depending on the type of instruction they received. Step 2. Set alpha alpha = .01 Step 3. Identify the appropriate statistic for the analysis: I propose to use a 1-way, between-groups ANOVA to analyze the data because: the independent variable has more than 2 levels; the dependent variable is interval or interval-like and the data are normally distributed in the population and in the groups defined by the levels of the independent variable; the groups defined by the levels of the independent variable consist of different individuals; I want to know whether there is a difference in the performance of the groups defined by the independent variable; I am working in the parametric paradigm—my participants were randomly selected from a population and I wish to generalize conclusions from my study to that population; there is a minimum of 5 (or 10) participants in each of the groups defined by the levels of the independent variable; and the number of participants in each of the groups defined by the independent variable is exactly the same OR the variances of the groups are approximately equal. Step 4. Collect the data. Retrieve the dataset from the Companion Website at www.routledge.com/cw/turner; save the dataset on your computer in the comma-separated values before you import it into R. (You may get an error message if you don’t save the dataset in the comma-separated values format before importing it into R!) 1 Turner Answer Key for Chapter Eight Using statistics in small-scale language education research: Focus on non-parametric data Step 5. Check the assumptions the independent variable has more than 2 levels. Yes. The independent variable has 4 levels. The type of instruction is indicated by the number 1, 2, 3, or 4 in the first column of the dataset, which has the label class. the dependent variable is interval or interval-like and the data are normally distributed in the population and in the groups defined by the levels of the independent variable. Yes. The data are CELT scores. The CELT is designed to yield normally distributed scores when it is administered to its target audience. I’ll make histograms and calculate the Shapiro Wilk statistic for each of the groups to verify that the scores are probably normally distributed. the groups defined by the levels of the independent variable consist of different individuals; Yes. The four levels of the independent variable are represented by the four separate classes. I want to know whether there is a difference in the performance of the groups defined by the independent variable; Yes. I am working in the parametric paradigm—my participants were randomly selected from a population and I wish to generalize conclusions from my study to that population; Yes. there is a minimum of 5 (or 10) participants in each of the groups defined by the levels of the independent variable; Yes, there are 12 people in each class. the number of participants in each of the groups defined by the independent variable is exactly the same OR the variances of the groups are approximately equal. Ok, there is exactly the same number of participants in each of the levels of the independent variable. Step 6. Calculate the observed value of the statistic I use R to calculate the observed value of the F and the exact probability of the observed value. I located the dataset on the Companion Websites (http://www.routledge.com/cw/turner) and saved the dataset on my computer as a comma-separated-values Excel spreadsheet. (If you enter the data directly into R see pages 232-233 for appropriate R commands.) > practice.problem.data = read.csv(file.choose(), header=T) [I imported the dataset] > View (practice.problem.data) [I viewed the dataset to check the header names, the number of cases, etc] [Create a subset of the data for each level of the independent variable & view each of the subsets] > Class1 = subset(practice.problem.data,practice.problem.data$class=="1") > View (Class1) > Class2 = subset(practice.problem.data,practice.problem.data$class=="2") >View (Class2) > Class3 = subset(practice.problem.data,practice.problem.data$class=="3") >View (Class3) > Class4 = subset(practice.problem.data,practice.problem.data$class=="4") >View (Class4) 2 Turner Answer Key for Chapter Eight Using statistics in small-scale language education research: Focus on non-parametric data [check the posttest scores for each of the classes to verify that the scores are probably normally distributed] > shapiro.test(Class1$posttest) Shapiro-Wilk normality test data: Class1$posttest W = 0.9835, p-value = 0.9939 > shapiro.test(Class2$posttest) Shapiro-Wilk normality test data: Class2$posttest W = 0.9221, p-value = 0.3034 > shapiro.test(Class3$posttest) Shapiro-Wilk normality test data: Class3$posttest W = 0.9151, p-value = 0.2476 > shapiro.test(Class4$posttest) Shapiro-Wilk normality test data: Class4$posttest W = 0.9411, p-value = 0.5121 > par(mfrow = c(2,2)) [create a space for 4 histograms] [create a histogram of each of the sets of posttest scores & review them to confirm likelihood of the scores being normally distributed] > hist(Class1$posttest, breaks = 10, col = "plum") > hist(Class2$posttest, breaks = 10, col = "deep pink 1") > hist(Class3$posttest, breaks = 10, col = "orange") > hist(Class4$posttest, breaks = 10, col = "dodger blue 3") 3 Turner Answer Key for Chapter Eight Using statistics in small-scale language education research: Focus on non-parametric data > study.model = aov(posttest~class, data = practice.problem.data) [Command to carry out the analysis and ‘save’ information from the steps in the analysis] > study.model [Command requesting some of the information from the steps] Call: aov(formula = posttest ~ class, data = practice.problem.data) Terms: class Residuals Sum of Squares 617.6042 2584.8750 Deg. of Freedom 1 46 Residual standard error: 7.496195 > res=aov(posttest~class, data = practice.problem.data) [request information on residuals] > summary(res) [request that the information from the analysis be reported in a Source Table; I added the blue information in the Source Table] Df 1 46 47 Sum Sq 617.6 2584.9 3202.5 Mean Sq 617.60 56.19 F value Pr(>F) 10.991 0.001792 ** class (between) Residuals (within) Total --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 4 Turner Answer Key for Chapter Eight Using statistics in small-scale language education research: Focus on non-parametric data > pairwise.t.test(practice.problem.data$posttest, practice.problem.data$class, paired =F, p.adjust.method = "bonferroni") [request that all pairwise comparisons be made using the Bonferroni correction technique so the comparisons that show a statistically significant difference are identified.] Pairwise comparisons using t tests with pooled SD data: practice.problem.data$posttest and practice.problem.data$class 1 2 3 2 0.0137 - 3 0.0624 1.0000 4 0.0026 1.0000 1.0000 P value adjustment method: bonferroni Step 7. Calculate the exact probability of the statistic I simply retrieve the exact probability from the R output, so exact p = .001792 Step 8. Compare the exact probability to alpha The rules for interpreting exact probability are: If exact probability ≥ alpha → accept the null hypothesis If exact probability < alpha → reject the null hypothesis Exact p is less than alpha, which was set at .01, so reject the null hypothesis and accept the alternative hypothesis. A post hoc analysis of the data must be performed to identify where the significant differences are. The Bonferroni method is used to identify which pairs show a significant difference; only Class 1 and Class 4 show a significant difference with alpha set at .01. Step 9. Make the probability statement There is 99% certainty of a statistically significant difference in students’ language learning (as measured by their performance on the CELT) depending on the type of instruction they received). Step 10. Interpret the meaningfulness There are two avenues for interpreting meaningfulness: 1) with reference to the research question, and 2) by calculating effect size. Effect size for the entire analysis can be determined by calculating omega2. The values of SSB (Sums of Squares Between), MSW (Mean Sums of Squares within), and SST (Sums of Squares 5 Turner Answer Key for Chapter Eight Using statistics in small-scale language education research: Focus on non-parametric data Total) can be retrieved from the Source Table; k is the number of levels of the independent variable. omega2 = = SSB [(k 1) MSW SST MSW 617.6 [(4 1)56.19 617.6 [(3)(56.19) 617.6 168.57 449.03 = = = = .138 3202.5 56.19 3258.69 3258.69 3258.69 The results of the analysis could be summarized like this: More information about the types of instruction received by the four difference classes is needed to provide a meaningful interpretation of the analysis; however, we note that with 99% certainty there is a statistically significant difference among the four classes (F = 10.991, p = .001792). The post hoc Bonferroni analysis indicates that the comparison of Class 1 to Class 4 shows a statistically significant difference (p = .0026), though effect size (omega2 = .138) is weak. [Alternatively, the effect size could be calculated for each of the pairwise comparisons, Class 1 to Class 2, Class 1 to Class 3, Class 1 to Class 4, Class 2 to Class 3, Class 2 to Class 4, and Class 3 to Class 4. See page 240 in the text for guidance.] 6