Turner Answer Key for Chapter Twelve 5 2015 Using statistics in small-scale language education research: Focus on non-parametric data Chapter Twelve Practice Problems Answer Key Study A: In this problem we investigate another of Maiya’s variables. Research question: Is there a statistically significant relationship between the sex of a sandwich shop customer and the type of response the individual makes to a server’s greeting and offer of service? I’ve assigned the role of independent variable to sex and the role of dependent variable to type of response. Now I follow the steps in statistical logic. Step 1: State hypotheses H0: The observed and expected frequencies are independent; that is, there is no statistically significant relationship between sex and the type of response an individual makes to a server’s greeting and offer of service. H1: The observed and expected frequencies are related; that is, there is a statistically significant relationship between sex and the type of response an individual makes to a server’s greeting and offer of service. Step 2. Set alpha alpha = .01 Step 3. Identify the appropriate statistic for the analysis I propose to analyze the data using the 2-way chi-squared statistic because: 1) the independent variable and the dependent variable are nominal; 2) the data are frequency counts; 3) each observation is independent of the others; 4) each person is counted only once; 5) degrees of freedom are greater than 1, so all expected frequencies must be greater than or equal to 5. [df = (number of levels of the independent variable minus 1) multiplied by (number of levels of the dependent variable minus 1)] Step 4. Collect the data. The data can be retrieved from the Companion Website (http://www.routledge.com/cw/turner9780415819947/s1/datasets/#section1) . For the independent variable sex, female is “1” and male is “2.” For the dependent variable type of response , “1” is greeting + politeness modal, “2” is politeness modal, and “3” is possible greeting. Step 5. Check the assumptions I propose to analyze the data using the 2-way chi-squared statistic because: 1) the independent variable and the dependent variable are nominal [Yes. Each variable represents a category.] 2) the data are frequency counts [Yes, we will do the analysis on the frequency counts; that is, how many people fall into each of the categories defined by the independent and dependent variables.] 3) each observation is independent of the others [Yes. Maiya noted in her explanation of how she collected her data that none of the customers consulted another before responding to the servers’ greeting and offer of service. ] 4) each person is counted only once [Yes. Maiya noted that though some people ordered more than one sandwich when it was their turn for service, each person was greeted only one time. ] 5) degrees of freedom are greater than 1, so all expected frequencies must be greater than or equal to 5. [This point will be checked as part of the calculation of the chi-squared value. R gives a warning if the assumption is not met.] Step 6. Calculate the observed value of the statistic I present the R commands below. 1 Turner Answer Key for Chapter Twelve 5 2015 Using statistics in small-scale language education research: Focus on non-parametric data > maiya.data = read.csv (file.choose(), header =T) > View (maiya.data) > chisq.test(maiya.data$sex,maiya.data$type) Pearson's Chi-squared test data: maiya.data$sex and maiya.data$type X-squared = 2.4474, df = 2, p-value = 0.2941 Step 7. Calculate the exact probability of the statistic I simply retrieve the exact probability from the R output; exact p = 0.2941 Step 8. Compare the exact probability to alpha The rules for interpreting exact probability are: If exact probability ≥ alpha → accept the null hypothesis If exact probability < alpha → reject the null hypothesis The exact probability, p = 0.2941, is greater than alpha, .01, so I accept the null hypothesis. H0: The observed and expected frequencies are independent; that is, there is no statistically significant relationship between sex and the type of response an individual makes to a server’s greeting and offer of service. Step 9. Make the probability statement We can be 99% certain that there is no a statistically significant relationship between a customer’s sex and the type of response the individual makes to a server’s greeting and offer of service. Step 10. Interpret the meaningfulness There are two avenues for interpreting meaningfulness: 1) with reference to the research question, and 2) by calculating effect size. Effect size is calculated using the formula for phi and Cramer’s V. phi = 2 n = Cramer’s V = 2.4474 = .0415 =.2037 59 phi 2 = (rows 1)or (columns 1)* .2037 2 = 1 .0415 = .2037 [*(rows – 1) refers to the number of levels of the independent variable minus 1; (columns – 1) refers to the number of levels of the dependent variable minus 1] We discovered that there is no statistically significant relationship between the sex of a customer and the type of response he or she makes to a server’s greeting and offer of service (χ2= 2.4474; p = 0.2941). Though there is no statistically significant relationship, the effect size (Cramer’s V = .2037) indicates a moderate relationship. 2 Turner Answer Key for Chapter Twelve 5 2015 Using statistics in small-scale language education research: Focus on non-parametric data Study B: There are two research questions: 1) Is there a statistically significant pattern of attitude among the participants toward pedagogical uses of technology? This question will be explored using the 1-way chi-squared statistic. 2) Is there are statistically significant relationship between the participants’ sex and their attitude toward pedagogical uses of technology? This question will be explored using the 2-way chi-squared statistic. The steps in statistical logic for Question 1: Is there a statistically significant pattern of attitude among the participants toward pedagogical uses of technology? Step 1: State hypotheses H0: The observed and expected frequencies are independent; that is, there is no statistically significant pattern of attitude toward pedagogical uses of technology among the participants. H1: The observed and expected frequencies are related; that is, there is a statistically significant pattern of attitude toward pedagogical uses of technology among the participants. Step 2. Set alpha alpha = .01 Step 3. Identify the appropriate statistic for the analysis I propose to analyze the data using the 1-way chi-squared statistic because: 1) the independent variable is nominal; 2) the data are frequency counts; 3) each observation is independent of the others; 4) each person is counted only once; 5) when degrees of freedom are greater than 1, all expected frequencies must be greater than or equal to 5; expected frequencies must be greater than 10 when sf = 1). [df is the number of levels of the variable minus 2, so df = 2] Step 4. Collect the data (note that the data are fabricated). Negative 5 Ambivalent 9 Positive 26 Step 5. Check the assumptions I propose to analyze the data using the 1-way chi-squared statistic because: 1) the independent variable and the dependent variable are nominal [Yes. The levels of the variable are categories.] 2) the data are frequency counts [Yes, I will do the analysis on the frequency counts—the number of people in each of the three levels of the independent variable.] 3) each observation is independent of the others [Yes. The individuals responded to the questionnaire independently—there was no discussion or collaboration among the participants.] 4) each person is counted only once [Yes. Each person completed only one questionnaire and thus is counted only once.] 5) degrees of freedom are greater than 1, so all expected frequencies must be greater than 5. There are 40 participants and the independent variable has 3 levels; the expected frequency for each of the three levels of the variable is 13.33 (40/3 = 13.33).] 3 Turner Answer Key for Chapter Twelve 5 2015 Using statistics in small-scale language education research: Focus on non-parametric data Step 6. Calculate the observed value of the statistic I carry out the calculations in the table below; the observed value of the χ 2 statistic is the sum of the values in the last column: 5.21 + 1.41 + 12.04 = 18.66. Cell fo fe A B C 5 9 26 13.33 13.33 13.33 (fo - fe)2 (fo - fe) 5 – 13.33 = -8.33 9 – 13.33 = -4.33 26 – 13.33 = 12.67 (fo - fe)2/fe 69.39 18.75 160.53 69.39/13.33 = 5.21 18.75/13.33 = 1.41 160.53/13.33 = 12.04 Step 7. Because I did the calculations with a hand-held calculator rather than R, I didn’t calculate the exact probability. I’ll follow the critical value approach to interpret the outcome of the analysis, soI use the degrees of freedom and alpha to find the critical value (from a chart of critical values for the χ2). (Check the Companion Website for the chart of critical values: http://www.routledge.com/cw/turner-9780415819947/s1/criticalvalue/ .) The formula for the degrees of freedom for a 1-way chi-squared analysis is the number of levels of the independent variable minus 1, so 3 – 1 = 2. The critical value for df = 2, alpha = .01 is 9.21. Step 8. Compare the observed value to the critical value. The rules are: If the observed value is ≤ critical value → accept the null hypothesis If the observed value is > critical value → reject the null hypothesis The exact probability, 18.66 > 9.21, so reject the null hypothesis. H0: The observed and expected frequencies are related; that is, there is a statistically pattern of attitude toward pedagogical uses of technology among the participants. Step 9. Make the probability statement We can be 99% certain that there is a statistically significant pattern of attitude toward pedagogical uses of technology among the participants. Step 10. Interpret the meaningfulness There are two avenues for interpreting meaningfulness: 1) with reference to the research question, and 2) by calculating effect size. Effect size is calculated using the formula for phi and Cramer’s V. phi = 2 n = Cramer’s V = 18.66 = .467 =.683 40 phi 2 = (rows 1)or (columns 1) .6832 = 2 .466 2 .233 = .483 We discovered that there is a statistically significant pattern of attitude among the participants toward pedagogical uses of technology in language teachings (χ2= 18.66, df = 2, α = .01). Effect size indicates that the pattern is very strong (Cramer’s V = .483). [Please recall though, that the data are fabricated!] 4 Turner Answer Key for Chapter Twelve 5 2015 Using statistics in small-scale language education research: Focus on non-parametric data Research Questions 2: Is there are statistically significant relationship between the participants’ sex and their attitude toward pedagogical uses of technology? Step 1: State hypotheses H0: There is no statistically significant relationship between participants’ sex and their attitude toward pedagogical uses of technology. H1: There is a statistically significant relationship between participants’ sex and their attitude toward pedagogical uses of technology. Note: I’ve assigned sex the role of independent variable and attitude the role of dependent variable. Step 2. Set alpha alpha = ..01 Step 3. Identify the appropriate statistic for the analysis I propose to analyze the data using the 2-way chi-squared statistic because: 1) the independent variable and the dependent variable are nominal; 2) the data are frequency counts; 3) each observation is independent of the others; 4) each person is counted only once; 5) degrees of freedom are greater than 1, so all expected frequencies must be greater than or equal to 5. [df = (number of levels of the independent variable -1) (number of levels of the dependent variable -1), so df = (2 – 1) (3 -1) = 2.] Step 4. Collect the data. Here are the data; note that the researcher collected data on age, too, but I’ve only analyzed the relationship between attitude and sex. attitude 3 3 2 1 3 3 3 2 2 2 1 3 3 3 3 2 3 3 1 1 2 3 3 age sex 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 2 2 2 2 2 2 2 1 1 1 1 2 2 2 2 1 5 Turner Answer Key for Chapter Twelve 5 2015 Using statistics in small-scale language education research: Focus on non-parametric data 3 3 1 3 3 2 2 2 3 3 3 3 3 3 3 3 3 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 1 1 1 2 2 2 2 1 1 1 2 2 2 1 1 2 2 Step 5. Check the assumptions I propose to analyze the data using the 2-way chi-squared statistic because: 1) the independent variable and the dependent variable are nominal [Yes. Each variable represents a category.] 2) the data are frequency counts [Yes, I will do the analysis on the frequency counts—that is, how many people fall into each of the categories defined by the independent and dependent variables.] 3) each observation is independent of the others [Yes. The questionnaires were completed individually without collaboration. ] 4) each person is counted only once [Yes. Each person is counted only 1 time in the dataset (each person completed only one questionnaire!). ] 5) degrees of freedom are greater than 1, so all expected frequencies must be greater than 5. This point will be checked as part of the calculation of the chi-squared value.] Step 6. Calculate the observed value of the statistic You can import the dataset from the Companion Website. Save it on your computer as a comma-separated values Excel document. > attitude = read.csv (file.choose (), header=T) [import the dataset from your computer using this command] > chisq.test(attitude$attitude, attitude$sex) [calculate the observed value of the statistic using this command] Pearson's Chi-squared test data: attitude$attitude and attitude$sex X-squared = 1.8154, df = 2, p-value = 0.4035 Warning message: In chisq.test(attitude$attitude, attitude$sex) : Chi-squared approximation may be incorrect Note the warning! It indicates that the expected frequency assumption is not met! I checked the number of people in each of the attitude category using the table command 6 Turner Answer Key for Chapter Twelve 5 2015 Using statistics in small-scale language education research: Focus on non-parametric data > table(attitude$attitude) 1 2 3 5 9 26 Combine the negative and ambivalent categories by returning to the dataset you saved on your computer and recoding the people who have a negative attitude (1) as 2, thus redefining the levels of the independent variable as negative/ambivalent or positive. Save the new dataset. Import the new dataset into R. > attitude.recoded = read.csv(file.choose(), header=T) [Import the new dataset.] > chisq.test(attitude.recoded$attitude, attitude.recoded$sex) [Redo the analysis] Pearson's Chi-squared test with Yates' continuity correction data: attitude.recoded$attitude and attitude.recoded$sex X-squared = 0.989, df = 1, p-value = 0.32 Note that the error message has been addressed, and that R used the Yates Continuity correction—which is appropriate given that the redefined independent variable now has only 2 levels and the df for the new analysis is 1 (2 – 1). See pages 318 – 319 for an explanation of the Yates Continuity correction formula. Step 7. Calculate the exact probability of the statistic I simply retrieve the exact probability from the R output, so exact p = 0.32 Step 8. Compare the exact probability to alpha The rules for interpreting exact probability are: If exact probability ≥ alpha → accept the null hypothesis If exact probability < alpha → reject the null hypothesis The exact probability, p = 0.4035, is greater than alpha, .01, so accept the null hypothesis. H0: The observed and expected frequencies are independent; that is, there is no statistically significant relationship between participants’ sex and their attitude toward pedagogical uses of technology. Step 9. Make the probability statement We can be 99% certain that there is no statistically significant relationship between participants’ sex and their attitude toward pedagogical uses of technology. Step 10. Interpret the meaningfulness There are two avenues for interpreting meaningfulness: 1) with reference to the research question, and 2) by calculating effect size. Because there is just one degree of freedom (after redefining the variable, attitude, by changing it from a 3-level to a 2-level variable, df = 1 and effect size is calculated using the formula for phi. phi = 2 n = .989 = .0247 =.157 40 We discovered that there is no statistically significant relationship between participants’ sex and their attitude toward pedagogical uses of technology (χ2=.989; p = 0.32). Effect size is weak (phi = .157). 7