CALICO 2008, San Francisco, March 18-22 Designing a computer delivered performance test: A pilot study Carol A. Chapelle, Yoo-Ree Chung, Volker Hegelheimer, Nick Pendar, and Jing Xu Iowa State University 1. Introduction ● Student placement is determined by performance on a 30-minute essay. ● Holistic essay rating is completed by two (sometimes three) trained raters. ● Placement decisions are based on essentially a one-item test questioned by some students. Î Additional information is needed to support placement decisions. Î SLA-based grammar test might be one possible solution. Figure 1. English Placement Test procedures at Iowa State University ENGLISH PLACEMENT TEST Writing (essay) – Reading (k=30) – Listening (k=30) 101B - G/UG 099R 101C - UG 101D - G 099L Pass Pass UG – First Year Composition 2. G - No more English Research Questions 1. Can test items constructed on the basis of research on grammatical development produce a short test with acceptable reliability? 2. Do students at three different levels of language development perform significantly differently on the test according to their proficiency levels? 3. Do students’ scores on other tests of language development correlate positively as hypothesized with the scores on the grammar test? 4. Do the empirical item difficulties of the items correspond to their difficulties that would be predicted by stage of development? 3. Basis for Test Development ● Natural development sequence in second language acquisition (SLA) • Morphosyntactic features: Andersen (1978), Bailey et al. (1974), Dulay & Burt (1973, 1974) • Syntactic features: Stauble (1984) • Tense and aspect: Bardovi-Harlig (2000), Bayley (1999) ● Bridging SLA and language assessment • Pienemann, Johnston, & Brindley (1988) • Norris (2000, 2005) 4. SLA Findings and Grammar Test Item Development Table 1. Target grammar areas of the items on the grammar test I (Spring 07 – Fall 07) Item # Construct Predicted Level of Acquisition Research Background Target Response Beginning (1) Andersen (1978); Dulay & Burt (1973, 1974); Bailey (1974) She is the author of a/the novel that he really loves. Item 1 Articles Item 2 Task Gap-filling Item 3 Present perfect Intermediate (2) Andersen (1978) – Past participle I have met/seen her twice more since then. Item 4 Cancellation of SV inversion in an embedded question Advanced (3) Norris (2000) Can you tell me how I can get … Jumbled word order Intermediate (2) Pienemann & Johnston (1987) …to the place? Add a word if necessary & jumbled word order Andersen (1978); Dulay & Burt (1973, 1974); Bailey (1974) When I was three … Item 5 Use of preposition Item 6 Past tense and number agreement of verb ‘to be’ Beginning (1) Item 7 Gerund Intermediate (2) He was good at changing … … it helped me imagine/to imagine … Item 8 (Bare) infinitive as a VP complement Intermediate (2) Item 9 Noun (number) Beginning (1) Item 10 Modal + present perfect Advanced (3) Item 11 Multiple Whquestions within the embedded sentence Advanced (3) Item 12 Subject clause; Use of adverbs Intermediate (2) Pienemann et al. (1988) Bailey et al. (1974); Dulay & Burt (1973; 1974) Andersen (1978) – Past participle; Pienemann et al. (1988) - Aux Norris (2000); Hawkins (2001) Pienemann et al. (1988) for use of ‘ly’ adverbs; Hawkins (2001) 2 Change word forms if necessary … the story/stories better. .. she may/might/ could have thought that …. Select words & change word forms if necessary Did Freda discover who bought what? It is nearly certain that oil prices will rise again. Jumbled word order Table 2. Target grammar areas of the items on the grammar test II (Spring 08) Predicted Level of Acquisition Research Background Target Response Task Articles Beginning (1) Andersen (1978); Dulay & Burt (1973, 1974); Bailey (1974) She is the author of a/the novel that he really loves. Gap-filling Item 3 Cancellation of the subject and verb inversion in the embedded question Advanced (3) Norris (2000) Can you tell me how I can get … Jumbled word order Item 4 Use of preposition Intermediate (2) Pienemann & Johnston (1987) … to the place? Add a word if necessary & jumbled word order Jane, was the yearly report submitted yesterday? Construct Item 1 Item 2 Add a few words, change word forms & jumbled word order Change word forms & jumbled word order Item 5 Passive Advanced (3) Andersen (1978) – Past participle Item 6 Past progressive Intermediate (2) Bardovi-Harlig (2000) I’m not sure. Tim was still working … Item 7 Use of preposition Intermediate (2) Pienemann & Johnston (1987) … on it. Add a word if necessary & jumbled word order Item 8 Modal + present perfect Advanced (3) Andersen (1978) – Past participle; Pienemann et al. (1988) - Aux … she may/might have thought that …. Select words & change word forms if necessary Item 9 Cancellation of the subject and verb inversion in the embedded question Advanced (3) Norris (2000) Bob, do you remember where we are going to meet? Item 10 Relative clause (with prep) Advanced (3) Norris (2000) Jordan ran into an old friend she lived with in college. Advanced (3) Lardiere (2007, 2008) Hardly ever have they seen such a mess. Advanced (3) Norris (2000), Hawkins (2001) Did Amy discover who bought what? Intermediate (2) Pienemann et al. (1988) for use of ‘ly’ adverbs; Hawkins (2001) It is nearly certain that oil prices will rise again. Item 11 Item 12 Item 13 S-V conversion in a sentence beginning with a negation Multiple Whquestions within the embedded sentence Subject clause; Use of adverbs 3 Jumbled word order Table 3. General scoring rubric for assigning scores of 0, 1 or 2 to each item on the grammar test 0 1 2 No evidence of acquisition • Random word order • No/inappropriate morphological marking • Making no sense Partial evidence of acquisition • Marginally acceptable • Use of a word that belongs to the construct group, but inappropriate (e.g., prep) Full evidence of acquisition • Morphologically and syntactically accurate • Alternative responses • Spelling or punctuation errors ignored 5. Grammar Test Development and Pilot Procedures Table 4. Timeline of the grammar test development and pilot procedures Timeline Procedure Spring 2007 • • • • Summer 2007 • • Fall 2007 • • • Spring 2008 • • 6. Examinees SLA research review Item Set I development Pilot test I – administered during the EPT Item evaluation & modification • Advanced/Proficiency speakers • Students in ESL courses • IEOP students Pilot test II – administered during the EPT Item analyses • New ISU-entering international students whose L1 is not English (for each semester) • Already met TOEFL requirement Pilot test III – administered during the EPT Item analyses New item development (Set II) Pilot test IV – administered during the EPT Item analyses Results Table 5. Descriptive statistics for Spring 07, Summer 07, Fall 07 and Spring 08 Semester Items included Spring 07 Ctrla Summer 07 Ctrl Fall 07 Ctrl Spring 08 Ctrl Allb n 76 k mean SD median Cronbach’s Alpha reliability mode 12 15.24 5.566 15.00 12 & 18 .78 12 (10*) 16.93 3.453 16.50 14 .54* 16 (14*) 22.10 4.737 21.50 18 .70* 452 10 18.61 3.472 19.00 20 .55 152 13 13.58 5.826 13.00 7 & 16 .76 30 Note. * For the summer 07 test, items 1.2 & 4.4 had zero variance, so they were removed from the scale in the calculation of reliability. a Ctrl refers to the constructed-response items; b All includes two free-writing items. 4 Table 6. Descriptive statistics for three proficiency level groups from Spring 2007 n k mean SD median mode max min Proficient English Speakers (3) 14 12 23.00 1.30 23.50 24 24 0 Upper Intermediate Speakers (2) 51 12 14.78 3.80 15.00 12 & 18 24 0 Sig. Diff F(2, 73) = 69.078, p < .001 Lower 24 0 7.45 2.42 8.00 6 & 10 Intermediate 11 12 Speakers (1) Note: Results of the Scheffé post-hoc comparisons of groups indicated that each group was significantly different from the other ones. Figure 2. Box and whiskers chart showing score distributions for three proficiency level groups (Spring 07) 25 98% CI Controlled_Total 20 15 10 5 1 2 3 Proficiency Level Table 7. Correlations of the grammar test with the EPT and with the TOEFL Correlation with the EPT Writing Correlation with the TOEFL iBT Summer 07 (All, k=16) .593** (n=30) -- Summer 07 (Controlled, k=12) .437** (n=30) -- Fall 07 (Controlled, k=12) .434** (n=452) -- Spring 08 (Controlled, k=13) .624** (n=152) .704** (n=84) Data Set ** Correlation is significant at 0.01 level (1-tailed). 5 Figure 3. Average item difficulties grouped by theoretically predicted level of emergence of the grammatical knowledge Note: 1=Beginner level items (k=2, ‾x=.82); 2=Intermediate level items (k=4, ‾x=.62); 3=Advanced level items (k=7, ‾x=.38) 7. Directions for Future Research Figure 4. The automatic writing classifier Figure 5. A tentative web-based interface for the grammar test 6 Table 8. Actual responses for item 10 0 • • • • • • • • • • • • oil that certainly it will rise again; oil that certain prices nearly will rise again; oil that will rise prices again nearly certain; oil prices that nearly will rise certain again; nearly that oil prices will certain rise again; certain prices rise again that will be oil; certain that nearly will oil prices rise again; certain oil that prices will nearly rise again; nearly certain that oil will rise prices again; nearly certain oil prices that will rise again; that oil prices will rise certain nearly again; certain that oil prices rise again nearly; 1 • • • • • • • • • • certain that oil prices will rise again; certain that oil prices will nearly rise again; certain nearly that the oil price will rise again; certain that oil prices will rise again nearly; certainly that oil prices will rise again; certain that nearly oil prices will rise again; certain nearly that the oil prices will rise again; oil prices that will certainly rise again nearly; certain that oil prices will again rise nearly; certain that oil prices nearly will rise again; Figure 6. An overview of the validity argument for the grammar test 7 2 • • • • nearly certain that oil prices will rise again; nearly certain that the oil prices will rise again; nearly certain that the prices of the oil will rise again; nearly certain that oil prices will again rise; References Andersen, R. (1978). An implicational model for second language research. Language Learning, 28, 221-282. Bailey, N., Madden, C., & Krashen, S. (1974). Is there a ‘natural sequence’ in adult second language learning? Language Learning, 24(2), 235-243. Bardovi-Harlig, K. (2000). Tense and aspect in second language acquisition: Form, meaning and use. Malden, MA: Blackwell. Bayley, R. J. (1999). The primacy of aspect hypothesis revisited: Evidence from language shift. Southwest Journal of Linguistics, 18(2), 1-22. Chapelle, C. A., Enright, M. K., & Jamieson, J. M. (Eds.). (2008). Building a validity argument for the Test of English as a Foreign LanguageTM. New York/London: Routledge Taylor & Francis Group. Dulay, H. & Burt, M. (1973). Should we teach children syntax? Language Learning, 23, 245-258. Dulay, H. & Burt, M. (1974). Natural sequences in child second language acquisition. Language Learning, 24, 37-53. Hawkins, R. (2001). Second language syntax: A generative introduction. Cornwall, UK: Blackwell Publishing. Larsen-Freeman, D. (1975a). The acquisition of grammatical morphemes by adult learners of English as a second language. Unpublished Ph.D. dissertation, University of Michigan. Larsen-Freeman, D. (1975b). The acquisition of grammatical morphemes by adult ESL students. TESOL Quarterly, 9, 409419. Lardiere, D. (2007). Ultimate attainment in second language acquisition: A case study. Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Publishers. Lardiere, D. (2008). Feature assembly in second language acquisition. In J. M. Liceras, H. Zobl, H. Goodluck (Eds.), The role of formal features in second language acquisition (pp. 107-140). New York: Lawrence Erlbaum Associates, Taylor-Francis Group. Norris, J. (2000). Pearson “Test Your English” level ability finder: Grammar pilot test development and revision project. Unpublished manuscript. Norris, J. (2005). Using developmental sequences to estimate ability with English grammar: Preliminary design and investigation of a web-based test. Second Language Studies, 24(1), 24-128. Retrieved Mar. 4, 2008 from http://www.hawaii.edu/sls/uhwpesl/on-line_cat.html. Pienemann, M. (1999). Language processing and second language development: Processability Theory. Amsterdam/Philadelphia: John Benjamins Publishing Company. Pienemann, M. & Johnston, M. (1987). Factors influencing the development of language proficiency. In D. Nunan (Ed.), Applying second language acquisition research (pp. 45-141). Adelaide: National Curriculum Resource Center. Pienemann, M., Johnston, M., & Brindley, G. (1988). Constructing an acquisition-based procedure for second language assessment. Studies in Second Language Acquisition, 10(2), 217-243. Purpura, J. (2004). Assessing grammar. Cambridge: Cambridge University Press. Stauble, A.-M. (1984). A comparison of a Spanish-English and a Japanese-English second language continuum: Negation and verb morphology. In R. Andersen (Ed.), Second language: A cross-linguistic perspective (pp. 323-353). Rowley, MA: Newbury House. 8 Appendix A: Item Analysis Grammar Test Results Controlled Items Spring 08 Item Target Response Construct Stagea IF (n=152) ID (n=152) 01 (1.1) 02 (1.2) 03 (2.1) 04 (2.2) 05 (3) to Was the yearly report submitted…? 06 (4.1) was still working 07 (4.2) 08 (5) 09 (6) 10 (7) 11 (8) 12 (9) 13 (10) on (it) may/ might have thought Where we are going to an old friend she lived with in college Hardly ever have they seen … (Did) Amy discover who bought what? Cancellati on of S-V inversion in the indirect question Relative clause S-V inversion with a negation MultipleWh embedded question (Whisland) (It is) nearly certain that oil prices.. Position of an adverb ‘nearly’ & subject clause a/the How I can get article article Cancellation of the S-V inversion in the indirect question Preposition Passive Past progressive Preposition Modal + have p.p. 1 1 3 2 3 2 2 3 3 3 3 3 2 0.76 0.87 0.32 0.61 0.55 0.60 0.70 0.23 0.55 0.27 0.28 0.47 0.58 0.44 0.28 0.46 0.31 0.60 0.76 0.56 0.57 0.40 0.50 0.52 0.72 0.53 the Correlations with: Total score (n=84) EPT result (n=84) TOEFL (iBT) (n=84) .386** .439** .413** .280** .580** .671** .599** .602** .290** .499* .464** .544** .516** .361** .177 .178 .149 .476** .380** .510** .444** .006 .154 .299** .488** .342** .239* .198 .226* .047 .410** .492** .379** .423** .007 .418** .370** .495** .472** a 1=Beginner; 2=Intermediate; 3=Advanced **. Correlation is significant at the 0.01 level (2-tailed). *. Correlation is significant at the 0.05 level (2-tailed). 9