ELT Testing and Assessment Session 3 Today: 1. Article Discussion 2. Practical Steps for Test Construction 3. Intro to Scoring, Grading and Feedback 1 Practical Steps to Test Construction 1. 2. 3. 4. Assessing Clear, Unambiguous Objectives Drawing Up Test Specifications Devising Test Tasks Designing Multiple-Choice Test Items 2 1. Assessing Clear, Unambiguous Objectives What is it you want to test? Ask yourself: ◦ What should Ss know? ◦ What should Ss be able to do? Answers should be stated as overt performance within a clear target linguistic domain Ideally, these should be the objectives for the lesson/unit/course you are testing. 3 OBJECTIVES FOR LOW-INTERMEDIATE INTEGRATED COURSE Form-focused objectives (listening & speaking) – Students will: 1. Recognize and produce tag questions, with the correct grammatical form and final intonation pattern, in simple social conversations 2. Recognize and produce wh-information questions with correct final intonation pattern and appropriate answers Communication skills (speaking) – Students will: 3. State completed actions and events in a social conversation 4. Ask for confirmation in a social conversation 5. Give opinions about an event in a social conversation 6. Produce language with contextually appropriate intonation, stress and rhythm Reading skills (simple essay or story) – Students will: 7. Recognize irregular past tense of selected verbs in a story or essay Writing skills (simple essay or story) – Students will: 8. Write a one-paragraph story about a simple event in the past 9. Use conjunctions so and because in a statement of opinion 4 2. Test Specs should include: a. b. c. Outline of test (what, when/how long, generally how) Skills to be included Item types and tasks 5 TASK: Create Test Specs for a midterm test for course Using objectives on slide 4 You have 12 students You have 30 minutes scheduled for the test But you also can incorporate some form of assessment into the preceding class if you want You need to test all skills 6 Ways to elicit responses … Elicitation mode: Response mode: Oral (S listens) Written (S reads) Word, pair of words Sentence(s), question Directions Monologue, speech Pre-recorded conversation Interactive (live) dialogue Word, set of words Sentence(2), question Directions Paragraph Essay, excerpt Short story, book Oral (for either oral or written elicitation) Written (for either oral or written elicitation) Repeat Read aloud Yes/no Short response Describe Role play Monologue (speech) Interactive dialogue Mark multiple-choice option Fill in the blank Spell a word Define a term (with a phrase) Short answer (2-3 sentences) essay 7 TASK: test specs outline… Speaking (time) ◦ Format: ◦ Task: Listening (time) ◦ Format: ◦ Task Reading (time) ◦ Format: ◦ Task: Writing (time) ◦ Format: ◦ Task: Should cover: - Topics (objectives) - Implied elicitation and response formats - Number of items in each section - Time to be allocated for each 8 What do you think of this outline? Speaking (5 minutes / person; previous day) ◦ Format: oral interview, T and S ◦ Task: T asks questions of S (Obj 3, 5; emphasis on 6) Listening (10 minutes) ◦ Format: T makes audiotape in advance, w one other voice on it ◦ Task: a. 5 minimal pair items; m-c (Obj 1) b. 5 interpretation items, m-c (Obj 2) Reading (10 minutes) ◦ Format: cloze test items (10 total) in storyline ◦ Task: Fill in the blanks (Obj 7) Writing (10 minutes) ◦ Format: prompt for a topic about sitcom seen in class ◦ Task: writing a short opinion paragraph (Obj 9) 9 3. Devising Test Tasks Sample: test item – 1st draft – Listening, part b Directions: Listen to the question [on the tape]. Choose the sentence on your test page that is the best answer to the question: Voice: Where did George go after the party last night? S reads: a. Yes, he did. b. Because he was tired. c. To Elaine’s place for another party. d. He went home around eleven o’clock. QUESTION: Any problem with this? 10 As you revise your test draft – ask yourself: Are the directions to each section absolutely clear? Is there an example item for each section? Does each item measure a specified objective? Is each item stated in clear, simple language? Does each m-c item have appropriate distractors? Is the difficulty of each item appropriate for your students? Is the language of each item sufficiently authentic? Do the sum of the items and the test as a whole adequately reflect the learning objectives? 11 4. More on M-C Test Items Cons? ◦ Tests only recognition knowledge ◦ Guessing has effect on grades ◦ Restricts what can be tested ◦ Difficult to write good items ◦ Washback may be harmful ◦ Cheating easier Pros? ◦ Practicality ◦ Reliability 12 M-C Terminology M-C items are all receptive (selective) response items (t-t chooses from set) Stem = stimulus Options/alternatives = answer choices (typically 3-5) Key = correct answer Distractors = incorrect answers 13 What’s wrong with these tasks? Excuse me, do you know ___? a. where is the post office b. where the post office is c. where post office is (Remember: task should be focused on specific learning objective) 14 How can we improve this? My eyesight has really been deteriorating lately. I wonder if I need glasses. I think I’d better go to the ____ to have my eyes checked. a. pediatrician b. dermatologist c. optometrist 15 How can this task be better? We went to visit the temples, ____ fascinating. a. which were beautiful b. which were especially c. which were holy 16 “Biased for best” Goal: to assure Ss have opportunity to perform at highest personal level on test ◦ T. provides appropriate review ◦ T. suggests appropriate preparation and testtaking strategies ◦ T. structures test so stronger Ss are challenged and weaker Ss do not feel defeated/overwhelmed 17 Item Indices – used to accept, discard or revise items (after trialing or using the test) 1. Item Facility – extent to which an item is easy or difficult for proposed group # of Ss answering the item correctly ___________________________________________________________________________________________________ Total # of Ss responding to that item Example: IF = 13/20 = .65 or 65% Range of .15 to .85 acceptable QUESTION: Why would you want to include questions at either end of the spectrum in your test? 18 2. Item Discrimination – the extent to which an item differentiates between high- and low-ability test-takers Divide t-t into 3 groups (high, middle, low) Example: 30 students – 3 groups of 10 high-group # correct – low group # correct _____________________________________________________________________________________ ½ x total of your two comparison groups ID= 7 -2 / ½ x 20 = 5/10 = .50 This is a moderate ID (high ID would be close to 1.0) 19 For item indices, large-scale tests use… IRT – item response theory 20 For every-day classroom testing, teachers use… intuition 21 Distractor efficiency = extent to which a) distractors “lure” t-ts and b) responses are evenly distributed among all distractors Choices High Ss (10) Low Ss (10) A 0 3 B 1 5 C* 7 2 D 0 0 E 2 0 QUESTION: What do we learn about D and E? 22 Scoring – Weight: Oral interview 40% 4 scores, 5-1 range x 2 = 40 points Listening 20% 10 items @ 2 pts/each = 20 points Reading 20% 10 items @ 2 pts/each = 20 points Writing 20% 2 scores, 5-1 range x 2 = 20 points 23 Grading – take into account: Country, culture, context of this English classroom Institutional expectations (often unwritten) Explicit and implicit definitions of grades that you have established Relationship you have established with students/class Student expectations engendered in previous tests and quizzes 24 Feedback Letter grade Total score Subscores Right/Wrong indication Marginal comments Checklist of areas needing work Oral feedback after interview Self-assessment Peer checking Whole class discussion Individual post-test conference 25