Vocabulary Assessment Norbert Schmitt University of Nottingham 1 Vocabulary Assessment • Nearly all teachers do vocabulary assessment of some sort, ranging from informal observation, to short quizzes, to more formal examinations • While informal assessment may not be difficult, designing good vocabulary measures for higher stakes purposes requires a considerable amount of expertise • Most teachers (and educators and researchers in general!) lack this expertise 2 Vocabulary Assessment • I’ve been thinking about vocabulary measurement since the early 1990s. • Here are 4 questions on test development which I came up with in 1994 (Thai TESOL Bulletin). 3 Vocabulary Assessment 1. WHY DO YOU WANT TO TEST? 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) 3. WHAT ASPECTS OF THESE WORDS DO YOU WANT TO TEST? 4. HOW WILL YOU ELICIT STUDENTS' KNOWLEDGE OF THESE WORDS? 4 Vocabulary Assessment 1. WHY DO YOU WANT TO TEST? • To see if students have learned taught words (achievement) 5 Vocabulary Assessment 1. WHY DO YOU WANT TO TEST? • • To see if students have learned taught words (achievement) To see if students have vocabulary gaps (diagnostic) 6 Vocabulary Assessment 1. WHY DO YOU WANT TO TEST? • To see if students have learned taught words (achievement) • To see if students have vocabulary gaps (diagnostic) • Placement 7 Vocabulary Assessment 1. WHY DO YOU WANT TO TEST? • • • • To see if students have learned taught words (achievement) To see if students have vocabulary gaps (diagnostic) Placement Part of a proficiency test 8 Vocabulary Assessment 1. WHY DO YOU WANT TO TEST? • • • • • To see if students have learned taught words (achievement) To see if students have vocabulary gaps (diagnostic) Placement Part of a proficiency test Motivation 9 Vocabulary Assessment 1. WHY DO YOU WANT TO TEST? • • • • • • To see if students have learned taught words (achievement) To see if students have vocabulary gaps (diagnostic) Placement Part of a proficiency test Motivation Washback (tests reflect educator goals) 10 Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) • It depends on the purpose of the test 11 Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) • Achievement = ? 12 Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) • Achievement = lexical items that have been taught 13 Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) • Diagnostic = ? 14 Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) • Diagnostic = The lexical items a student is expected to know, or should know at a certain level 15 Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) • Placement = ? 16 Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) • Placement = The lexical items that will be taught in a course, or that a student may know at the level being taught in the course. Also the foundation vocabulary expected to be learned before entering the course. 17 Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) • Proficiency = ? 18 Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) • Proficiency = A range of vocabulary, especially some that will be challenging for the best students 19 Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) • Motivation = ? 20 Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) • Motivation = Lexical items that were recently taught, or the items that the students see as useful for reaching their goals (e.g. TOEFL, university entrance exam) (or any vocabulary : testing always makes students study?) 21 Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) • Washback = ? 22 Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) • • Washback = any vocabulary, as the act of putting vocabulary on a test shows that it is important Is a way of highlighting education goals 23 Vocabulary Assessment 2. WHAT WORDS DO YOU WANT TO TEST? (AND HOW MANY?) • It depends • How long should the test be? (low/high stakes) • Longer is better, but it must be a practical length • What sampling rate will you accept? 24 Vocabulary Assessment • • • • Sampling Rate You typically cannot test every lexical item So you need to extract a representative sample Depends on item format: checklist format allows more items than multiple-choice 1/5, 1/10, 1/100, 1/1,000? Many vocabulary tests have very low sampling rates (e.g. VLT is only 3/100) 25 Vocabulary Assessment • • • • How to Sample? Random Systematically: every nth item, every nth page, etc. Equal proportions of different word classes (nouns, verbs, etc.) Only the most difficult (least frequent?) items, on the assumption that these are the items which will not be known) 26 Vocabulary Assessment 3. WHAT ASPECTS OF THESE WORDS DO YOU WANT TO TEST? • Which word knowledge aspects will you cover? • Form-meaning link is the minimum specification • It is also the typical specification (Why do you think this is so?) 27 Vocabulary Assessment 4. HOW WILL YOU ELICIT STUDENTS’ KNOWLEDGE OF THESE WORDS? • Which item format will you use? 28 Item Formats • • • • • Let’s look at a number of item formats What word knowledge aspects do they address? Are they receptive or productive? Are they size or depth tests? What are their advantages and disadvantages? For what testing purposes might they most useful? Least useful? 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 Size & Depth Test Formats • • Next, let’s look at a number (semi-) established test formats: Vocabulary Size test formats – Multiple-choice formats – Vocabulary Levels Test • Vocabulary Depth Formats – Developmental Scales • • Vocabulary Knowledge Scale Schmitt and Zimmerman Scale – Word Associates Format 45 46 Checklist (Yes-No) Tests • • • Checklist tests are straightforward to take Learners just check () which words they think they know Here is a checklist test from one of the best known studies into the vocabulary size of native English speakers (NZ university students) 47 48 Checklist (Yes-No) Tests • Checklist tests are an efficient way of testing a lot of lexical items • This allows to a high sample rate • Easy to build and easy to mark • But learners sometimes overestimate their knowledge (i.e. they check words they don’t actually know) • How to control for this? • Meara’s 1992 Checklist Tests 49 50 Checklist (Yes-No) Tests • • • • • • • The most common way is to add nonwords to the test, and see if they check them as known If so, then their scores are adjusted down Meara’s adjustment table However, the adjustment formulas are all a bit wonky In some research, data is deleted if a certain number of nonwords are checked as known In the end, checklist tests don’t work very well if examinees are not honest and careful So the usefulness of the test format depends on the examinees behavior to a large extent 51 52 Adjusting Checklist (Yes-No) Tests • • • Reaction Time (speed of response) is a viable way of adjusting accuracy Faster responses are usually more sure Pellicer-Sanchez and Schmitt (2012) Language Testing Best adjustment formula by individual result and False Alarm rate FA rate 0 1 2 3 4 8 Best adjustment formula NS RT H − FA > RT H − FA = RT — — — NNS RT RT = Δm H − FA H − FA H − FA Isdt> H − FA 53 Vocabulary Knowledge Scale • Often used as a depth test • Is a developmental type of measurement • But there are many problems with this scale: See Researching Vocabulary for a full critique: – – – – – How many stages should scale have? Not an interval scale Can’t use inferential statistics with it Sentences often not informative Not clear what VKS is measuring 54 55 Schmitt & Zimmerman Scale • • • • • • Suffers from many of the same problems as VKS Fewer stages make it more transparent? Written in a ‘can-do’ manner: easier for learners to say what they can do than what they know More closely connected to receptive vs. productive mastery Tests uses non-words (artivious, ploat) to assure honesty of response Which is better? 56 57 Word Associates Format (Read, 2000) • • • • • • One of the most used depth test formats Comes in 8-word and 6-word versions, some with boxes and some with words in lists Learners circle all of the words which are associated with the target word Left box is meaning-based Right box has collocations Ratio of answers per box can vary to make guessing more difficult 58 59 Word Associates Format (Read, 2000) • • • • • If learner correctly selects all correct associations and none of the distractors, then this shows good knowledge of the target word If learner selects none of the correct options, and this indicates little or no knowledge of word But what about ‘split’ answers: some correct options and some incorrect ones? MA research at Nottingham (Schmitt, Ng, & Garras, 2011) shows that this actually corresponds to little real knowledge of the words That is, split scores do not indicate reliable knowledge 60 Vocabulary Website • Most Schmitt (and colleagues) research is available on Norbert Schmitt’s personal website: www.norbertschmitt.co.uk • There are also vocabulary resources, including vocabulary tests on the site 61