Testing Grammar and Vocabulary

A presentation by
Marcia Tadjuddin
Which side are you on?
Grammar and vocabulary should be
tested in isolation. (direct testing of
grammar and vocabulary)
Pro or Con?
Develop your arguments!
Testing Grammar
• Why Test Grammar?
• As far as proficiency tests are concerned, there
has been a shift towards the view that since it is
language skills that are usually of interest, then
it is these which should be tested directly, not
the abilities that seem to underlie them.
Why test grammar?
• Probably, most proficiency tests that are
administered on a large scale still retain a grammar
• One reason for this must be the ease with which
large number of items can be administered and
scored within a short period of time.
• The question of content validity: if we decide to test
writing ability directly, then we are severely limited
in the number of topics, style of writing and the
grammatical elements that we can cover in any
version of the test.
So, should we test grammar?
• Wherever the teaching of grammar is thought
necessary, then consideration should be given to
the advisability of including a grammar
component in achievement area.
• Whether or not grammar has an important place
in an institution’s teaching, it has to be accepted
that grammatical ability, or rather the lack of it,
sets limits to what can be achieved in the way of
skills performance.
Writing specifications
• Specification of content for achievement tests:
▫ From the syllabus, which list the grammatical
structures to be taught.
▫ Inferring from textbooks and other teaching
• For placement test:
▫ Includes all of the structures identified above
▫ Also perhaps those structures even for the lowest
• Reflect an attempt to give the test content
validity by selecting widely from the structures
• It should also take into account of what are
regarded as the most important structures.
• It should not deliberately concentrate on the
structure that happen to be easiest to test.
Writing items
• Whichever techniques are chosen, it is important
for the text of the item to be written
grammatically correct and in natural language.
▫ E.g. We can’t work in this class because there isn’t
enough silence.
• To avoid unnatural language, it is recommended
to use corpus based examples.
• 4 techniques are presented here: gap filling,
paraphrase, completion and multiple choice. The
first 3 require the candidate’s production, the
last calls only for recognition.
Gap Filling
• Ideally, gap filling items should have just one
correct response.
▫ E.g. The council must do something to improve
transport in the city. ______________, they
will lose the next election. [otherwise]
• An item with two possible correct responses may
be acceptable if the meaning is the same,
whichever is used.
▫ E.g. My dad works in a company _________
makes Nike Shoes. [that/which]
Gap filling
• An item is probably rejected if the different
possibilities give different meanings or involve quite
different structures, one of which is the one that is
supposed to be tested.
Patient: My baby keeps me awake all night. She won’t
stop crying.
Doctor: __________ let her cry. She’ll stop in the end.
[just, I’d, Well, Then, etc.)
• This item may be improved by including the words
“then”, and “just” so that it cannot fill the gap.
Doctor: Then _________ just let her cry. She’ll stop in
the end. [I’d , you]
Gap filling
• Adding to the context can often restrict the
number of possible correct responses to a single
• This may be presented in a longer passage with
several gaps and can be used to test a set of
related structures (such as the articles) or a
variety of structures
We live in _____ old house in _______ middle
of the village. There is ________ beautiful
garden behind ________ house. ______ roof of
______ house is in very bad condition.
• Require the student to write a sentence
equivalent in meaning to one that is given.
• It is helpful to give part of the paraphrase in
order to restrict the students to the grammatical
structure being tested.
▫ When I came, my mom was setting up the dinner
▫ When I came, the dinner table____________
• Can be used to test a variety of structures.
• Usually in a form of conversational transcript.
• See the example on pg. 177
Multiple Choice
• There are times when gap filling will not test what
we want to test because there are too many
Gap filling:
They left at 7. They _________ be home by now.
Multiple Choice:
A: They left at 7. They _________ be home by now.
B: Yes, but we can’t count on it, can’t we?
a. can
b. could
c. will
d. must
Multiple Choice
• Can be used to test discontinuous elements.
A: Poor man, he ________at that for days now.
B: Why doesn’t he give up?
was working
has been working
is working
had worked
Scoring Production Grammar Tests
• The important thing when scoring is to be clear
about what each item is testing, and to award points
for that only.
• Nothing should be deducted for non-grammatical
errors, or for errors in elements of grammar which
are not being tested by the item.
• If two elements are being tested in an item, then
points may be assigned to each of them.
• Alternatively, it can be predetermined that both
elements have to be correct for any points to be
• To ensure scoring is valid and reliable, careful
preparation of the scoring key is necessary.
Testing Vocabulary
Why test vocabulary?
• If there is little teaching of vocabulary, it may be
argued that there is little call for achievement tests
of vocabulary.
• For those who believe that systematic teaching of
vocabulary is desirable, vocabulary achievement
tests are appreciated for their backwash effect.
• The usefulness and feasibility of a general diagnostic
test of vocabulary is not readily apparent.
• In placement tests, we would be looking for some
general indication of the adequacy of the student’s
vocabulary. We would not normally require a
particular set of lexical items to be a prerequisite for
a particular language class.
Writing Specification
• If vocabulary is being consciously taught, then
presumably all the items presented to the students
should be included in the specifications.
• We can add all the new items that the students have
met in other activities (reading, listening, etc.)
• Words should be grouped according to whether their
recognition of their production is required.
• A subsequent step is to group the items in terms of
their relative importance.
• For placement tests or proficiency tests ,
specification refer to one of the published word lists
that indicate the frequency of the words.
• Words can be grouped according to their
frequency and usefulness.
• From each of these groups, items can be taken at
random, with more being selected from the
groups containing the more frequent and useful
Writing Items:
Testing recognition ability
• Multiple choice can be recommended for this
type of testing problem.
• Distractors are usually readily available
• It seems unlikely to be any serious harmful
backwash effect since guessing meaning of
vocabulary items is something that we would
probably wish to encourage.
Recognizing Synonyms
E.g.: which is the closest in meaning to “gleam”?
A. gather
B. shine C. welcome D. clean
• Which distractors do you think are likely to be
• Whether distractors would work as intended
would only discovered through trialling.
• Note that all of the options are words that the
candidates are expected to know.
Recognizing Synonyms
• Another way is to have a common word as the
stem with 4 less frequent words as options.
E.g.: which is the closest in meaning to “shine”?
A. malm B. gleam C. loam D. snarl
• The drawback: what distractors to use? How do
we ensure that the distractors are unfamiliar
words to the test takers? If the test taker knows
them, they will not distract.
Recognizing Definitions
Loathe means
a) dislike intensely
b) become seriously ill
c) search carefully
d) look very angry
• Note that all of options are of about the same length.
• Test takers who are uncertain of which option is
correct will tend to choose the one which is
noticeably different from the others.
• In the example above, the writer has included some
notion of intensity in all of the options.
Recognizing Definitions
• The less frequent word or the difficult word could
also be used (although with the same concerns)
▫ One word that means to dislike intensely is
A. growl B. screech C. sneer D. loathe
• Thrasher (Internet) believes that vocabulary is best
tested in context. A better way to test knowledge of
loathe would be:
▫ Bill is someone I loathe for the humiliation that he has
caused me and my family.
A. like very much
C. respect
B. dislike intensely
D. fear
Recognizing appropriate word
for context
• Context, rather than a definition or a synonym,
can be used to test knowledge of a lexical item.
▫ E.g. The strong wind _______ the man’s efforts
to put up the tent.
▫ a. disabled b. hampered c. deranged d.
• Note that the context should not itself contain
words that the candidates are unlikely to know.
Recognizing appropriate word
for context
• Since learners and language users in general
normally meet vocabulary in context, providing
context in an item makes the task more authentic
and perhaps result in a more valid measure of the
candidate’s ability.
• The context may help activate a memory of the
word, in the sane way as meeting it when reading in
a non-test situation.
• There could be some negative backwash when words
are presented in isolation.
• However, when we test vocabulary by means of
multiple choice, the range of possible distractors will
be wider if words are presented in isolation.
Testing Production Ability
Using Pictures
• The main difficulty in testing productive lexical
ability is the need to limit the candidate to the
lexical item that we have in mind using only
words. One way around this is to use pictures.
• However, this method is obviously restricted to
concrete nouns that can be unambiguously
Testing Production Ability
• May work for a range of lexical items.
• But not all items can be identified uniquely from
a definition (excluding all synonyms)
• Not all words can de defined entirely in words
more common or simpler than themselves.
Testing Production Ability
Gap filling
• This can take the form of one or more sentences
with a single word missing.
• Often there is an alternative word to the one we
have in mind. This problem can be solved by
giving the first letter of the word (possibly more)
and even an indication of the number of letters.
• While grammar and vocabulary contribute to
communicative skills, they are rarely to be
regarded as ends in themselves.
• It is essential that tests should not accord them
too much of importance, and so create a
backwash effect that undermines the
achievement of the objectives of teaching and
learning where these are communicative in