1 Vocabulary Growth of the Advanced EFL Learner Meral Ozturk

advertisement
Vocabulary Growth of the Advanced EFL Learner
Meral Ozturk*
mozturk@uludag.edu.tr
Abstract
This article reports the results of three studies conducted between the years 20052010 on the vocabulary growth of advanced EFL university students in an English-medium
degree programme. Growth in learners’ written receptive as well as productive
vocabularies was investigated in one longitudinal and two cross-sectional studies over
three years. While the first two studies used the receptive and semi-productive versions of
the Vocabulary Levels Test, study 3 used the more recent Vocabulary Size Test (Nation
and Beglar, 2007). The overall results of the three studies suggested that learners’
vocabularies did expand both receptively and productively, however the growth was
rather modest. Learners' receptive vocabulary sizes were 5-6,000 words and expanded by
about 500 words a year. There was also evidence for severe attrition in the final year.
Productive vocabulary expanded by 10% in the longitudinal study. Receptive knowledge
of academic vocabulary did not improve significantly due to a ceiling effect, but
productive growth was significant. Frequency seems to have a stable overall effect in
vocabulary development. However, for only one of the three tests used (i.e. the
Vocabulary Levels Test) an implicational scale between the levels could be established.
*Uludag University, Education Faculty, ELT Department, Turkey
2
Keywords: vocabulary growth, vocabulary size, receptive vocabulary, productive
vocabulary, frequency, advanced learner, EFL
Introduction
A large vocabulary size is important in using English. Research has shown that for
written receptive tasks like reading newspapers, novels, or academic texts, 8-10,000
words1 are necessary (Nation, 2006; Hazenberg and Hulstijn, 1996), and for spoken
receptive tasks like watching English TV programmes or movies, 7-8,000 words are
needed (Webb and Rodgers, 2009a; Webb and Rodgers, 2009b). For most EFL learners,
these targets are quite challenging if not impossible to attain. Part of the reason is that
English language courses do not usually target vocabulary beyond a few thousand (Cobb,
1995) on the assumption that having mastered the core vocabulary of English (i.e. the
most frequent 2000 or so words) learners will maintain progress on their own. While in
the earlier stages of language learning vocabulary learning is guided by the teacher and
the coursebook, the advanced learner is left to their own devices to learn a large
vocabulary mainly through language use. The question is whether extended language use
promotes such learning and whether learners continue to expand their vocabularies fast
enough to achieve the desired sizes. Another issue concerns patterns of lexical
development. Do L2 learners’ vocabularies grow in predictable ways, or do they grow
idiosyncratically depending on individual learners’ personal interests and needs? The
present study will investigate the potential of word frequency to predict the path of
development. Word frequency has long been a major guiding principle in setting lexical
3
targets for L2 learners, and it is assumed that learners should and will proceed according
to frequency. However, few studies have so far empirically tested it. The present study
will investigate these questions in relation to EFL learners who use English for academic
purposes in English-medium degree programmes. As noted by Meijer (2006), Englishmedium programmes are spreading in non-English-speaking countries ‘especially but not
exclusively in Europe’ and the kind of learner concerned does not represent a marginal
subset of English language learners.
L2 Vocabulary Growth
The literature on L2 vocabulary growth is rather small and only three of these
studies (Cobb & Horst, 2000; Schmitt & Meara, 1997; Milton & Meara, 1995)
specifically deal with vocabulary growth through academic study at the tertiary level
while others involve learning through direct language study in language courses rather
than learning through language use (Milton, 2009, pp.79-85; Laufer, 1998; Read,
1988). Unfortunately, the results of these studies are conflicting regarding the
evidence for progress. While Milton & Meara (1995) report significant gains in the
learners’ receptive vocabularies, the other two fail to provide evidence for any
substantial expansion of vocabulary size (Cobb & Horst, 2000; Schmitt & Meara, 1997).
Growth rates reported in these studies also vary considerably. Milton and Meara
(1995) investigated receptive vocabulary growth of European exchange students in a
British university and they estimated the annual growth rate to be 2650 words on
4
average. On the other hand, Cobb and Horst (2000) found that the second year
students’ receptive vocabularies in a university in Hong Kong differed from the first
years’ by only 200 words, and the first years did not make any significant gains after six
months. Schmitt and Meara’s (1997) EFL learners in Japan gained only 330 words
receptively in a year. Obviously, more data from similar contexts are needed in order
to gain insight into the nature of vocabulary growth of these learners.
The present study will improve on previous research in many ways. All three of
the aforementioned studies were limited in duration, not exceeding one year, during
which sizebale gains may be hard to surface. The present studies, on the other hand,
will cover a much longer time span, i.e. three years. While previous studies looked at
receptive size only, the studies in this paper will look at both receptive and productive
size. The only other study which investigated productive growth is Laufer (1998), who
found an increase by 850 words in the vocabularies of her high-school learners in Israel
after one year of language study. In the present research, both cross-sectional (studies
1 and 3) and longitudinal (study 2) designs will be employed. While the first two
studies will use the receptive and (semi-) productive Vocabulary Levels Test (Nation,
2001) to measure vocabulary size, study 3 will use the recently developed Vocabulary
Size Test by Nation and Beglar (2007).
Growth in academic vocabulary, i.e. subtechnical vocabulary that occur frequently
across a variety of academic disciplines but are not so common in non-academic texts
5
(Nation, 2001, p.187), will also be investigated. Although growth in this area is to be
expected given the opportunities for exposure to academic vocabulary, Cobb and Horst
(2000) did not find evidence for progress in the knowledge of academic words of the first
year students over six months or from the first to the second year. Conducted under
similar conditions, the present study will investigate if significant gains could be obtained
over several years of academic study.
It is expected that the context of the English-medium degree programme where
the three studies that will be reported here were conducted will provide enough
immersion in the target language to induce vocabulary development. However, the
generally held conviction among the students in the programme that their English
proficiency in general and vocabulary knowledge in particular deteriorated in the
course of their studies runs counter to this expectation. The present study will shed
light on this as well.
The Frequency Effect
An important factor affecting vocabulary development in a second language is
frequency of words in the language. Frequency exerts its influence through input. In the
L1, learners are exposed to words of varying frequency in receptive language use, and
words that appear more often in the input stand better chance of being learnt as
repeated encounters raise salience of the word, provide richer clues to meaning, and
strengthen memory traces. Vermeer (2001), in a study with native English-speaking
6
children in primary education, found significant correlations between the frequency of a
word in the input and the probability of knowing it. In L2 vocabulary acquisition,
frequency is likely to have a stronger effect. Input to L2 learners is usually graded in
vocabulary difficulty which is largely decided on the basis of word frequency, to the
effect that high frequency vocabulary becomes even more frequent and effect of
frequency more pronounced. Several studies have provided evidence for a frequency
effect. Even though frequency is a continuous variable, these studies used test
instruments where test words were drawn from lists of words divided, for the sake of
convenience (Meara, 2010, p.3), into one-thousand-word bands of frequency. These
studies have shown significant differences in scores between frequency levels and a
decrease in knowledge as the frequency level decreased. Laufer et al. (2004) measured
vocabulary knowledge of adult ESL learners with intermediate to advanced proficiency in
English in four one-thousand-word bands of frequency (2K, 3K, 5K and 10K) using the
Vocabulary Levels Test, and found that higher frequency words at 2K and 3K levels were
easier for these learners than the lower frequency words in the 5K level which, in turn,
were easier than the 10K level words. Milton (2009) reports a similar pattern in Greek
learners whereby learners' knowledge of words were highest at the 1K level and steadly
decreased over the following four adjacent levels of lower frequency. Laufer & Paribakht
(1998) have also found increasingly higher scores across the frequency levels. Milton
(2007) formulates this as ‘the frequency model of lexical learning’ profiling learners’
knowledge over frequency levels on a graph borrowed from Meara (1992). The following
graph is the vocabulary profile of a typical learner.
Words Known
7
100
90
80
70
60
50
40
30
20
10
0
1K
2K
3K
4K
5K
Frequency Levels
Figure 1. Vocabulary profile of a typical learner (Meara, 2010, p.6)
The model claims that ‘a typical learner’s knowledge is high in the frequent
columns and lower in the less frequent columns giving a distinctive downwards slope
from left to right. As learner knowledge increases, this profile moves upwards until it hits
a ceiling at 100% when the profile ought to flatten at the most frequent levels and the
downwards slope, left to right, shifts to the right into less frequent vocabulary bands.’
(Milton, 2007, p.49). Milton’s own research (2007, 2009) generally supported the model
yielding normal frequency profiles for 60% of the learners. On the other hand, a
substantial proportion of learners deviated from a normal profile and even the most able
learners were not able to hit the 100% ceiling in the highest frequency levels but
plateaued at around 85-90%. This suggests that while frequency has a strong effect in
vocabulary learning other factors might be at play.
8
Some researchers went further and looked for the presence of an implicational
scale among the levels. Read (1988) has shown that frequency levels in the VLT form an
implicational scale whereby a learner ‘…who achieved the criterion score at a lower
frequency level-say, the 5,000-word level- could normally be assumed to have mastered
the vocabulary of higher frequency levels - 2,000 and 3,000 words- as well’ (Read, 1988,
p.18). This finding has been replicated by Schmitt, Schmitt & Clapham (2001).
While frequency is clearly an important factor in vocabulary learning, the case for
frequency could be made from previous studies only for the earlier stages of vocabulary
learning since the test instruments used either did not measure knowledge in lower
frequency levels beyond 5K (X-Lex in Milton 2007, 2009) or measured only the 10K level
skipping the levels in between (Vocabulary Levels Test in other studies). Frequency might
not have the same strong effect on vocabulary learning in advanced levels as in earlier
levels. Since words that need to be learned at an advanced level will generally be of low
frequency, other factors like personal interest might become more decisive in
determining which words are learnt. The present study will test for an implicational scale
covering a greater range of frequency levels in receptive vocabulary knowledge in study
3. The presence of an implicational scale will also be tested for productive vocabulary
knowledge in studies 1 and 2. Vocabulary Levels Test (VLT from now on) scores in studies
1 and 2 will also be investigated for an implicational scale for the sake of comparison.
9
The three studies here will seek for answers to the following research questions:
1. Do the written receptive and written productive vocabulary sizes of advanced EFL
learners in English-medium degree programs continue to grow over time and at
what rate do they grow?
2. Does the knowledge of academic vocabulary in a foreign language develop
receptively and productively through academic study?
3. Do word frequency levels form an implicational hierarchy in developing a written
receptive and a written productive vocabulary in a foreign language through
academic study?
Study 1
Fifty-five first-year and forty-five fourth-year students in four intact classes in the
ELT programme in a university in Turkey participated in the study. All spoke Turkish as
their L1. They were highly advanced in English as they had to pass a very competitive
national English test to be admitted to the programme. In the department, they were
immersed in an English language environment, which should be conducive to further
development of vocabulary. Beginning from the first-year, all intradepartmental courses
are offered in English and take up 73% of all the courses in the four-year curriculum and
74% of the credit hours that have to be taken to graduate from the programme. In these
courses, the course material, lectures, class discussions, oral presentations, written
projects and exams are mediated through English. English language skills courses are
offered in the first year and the rest of the courses are related to learners’ subject area
10
which includes linguistics, English language teaching, language acquisition, and English
literature.
Study 1 employed a cross-sectional design comparing the first- and fourth-years
in terms of English vocabulary size. Any difference in vocabulary size between the two
groups is assumed to be the result of the extra years of exposure to English through
academic study by the fourth-year group as both groups studied under similar conditions
in the programme. Both groups had to take the same courses throughout with the
exception of a few interdepartmental elective courses which are taught in the learners’
L1; both groups were taught by the same teachers as the staff is pretty stable; and the
course material is unlikely to have grossly differed in the three intervening years. Initial
English proficiency and vocabulary size of the two groups is likely to be very similar since
the national English test admits students to the department from a very narrow range of
scores each year. However, the method of calculation of the national test scores has
been changed between the years when the fourth-years and the first-years sit the
English test (the years 2002 and 2005 respectively) and therefore a direct comparison of
the learners’ initial proficiency scores was not possible. On the other hand, the content
of the test has not changed from the year 2002 to 2005. In both occasions, 60% of the
items measured reading comprehension, 25% tested grammar and vocabulary and 15%
were translation items. Thus, both groups studied for the same kind of an exam, and any
washback effect from the test has probably led to the development of the same kind of
linguistic skills in English.
11
Learners’ receptive and productive vocabularies at different frequency levels
were measured. For this purpose, the receptive and (semi-) productive versions of the
Vocabulary Levels Test (Nation, 2001) were used. As a measure of receptive vocabulary
size, Version B of the VLT (Nation, 2001) was used. The test measures knowledge of 156
words in total from four frequency levels (2K, 3K, 5K, and 10K levels) as well as academic
vocabulary from the Academic Word List (i.e. AWL (Coxhead, 2000)). Thirty words from
each frequency band and thirty-six words from the AWL were tested. The test uses a
matching format as in the following example, where three words are being tested (horse,
pencil, wall):
1
2
3
4
5
6
business
clock
horse
pencil
shoe
wall
_____________ part of a house
_____________ animal with four legs
_____________ something used for writing
In scoring the test, one point was given for each correct answer and section
scores were computed by counting the number of correct answers in a given section. The
test had good overall and group reliabilities (KR 21=0.89 overall, KR 21=0.93 first years,
KR 21=0.75 fourth years).
Productive vocabulary size was measured by Version C of the Productive Levels
Test (PVLT) (Nation, 2001). This test was chosen because of its structural similarity to the
receptive version although its validity as a test of productive vocabulary knowledge has
12
been questioned (Read, 2000, pp. 124-6; Schmitt, 2010, pp.203-5). The test format
simulates the mental processing of words in production where users go from a word’s
meaning to its form. It does not measure vocabulary that learners can use in production.
It measures vocabulary which are ‘available for productive use’ (Laufer & Nation, 1999,
p.41). As far as validity goes, PVLT is no different from the receptive tests (VLT,
Vocabulary Size Test and the Yes-No tests (Meara, 2010). These receptive tests are
practically decontextualised, and they do not measure ‘use’ of vocabulary, either, in that
in answering these tests learners are not using the words to understand written texts. To
borrow a common dichotomy from SLA, the receptive tests and the PVLT are tests of
lexical competence as opposed to lexical performance. In the present study, the test was
reliable overall and for each group of learners (KR 21=0.84 overall, KR 21=0.87 first years,
KR 21=0.73 fourth years).
The test measures knowledge of 90 words in total. It parallels the receptive
version and consists of the same sections (the four frequency levels and the academic
word level from the slightly larger University Word List (Xue and Nation, 1984)) with 18
items each. Although the words tested in the receptive and productive versions are not
entirely identical (in fact, two-thirds of the items in the productive version were
different), the two tests are not necessarily incomparable. The test words are random
selections from a frequency list, and the test scores represent knowledge of all words in
a frequency level rather than knowledge of the exact words tested. One word is no
better than another drawn from the same level. Therefore, this should not be a major
13
drawback, although ideally one would want all the words to be the same for full
comparibility.
Each item in the productive test consists of a single sentence with a blank for one
of the words. The learner is asked to provide the missing word. The test, however, is
productive in a ‘controlled’ way (Laufer & Nation, 1999) as the beginning of each missing
word is given to limit the number of possible answers. Thus, only one word is possible for
a given blank. Here is an example for the word bicycle.
He was riding a bic___________.
The test was scored with one point given for each correct answer on the test.
There was no penalty for misspelled or wrongly inflected words unless the spelling
mistake distorted the pronunciation or the orthographic form of the word or the
inflection error involved an irregular form. Thus, willy for wily, council for counsel,
homojen (L1 spelling) for homogeneous were counted incorrect due to spelling errors.
Also, the inflected forms stretchen for stretched or thrusted for thrust were counted
incorrect. On the other hand, omission of letters which did not alter the spoken or
written form of the word to an important degree (e.g. orchides for orchids), omission of
plurals or tense markers required by the sentence context were not considered as
mistakes.
14
Both tests were given in one session during normal class hours in the second half
of the 2005-2006 academic year. Each group of learners was tested separately. Half of
the learners in a group were given the receptive test first while the other half were given
the productive test first in order to prevent fatigue from having an effect on test
performance. No two learners sitting next to each other received the same version first,
so that any possibility of cheating is eliminated. There were some overlapping items
between the receptive and productive tests, which might have provided an advantage to
those who answered the receptive test first. It was not possible to check for this
possibility in the present data. However, study 2, using the same first-year data as in
study 1, has found no statistically significant effect of test order on test scores.
Instructions in English appeared on the first page of each test. The learners were
advanced enough in English to understand the instructions and no problems with the
instructions were reported. Each student did both of the tests. On completing a given
version, the learner was immediately given the other version. The tests were completed
in about an hour.
The results for the receptive test are reported in means and mean percentages in
Table 1. One outlier from the fourth-year group and two outliers from the first-year
group were removed from the analysis of the receptive data. The one from the fourthyear group was removed for being too high (98% overall) and those of the first-years
15
were too low (25% and 36% respectively). The results were examined using a 2x4
analysis of variance with the year of study as the between-subjects factor with 2 levels
(Year 1 and Year Four) and the frequency level as the within-subjects factor with 4 levels
(2K, 3K, 5K, and 10K levels). The academic word section was analysed separately, as it did
not represent a level of frequency.
Table 1. Receptive Test Results of learners in study 1
Mean
%
SD
2000
1
4
N=53 N=44
28.53 28.13
95%
94%
1.70
1.71
3000
1
4
N=53 N=44
25.89 25.90
86%
86%
3.87
2.52
5000
1
4
N=53 N=44
21.09 20.77
70%
69%
6.06
3.64
10000
1
4
N=53 N=44
9.51
9.18
32%
31%
5.87
3.75
Academic
1
4
N=53 N=44
30.70 31.79
85%
88%
5.03
3.18
Total
1
4
N=53
N=44
115.71 115.79
74%
74%
19.84
10.91
ANOVA revealed the main effect for frequency significant (F=829.47, p=.000)
while the main effect for the year of study (F=.152, p=.697) and the interaction between
the two (F=.102, p=.959) were not statistically significant at the .05 level. Assuming equal
starting vocabularies for the two groups, it seems that the fourth years' receptive
vocabularies did not grow further in the three years. The similar performance of two
groups across the frequency levels suggested that they were not qualitatively different,
either. Overall, the differences between frequency levels were all significant according to
Bonferroni tests. The vocabulary profile of the whole group in Figure 1 shows that
learners’ scores across frequency levels linearly decreased with decreasing frequency.
Thus, learners’ knowledge was greatest at the highest frequency level, and systematically
decreased over the following levels as frequency decreased. Mean scores were very
close to the 100% ceiling in the 2K level (around 95%) and relatively high in the 3K level
16
(86%), but only 28 learners (29%) hit 100% in the 2K level and 12 learners (12%) in the
3K level.
30
25
20
15
10
5
0
2K
3K
5K
10K
Figure 2. Vocabulary profile of the whole group in receptive VLT in study 1 (N=97)
Following Read (1988) and Schmitt et.al. (2001), A Guttman scalability analysis
(Hatch and Lazaraton, 1991) was used to test for the presence of an implicational scale
between the frequency levels. Both studies found high degrees of scalability whereby a
learner who attained the criterion for mastery at a given level can be safely assumed to
have also attained mastery at higher levels of frequency. As in the previous studies, the
criterion for mastery in the present study was set at 90%. Scores higher than or equal to
the criterion was assigned a 1 while scores below the criterion were assigned a 0. Hatch
& Lazaraton (1991) recommend at least .90 for the coefficient of reproducibility and .60
for the coefficient of scalability to be obtained in order to establish an implicational
scale. The results of the present study revealed coefficients higher than the minimum
values recommended (Crep=.995, MMrep=.822, Cscal=.971). They were also similar to, or
17
even higher than, those obtained in the previous two studies (Crep=.93 and .92 for the
two separate administrations in Read (1988), and .993 and .995 for the two different
versions of the test in Schmitt et.al.(2001); Cscal= .90 and .84 in Read (1988) and .971
and .978 in Schmitt et.al. (2001)). These results suggested that the frequency levels in the
VLT formed an implicational scale.
Both groups of learners displayed knowledge of many of the words in the
academic word section (85% and 88%), however the difference between the two groups
did not reach statistical significance at the .05 level (t= 1.304, p=.196).
The results of the productive test are given below in Table 2. One outlier from
each group was omitted from the analysis. The omitted fourth-year student scored too
high in the test overall (83%) and the first-year student scored too low (5%).
Table 2. Productive Test Results of learners in study 1
Mean
SD
2000
1
4
N=54 N=44
13.52 14.84
75%
82%
3.12
2.11
3000
1
4
N=54 N=44
7.69
8.52
43%
47%
3.67
3.19
5000
1
4
N=54 N=44
5.46
6.25
30%
35%
2.28
1.79
10000
1
4
N=54 N=44
2.12
2.14
12%
12%
1.98
1.41
Academic
1
4
N=54 N=44
7.20
7.68
40%
43%
3.22
2.70
Total
1
4
N=54 N=44
35.99 39.43
40%
44%
12.63 8.98
The ANOVA yielded only one significant effect. The main effect for frequency
(F=877.30, p=.000) was significant while the main effect for year of study (F=2.91,
18
p=.091) and the interaction between the two (F=2.51, p=.059) were not. These results
suggested that learners' productive vocabularies did not grow significantly in three years,
but frequency was again effective in determining the overall course of development.
Bonferroni post-hoc tests have shown all differences between frequency levels
significant. Learners’ profile in Figure 2 shows a linear decrease with decreasing
frequency. Differently from the receptive scores, there is a sharp difference between the
2K scores and the rest. Still, 2K scores were relatively low (75 vs 82 %), and only 8
learners (8%) hit the 100%.
16
14
12
10
8
6
4
2
0
2K
3K
5K
10K
Figure 3. Vocabulary profile of the whole group in productive VLT in study 1 (N=98)
For the Guttman analysis the 90% criterion for mastery was not applicable as
there were too few scores which passed the criterion (i.e. 34 scores in 2K and none in the
3K, 5K and 10K levels). Therefore, a different criterion was used to assign the values of 1
or 0 to level scores. A score was assigned a 1 if it was higher than the learner’s score at
the next level of lower frequency. Thus, if a learner has scored 10 in the 2K level and 8 in
19
the 3K level his /her 2K score was assigned a 1. If, on the other hand, he /she has scored
8 in the 2K but 10 in the 3K level, his / her 2K score received a 0. The results revealed a
relatively high coefficient of scalability although lower than that of the receptive test
whereas the coefficient of reproducibility was lower than the minimum .90
recommended by Hatch & Lazaraton (1991), which makes it dubious to make a case for
an implicational scale with sufficient confidence (Crep=.817, MMrep=.901, Cscal=.848).
The difference between the two groups in academic word section was not
statistically significant (t=.799, p=426). The gap between receptive and productive
knowledge of academic vocabulary within the same individuals, however, seems rather
large. While learners’ receptive knowledge of academic words is quite substantial, their
productive vocabulary is half the size of their receptive vocabulary (85% vs 40% for the
first years and 88% vs 43% for the fourth years).
Nevertheless, the results of this study need to be treated with caution as the
study employed a cross-sectional design, and, although, the first- and fourth-years were
argued to be similar in initial proficiency, there is still the possibility that they might have
been different. Therefore, Study 2 will employ a long-term longitudinal design in order to
verify the results of Study 1.
20
Study 2
Eighteen students from among the 55 first-year students in study 1 participated in
Study 2. Of the 55 students, only the results of those for whom fourth-year data could be
obtained were used in this study. These learners were tested twice on their vocabulary
knowledge. The first time was in 2005 when they were in their first-year of study. They
were tested again in their final year in the programme in 2008. The same materials were
used on both occasions. The three-year lapse between the two testing events was
considered long enough for any kind of learning not to carry over from the first to the
second testing.
It was not possible to collect the fourth-year data in class. As the learners were
free to register for a course in any of the six groups that were available, the two intact
groups used in the first-year data no longer existed. Therefore, the tests had to be
distributed by hand to the 55 learners who participated in the first-year data to be
completed in their own time. Eighteen learners answered and returned the tests with a
return rate of 32%.
In order to maintain testing conditions as similar to the first time as is possible,
the order of the two tests was counterbalanced across the participants. Half of the 55
learners were given written instructions to answer the receptive test first, and the other
21
half were asked to answer the productive test first. Fortunately, the data from the 18
learners preserved the balance in the order of the tests as it turned out that 9 of the
learners had answered the receptive test first and the other 9 learners had completed
the productive test first. The learners were instructed to answer the tests on their own
and not to apply to external sources like a dictionary or another person. After the scoring
of the tests was complete, the results were shared with the students individually.
The fact that some of the target words appeared in both tests had a potentially
contaminating effect on the data. There were 27 such words which amounted to about
one third of the items on the productive test. It was possible that the overlapping items
provided advantage to those students who answered the receptive test first. Having seen
an item in the receptive test first might have aided the subsequent recall of the item in
the productive test. The learners noticed the presence of these overlaps as they pointed
out this fact to the researcher after the testing.
In order to check for a possible advantage of the overlapping items on the
productive test results, the productive scores of students who did the receptive test first
in the fourth-year data (Mean= 56.89, SD=11.85) were compared to those who answered
the productive test first (Mean= 50.33, SD=10.39) using independent samples t-test. The
order effect could not be tested in the first-year data as information as to the order of
tests for individual learners was not available. Although those who answered the
22
receptive test first seem to have gained advantage by about 6 items on the whole test,
this advantage was not statistically significant at the .05 level (t= -1.288, p=.216). Thus, it
appears that test order is unlikely to have influenced the results to a significant degree.
However, future studies using the Levels tests are advised not to counterbalance the
order of the tests, but give the productive test before the receptive.
The results of the receptive test for the first-year and for the fourth year are given
in Table 3 below. One subject was removed from the final analysis as she reported that
she obtained the original tests from the internet after the first administration and that
specifically studied the tests between the two administrations.
Table 3. Receptive Test Results of learners in study 2 (N=17)
2000
3000
5000
10000
Academic
Total
1
4
1
4
1
4
1
4
1
4
1
4
Mean
28.94
96%
29.24
97%
27.41
91%
27.88
93%
23.71
79%
24.88
83%
11.47
38%
13
43%
33.35
93%
34.29
95%
124.88
83%
129.29
86%
SD
1.95
1.25
3.85
2.54
4.10
2.34
4.87
5.13
2.26
2.11
14.17
9.32
N.B. The maximum score for the academic vocabulary level is 36, for other frequency levels 30, and for the total 156.
The ANOVA results were similar to those of study 1. While the main effect for
frequency was statistically significant (F=265.92, p=000), neither the main effect for year
of study (F=.944, p=.339), nor the interaction between the two (F=.386, p=.764) reached
significance. The non-significant effect of the year of study suggested that these learners
23
did not improve their vocabularies to a significant degree. This finding is surprising
because the learners who participated in this study did so voluntarily without any return
for their efforts other than the feedback on their lexical competence. They are likely to
have highly positive attitudes towards and a high degree of motivation for vocabulary
learning. A comparison of this group of learners with the larger group in Study 1 has
shown that they scored significantly better overall on the receptive test (t=-2,698,
p=.009) as well as on the productive test (t=-3,363, p=.001). Therefore, they were likely
to have a larger starting vocabulary. Even the better learners, however, do not seem to
make any progress receptively.
Benferroni post-hoc tests on frequency revealed all differences between
frequency levels significant. The Guttman scalogram analysis using a 90% criterion for
mastery suggested the presence of an implicational scale between the levels (Crep=.986,
MMrep=.867, Cscal=.894) . The vocabulary profile of the learners in Figure 3 is somewhat
flatter in the higher frequency levels than those in study 1, which resulted from the
closer performance of the study 2 learners in the 2K and 3K levels. Mean scores were
very close to 100% and about half (53%) of the learners hit 100% in the 2K and 3K levels
each.
24
35
30
25
20
15
10
5
0
2K
3K
5K
10K
Figure 4. Vocabulary profile of the whole group in receptive VLT in study 2 (N=17)
The difference in academic word scores between the first year and the fourth
year of study (i.e. 1 word on average), although statistically significant (t= -2.885,
p=.011), was very small in terms of the number of words learnt during the intervening
three years amounting to an increase by about 16 words over the 570 words of the AWL.
Of course, this might be due to a ceiling effect. Learners’ scores in the academic
vocabulary section were already very high in the first year, only one to three items short
from the maximum score and there was little room for improvement.
The results of the productive test are given in Table 4. The ANOVA revealed two
significant effects: the main effect for frequency (F=255.018, p=000) and the main effect
for year of study (F=8.530, p=.006). The interaction was not significant (F=1.577, p=.200).
25
Table 4. Productive Test Results of learners in study 2 (N=17)
2000
3000
5000
10000
Academic
Total
1
4
1
4
1
4
1
4
1
4
1
4
Mean
15.29
85%
16.18
90%
8.94
50%
11.65
65%
6.88
38%
8.24
46%
3.00
17%
4.88
27%
9.00
50%
11.29
63%
43.12
48%
52.24
58%
SD
1.99
1.29
2.82
2.94
1.69
2.56
1.46
3.04
2.72
2.62
8.43
9.62
*The maximum score for each frequency level is 18, and for the total 90.
Overall learners increased their scores by about 10% from year 1 to year 4. The
increase uniformly occured in all sections of the test by 1-2 words on average. The
Benferroni test showed all the differences between frequency levels significant, and the
vocabulary profile in Figure 4 indicates a linear decreasing effect. Learners did rather well
in the 2K level (85% vs 90%), but only 3 learners (18%) hit 100%. The Guttman analysis
did not suggest an implicational scale between frequency levels (Crep=.844, MMrep=.892,
Cscal=.444) with both the reproducibility and the scalability coefficients being lower than
the minima (>.90 and >.60 respectively) suggested by Hatch & Lazaraton (1991).
18
16
14
12
10
8
6
4
2
0
2K
3K
5K
10K
Figure 5. Vocabulary profile of the whole group in productive VLT in study 2 (N=17)
26
For the academic vocabulary, the overall gain in three years was around 13%,
which was also statistically significant (t=-2.885, p=.011). In this study as in study 1, there
was a large gap between the receptive and productive knowledge of academic
vocabulary with the gap getting smaller in the fourth year (the receptive-productive ratio
being 43% in the first year (93% -50%) and 32% in the fourth year (95% -63%)).
Study 3
Study 3 was conducted against the possibility that the receptive gains are
underrepresented in the first two studies due to the limitations of the test instrument
used. The Vocabulary Levels Test is not sensitive to gains in lower frequency levels
especially after the 5K level as the frequency bands selected are not spaced evenly and
more bands are measured from the first 5K words while the second 5K (up to 10K) is
measured with only one band. Any gains made within this broad band will not be
detected by the Levels Test. This is certainly a possibility for 20% (N=11) of the learners in
study 1 and 30% (N=6) of learners in study 2 who already have attained mastery at the
5K level and likely to have moved to lower frequency levels beyond. Study 3 will use a
measure which is more sensitive to gains in lower frequency vocabulary.
The participants in study 3 were drawn from the same context as in studies 1 and
2. There were 174 participants altogether who were in different stages of their studies.
27
48 of these were first-year students, 60 were second-years, 34 were third-years, and 32
were fourth-years. These groups were assumed to represent different levels in terms of
general English proficiency as well as vocabulary knowledge because of the differences in
the number of years of study.
In study 3, the size of learners’ receptive vocabularies was measured with the
Vocabulary
Size
Test
(Nation
and
Beglar,
2007;
http://www.victoria.ac.nz/lals/staff/paul-nation/nation.aspx).
also
available
Learners’
at:
productive
knowledge could not be measured due to the absence of an equivalent productive test.
The Vocabulary Size Test is based on word frequency lists from the British National
Corpus, arranged into 14 one-thousand-word bands of decreasing frequency. The test
contains 10 target words from each frequency band with a sampling rate of 1 in 100. The
target words are presented in short sentences with non-defining contexts. The test uses
a multiple-choice format in which the choices are single-word or phrase-length
definitions. An example item from the 2K band is given below:
nil: His mark for that question was nil.
a. very bad
b. nothing (key)
c. very good
d. in the middle
In the present study, learners were tested only on nine of the fourteen frequency levels.
The 1K level and the four levels from 11K-14K were not tested. The 1K level was judged
to be too easy and the levels beyond the 10K too difficult for the learners from the
28
performance of their peers in the Levels Test in the first two studies. The exclusion of
these levels resulted in a shorter and more feasible test. In scoring the test, one point
was given for each correct answer. The ovarall test score was converted to a size score
over 10,000 words targeted in this reduced version of the test. In the calculation of
overall size, the learner’s test score was multiplied by 100 as each word in the test
represented 100 word families. Given the English proficiency levels of the learners and
their near perfect performance in the 2K level in the Vocabulary Levels Test, the learners
were credited with knowledge of the whole of the 1K level words, and accordingly the
size score was increased by 1000 for each learner.
The test was administered to students in normal class hours. The learners were
told that they were being asked to answer a test which measured how many words
they knew in English. It was believed that the learners will be more motivated to do
the test if they also benefited from it, and therefore, they were promised their test
results. Considering the concern some learners might feel in having their results
announced publicly on a notice board, they were given a choice to learn their test
score individually in private. No time limits were set for the test, but it was completed
in about 40 minutes by most students. The slowest learner took 50 minutes and the
fastest as few as 20 minutes.
The results of the test for the four learner groups are given in Table 5. The KR 21
reliabilities were mostly acceptable with the exception of the third years. Overall,
29
learners answered about half of the items on the test correctly (47 out of 90 items). This
converts to a receptive size of 5686 words on average. The scores spread over a wide
range: the highest score was 69 and the lowest was 22. Expressed in terms of vocabulary
size, these figures correspond to 7900 and 3200 word families respectively with a
difference of 4700 word families. None of the learners seem to have made the 8-10,000
written-receptive target, and are at varying distances from it.
Table 5. Results of the Vocabulary Size Test in study 3
Year
First
Mean
Two
6.54
Three
5.90
Four
5.67
Five
4.83
Six
4.23
Seven
3.67
Eight
5.00
Nine
3.17
Ten
3.44
Test
42.44
Size
5243.75
(N=48)
SD
1.37
1.57
1.66
1.45
1.36
1.31
1.56
1.36
1.70
7.84
783.86
Second
Mean
7.58
6.40
6.18
6.07
4.33
3.98
6.28
3.68
3.13
47.65
5765.00
(N=60)
SD
1.43
1.56
1.47
2.07
1.59
1.42
2.03
1.47
1.83
9.61
961.08
Third
Mean
8.03
6.88
7.15
6.74
4.65
4.74
7.03
4.00
4.03
53.24
6323.53
(N=34)
SD
1.17
1.39
1.65
1.29
1.41
1.29
1.57
1.26
1.51
6.94
694.16
Fourth
Mean
7.53
6.16
6.72
5.97
3.72
4.03
5.91
3.34
2.91
46.28
5628.13
(N=32)
SD
.95
1.39
1.37
1.49
1.69
1.31
1.94
.97
1.45
8.24
823.54
Total
Mean
7.37
6.31
6.33
5.84
4.25
4.05
6.01
3.54
3.35
47.05
5705.17
(N=174)
SD
1.39
1.53
1.62
1.79
1.53
1.38
1.93
1.34
1.70
9.12
912.11
F
10.32
3.03
6.99
9.46
2.15
4.29
9.08
3.14
3.02
11.12
p
.000*
.031*
.000*
.000*
.095
.006*
.000*
.027*
.031*
.000*
KR 21
0.64
0.77
0.56
0.68
0.74
ANOVA revealed all the effects significant (Main effect of frequency: F=212.485,
p=.000; Main effect for the year of study: F=11.128, p=.000; Interaction: F=2.735,
P=.000). Learners’ overall scores steadily increased by the year of study except a drop in
the last year. The Benferroni post-hoc tests revealed all the differences between the first
three groups significant while the fourth years’scores were not significantly different
from the first years’ and the second years’and they were significantly lower than the
third years’. The mean increase over the whole test between two successive groups (first
years vs second years and second-years vs third years) was about 5 words, which
30
corresponds to an annual increase by about half a thousand words (521 words vs 559
words respectively). While these are not as impressive as the annual growth rate (i.e.
2650 words a year) reported in Milton and Meara (1995), they are a little better than the
330 word families reported in Schmitt and Meara (1997), the 200 words in Cobb and
Horst (2000), or the insignificant gains in Study 1 and 2 in this paper. The significant
increase in receptive vocabulary in study 3 in comparison to study 1 and 2, could be
explained by the greater capacity of the Vocabulary Size Test to measure knowledge in
lower frequency levels with greater sensitivity.
The vocabulary profile of the whole group across the levels of the test (cf. Figure
6) somewhat deviates from a normal profile. The most noticeable deviation is the
unexpected peak in the 8K scores, which were not significantly different from the 3K, 4K
and 5K scores according to Benferroni tests, and more learners (i.e. 18 vs 14, 16, and 9
respectively) passed the 90% cut-point for mastery. An examination of the test words in
this level suggested that the presence of three cognates (palette, authentic, cabaret)
were likely to have boosted performance at this level. Disregarding the 8K level, learners’
profile shows a general decreasing trend with frequency so that the scores were highest
at the 2K level and lowest at the 10K level. However, this decrease seems to be taking
place slowly and become obvious over several frequency bands. There are three such
clusters in the data: 3K, 4K and 5K form a cluster, the 6K and 7K form another cluster,
and finally 9K and 10K also cluster together. Bonferroni tests revealed the differences
between the levels in a cluster non-significant (except the difference between 3K and
31
5K). Learners’ performance at the 2K level was significantly better than all other levels.
However, the scores were not as high as that would be expected given the level of the
learners and their performance in the receptive version of the VLT. Learners scored 7.37
on average out of 10 and only 22% of the learners (i.e. 39) demonstrated mastery in this
level on the 90% criterion while only 6 (3%) learners hit 100%. In the light of this finding,
the validity of our earlier assumption that the learners would have full mastery of the 1K
level was checked with further data obtained from the second year group. 17 learners
answered the 1K level section of the test where one of the distractors for the word
reason was replaced as recommended by Beglar (2009) for failing to behave properly as a
distractor. The average score for this level was 9.52 over 10 items, which suggested that
learners’ vocabulary sizes in Table 6 have been overestimated by about 50 words only in
each group. As this overestimation was uniform across proficiency groups, it does not
invalidate the foregoing conclusions drawn from the data concerning group differences.
8
7
6
5
4
3
2
1
0
2K
3K
4K
5K
6K
7K
8K
9K
10K
Figure 6. Vocabulary profile of the whole group in the Vocabulary Size Test in study 3 (N=174)
32
The interaction effect seems to be the result of the fourth years’ performance
across the frequency levels (cf.Figure 7). While the other three groups display a
remarkably similar pattern of scores over the levels (cf. Figure 8 without the fourth
years), the fourth year group deviate from this pattern.
9
8
7
6
First Year
5
Second Year
4
Third Year
Fourth Year
3
2
1
0
2K
3K
4K
5K
6K
7K
8K
9K
10K
Figure 7. Vocabulary profiles of the four year groups in the Vocabulary Size Test in study 3 (N=174)
33
9
8
7
6
First Year
5
Second Year
4
Third Year
3
2
1
0
2K
3K
4K
5K
6K
7K
8K
9K
10K
Figure 8. Vocabulary profiles of the first three groups in the Vocabulary Size Test in study 3 (N=142)
The fourth-years scored unexpectedly lower overall and in test sections. Their
scores were always lower than those of the third years’. They scored lower than the
second years in seven of the nine sections. They were almost always better than the first
years. They generally fell somewhere between the first and second years. To explain this
unexpected performance, the possibility of an initial discrepancy with the other groups
was investigated. Learners’ scores in the university admissions test were expected to
provide some clue to their initial proficiency and indirectly to the size of their
vocabularies as these two are closely related (Alderson, 2005). While being a composite
measure, the university admissions score is largely based on scores in an English
proficiency test and therefore would be highly indicative of English proficiency levels of
the learners prior to the start of their studies. Scores were not available for individual
34
learners who participated in this study. However, the descriptive statistics provided by
the National University Testing Centre for the whole student population in the
department (cf. Table 6) do not indicate any important gaps between the fourth-years
and the other groups. So, the fourth years were not initially disadvantaged. A sudden
attrition in the final year is not altogether unlikely, though. In an interview about their
test results the learners suggested that they have limited extracurricular involvement
with English as they devote most of their time to the preparation for a written (nonEnglish) exam for prospective teachers in order to secure a position as English language
teachers in state schools after graduation. The performance of the fourth years in this
study might also offer an explanation for the non-significant results in receptive scores in
the first two studies, which sampled from the first and fourth years skipping the
intermediate years. The decrease in receptive size in the fourth year might have
disguised the growth in the intervening years. It will be alarming, however, in view of L2
vocabulary learning if all this attrition took place in the final year and most of what was
gained in three years was lost in one year. Other studies also report attrition in
vocabulary knowledge. Schmitt and Meara (1997) note that 28% of their subjects
decreased in vocabulary size, and in Milton and Meara (1995) 5 of the 53 subjects
regressed to a lower size. This suggests that vocabulary learning is not only a matter of
learning new words or new aspects of known words, but also of preserving what is
known.
35
Table 6. Learners’ University Admissions Test scores in study 3
Current Level
Year of
Admission
N
Mean
SD
Min.
Max.
First Years
2009
164
342
5.35
317
368
Second Years
2008
164
358
2.67
355
375
Third Years
2007
154
352
3.33
346
366
Fourth Years
2006
154
351
3.07
345
371
A number of Guttman scalogram analyses were performed on the data (cf. Table
7). The 90% criterion was not usable as there were often either too few or no scores that
met the criterion. Therefore, as in the first two studies, a given score was defined with
respect to the score for the next frequency level. This analysis was first applied to the
whole data, but it did not indicate the presence of an implicational scale between the
levels. In the hope of obtaining evidence for an implicational scale, the analyses were
repeated without the 8K level and without the fourth year data where the results turned
out to be rather different than was predicted. Another analysis included only the levels
corresponding to those in the VLT on the basis of the possibility that the VLT revealed an
implicational scale because it used larger frequency bands. None of these analyses,
however, suggested the presence of an implicational scale.
36
Table 7. Guttman scalogram results in study 3
All
Crep
0.539
Without Level
8
0.554
Without
Fourth Years
0.536
VLT Levels
MMrep
0.54
0.573
0.655
0.641
Cscal
0.39
0.044
0.344
0.342
0.518
General Discussion
Receptive Growth
While the first two studies did not provide evidence for significant receptive
growth, study 3 suggested that this might be due to a backslide in the final year sweeping
the gains made in three years. When the final year is excluded, study 3 has shown that
learners’ receptive vocabularies do increase but do so slowly by about 500 words a year.
Nation (1990, p.11) estimates the receptive growth rate for native speakers to be
between 1,000 and 2,000 new words per year. The learners in the present study did not
seem to expand their receptive vocabularies at that rate. These results are not dissimilar
to those obtained in some of the studies mentioned earlier, either (Cobb & Horst, 2000;
Schmitt & Meara, 1997). Although the first two studies in the present research did not
provide a size estimate, study 3 has shown that the size of learners’ vocabularies was
about 5-6,000 words. This is obviously lower than the 10,000-word target set for
academic study (Hazenberg & Hulstijn, 1996), suggesting that learners do not attain the
vocabulary target for academic reading incidentally through reading.
37
The learners’ main input for written receptive development came from reading of
academic texts in their disciplines. Academic reading involves incidental learning of
vocabulary, and research suggests that such learning is minimal in L2 learners (see Horst,
Cobb and Meara, 1998 and Horst, 2005 for a review of this research). Of course, the slow
vocabulary growth in the present study might be the result of the specific learning
conditions in the institution where the research has been conducted. Nation (2007)
argues that substantial vocabulary learning from reading will occur if learners are
exposed to large amounts of text, and there is some research evidence for this (Horst,
2005). The amount of reading these learners have undertaken might have been
insufficient either because the learners were assigned small amounts of reading in the
courses they have taken or because there wasn’t sufficient enforcement for the
completion of the reading assignments. However, slow growth seems typical for this kind
of learning context as other studies conducted in an EFL academic context (Schmitt &
Meara, 1997; Cobb & Horst, 2000) found similarly slow growth rates.
There were several disadvantages of the learning context which made it
unfavourable to vocabulary learning from reading. First, the learners in the present study
were relatively advanced learners with most of the high frequency vocabulary already in
stock. The words that these learners will need to learn were often low frequency words
which are less likely to be learned incidentally because learners will encounter them less
frequently in their reading and fewer opportunities will arise to learn them. Also, the
gaps in these learners’ present vocabularies might not be causing them too much trouble
38
in their reading as most of them have a substantial number of words in their disposal to
scaffold them in their reading. An unknown word now and then might not seem too
serious, where they can be compensated by guessing in ‘pregnant contexts’ (Mondria &
Wit-de Boer, 1991), looking up in a dictionary, or otherwise be simply ignored. Therefore,
learners may not feel the need to make an effort to learn new words.
Second, the kind of vocabulary these learners were exposed to in their reading is
likely to be different from the kind of vocabulary measured by a size test based on
general written English. The learners were exposed to a specific type of English in their
disciplines which may not be lexically as diverse as general English or even general
academic English. There is research evidence that indicates use of smaller vocabularies in
specific disciplines. Sutarsyah et.al. (1994) compared the vocabulary of an economics
textbook with a general corpus of academic English. The former contained less than half
the number of different words found in the general corpus (5438 and 12744
respectively). Ward (1999) examined the vocabulary of engineering and concluded that
2000 words will be sufficient for reading engineering texts. This suggests that, when
reading in their subject area, learners will encounter only a subset of the 8-10,000
vocabulary of general written academic English and perhaps even a smaller subset of
general written English, and there will be fewer opportunities for lexical learning from
reading these texts. The possibility of smaller size vocabularies in specific disciplines
might also explain the negative correlation between study time and vocabulary gain in
39
Milton & Meara (1995), whereby learners who spent more time on academic study
gained fewer words on a general vocabulary size test.
Third, in academic reading ‘technical vocabulary’, i.e. vocabulary that relate to
learners’ area of study, might seem more important to acquire, causing learners to pay
less attention to the learning of general vocabulary reflected as insignificant gains on
the size tests. Lessard-Clouston (2006) reports significant gains in the technical
vocabularies of native and non-native theology students in Canada in both size and
depth over one term providing evidence for enhanced attention to technical
vocabulary. In future research, the two aspects of learners’ vocabulary development
needs to be investigated together to identify the interrelationships between general
and technical vocabulary development. There might be differences across disciplines in
this respect as specific disciplines are likely to make different demands on the learner
with respect to technical vocabulary. Chung & Nation (2003) have compared the
technical vocabulary of an applied linguistic text and an anatomy text, and found that
one in every five words in the applied linguistics text is technical while a technical word
occurred once in every three words in the anatomy text, suggesting that applied
linguistics might have a smaller technical vocabulary than anatomy. Studies on
vocabulary development in a variety of disciplines are, therefore, needed.
40
Productive Growth
The growth in written productive vocabulary (i.e. writing vocabulary) over three
years was statistically significant in the longitudinal data, but non-significant in the
crossectional data involving a greater number of subjects. The increase might be
characteristic of better learners as learners in the longitudinal study had larger initial
vocabularies than the larger group in the crossectional study. Although the increase is
only 10%, it is likely to have dropped to this figure from a higher percentage as the
attrition observed in the receptive scores in the final year is likely to have occured in the
productive scores as well. In the crossectional data, on the other hand, the increase in
the middle years might have been masked by this attrition.
On the whole, the learning environment can be argued to be conducive to
productive growth. In the programme learners are given opportunities for written
production in the form of term papers and sit-in exams. However, greater increase would
have been expected given the length of study and the starting receptive vocabularies of
the learners. Considering that many of the words in the productive test were already
known to these learners receptively given their performance on the receptive test, the
transformation from receptive to productive seemed rather slow. For faster
development, more frequent and regular production is to be recommended.
41
The unsatisfactory growth in productive vocabulary can also be related to the
idea of “comprehensible output” (Swain, 1985), which involves stretching of one’s
linguistic resources in production. Certain learner behaviour in academic writing will
reduce the pressure to stretch the resources. One of these is the avoidance behaviour.
Learners are known to avoid difficult vocabulary in production (Blum & Levenston, 1978),
and since words known only receptively are difficult to produce they might have been
avoided in writing with the result that opportunities for learning them productively were
lost. Another student behaviour that might have a negative effect on comprehensible
output involves paraphrasing others’ work with minimal modification in written
assignments, which hardly stretches one’s linguistics resources. This kind of writing is not
likely to contribute much to learners’ productive vocabularies.
Growth in Academic Vocabulary
The high scores of the first-years on receptive academic vocabulary in study 1 and
2 suggest that these learners already had receptive knowledge of a great proportion of
academic vocabulary probably prior to their studies (85%-95%). The improvement in
receptive academic vocabulary in three years was therefore quite modest. The greater
starting academic vocabularies of the learners stand in contrast to Cobb and Horst’s
subjects (2000) in Hong Kong who knew around 70% of the academic vocabulary at the
start of their studies and to Read’s learners (1988) who knew 64%. This advanced
receptive knowledge of academic vocabulary of the learners in the present study is likely
42
to have been developed in the course of the long and painstaking preparation these
learners undertake to pass the English Test of the University Entrance Exam, which
includes non-fiction text where academic vocabulary is likely to come up frequently. It is
also interesting to note that learners’ scores on the academic section were remarkably
higher than those on the 5K section although the words in the two sections were similar
in frequency (Laufer, 1998). From this data, academic vocabulary emerges as a
psychologically distinct category although concerns have been recently raised as to the
validity of the AWL (Martinez et. al., 2009; Wang et.al., 2008) or even the existence of a
so-called academic vocabulary (Hyland & Tse, 2007).
While the development of receptive academic vocabulary was held back by a
ceiling effect there was plenty of room for productive development as productive
knowledge of these learners was about half the size of their receptive vocabulary in
study 1 and 2. The expansion in the productive knowledge of academic vocabulary size
was relatively large in the longitidunal data (13%). However, greater improvement would
have been expected given the kind of academic work these learners had to undertake.
The same explanations as those for the general productive vocabulary are likely to hold
for the somewhat unsatisfactory development in the academic productive vocabulary as
well.
43
Frequency
The effect of frequency was invariably significant in all five tests used in the three
studies. Learners’ vocabulary scores tended to decrease as the frequency of words
decreased. This study provides further emprical support for the frequency model of
lexical learning in that learners’ knowledge of higher frequency words tend to develop
faster than those of lower frequency. In lower frequency levels, the effect of frequency
becomes apparant over wider bands, e.g. two-thousand-word bands of frequency.
However, the presence of an implicational scale could only be established for the
receptive Vocabulary Levels Test. The two studies in the literature (Read, 1988; Schmitt
et.al., 2001) that found an implicational scale also used the Vocabulary Levels Test. When
other tests are used no implicational scale seems to be present. This might have to do
with the degree of knowledge required in different tests. Of the three tests used in the
present study, the VLT was easier than the Productive VLT, but it was also easier than the
VST. Nation and Beglar (2007, p.11) claim that the two receptive tests are not of equal
difficulty and that the Vocabulary Size Test is slightly more demanding than the VLT.
Learners’ performance in the present study supported this claim. They answered
correctly a greater proportion of items in the higher frequency levels in the Levels Test
and a greater number of learners displayed mastery. The two tests are likely to be
tapping different degrees of receptive knowledge. These results suggest that the
frequency effect is relatively strong, but when the measurement requires a lower degree
of knowledge it is stronger. Frequency seems to have a predictive power on the initial
learning of words which requires limited knowledge. For deeper learning, frequency may
44
not be the sole factor that determines L2 vocabulary development, and other factors
might be at play.
Learners' performance in the 2K level in the VST was lower than expected. While
learners in study 1 and 2 scored at least 95% or more in the 2K level in the VLT, the
learners in study 3 scored around 75% at this level. It is interesting that learners with
relatively large vocabularies of 5-6000 words should still have gaps in the 2K level when
the test requires more precision. Few learners reached the 100% ceiling, but the plateau
seems to change with the test.
Conclusion
This study has investigated the vocabulary growth of advanced EFL learners in an
academic mainly incidental learning context. The conclusion that is suggested by the
three studies in this paper about the nature of L2 vocabulary development is that
expansion of vocabulary size in advanced levels through academic study seems to be
rather slow even though there is some significant progress. Learners do not seem to add
many new words to their vocabularies nor do they seem to transfer many words from
receptive to productive. Knowledge of academic vocabulary also followed a similar
pattern. The data further suggest the possibility of regression in receptive size when level
of involvement with the target language decreases. Thus, the data provided emprical
45
support for the present learners’ sense of deterioration in their lexical knowledge in the
final year. For the middle years, it looks more like the misperception of slow growth as
deterioration. Frequency seems to have a stable overall effect in vocabulary
development whereby learners’ knowledge changes linearly with frequency. However,
for two of the three tests used an implicational scale between the levels could not be
established.
The present research attempted to provide a bigger and more accurate picture of
vocabulary growth at the advanced level by using both cross-sectional and longitudinal
data covering a longer time span than in previous studies as well as by using multiple test
instruments. It should be noted, however, that the tests measure vocabulary knowledge,
and not vocabulary use. Receptive and productive use of vocabulary involve other
knowledge and skills than vocabulary knowledge alone such as the guessing of unknown
words or finding out the referent of known words in receptive use, or using the word in a
grammatically correct and pragmatically appropriate way in productive use. It is possible
for a learner to have a greater /fewer number of words that she/he can use receptively
or productively than suggested here. Also, the results are valid as far as written
vocabulary is concerned, and do not generalize to spoken vocabulary. The test
instruments employ a rather restricted definition of vocabulary knowledge, which is
limited to knowledge of the form and basic conceptual meaning of a word. Therefore,
correctly answered items on the tests cannot be assumed to be known to the learners in
further depth.
46
The plateau in vocabulary expansion suggested in this paper might be
characteristic of advanced vocabulary learning in academic learning situations, which is
largely incidental. More research into vocabulary development of L2 learners in different
learning contexts (e.g. incidental learning contexts vs language courses; ESL vs EFL) and
different proficiency levels are needed.
Growth in the productive vocabulary also requires further research. However, we
do not yet have a productive vocabulary test instrument comprable to the Vocabulary
Size Test or the EVST which samples evenly from frequency levels in a modern frequency
list.
Vocabulary size targets also need to be identified for receptive and productive
vocabulary through further research. The receptive target recommended for academic
study in English is based on a study on the Dutch language (Hazenberg and Hulstijn,
1996), and there are no studies on productive targets. Reliable targets need to be
established and anounced to language learners, teachers and materials writers if
vocabulary growth is to be maintainedz to the required levels.
The slow growth of the learners in the present study suggests that advanced
learners in English-medium degree programs do not learn a great number of words
47
through academic study, and this does not seem to be confined to the contex of the
present study. To ensure greater growth in such contexts, larger amounts of reading and
writing requirements as well as stronger enforcement of the requirements are advised.
Also, extra support and guidance may be given to the learners in the form of an
advanced vocabulary course. It will be wrong to expect vocabulary to take care of itself.
Notes
1”
Word” refers to “word family” throughtout the text.
Acknowledgements
I am grateful to the students in the ELT Department of the Faculty of Education in Uludag
University, Turkey for sparing their time to provide the data for this study.
References
Beglar, D. (2009). A Rasch-based validation of the Vocabulary Size Test. Language
Testing, 26(4), 1-22.
Blum, S. & Levenston, E. A. (1978). Universals of lexical simplification. Language Learning,
28, 399-415.
48
Cameron, L. (2002). Measuring vocabulary size in English as an additional language.
Language Teaching Research, 6(2), 145-173.
Chung, T.M., & Nation, P. (2003). Technical vocabulary in specialized texts. Reading in a
Foreign Language, 15(2), 103-116.
Cobb, T. http://www.lextutor.ca/tests/levels
Cobb, T. (1995). Imported tests: Analysing the task. Paper presented at TESOL (Arabia),
Al-Ain, United Arab Emirates.
Cobb, T., & Horst, M. (2000). Vocabulary sizes of some City University students. City
University (HK) Journal of Language Studies, 1, 59-68.
Also retrievable at: http://www.lextutor.ca/cv/index.html#publications
Coxhead, A. (2000). A new academic word list. TESOL Quartely, 34(2), 213-238.
Alderson, J.C. (2005). Diagnosing Foreign Language Proficiency: The Interface Between
Learning and Assessment. London: Continuum.
Hazenberg, S., & Hulstijn, J.H. (1996). Defining a minimal receptive second-language
vocabulary for non-native university students: An empirical investigation. Applied
Linguistics, 17(2), 145-163.
Horst, M. (2005). Learning vocabulary through extensive reading: A measurement study.
The Canadian Modern Language Review, 61(3), 355-382.
Horst, M., Cobb, T., & Meara, P. (1998). Beyond a clockwork orange: Acquiring second
language vocabulary through reading. Reading in a Foreign Language, 11, 207-223.
49
Hyland, K. & Tse, P. (2007). Is there an “Academic Vocabulary”? TESOL Quarterly, 41(2),
235-253.
Laufer, B. (1989). What percentage of lexis is essential for comprehension? In C. Lauren,
& M. Nordman (Eds.), From Humans Thinking to Thinking Machines (pp. 316-323).
Clevedon, UK: Multilingual Matters.
Laufer, B. (1991). How much lexis is necessary for reading comprehension? In P.J.L.
Arnaud, & H. Bejoint (Eds.), Vocabulary in Applied Linguistics (pp. 126-132).
Basingstoke: Macmillan.
Laufer, B. (1992). Reading in a foreign language: How does L2 lexical knowledge interact
with the reader’s general academic ability? Journal of Research in Reading, 15, 95103.
Laufer, B. (1998). The development of passive and active vocabulary in a second
language: Same or different?. Applied Linguistics, 19(2), 255-271.
Laufer, B., & Paribakht, T.S. (1998). The relationship between passive and active
vocabularies: Effects of language learning context. Language Learning, 48(3), 365391.
Laufer, B., & Nation, P. (1999). A vocabulary size test of controlled productive ability.
Language Testing, 16(1), 33-51.
Laufer, B., Elder, C., Hill, K., and Congdon, P. (2004). Size and strength: Do we need both
to measure vocabulary knowledge? Language Testing, 21(2), 202-226.
50
Lessard-Clouston, M. (2006). Breadth and depth: Specialized vocabulary learning in
theology among native and non-native English speakers. The Canadian Modern
Language Review, 63(2), 175-198.
Martinez, I.A., Beck, S.C., & Panza, C.B. (2009). Academic vocabulary in agriculture
research articles: A corpus-based study. English for Specific Purposes, 28, 183-198.
Meara, P.(1992). EFL Vocabulary Tests. Swansea: _lognostics.
Meara, P.(2010). (Second Edition). EFL Vocabulary Tests. Swansea: _lognostics.
Meijer, A. (2006). Second International Conference on Integrating Content and Language
in Higher Education. Journal of English for Academic Purposes, 5, 333-334.
Milton, J. (2009). Measuring Second Language Vocabulary Acquisition. Bristol:
Multilingual Matters.
Milton, J. (2007). Lexical profiles, learning styles and the construct validity of lexical size
tests. In H.Daller, J.Milton & J.Treffers-Daller (Eds.). Modelling and Assessing
Vocabulary Knowledge. Cambridge: CUP, 47-58.
Milton, J., & Meara, P. (1995). How periods abroad affect vocabulary growth in a foreign
language? ITL Review of Applied Linguistics, 107-108, 17-34.
51
Mondria, J. A., & Wit-de Boer, M. (1991). The effects of contextual richness on the
guessability and retention of words in a foreign language. Applied Linguistics, 12 (3),
249-267.
Nation, I.S.P. (1990). Teaching and Learning Vocabulary. Boston: Heinle & Heinle.
Nation, I.S.P. (2001). Learning Vocabulary in Another Language. Cambridge: Cambridge
University Press.
Nation, I.S.P. (2006). How large a vocabulary is needed for reading and listening? The
Canadian Modern Language Review, 63(1), 59-82.
Nation, I.S.P. http://www.victoria.ac.nz/lals/staff/paul-nation/nation.aspx
Nation, I.S.P. and Beglar, D. (2007). A vocabulary size test. The Language Teacher, 31(7),
9-12.
Nurweni, A., & Read, J. (1999). The English vocabulary knowledge of Indonesian
university students. English for Specific Purposes, 18(2), 161-175.
Read, J. (2000). Assessing Vocabulary. Cambridge, UK: Cambridge University Press.
Read, J. (1988). Measuring the vocabulary knowledge of second language learners. RELC
Journal, 19(2), 12-25.
Schmitt, N. (2010). Researching Vocabulary: A Vocabulary Research Manual.
Basingstoke, UK: Palgrave Macmillan.
52
Schmitt, N. and Meara, P. (1997). Researching vocabulary through a word knowledge
framework: Word associations and verbal suffixes. Studies in Second Language
Acquisition, 20, 17-36.
Schmitt, N., Schmitt, D. & Clapham, C. (2001). Developing and exploring the behaviour of
two new versions of the Vocabulary Levels Test. Language Testing, 18 (1), 55-88.
Sutarsyah, C., Nation, P., & Kennedy, G. (1994). How useful is EAP vocabulary for ESP? A
corpus based case study. RELC Journal, 25(2), 34-50.
Swain, M. (1985). Communicative competence: Some roles of comprehensible input and
comprehensible output in its development. In S. Gass, & C. Madden (Eds.), Input in
Second Language Acquisition (pp. 235-256). New York: Newbury House.
Wang, J., Liang, S.I., & Ge, G.C. (2008). Establishment of a medical academic word list.
English for Specific Purposes, 27, 442-458.
Ward, J. (1999). How large a vocabulary do EAP engineering students need? Reading in a
Foreign Language, 12(2), 309-323.
Webb, S. & Rodgers, M.P.H. (2009a). Vocabulary demands of television programs.
Language Learning. 59(2), pp.335-366.
Webb, S. & Rodgers, M.P.H. (2009b). The lexical coverage of movies. Applied Linguistics.
30(3), pp.407-427.
53
Xue, G. and Nation, I.S.P. (1984). A university word list. Language Learning and
Communication, 3, 215-229.
Download