Research-based Principles of Vocabulary Teaching

Research-based Principles of

Vocabulary Teaching

Norbert Schmitt

1

Importance of Vocabulary

How important is vocabulary in first and second language use?

?

2

Importance of Vocabulary

How important is vocabulary in first and second language use?

3

Vocabulary Size

How large a vocabulary size do you need to function in English?

a. 2,000 word families b. 4,000 word families c. 6,000 word families d. 8,000 word families e. 10,000 word families f. 10,000+

4

Vocabulary Size

How large a vocabulary size do you need to function in English?

It Depends!

5

Vocabulary Size

(Word families)

Daily conversation

2,000 - 3,000 (minimum size which enables basic communication)

5,000 - 7,000 (size which enables conversation on a wide range of topics)

Read authentic texts

3,000 - 5,000 (begin to read a range of authentic texts

8,000 - 9,000 (size which enables reading on a wide range of topics)

6

High / Low frequency Vocabulary

Hi-frequency Everything else = Low-frequency vocabulary?

vocabulary

TEACH

2,000 word families

DISREGARD everything else

3,000

Hi-frequency vocabulary

Mid-frequency Vocabulary

(Schmitt & Schmitt, 2012)

3,000 – 9,000 9,000+

Mid-frequency Low-frequency vocabulary vocabulary

High / Mid / Low frequency Vocabulary

3,000

Hi-frequency vocabulary

TEACH

3,000 – 9,000 9,000+

Mid-frequency Low-frequency vocabulary vocabulary

ADDRESS IN SOME WAY STRATEGIES

Where do these size requirement estimates come from anyway?

10

Fat City (95% coverage)

In December, to the delight of many __________ and the ______ of many doughnut lovers, the New York

City Board of Health voted to ban artificial trans fats from restaurants, school cafeterias, pushcarts, and almost every other food-service establishment it oversees, which includes most everything except hospitals. Trans fats don’t occur naturally in the things people like but feel guilty eating, or at least not at high levels (there are small proportions in the fat in meat and dairy products). But artificial ones are plentiful in commercial foods, because they are easy to use, cheaper than natural fats, and keep practically forever.

Trans fats are made by pumping _______ gas into liquid fats usually in the presence of _____ so that they will remain solid at room temperature, like butter and

___; and they have the same wonderful properties in pie crusts, cookies, and cakes. Crisco, still _____ for solid shortening made by partial ___________ (of cottonseed oil), soon became the “______” choice for pie crust and fried chicken, making pastry almost as flaky and skin almost as crisp as ___ does.

11

Fat City (95% coverage)

In December, to the delight of many cardiologists and the dismay of many doughnut lovers, the New York

City Board of Health voted to ban artificial trans fats from restaurants, school cafeterias, pushcarts, and almost every other food-service establishment it oversees, which includes most everything except hospitals. Trans fats don’t occur naturally in the things people like but feel guilty eating, or at least not at high levels (there are small proportions in the fat in meat and dairy products). But artificial ones are plentiful in commercial foods, because they are easy to use, cheaper than natural fats, and keep practically forever.

Trans fats are made by pumping hydrogen gas into liquid fats usually in the presence of nickel so that they will remain solid at room temperature, like butter and lard; and they have the same wonderful properties in pie crusts, cookies, and cakes. Crisco, still generic for solid shortening made by partial hydrogenation (of cottonseed oil), soon became the “ sanitary ” choice for pie crust and fried chicken, making pastry almost as flaky and skin almost as crisp as lard does.

12

The Truth About Beauty (98% coverage)

Cosmetics makers have always sold “hope in a jar” – creams and ______ that promise youth, beauty, sex appeal, and even love for the women who use them.

Over the last few years, the marketers at Dove have added some new-andimproved __________. They’re now promising self-esteem and cultural transformation.

Dove’s “Campaign for Real Beauty,” declares a press release, is “a global effort that is intended to serve as a starting point for societal change and act as a ______ for widening the definition and discussion of beauty.”

Along with its thigh-firming creams, self-tanners, and hair conditioners, Dove is peddling the crowd-pleasing notions that beauty is a media creation, that recognizing plural forms of beauty is the same as declaring every woman beautiful, and that self-esteem means ignoring imperfections.

13

The Truth About Beauty (98% coverage)

Cosmetics makers have always sold “hope in a jar” – creams and potions that promise youth, beauty, sex appeal, and even love for the women who use them.

Over the last few years, the marketers at Dove have added some new-and-improved enticements . They’re now promising self-esteem and cultural transformation.

Dove’s “Campaign for Real Beauty,” declares a press release, is “a global effort that is intended to serve as a starting point for societal change and act as a catalyst for widening the definition and discussion of beauty.”

Along with its thigh-firming creams, self-tanners, and hair conditioners, Dove is peddling the crowd-pleasing notions that beauty is a media creation, that recognizing plural forms of beauty is the same as declaring every woman beautiful, and that self-esteem means ignoring imperfections.

14

Coverage

Size Requirement

98-99%  8,000-9,000 word families for reading

15

Schmitt, Jiang, and Grabe (2011)

100

90

80

70

60

50

40

30

20

10

0

90% 91% 92% 93% 94% 95% 96% 97% 98% 99%

Vocabulary Coverage

10

0%

Mean

+1 SD

-1 SD

16

Vocabulary Coverage vs. Listening

Comprehension (van Zeeland & Schmitt, 2013)

17

Coverage

Size Requirement

95%  2,000-3,000 word families for listening to spoken narratives

18

English Vocabulary Size of Foreign Learners

(Laufer, 2000)

• Japan EFL University

• China English majors

• Indonesia EFL University

• Oman EFL University

• Israel High school graduates

• France High school

• Greece Age 15, high school

• Germany Age 15, high school

Vocabulary

Size

Hours of

Instruction

2,000-2,300

4,000

1,220

2,000

3,500

1,000

1,680

1,200

800-1,200

1,800-2,400

900

1,350+

1,500

400

660

400

19

How to Address Such Large

Amounts of Vocabulary?

• Why not teach just the most ‘ content-ful ’ words?

• Technical Vocabulary

• ESP vocabulary

• Academic Support Vocabulary

• e.g. Academic Word List (Coxhead, 2000)

20

Developing LSP Word Lists

“Technical vocabulary ‘is dependent for a full appreciation of its meaning on the meaning of the other terms in the cluster of which it is a member.’”

(Godman and Payne, 1981: 37, in Coxhead and Nation, 2001)

An ESP Text (Wang, et al., 2008)

• Technical Vocabulary only wounds wound healing. wounds wounds pressure ulcers leg ulcers. wounds clinical Pressure ulcers ischemia necrosis hospitalized mobility

Leg ulcers etiologies. ulcers dysfunction backflow blood. blood macromolecules dermis, nutrients

+Medical AWL

Chronic wounds challenge wound healing . wounds involve area , incidence impacts . chronic wounds pressure ulcers leg ulcers . estimated affect clinical annually . ischemia necrosis , hospitalized mobility impaired . Leg ulcers variety etiologies . Venous dysfunction veins backflow venous blood . Venous blood macromolecules dermis barriers nutrients

A Complete Medical Text

Chronic wounds represent a different kind of challenge for wound healing . These wounds do not usually involve a large surface area , but they have a high incidence in the general population and thus have enormous medical and economic impacts . The most common chronic wounds include pressure ulcers and leg ulcers . In the United States alone, these wounds are estimated to affect more than 2 million people with total clinical treatment costs as high as $1 billion annually . Pressure ulcers , characterized by tissue ischemia and necrosis , are common among patients in long-term care settings, but patients hospitalized for short-term care settings are also at risk if mobility is impaired . Leg ulcers can have a variety of etiologies . Venous ulcers are the most common, often resulting from dysfunction of valves in veins of the lower leg that normally prevent the backflow of venous blood .

Venous congestion leads to leakage of blood and macromolecules into the dermis , which can act as physical barriers to diffusion of oxygen and nutrients from the

Need Foundation Vocabulary

• So academic and technical word lists are useful, but cannot replace the need for learners to have a solid foundation of highand mid-frequency vocabulary in place

How to Address Such Large

Amounts of Vocabulary?

• These are clearly large numbers of word families to learn

• Long-term goal

• Most teachers do not have the time, expertise, or opportunity to organize large amounts of vocabulary over a period of time

• Most will rely on textbooks to provide the principled selection and instruction of vocabulary

26

How to Address Such Large

Amounts of Vocabulary?

• A single textbook cannot do the job (even a vocabulary textbook)

• So are textbook series up to the task?

27

28

Chilean Case Study

(Diaz Berrocal)

• Ministry sets vocabulary targets for 8 years of ELT schooling

• Analyzed mandatory books for the 8 years

( 5 th -12 th grades)

• Counted number of word families in books

• Frequency analysis of those word families

• How much recycling was there?

29

• 5 th

• 6 th

• 7 th

• 8 th

• 9 th

• 10 th

• 11 th

• 12th

Ministry Vocabulary Targets

Comprehended New Learned

250

500

250*

250*

800

1,200

1,500

2,000

2,500

3,000

300

400

300

500

500

500

30

• 5 th

• 6 th

• 7 th

• 8 th

• 9 th

• 10 th

• 11 th

• 12th

Textbook Vocabulary Load

(without proper nouns)

Comprehended

250*

500*

800

1,200

1,500

2,000

2,500

3,000

Textbooks

1,086

1,142

1,479

1,568

2,027

2,565

2,557

2,696

31

• 5 th

• 6 th

• 7 th

• 8 th

• 9 th

• 10 th

• 11 th

• 12th

Textbook Vocabulary Load

(without proper nouns)

Comprehended

250*

500*

800

1,200

1,500

2,000

2,500

3,000

Textbooks

1,086

1,142

1,479

1,568

2,027

2,565

2,557

2,696

32

• 5 th

• 6 th

• 7 th

• 8 th

• 9 th

• 10 th

• 11 th

• 12th

Textbook Vocabulary Load

(without proper nouns)

Comprehended

250*

500*

800

1,200

1,500

2,000

2,500

3,000

Textbooks

1,086

1,142

1,479

1,568

2,027

2,565

2,557

2,696

33

• 5 th

• 6 th

• 7 th

• 8 th

• 9 th

• 10 th

• 11 th

• 12th

Frequency of Words in Textbooks

Shared with COCA 3000

Without

67%

70%

68%

Proper nouns

With

83%

86%

83%

67%

62%

58%

60%

58%

83%

82%

76%

77%

76%

34

• 5 th

• 6 th

• 7 th

• 8 th

• 9 th

• 10 th

• 11 th

• 12th

Frequency of Words in Textbooks

Shared with COCA 3000

Without

67%

70%

68%

Proper nouns

With All <95%

83%

86%

83%

67%

62%

58%

60%

58%

83%

82%

76%

77%

76%

35

Recycling of 1

st

3,000 Words

Former Texts New Text # Recycled %Recycled

5

5+6

6

7

632

833

46%

47%

5+6+7

5+6+7+8

5+6+7+8+9

5+6+7+8+9

+10

5+6+7+8+9

+10+11

8

9

10

11

12

1,053

1,285

1,575

1,704

1,807

56%

50%

50%

54%

54%

36

Chilean Case Study

Conclusions

• Textbooks do not match Ministry goals

• No obvious approach to vocabulary selection or recycling

• Ministry gives size goals, not specific word lists

• Publishers given no guidance as to what words to use

• Different publishers do not liaise with each other to build coherent program

37

Vocabulary Knowledge is a Complex Construct

What Does It Mean to Know A Word?

Form

Meaning

Use

Spoken form

Written form

Word parts

Form and meaning

Concept and referents

Associations

Grammatical functions

Collocations

Constraints on use

(register, frequency…)

(Nation, 2001: 27)

39

Lexical Organization

• “Vocabulary size is not a feature of individual words: rather it is a characteristic of the test taker’s entire vocabulary.”

(Meara and Wolter, 2004: 87)

• Size is a feature of the overall lexicon

40

Lexical Organization

• Nature of the Lexicon must be connected with vocabulary knowledge

• Better connected and more highly organized lexicons should relate to more vocabulary knowledge

41

Automaticity

• Should also lead to faster speed of access and use

• Fluency

42

How to Facilitate this Complex Learning for Large Numbers of Words?

• Incidental Learning

• Explicit Intentional Learning

43

Incidental Learning

• Many practitioners believe that all necessary vocabulary can be learned incidentally simply by being exposed to, and by using, language

44

Incidental learning does occur in L2 (Reading)

• Do Things Fall Apart?

(PellicerSánchez and Schmitt, 2010)

• Nigerian language Ibo

• Spelling recognition: (2-4)=16% (10-17)= 85%

• Word class recall: (2-4)=7% (10-17)= 54%

• Meaning recognition: (2-4)=33% (10-17)= 80%

• Meaning recall: (2-4)=5% (10-17)= 48%

45

Incidental learning does occur in L2 (Listening)

(Van Zeeland & Schmitt, 2013)

46

Problem: Incidental learning is limited by amount of exposure

• It takes at least 8-10 reading exposures to develop an initial form-meaning link and more for meaning recall knowledge (even more for listening)

• Other word knowledge types (e.g. collocation, register, derivative forms) will likely take many more exposures

• Most L2 learners do not read enough to ensure this number of repetitions (Cobb, 2007)

• SO incidental learning is useful, but not sufficient

47

Intentional learning

• Virtually all research shows that intentional learning with an explicit focus on the target linguistic features results in learning that is

– Stronger

– More durable

– More consistent among learners

• Productive mastery seems to come mainly from productive engagement

48

Is Knowledge of the

Form-Meaning Link Enough?

• Learning a word might require more than just learning its meaning and form

• For receptive use, perhaps a meaning-recall level of mastery might suffice

• See/hear word form and retrieve/recall meaning

• All of the other ‘contextual’ word knowledge aspects are already in the discourse/text

49

Various Kinds of ‘Word Knowledge’ are Learned Differently

• But for productive use, learners have a concept in their head, but must produce the appropriate lexical form

• This requires most (all?) of the ‘contextual’ kinds of word knowledge

• These contextual aspects (e.g. collocation, connotation, register constraints) are more difficult to teach, and probably require large amounts of exposure to acquire incidentally

50

Different Types of

Exposure and Learning

• Explicit Intentional Learning

• Can focus on most useful (frequent) words

• Stronger learning

• Mainly useful for ‘teachable’ word knowledge aspects like form-meaning, word class, affixes

• Hard to cover enough words

• Hard to build in enough recycling

51

Different Types of

Exposure and Learning

• Incidental Learning

• Get exposure to a wide variety of words

• A way to get more recycling

• Provides context for learning ‘contextual’ types of word knowledge

• Incidental learning is useful, but the uptake is slow and inconsistent

52

Different Types of

Exposure and Learning

• Intentional and incidental learning are complementary

• They add different things to vocabulary knowledge

• They need to be combined in any principled vocabulary program

53

Formulaic Language

• All of my discussion up until now discusses single words, lemmas, or families

• There is a large amount of lexical patterning in language

• Formulaic language needs to be brought into the discussion of vocabulary use, acquisition, and pedagogy

54

What is Formulaic Language?

• Recurrent multi-word lexical items that have a single meaning or function (Schmitt, 2010)

• It is a umbrella cover term for a number of formulaic categories

– Idioms

– Collocations

– Phrasal verbs

– Lexical bundles

– Lexical phrases

– Phrasal expressions

– etc

Learner Use of Formulaic Language

• Learners don’t use many idioms

• Learners do use many high-frequency collocations ( nice day )

• Learners don’t use many lower-frequency but tightly-bound collocations ( preconceived notions )

Learner Use of Formulaic Language

• But learners often do not use the collocations that they know very appropriately

• Inappropriate collocations is a leading problem in learner language

• Learners often use words with their correct meanings, but do not understand the correct context of use (collocation, register, frequency)

Learner Use of Formulaic Language

• Learners consistently overestimate their comprehension of reading texts that contain formulaic sequences that they either fail to identify or misunderstand, even at high levels of proficiency

(Martinez and Murphy, 2011)

Learner Acquisition of

Formulaic Language

Boers & Lindstromberg ( ARAL 2012) reviewed acquisition research:

– Learning from exposure requires repetition

(frequency)

– Intentional learning produced better results

– Raising awareness of formulaic language is not a powerful accelerator of learning

– Knowing the component words makes learning a formulaic sequence easier

– Providing learning strategies (dictionaries, concordance lines) produced mixed results

Pedagogical Implications

• Meunier review ( ARAL , 2012)

• If formulaic sequences are so important:

• They need to be included in teaching syllabuses and materials

• We can’t assume they will just be learned from exposure

• They need to incorporated into language tests to a greater extent

Pedagogical Implications

• But what formulaic sequences?

• In order to incorporate formulaic sequences into their teaching and testing, most practitioners need a list of formulaic sequences to address

• But what criteria to use?

Infrequent

Formulaic Framework

(Martinez, 2013)

Frequent take credit take issue take time take place

27 121 910 10,556

(per 100 million – BNC)

Transparent Opaque take credit take time take issue take place

Formulaic Framework

Frequent take time (2) take place (1)

Transparent take credit (4)

Infrequent

Opaque take issue (3)

Formulaic

Framework

Frequent take time (2) TAKE PLACE (1)

Transparent take credit (4)

Infrequent

Opaque take issue (3)

PHRASE List

(Martinez & Schmitt, 2012)

• PHRASE List ( PHRAS al E xpressions)

• Some formulaic sequences are very frequent

• 500 phrasal expressions within 5,000

BNC frequency level

• Based on same frequency as individual

BNC words

• Phrases which are opaque and not easily guessable (1)

PHRASE List

• LEAD TO ( CAUSE ) 13,555 (1 st 1,000 frequency level)

Excessive smoking can lead to heart disease.

• HAVE GOT TO ( must ) 12,270 (2 nd 1,000 frequency level)

You have got to try this salad.

• BY THE TIME ( when ) 3,607 (3 rd 1,000 frequency level)

By the time dinner started there were none left.

Integrated Phrase Frequency Spoken Written Written Example

List (per 100 million) general general academic

Rank

107 HAVE TO 83,092 *** ** * I exercise because I have to .

463 GOING TO 28,259 *** ** x I’m going to

(FUTURE) think about it.

894 WAS TO 14,366 x *** ** The message was to be transmitted worldwide.

Integrated Phrase Frequency Spoken Written Written Example

List (per 100 million) general general academic

Rank

5502 MAKE UP 788 *** ** x You’d better

ONE’S MIND make up your mind .

5503 AT WORK 787 x *** *** There were strange forces at work .

Download Research Articles

 Most Norbert Schmitt (& co-author) publications are available for free download at his personal website: www.norbertschmitt.co.uk