Session 2 • Vocabulary size and vocabulary profiles of students • An indicator of proficiency level • Word frequencies • Which words are most frequently used in the English language? • What kind of words should learners focus on? • Word lists • Computer applications for assessing and learning vocabulary Warming Up Number of words in the English language: Number of words a universityeducated native English speaker knows: Number of words that you know: Vocabulary size needed for basic communication (i.e., to express what one wants to express, however simply): Vocabulary size needed for reading (understanding any written text): Can you think of a good way to measure people’s vocab sizes? New curriculum proposed by EDB Key Stage (KS) Stage target (no. of word families) Cumulative target (no. of word families) KS1 (Pri 3) KS2 (Pri 6) KS3 (Sec 3) KS4 (Sec 6) 3 How many words do you know? (Measuring vocab size) 1. 2. Rough estimate of vocab size (by Goulden, Nation & Read, 1990) see Schmitt, 2000, pp.7-8 for a more sophisticated version of this test Paul Nation’s Vocabulary Levels Test measures number of words that are known at various levels of frequency How can you interpret your score? Number of words in the English language: •1 to 2 million words Goulden, Nation & Read (1990) estimated that Webster’s Third International Dictionary (published in 1961) contained around 267,000 entries and 54,000 word families. The latest edition of Webster's Third New International Dictionary, unabridged, published in 2000, is believed to contain over 472,000 entries. Number of words a universityeducated native English speaker knows: 20,000 word families 2,000 most frequent words Vocabulary size needed for basic West’s (1953) General Service List: communication (i.e., to express what one wants to express, however simply): 95% coverage of informal spoken English (but only 80% coverage of written English) Vocabulary size needed for reading (understanding any written text): Students need to know 95%-98% of the words in a text in order to understand the text (5,000 words – about 90% coverage, depending on the kind of text being read) New curriculum proposed by EDB Key Stage (KS) Stage target Cumulative target KS1 (Pri 3) 1000 1000 KS2 (Pri 6) 1000 2000 KS3 (Sec 3) 1500 3500 KS4 (Sec 6) 1500 5000 6 Vocabulary size and text coverage Source: Carroll, Davies & Richman, 1971 (as cited in Nation, 2001) Recommended sequence for learners First 2,000 words First 2,000 words + AWL 90% of text coverage of a text that a student would typically read First 2,000 words + AWL + Technical vocab 80% of text coverage 95% of text coverage of a text that a student would typically read First 2,000 words + AWL + Technical vocab + most frequently used prefixes, roots and suffixes 8 Strategies for learning words of different frequency levels 5,000 Word Level (general vocabulary) •Training at guessing words in context •Wide general reading : novels, newspapers and magazines •Intensive reading of a variety of texts •Advanced English Vocabulary workbooks University Word Level (specialised academic vocabulary) •Learn the words on the University Word List (Nation 1990) and Academic Word List (Coxhead, 2000) •Intensive reading of university texts 10,000 Word Level (a wide, general vocabulary) •Activities similar to the 5,000 word level, •combined with learning prefixes and roots What do you think are the ten most frequently used words in English? 10 most frequently used words 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Frequency lists West (1953)’s GSL (2000 words) How to make reading tests more comprehensible for learners Replacing difficult vocabulary with the GSL words, selected according to: Frequency Universality Range Usefulness 11 Discussion (p. 9): Other than frequency, what criteria can we use in selecting words for learners? Other than frequency??? Relevance to Ss’ immediate and future needs Usefulness Frequently used (frequency) Used in a wide range of topics/domains (have a wide range) Related to the personal experience (relevance to learners) Combinability (collocation) and word formation Blonde+girl/ Rising+prices/Torrential+rain Words made up of some familiar word parts (prefixes, roots, suffixes) 13 Local Situation: Sources of input for the EMB wordlists Words taken from: General Service List (GSL) Most frequent words in British National Corpus (BNC) Academic Word List (AWL) Teacher representatives then further selected words based on their judgment according to: themes recommended in the Government’s Curriculum Guides vocabulary content of approved textbooks other guidelines set by the research team (e.g. whether words are used in Hong Kong, ease for learning, etc.) 14 Sources for vocabulary lists GSL Classic list of most frequent 2000 words GENERAL words BNC 100 million word collection from written and spoken texts AWL ACADEMIC words 570 words that occur frequently in academic texts across disciplines 15 Discussion What are the benefits of using word lists (such as GSL, AWL)? To design a curriculum To decide which texts to use with students To decide which words in a text would cause difficulty to students 17 Tom Cobb’s Compleat Lexical Tutor (http://www.lextutor.ca/) Test (to get receptive and productive tests of various word levels) List_Learn (to learn words at various levels with an online concordancer and dictionary; to get lists of words from 1k to 20k level and AWL and UWL) Vocab Profiler (to see the vocab profile of one’s writing / to predict “readability” of a text for learners) % of words at 2000 word level % of academic words % of words from beyond the most frequent 2000 type-token ratio Corpus-based Range checks whether a word is used more frequently in spoken or written English in the Brown Corpus. It also checks the range of a word in any of the 15 sub-corpora of the Brown Corpus, i.e. in which and how many of the 15 sub-corpora a word can be found. The sub-corpora cover a wide range of domains such as press, academic, and fiction. Text-based Range allows you to upload up to 25 texts of your own and check the range and frequency of a word in these 25 texts. Type and Token How many types are there in the following sentence? How many tokens (running words) are there in the following sentence? We need a vocabulary to talk about vocabulary. Type-Token Ratio (also called “Lexical Richness” or “Lexical Density”) How many types and tokens do you see here? Watch out! I said watch out! 4 types 6 tokens / running words Type-token ratio: 4/6 (0.67) 20 Text written by a local HK 12-year I have a rubber, an old, small rubber. Although it is so small that I can not use it anymore, I still keep it carefully in my drawer as it is so important for me. That is a long, long time that I have my rubber. Four years ago, when I was still an eight-years-old child, my parents bought me a rubber as my birthday present. I put it into my pencil-box and brought it to school everyday. We had an interesting game in the past. We used our rubber to play with in the game. We pushed our rubber one by one and tried not to be pushed out at the desk by another rubber. We pushed and pulled our rubbers, soon our rubbers became older and smaller one day than one day. Source: Arthur McNeill’s (2004) “VocabProfile” of a student’s text First 1000 words 88% Second 1000 words 12% Academic words (AWL) 0% 75 types /137 tokens : 0.55 Off-list words (Less frequent words) 0% Examples from Hong Kong sample Repetition of key words (need for lexical substitution – synonyms, superordinates / hyponyms, and pronoun substitution) The need for lexical enrichment (adjectives and adverbs) Substitutes for “rubber” It (pronoun) One (pronoun) Eraser (synonym) Item of stationery (superordinate) Tool? (superordinate) I have a rubber, an old, small one. Although it is so small that I can not use it anymore, I still keep it carefully in my drawer as it is so important for me. That is a long, long time that I have my favourite chosen possession. Four years ago, when I was still an eight-years-old child, my parents bought it for me as my birthday present. I put it into my pencil-box and brought it to school everyday. We had an interesting game in the past. We used our eraser to play with in the game. We pushed our stationery one by one and tried not to be pushed out at the desk by another opponent. We pushed and pulled our weapons, soon our rubbers became older and smaller one day than one day. Text written by a local HK 16-year old under exam conditions Many students strive for academic excellency, but what is the motivation behind their hardwork? In this essay, I am going to explore the different aspects of learning, and analyse the pros and cons of each motivating factor. The hunger for knowledge and wisdom can motivate students to learn. They hope to widen their horizons through reading, watching educational programs, travelling and other ways. To them, the world is a fascinating place, full of wonders and mysteries to unravel. Their love of learning motivates them to seek knowledge in all areas, from science and mathematics to arts. Source: McNeill’s (2004) “VocabProfile” of a student’s text First 1000 words 73% Second 1000 words 6% Academic words (AWL) 10.5% 69 types / 96 tokens = 0.72 Off-list words (Less frequent words) 10.5% Lexical enrichment I was sweating. Ms Ip neared my table and put the exam paper in front of me. I closed my eyes and opened them a fraction of an inch. There, on top of the paper, was a 33. My heart sank. Then my teacher took away the paper and put another one in front of me. I took it and saw an 88 in the mark box. The first paper belonged to my neighbor, Sally. 28 Lexical enrichment I was sweating [adv]. Ms Ip neared my table [adv] and put the [adj] exam paper in front of me. I [adv] closed my eyes and [adv] opened them a fraction of an inch. There, on top of the [adj] paper, was a 33. My heart sank [adv]. Then my teacher [adv] took away the paper and put another one in front of me. I [adv] took it and saw an 88 in the mark box. [???] The first paper belonged to my neighbor, Sally. 29 Local research on vocabulary size and vocabulary knowledge Littlewood & Liu (1996) Barber (1999, as cited in Fan, 2001) found a positive correlation between students’ vocabulary knowledge and their HKCEE results Cobb & Horst (2000) – post session reading 40 first-year HKU/CUHK students knew around 3,500 words CityU students knew the most basic 2000 words; also performed well at 3000word level But low scores on UWL level Vocabulary growth over a period of 6 months: No Fan (2001) – post session reading Vocabulary scores positively correlate with language proficiency Students from Chinese-medium schools and those with E in HKAL need help with vocabulary 30