Session 2 • Vocabulary size and vocabulary profiles of students • Often used in proficiency or entrance tests; as an indicator of proficiency level • Word frequencies • Which words are more frequently used in the English language? • What kind of words should learners focus on? • Some well-known word lists • Computer applications for assessing vocabulary size and profile Warming Up Number of words in the English language: Number of words a universityeducated native English speaker knows: Number of words that you know: Vocabulary size needed for basic communication (i.e., to express what one wants to express, however simply): Vocabulary size needed for reading (understanding any written text): Can you think of a good way to measure people’s vocab sizes? New curriculum proposed by EDB Key Stage (KS) Stage target (no. of word families) Cumulative target (no. of word families) KS1 (Pri 3) KS2 (Pri 6) KS3 (Sec 3) KS4 (Sec 6) 3 Number of words in the English language: •1 to 2 million words (Schmitt, 2000) Goulden, Nation & Read (1990) estimated that Webster’s Third International Dictionary (published in 1961) contained around 267,000 entries and 54,000 word families. Number of words a universityeducated native English speaker knows: 20,000 word families 2,000 most frequent words Vocabulary size needed for basic West’s (1953) General Service List: communication (i.e., to express what one wants to express, however simply): 95% coverage of informal spoken English (but only 80% coverage of written English) Vocabulary size needed for reading (understanding any written text): Students need to know 95%-98% of the words in a text in order to understand the text (5,000 words – about 90% coverage, depending on the kind of text being read) New curriculum proposed by EMB Key Stage (KS) Stage target Cumulative target KS1 (Pri 3) 1000 1000 KS2 (Pri 6) 1000 2000 KS3 (Sec 3) 1500 3500 KS4 (Sec 6) 1500 5000 5 Vocabulary size and text coverage Source: Francis and Kucera, 1982 (as cited in Nation & Waring, 1997, presession 2 reading) How many words do you know? (Measuring vocab size) take a dictionary and count the number of words that you know on 10 pages chosen at random. Divide the total by 10 and multiply by the number of pages in the dictionary. Method by Goulden, Nation & Read (1990); more sophisticated version in Schmitt (2000) Paul Nation’s Vocabulary Levels Test measures number of words that are known at various levels of frequency 8 Recommended sequence for learners First 2,000 words First 2,000 words + AWL 90% of text coverage of a text that a student would typically read First 2,000 words + AWL + Technical vocab 80% of text coverage 95% of text coverage of a text that a student would typically read First 2,000 words + AWL + Technical vocab + most frequently used prefixes, roots and suffixes 9 Strategies for learning words of different frequency levels 5,000 Word Level (general vocabulary) •Training at guessing words in context •Wide general reading : novels, newspapers and magazines •Intensive reading of a variety of texts •Advanced English Vocabulary workbooks University Word Level (specialised academic vocabulary) •Learn the words on the University Word List (Nation 1990) and Academic Word List (Coxhead, 2000) •Intensive reading of university texts 10,000 Word Level (a wide, general vocabulary) •Activities similar to the 5,000 word level, •combined with learning prefixes and roots Receptive Knowledge vs. Productive Knowledge Tang (2007) found that primary five students in Hong Kong knew about 40% of the most frequent 5,000 words However, their limited use of vocabulary in writing suggested that more effort is needed to convert the receptive knowledge into productive knowledge Tang, E. (2007). An exploratory study of the English vocabulary size of Hong Kong primary and junior secondary school students. The Journal of Asia TEFL. 4 (1), 125-144. 11 What do you think are the ten most frequently used words in English? 10 most frequently used words 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. West (1953)’s GSL (2000 words) Based on a 5 million word written corpus % given for different meanings and parts of speech How to make reading texts more comprehensible for learners Replacing difficult vocabulary with the GSL words, selected according to: Frequency Universality (words used in different countries) Range (words used to talk about a range of topics) Usefulness (words used to define other words) Despite its age, validated by later studies to be providing an average of 82% text coverage (Hirsh & Nation, 1992; Sutarsyah, nation & Kennedy, 1994) 13 Frequency Lists Nation and Waring (1997) (pre-session 2 reading) suggests that a list of high frequency words used in a course should provide: Core meaning of the word Different word forms and parts of speech (word family) Variations of meaning (for polysemous words) Frequency information of the different meanings & uses Collocations Restrictions on use of the word 14 Local Situation: Sources of input for the EDB wordlists Words taken from: General Service List (GSL) Most frequent words in British National Corpus (BNC) Academic Word List (AWL) Teacher representatives then further selected words based on their judgment according to: themes recommended in the Government’s Curriculum Guides vocabulary content of approved textbooks other guidelines set by the research team (e.g. whether words are used in Hong Kong, ease for learning, etc.) 15 Sources for vocabulary lists GSL Classic list of most frequent 2000 words GENERAL words BNC 100 million word collection from written and spoken texts (you can get BNC lists from http://www.lextutor.ca/list_learn/) AWL ACADEMIC words 570 words that occur frequently in academic texts across disciplines 16 Tom Cobb’s Compleat Lexical Tutor (http://www.lextutor.ca/) Test (to get receptive and productive tests of various word levels) List_Learn (to learn words at various levels with an online concordancer and dictionary; to get lists of words from 1k to 20k level and AWL and UWL) Vocab Profiler (to see the vocab profile of one’s writing / to predict “readability” of a text for learners) % of words at 2000 word level % of academic words % of words from beyond the most frequent 2000 type-token ratio Corpus-based Range checks whether a word is used more frequently in spoken or written English in the Brown Corpus. It also checks the range of a word in any of the 15 sub-corpora of the Brown Corpus, i.e. in which and how many of the 15 sub-corpora a word can be found. The sub-corpora cover a wide range of domains such as press, academic, and fiction. Text-based Range allows you to upload up to 25 texts of your own and check the range and frequency of a word in these 25 texts. Type and Token How many types are there in the following sentence? How many tokens (running words) are there in the following sentence? We need a vocabulary to talk about vocabulary. Type-Token Ratio (also called “Lexical Richness” or “Lexical Density”) How many types and tokens do you see here? Watch out! I said watch out! 4 types 6 tokens / running words Type-token ratio: 4/6 (0.67) 20 Text written by a local HK 12-year I have a rubber, an old, small rubber. Although it is so small that I can not use it anymore, I still keep it carefully in my drawer as it is so important for me. That is a long, long time that I have my rubber. Four years ago, when I was still an eight-years-old child, my parents bought me a rubber as my birthday present. I put it into my pencil-box and brought it to school everyday. We had an interesting game in the past. We used our rubber to play with in the game. We pushed our rubber one by one and tried not to be pushed out at the desk by another rubber. We pushed and pulled our rubbers, soon our rubbers became older and smaller one day than one day. Source: Arthur McNeill’s (2004) “VocabProfile” of a student’s text First 1000 words 88% Second 1000 words 12% Academic words (AWL) 0% 75 types /137 tokens : 0.55 Off-list words (Less frequent words) 0% Examples from Hong Kong sample Repetition of key words (need for lexical substitution – synonyms, superordinates / hyponyms, and pronoun substitution) The need for lexical enrichment (adjectives and adverbs) Substitutes for “rubber” It (pronoun) One (pronoun) Eraser (synonym) Item of stationery (superordinate) Tool? (superordinate) I have a rubber, an old, small one. Although it is so small that I can not use it anymore, I still keep it carefully in my drawer as it is so important for me. That is a long, long time that I have my favourite chosen possession. Four years ago, when I was still an eight-years-old child, my parents bought it for me as my birthday present. I put it into my pencil-box and brought it to school everyday. We had an interesting game in the past. We used our eraser to play with in the game. We pushed our stationery one by one and tried not to be pushed out at the desk by another opponent. We pushed and pulled our weapons, soon our rubbers became older and smaller one day than one day. Text written by a local HK 16-year old under exam conditions Many students strive for academic excellency, but what is the motivation behind their hardwork? In this essay, I am going to explore the different aspects of learning, and analyse the pros and cons of each motivating factor. The hunger for knowledge and wisdom can motivate students to learn. They hope to widen their horizons through reading, watching educational programs, travelling and other ways. To them, the world is a fascinating place, full of wonders and mysteries to unravel. Their love of learning motivates them to seek knowledge in all areas, from science and mathematics to arts. Source: McNeill’s (2004) “VocabProfile” of a student’s text First 1000 words 73% Second 1000 words 6% Academic words (AWL) 10.5% 69 types / 96 tokens = 0.72 Off-list words (Less frequent words) 10.5% Pedagogical Implications Process Writing may be used to improve students’ lexical richess Peer revision of writing drafts insertion of adjectives and adverbs activation of recently learnt vocabulary 28 Lexical enrichment I was sweating. Ms Ip neared my table and put the exam paper in front of me. I closed my eyes and opened them a fraction of an inch. There, on top of the paper, was a 33. My heart sank. Then my teacher took away the paper and put another one in front of me. I took it and saw an 88 in the mark box. The first paper belonged to my neighbor, Sally. 29 Lexical enrichment I was sweating [adv]. Ms Ip neared my table [adv] and put the [adj] exam paper in front of me. I [adv] closed my eyes and [adv] opened them a fraction of an inch. There, on top of the paper, was a 33. My heart sank. Then my teacher [adv] took away the paper and put another one in front of me. I took it [adv] and saw an 88 in the mark box. [Adv] the first paper belonged to my neighbor, Sally. 30 Lexical enrichment I was sweating [heavily]. Ms Ip neared my table [unexpectedly] and put the [horrible] exam paper in front of me. I [immediately] closed my eyes and [slowly] opened them a fraction of an inch. There, on top of the paper, was a 33. My heart sank. Then my teacher [swiftly] took away the paper and put another one in front of me. I took it [without thinking] and saw an 88 in the mark box. [Fortunately] the first paper belonged to my neighbor, Sally. 31 Discussion What are the benefits of using word lists (such as GSL, AWL)? To design a vocabulary curriculum To decide which texts to use with students To decide which words in a text would cause difficulty to students 32 Local research on vocabulary size and vocabulary knowledge Cobb & Horst (2000) – post session reading CityU students knew the most basic 2000 words; also performed well at 3000word level But low scores on UWL level Vocabulary growth over a period of 6 months: No Fan (2001) – post session reading Vocabulary scores positively correlate with language proficiency Students from Chinese-medium schools and those with E in HKAL need help with vocabulary 33