1 Experiencing Vocabulary Learning Using Small Language Corpora Višnja Kabalin Borenić, Department of Business Foreign Languages Faculty of Economics and Business, University of Zagreb Sanja Marinov, Department of Foreign Languages and PE Faculty of Economics, University of Split Martina Mencer Salluzzo, Department of Languages and Culture Vern – University of Applied Sciences, Zagreb Abstract This article researches university students' responses to a set of exercises based on authentic corpus material. It aims to add to the database of potential exercises derived directly from corpus material. The research involved 51 students of business and tourism who were asked to complete a variety of exercises derived from corpus material and record their impressions in a journal. Since they combine quantitative and qualitative data (students’ success rates and comments), our results provide reliable guidelines for the design of corpus-based exercises. Research results revealed that some learners appreciate the benefits of corpus consultation while others find it too time consuming or demanding. On the whole, the respondents recognised the benefits of autonomous learning, intensive 2 reading and context reconstruction. We found the method beneficial and practicable for intermediate and advanced level students provided that it be introduced gradually. Key words: small language corpus, corpus-based exercises, vocabulary, university students, journal Introduction Usefulness of corpus data for language teaching has long been recognised (Willis, 1993; Tribble and Jones, 1990; Krishnamurthy, 2001) and corpus informed language teaching materials are now taken for granted. A modern course book, for example, is entirely corpus-informed (McCarthy, 2004) and all major publishers now provide corpus-based dictionaries (O'Keeffe, A. et al., 2007: 17). Experimenting with direct application of corpus material and corpus methods in language classrooms is only relatively a recent phenomenon. Corpora and concordancing were introduced in the language-learning environment in 1969 (McEnery and Wilson, 1997: 12) but it was Tim Johns’ (1986) work and his idea of Data driven learning in the 1980s that spawned interest and further empirical research (Tribble and Jones, 1990; Stevens, 1995; Cobb, 1997). The tool, however, is not yet widely used in the language classrooms and more empirical research is needed to help disseminate the idea and encourage the use of corpus- 3 driven activities. More importantly, the research should indicate new ways, and new language items that can be presented in this way to facilitate the application of corpusdriven activities in the classroom. This is exactly the aim of this paper: to set examples of possible tasks that can be designed using a small corpus and to analyse how students react to them, both in terms of their ability to solve the set problems and opinions/attitudes towards the given type of exercise. In doing so we hope to bring corpora directly into the classroom to help teach, explain, or practice particular language items. Vocabulary teaching and corpora In order to be able to speak a language well it is essential to have a wide range of vocabulary. This fact is now taken for granted by both teachers and learners, but it has not always been that way. Not so long ago it was grammar that was given priority and words were seen as mere gap fillers of predetermined syntactic language structures. It was the careful study of language corpora that brought evidence of a vague and almost non-existent borderline between grammar and lexis (Sinclair, 1991). Carefully sorted corpus concordance lines highlighted patterns that depend on particular lexical items rather than syntactic structures and thus revealed that each lexical item has a little “grammar” of its own. Today, we 4 approach vocabulary teaching with this revolutionary development in our minds. Furthermore, regular language classes cannot cover the huge number of vocabulary items that students need to learn. Students need to be enabled to do a lot of autonomous learning and modern language instructors need to teach them both how to learn vocabulary as well as what to learn. Presenting corpus data in a variety of tasks can raise students’ awareness of what there is to learn and how to do it. Methodology Our sample consists of 51 undergraduate university students of economics and tourism who have been learning English between 8 and 12 years. To make the sample more representative of the student population in non-philological studies it comprises students from three different universities/faculties. The test The students were given a test that consisted of three different exercises based on and derived from a small corpus. The corpus compiled and used in this study consists of 450 000 tokens and is therefore classified as a small corpus. It combines one register and one genre because it includes only the texts from the area of tourism, or more specifically of the tour guides of 5 the Mediterranean countries. It was originally compiled as a source of corpus-derived exercises in a project carried out with students of tourism (Marinov, 2011) but is now used for teaching purposes to address particular language issues when necessary. Each exercise aims at a particular language problem that we believed students at this level of language learning should be able to cope with and understand but not without some difficulty. We concentrated on teaching vocabulary as it is believed that lexical information is much easier for learners to notice and study (Gaskell and Cobb, 2004). Wanting to include the elements of "student research" and knowing that students find coping with the whole range of lines so discouraging that they prefer and need guidance (Marinov, 2011), we provided the material in the form of a shortened concordance. Our research included three different exercises as described below. Task 1 – the verb “run” This task consists of an 83-line-long concordance of the verb “run.” Concordance is a screen display or printout of a chosen word or phrase in its different contexts, with that word or phrase arranged down the centre of the display along with the text that comes before and after it (McCarthy, 2004). The task concentrates on meanings. “Run” is a word all our students are familiar with but only with a narrow range of its meanings. The 6 aim of this exercise is, therefore, to extend this range and possibly raise the students’ awareness that many other highly frequent words have additional meanings to be learnt. In the “Mediterranean Europe” corpus there are as many as 562 tokens of “run” so the concordance had to be shortened. In order not to lose authenticity of the corpus data by editing it (Flowerdew, 1996), we used the WordSmith tool feature to shorten the concordance automatically and this resulted in 83 lines. The concordance was then right-sorted, i.e. arranged alphabetically to the right of the node word. Observing the students' responses we wanted to find out the following: I how many and which of the present meanings the students managed to identify II which of the meanings were more easily identifiable Task 2 The students were required to complete 14 gapped sentences with one of the two phrasal verbs: make for or make up for, and organize the information in their personal vocabulary files by defining the meaning and providing an example sentence for each phrasal verb. Finally, the students were asked to comment on the task so that we could determine these: I The overall accuracy (score) expressed in percentages. 7 II Which of the phrasal verbs was understood better and used more correctly. III Whether there is a connection between the score achieved and the students' perception of task difficulty. Task 3 In Task 3 students were asked to study the 16 examples and deduct if there was any difference in usage between made from and made of. They were then, same as in Task 1, asked to make a vocabulary file entry for each of the collocations. In the journal entry, they had to note their impressions and possible difficulties in solving the task. Journal Along with the corpus data the students were asked to keep a journal in which they noted their opinions, feelings, and difficulties encountered while doing each of the assigned tasks. The journal consisted of generic questions and questions related to specific tasks. Students were also asked to explain the path they were taking while trying to solve the tasks. Analysis Journal questions 1 and 2 The introductory generic questions were as follows: “Have you already encountered this method of discovering meanings and 8 studying new vocabulary and its usage?” and "Do you use search engines? Why? and When?" As regards the familiarity with similar tasks, the majority of students (N = 39) responded negatively. When asked whether, why, and when they used internet search engines most students (44) answered affirmatively, but indicated different level of frequency. Judging by the answers obtained, we can conclude that the majority interpreted search engines as Internet, Google translator or computers in general. Several students mentioned that they used search engines to verify expressions they cannot find in dictionaries. Three students mentioned wanting to see the context in which a particular phrase or expression is used. Finally, only one student mentioned looking for a similar authentic document in the target language. It is obvious that students should be given some direct, explicit instructions about the differences between on-line dictionaries and search engines, and should be taught how to use internet search engines to advance their language learning. Task analysis Task 1 – meanings of “run” The 83-line-long concordance included ten different meanings/usages of “run”. Their frequency was established and is presented in the second column of Table 1. Meaning frequency identified 9 correctly 1 manage, operate, organize 42 94% 2 transport, drive, ride 25 96% 3 cost 3 39% 4 last 1 49% 5 stretch 6 53% 6 expire 1 61% 7 dilapidated 1 29.5% 8 go berserk 1 4% 9 read 1 9.8% 1 5.8% 10 to include everything within a group or type Table 1: Meanings/usages of the verb ”run“ in the 83-line-long concordance from the corpus „Mediterranean Europe“ The sample as a whole managed to identify all ten meanings of the verb but with varying success. The most easily identifiable senses were 1 and 2, which were also the two most frequent senses in the concordance. The rate of noticing is obviously related but not directly to the frequency of occurrence as can be seen from the example of the next most frequently noticed sense (expire) which appears only once in the concordance. In other words, there is no clear and measurable connection between frequency of occurrence and the rate of recognition of 10 a particular sense. The rate of recognition can be influenced by a number of factors such as: an existent passive/active knowledge of the word/sense, the immediate context, language proficiency, seriousness with which a student has tackled the task (motivation, interest, patience) which in itself could be a topic of another, separate research. Students have also ”invented“ some meanings of their own. They treated different uses of the same sense as separate senses. Most frequently they interpreted the passive usage of ”run“ as in “well-run“ or ”run by“ as separate senses (24%). Students' comments allowed us to establish how much they liked the exercise, what were the major difficulties encountered and strategies used in finding the solutions. Content analysis of students' responses is presented in Table 2. comment/idea frequency General impressions 1 easy/initial problems quickly resolved 13 2 interesting/useful 10 3 interesting but ... confusing/difficult/long 5 Major difficulties encountered 4 difficult to distinguish between the meanings 9 5 lacking or difficult context 7 6 understand the meaning but cannot explain 5 7 time consuming 6 11 Strategies used 8 used Internet/dictionaries to find out 15 9 re-reading 7 10 careful analysis and concentration (which is good 2 and helps acquisition) Table 2: Students' opinions/comments on Task 1: different senses of the verb ”run“ Quoted below are two students' exact words which we selected as extreme examples of the two ends of a spectrum of opinions. Task 2 – phrasal verbs “make for” and “make up for” The compounded score for all students revealed a satisfactory overall accuracy with 80% of all sentences completed correctly. More than 50% of students made fewer than 2 mistakes. At the other end of the spectrum, there were 9.8% students with 7 or fewer correct answers. Altogether, the students had more difficulty understanding make for than make up for. The connection between the score and perception of task difficulty could only be examined for the 19 students who made comments about the difficulty. The score and the perceived task difficulty corresponded in 9 cases only. By contrast, 4 students 12 with high scores found the task difficult and expressed uncertainty about their answers and 6 students with very low scores maintained the task was easy. To conclude, the correct and incorrect perception of task difficulty and one's achievement appeared to be equally widespread, which we found surprising as one would expect a higher level of selfawareness among university students. Task 3 – difference between made of and made from Students’ answers were evaluated on a scale from 0 to 3: 0 - no answer or completely wrong, 1- fair, 2 - good, and 3 - very good. The overview of their answers and journal comments is presented in the Table 3. Quality No. of of Perceived level Strategies applied students of difficulty answers 0 14 No effort made - no An easy task difference noticed 1 14 Strategies applied led to An easy task mistaken conclusions 2 10 Difference briefly explained An easy task (from dictionary) Correct definitions provided with good examples of 3 13 usage and explanation of one’s analytical approach / the logic applied A difficult task 13 Table 3: Students' answers to Task 3: deducting the difference in meaning between made of and made from The quality of students’ answers was inversely proportionate to the perceived level of task difficulty: the students who invested no or little effort found the task easy, whereas the students who chose to ponder the sentences and work out the meaning for themselves found the task difficult but interesting and rewarding. Clearly, this kind of task is best suited for highly-motivated, curious, and committed students. Journal questions 4 and 5 Having completed three different corpus-based exercises the students were required to outline what they saw as the particular benefits of this approach to language learning and suggest potential users. The analysis of students’ answers regarding the advantages of corpus-based exercises revealed the following: 1. Praise to the inductive approach and learner autonomy. A significant number of students (19) appreciated the inductive approach and learner empowerment. They found that corpusbased exercises developed skills important for increasing the quality of learning and understanding through parallel observation of different examples of usage, practical application of knowledge, autonomy in establishing rules, 14 creation of meaning from context and discovering relations between language phenomena. 2. Strong emphasis on the importance of context. Many students (18) emphasised the importance of context, especially as it prevents literal translation, underpins deduction of meaning and enhances long-term memory. 3. Generally positive remarks about the method. A significant number of students thought that corpus-based approach enhanced vocabulary learning (12) and was more interesting than traditional ways of learning (9). A smaller group appreciated the intense practice, focus on details (4), the abundance of examples (3) and positive effects of the interaction between existing and newly acquired knowledge (3). Finally, as regards the potential beneficiaries of this approach, students' responses fall into several distinct groups. 1. Emphasis on learners’ desire to learn. Answers in this group (21) revolve around the idea of the desire to learn as a prerequisite for employing this method. Seventeen students would recommend this approach to persons willing to invest effort and actively engage in building their vocabulary, either for study or for work, and the remaining four would recommend it to those whose English is weak but who wished to learn more. 15 2. Emphasis on level of English 22 comments (44%) mention or focus on prospective learners' level of English. Most students who fall into this group believe this approach suits advanced learners (15), which qualification can include high school students as well. Three respondents would recommend it to individuals who are too self-confident about their level of English. Three students believe even the beginners in primary education would benefit from this method and one thinks that both advanced learners and beginners would find it beneficial. Finally, three students think the approach would be useful for people with specific vocabulary problems. 3. Emphasis on learners' professional or academic needs This group recommends the method to individuals who are interested in the English language itself, who focus on details, i.e. to language students (3) or to Croatian politicians because they “constantly embarrass us with their horrible English” (2). 4. Negative attitude There are three students who would not recommend this approach to anyone. Conclusion Using authentic corpus material is a rather innovative and insufficiently explored teaching/learning method. In this 16 research three corpus-based tasks were given to university students of business and tourism and their responses were analyzed with respect to both accuracy/new learning, as well as to their comments about this teaching/learning approach. All three exercises presented a challenge to most students and made them develop own strategies for finding solutions and answers. In line with previous empirical studies our research has shown that certain learners appreciate the benefits of corpus consultation while others find it too time consuming or demanding (Chambers & O’Sullivan, 2004; Chambers, 2005). Each of the three exercises caused different problems but they also presented some common ground such as the lack of context, difficult context, or the need to invest more time and concentration on intensive reading. For some students the strategy of intensive and concentrated reading made up for the lacking context that they thus managed to construct. Intensive reading is a skill that is rarely practiced in regular language courses. Therefore, the best practice would be to introduce corpus and intensive reading gradually, as a long-term process and an integral part of the overall language-learning process (Kennedy and Miceli, 2001). The problem of non-existent context can also be tackled by encouraging students to use various reference materials and examine longer stretches of source texts. 17 A clear focus on independent acquiring of new knowledge has also been recognised, combining more familiar with less familiar. The process of language recycling, which helps shifting passive into the active usage, is thus also initiated. The depth and long-term knowledge retention is recognized as the main advantage of this approach. Apart from recognising the value of learning vocabulary through several contextual encounters (Cobb, 1997) certain students have also found the tasks to be challenging and therefore more motivating than traditional types of exercises. The complexity of the task increased motivation for higher (upper-intermediate or advanced) level language learners, while it decreased for those at lower levels. Finally, appreciation of skills developed in the course of doing the exercises (parallel observation, practical application of knowledge, establishing rules, drawing conclusions, deduction of meaning from context, finding relations between language phenomena) clearly emphasise the importance of procedural knowledge that is enhanced by this approach. Based on the obtained data and students’ journal responses, we can conclude that the method is beneficial and practicable for intermediate and advance level students but the corpus data should be edited in order to suit the particular students’ needs, abilities and language proficiency. Some challenge should be provided, to provoke interest and the feeling of success, but the 18 task should not be too long or difficult, as not to discourage the students. The list of obtained student opinions about corpusdriven learning can be used as a solid starting point for future research. References: 1 Chambers, A. (2005). Integrating corpus consultation procedures in language studies. Language Learning & Technology 9 (2): 111-125. 2 Chambers, A., & O'Sullivan, Í. (2004). Corpus consultation and advanced learners' writing skills in French. ReCALL, 16(1): 158-172. 3 Cobb, T. (1997). Is there any measurable learning from hands-on concordancing? System, 25, 301-315. 4 Flowerdew, J. (1996). “Concordancing in Language Learning” in Pennington, M. (ed.), The Power of CALL: 97-113. Houston, TX: Athelstan. 5 Gaskell, D. and Cobb, T. (2004). “Can learners use concordance feedback for writing errors?”. System 32: 301- 319 6 Johns, T. (1986). ”Micro-concord: A language learner's research tool“. System, 14 (2): 151-162. 7 Kennedy, C. & Miceli, T. (2001). An evaluation of intermediate students' approaches to corpus 19 investigation. Language Learning & Technology 5(3): 77-90. 8 Krishnamurthy, R. (2001). “Learning and Teaching through Context - A Data-driven Approach”. TESOL Spain Newsletter, Volume 24. 9 Marinov, S. (2011). The role of small specialised corpus in teaching ESP, unpublished master’s thesis, University of Zadar. 10 McCarthy, M. (2004). Touchstone - From corpus to coursebook. Cambridge: CUP. 11 McEnery, T. and Wilson, A. (1997). Teaching and language corpora, ReCALL 9 (1): 5-14 12 O'Keeffe, A. et al. (2007). From Corpus to Classroom. Cambridge: Cambridge University Press 13 Scott, M. (2004). WordSmith Tools, version 4, Oxford: Oxford University Press. ISBN: 0-19-459400-9. 14 Sinclair, J.M. (1991). Corpus, concordance, collocation. Oxford: Oxford University Press. 15 Stevens, V. (1995). “Concordancing with language learners: Why? When? What?”. CAELL Journal, 6 (2): 2-10. 16 Tribble, C. and Jones, G. (1990). Concordances in the classroom. London: Longman 17 Willis, D. (1993). Syllabus, corpus and data driven learning. IATEFL Conference Report: Plenaries 20