Explorations of Grammar in the BAWE corpus of Academic English Grammar and EAP: BALEAP PIM St Andrews, June 2015 Sheena Gardner sheena.gardner@coventry.ac.uk www.coventry.ac.uk/BAWE 1. The BAWE Corpus Level 1 Arts and Humanities (AH) Applied Linguistics (115), English (106), Philosophy (106), History (96), Classics (82), Archaeology (76), Comparative American Studies (74), Other inc MFL & Theatre Studies (50) Life Sciences (LS) Biological Sciences (169), Agriculture (134), Food Sciences (124), Psychology (95), Health and Social Care (81), Medical Science (80) Physical Sciences (PS) Engineering (238), Chemistry (89), Computer Science (87), Physics (68), Mathematics (33), Meteorology (29), Cybernetics & Electronics (28), Planning (14), Architecture (9), Other (1) Social Sciences (SS) Business (146), Law (134), Sociology (110), Politics (110), Economics (96), Hospitality, Leisure and Tourism, Management (93), Anthropology (49), Publishing (30), Other inc Education (9) Level 2 Level 3 Level 4 Total Students Assignments Texts Words 101 239 255 468,353 83 228 229 583,617 61 160 160 427,942 23 268 78 705 80 724 234,206 1,714,118 Students Assignments Texts Words 74 180 188 299,370 71 193 206 408,070 42 113 120 263,668 46 233 197 683 205 719 441,283 1,412,391 Students Assignments Texts Words 73 181 181 300,989 60 149 154 314,331 56 156 156 426,431 36 225 110 596 133 624 339,605 1,381,356 Students Assignments Texts Words 85 207 216 371,473 88 197 198 475,668 76 166 170 447,950 64 313 207 777 207 791 704,039 1,999,130 Total students 333 302 235 169 1039 Total assignments 807 767 595 592 2761 Total texts 840 787 606 625 2858 Total words 1,440,185 1,781,686 1,565,991 1,719,133 6,506,995 The British Academic Written English (BAWE) corpus was developed at the Universities of Warwick, Reading and Oxford Brookes under the directorship of Hilary Nesi and Sheena Gardner (formerly of the Centre for Applied Linguistics [previously called CELTE], Warwick), Paul Thompson (Department of Applied Linguistics, Reading) and Paul Wickens (Westminster Institute of Education, Oxford Brookes), as part of the project An investigation of genres of assessed writing in British Higher Education which was funded by the Economic and Social Research Council (project number RES-000-23-0800) from 2004 to 2007. We are indebted to the students who contributed their work, and enabled the corpus to exist. © Sheena Gardner 1 St Andrews June 2015 2. Accessing the BAWE corpus: The BAWE corpus can be downloaded for research purposes via the Oxford Text Archive (http://ota.ahds.ac.uk/headers/2539.xml) for use, with WordSmith Tools, AntConc et al. It can be freely accessed using the open version of the corpus query tool SketchEngine at http://the.sketchengine.co.uk/open/ or register for the full version for greater capability. Information about the corpus, ranging from wordlists to academic publications is available at www.coventry.ac.uk/BAWE The guide Using Sketch Engine with BAWE (Nesi and Thompson) is also available there. Writing for a Purpose Materials based on the project research are available for learners and teachers on the British Council Learn English website http://learnenglish.britishcouncil.org/en/writing-purpose/writing-purpose 5 Writing Purposes and 13 Genre Families in the BAWE Corpus Explanations, Exercises …………………………………………….. Literature Survey, Methodology Recount, Research Report …………………………………………. Essay Critique …………………………….. Empathy Writing Narrative Recount ……………………………… Case Study Design Specification Problem Question Proposal © Sheena Gardner 2 St Andrews June 2015 Data is or Data are? Do a simple search for data. How many instances (hits) are there in the corpus? Which is more frequent, this data or these data? ______________________ Hint: sort by the left Which is more frequent, data is or data are? __________________________ Hint: sort by the right Table 1 shows the proportion of instances of data with singular forms (e.g. this data is …) compared to that with plural forms (e.g. these data are ...) in texts ranging from published scientific research through to American student writing (MICUSP is the Michigan corpus of upper level student papers). What does this table suggest about the use of data? Data with singular% Plural% Nature-Johns 10 90 Swales-Hyland 20 80 Google scholar 22 78 New Scientist-Johns 26 74 Newspaper Guardian-Johns 60 40 WWW Google 67 33 Student assignments BAWE 79 21 Academic science writing Academic journal articles across ten disciplines Academic books, reports and journal articles Popular Science magazine Student assignments MICUSP 82 18 Table 1: data across registers ___________________________________________________________________________________ ___________________________________________________________________________________ ________________________________________________________________________________ Johns (1996) suggests that “the (traditional) meaning "evidence used in experimental procedures" is most often plural, while the (more recent) meaning "digital information stored or manipulated by a computer" is most often singular.” Swales (2002) concurs with this distinction in his study of data in academic research articles. Alternatively, consideration of these examples from a recent article in the journal Applied Linguistics suggests it is important to look at the unfolding text; or maybe to consider what is ‘countable’.. Data for the project was collected over a five year period …The data included videos and transcripts of … The data were analysed using … Look again at the BAWE concordance lines. Does the Johns-Swales distinction between different senses of data still hold? Or could you formulate an alternative rule of thumb to help students? ___________________________________________________________________________________ ___________________________________________________________________________________ ________________________________________________________________________________ Or perhaps we should be asking another question. In most instances data is not marked for singular or plural. What happens instead? Based on the concordance lines, what strategies could you suggest to help student writers avoid using either singular or plural forms with data? ___________________________________________________________________________________ ___________________________________________________________________________________ ___________________________________________________________________________________ ___________________________________________________________________________________ ______________________________________________________________________________ As this exercise suggests, it is important not only to know what sort of corpus data you are searching (published research, general English, proficient student writing), but also to go beyond narrower grammatical questions such as ‘should data be used with a singular or plural verb’ to explore different senses of the search term, different contexts, and whether it is usually used in ways that do not require a singular vs plural form. © Sheena Gardner 3 St Andrews June 2015 REFERENCES (more at www.coventry.ac.uk/BAWE) Bruce, I. (2010) Textual and discoursal resources used in the essay genre in sociology and English. Journal of English for Academic Purposes 9 (3) 153-166 Chen, Y.-H. & Baker, P. (2010). Lexical Bundles in L1 and L2 Academic Writing. Language Learning and Technology, (14) 2, 30-49. Durrant, P., and J. Mathews-Aydınlı,. (2011), A function-first approach to identifying formulaic language in academic writing, English for Specific Purposes 30 (1) 58-72 Ebeling, , S. O. & Wickens, P. (2012). Interpersonal themes and author stance in student writing. In: Hoffmann, S., Rayson, P. and G. Leech (eds) English Corpus Linguistics: Looking Back, Moving Forward: Papers from the 30th International Conference on English Language Research on Computerized Corpora (ICAME 30) pp. 23-40 Gardezi, S. A. and H. Nesi (2009). Variation in the writing of economics students in Britain and Pakistan: the case of conjunctive ties. In: M. Charles, S. Hunston, D. Pecorari (eds) Academic Writing: At the Interface of Corpus and Discourse. London: Continuum pp. 236-250 Gardner, S. (2012) A pedagogic and professional Case Study genre and register continuum in Business and in Medicine. Journal of Applied Linguistics and Professional Practice, 9.(1) 13-35. Gardner, S. (2012) Genres and registers of student report writing: an SFL perspective on texts and practices. Journal of English for Academic Purposes, 11 (1) 52-63 Gardner, S. and J. Holmes. (2009). Can I use headings in my essay? Section headings, macrostructures and genre families in the BAWE corpus of student writing. In: M. Charles, S. Hunston, D. Pecorari (eds) Academic Writing: At the Interface of Corpus and Discourse. London: Continuum. pp. 251-271 Gardner, S. and H. Nesi (2013) ‘A classification of genre families of university student writing’ Applied Linguistics 34 (1) 1-29 Holmes, J. and H. Nesi (2009). Verbal and Mental Processes in Academic Disciplines. In: M. Charles, S. Hunston, D. Pecorari (eds) Academic Writing: At the Interface of Corpus and Discourse. London: Continuum pp. 58-72 Lee, D. & X. Chen (2009) Making a bigger deal of smaller words: function words and other key items in research writing by Chinese learners. Journal of Second Language Writing, 18 (4) 281-296 Leedham, M. (2014) Chinese students’ writing in English: Implications from a corpus-driven study. Routledge. McKenny, J. (2005) Content analysis of dogmatism compared with corpus analysis of epistemic stance in student essays.Information Design Journal + Document Design, 13 (1) pp 40-49 Nesi, H. (2008).Corpora and EAP. In: LSP: Interfacing Language with other Realms: Proceedings of the 6th Languages for Specific Purposes International Seminar. Universiti Teknologi Malaysia, Johor Bahru, Malaysia. Nesi, H. (2008). Extended abstract: The form, meaning and purpose of university level assessed reflective writing. In M. Edwardes (ed) Proceedings of the BAAL Annual Conference 2007. Edinburgh University. London: BAAL/Scitsiugnil Press. Nesi, H. and P. Thompson (2011) Using Sketch Engine with BAWE. Available online at www.coventry.ac.uk/BAWE Nesi, H. and E. Moreton (2011) ‘EFL/ESL writers and the use of shell nouns’ in In:Tang, R. (ed) Academic Writing in a Second or Foreign Language: Issues and challengesfacing ESL/EFL academic writers in higher education contexts. London: Continuum Nesi, H. and Gardner, S (2015) Balancing old and new activity types on an academic writing website. In: Kavanagh, M. and Robinson, L. (eds). The Janus Moment in EAP: Revisiting the Past and Building the Future. Reading, UK: Garnet Education 187-198 Nesi, H. & S. Gardner (2012). Genres across the Disciplines: Student writing in Higher Education. Cambridge: Cambridge University Press Thompson, P. (2009) Shared disciplinary norms and individual traits in the writing of British undergraduates. In M. Gotti (ed) Commonality and Individuality in Academic Discourse. Bern: Peter Lang, pp 53-82. © Sheena Gardner 4 St Andrews June 2015