ELC 2010 Lexico-Grammatical Patterns in English Scientific Abstracts: presenting the research’s purposes and results Carmen Dayrell Arnaldo Candido Jr. Stella Tagnin Sandra Aluísio DLM ICMC / NILC Context English for Academic Purposes Academic communication poses real challenges for novice researchers (Hyland 2009:ix) Demands are heavier for non-native speakers of English (Hyland 2009:5, Milton and Hyland 1999, Vold 2006) Difficulties relate to: lexical and syntactical features of the target genre rhetorical motivations behind linguistic choices Disciplinary variation Cultural differences across languages Context Local Context Courses on English academic writing Writing tools for non-native speakers of English Assist graduate students to write scientific papers in English Context Courses on English Academic Writing 2004 to 2010 USP Department of Physics (IFSC) Department of Pharmaceutical Sciences (FCF) Department of Computer Science (ICMC) UNESP IBILCE Dentistry and Biology/Genetics UFSCar Department of Biology/Genetics Context Writing tools: Scipo-Farmácia (http://www.nilc.icmc.usp.br/scipo-farmacia/) Abstracts Background Gap Purpose Methodology Results Conclusion Context Writing tools: Scipo-Farmácia (http://www.nilc.icmc.usp.br/scipo-farmacia/) Examples from published abstracts Context Why Abstracts? Relevant in various academic contexts However … (Swales & Feak 2009: xiii) Constructing an efficient, clear abstract is a fairly difficult task, even for experienced and widely published writers In Brazil: Abstracts are part of most research papers written in Portuguese as well as PhD and master’s dissertations Purpose General Objective Investigate the potential differences between English abstracts written by Brazilian graduate students vis-à-vis abstracts taken from published papers from the same disciplines Purpose Aim of this study To investigate the recurring lexico-grammatical patterns used for presenting either the purposes or results of the research Rhetorical ‘moves’ in abstracts Purpose Swales and Feak (2009: 5) Background / Introduction Purpose Methods / Materials / Subjects/ Procedures Results / Findings Discussion / Conclusion / Implications / Recommendations Lexico-grammatical patterns Purpose The AIM of this STUDY the the present aim purpose objective goal aims objectives purposes study work investigation article research project paper Corpora Student Abstracts Physical Sciences and Engineering ST-EXA Abstracts: Tokens: Average Number Words (ANW): Life and Health Sciences ST-BIO 169 138 34.151 27.911 202 202 Corpora Student Abstracts Physical Sciences and Engineering Disciplines Physics Computing Earth Sciences Engineering # texts 85 46 20 18 169 Life and Health Sciences Disciplines # texts Dentistry 47 Pharmaceutical Scs. 39 Biology 21 ST-BIO Biophysics 21 Bioengineering 5 Biomedical Scs. 5 138 Corpora English Abstracts Physical Sciences and Engineering Life and Health Sciences Disciplines ST PB Disciplines ST PB Physics 85 425 Dentistry 47 235 Pharmaceutical Scs. 39 195 Biology 21 105 21 105 Bioengineering 5 25 Biomedical Scs. 5 25 138 690 Computing Earth Sciences Engineering 46 20 18 169 230 100 90 845 ST-BIO Biophysics Corpora Published Abstracts Physical Sciences and Engineering PB-EXA Abstracts Tokens Average Number Words (ANW) Life and Health Sciences PB-BIO 845 690 139.591 159.940 165 231 Corpora Published Abstracts Taken from papers published by various leading academic journals (CAPES - QUALIS A) Preference given to authors affiliated to universities in English speaking countries Methods Methodology 1. Identification of rhetorical moves 2. Identification and comparison of lexico-grammatical patterns in ‘purposes’ and ‘results’ Methods a) 1. Identifying Rhetorical Moves Automatic tagging AZEA (Argumentative Zoning for English Abstracts) (Genovês et al. 2007) • • • a corpus-based machine learning system PURPOSE: to automatically identify components of the schematic structure of scientific abstracts in English Background Gap Purpose Methodology Result Conclusion AZEA achieved 80.4% accuracy (kappa 0.73) using a very small training corpus Methods AZEA’s features Basic Features 1. Sentence Length 2. Position within the abstract 3-5. Verb Tense, Voice and Modal 6. Previous Component 7-8. Formulaic patterns 14 additional features to distinguish between Results and Methods and improve accuracy Methods Azea-Web http://www.nilc.icmc.usp.br/azea-web/ Methods Azea-Web http://www.nilc.icmc.usp.br/azea-web/ Methods 1a. AZEA tagging <purpose> We propose a Local-Density approximation to calculate the entanglement entropy of the inhomogeneous one-dimensional Hubbard model. </purpose> <background> Such inhomogeneity can be due to the finite size, the presence of impurities, or the periodic variation of the interaction and the external potential, as in superlattices. </background> <purpose> We show that, to inhomogeneities due to finite size, our approximation reproduces the know thermodynamic limit and also the limit of the entanglement entropy in n=1, obtained by Cardy and Calabrese. </purpose> Methods 1b. Manual Validation <purpose> We propose a Local-Density approximation to calculate the entanglement entropy of the inhomogeneous one-dimensional Hubbard model. </purpose> <background> Such inhomogeneity can be due to the finite size, the presence of impurities, or the periodic variation of the interaction and the external potential, as in superlattices. </background> <result> We show that, to inhomogeneities due to finite size, our approximation reproduces the know thermodynamic limit and also the limit of the entanglement entropy in n=1, obtained by Cardy and Calabrese. </result> Methods Manual Tagging: Correcting sentence break <purpose> We find aRb/aNa=1.959(5), </purpose> <background> aK/aNa =1.786(6), </background> <purpose> and aRb/aK=1.097(5). </purpose> <result> We find aRb/aNa=1.959(5), aK/aNa =1.786(6), and aRb/aK=1.097 (5). </result> Methods Manual Tagging: multi-labels <purpose> Using whole-cell rapid-agonist application techniques and the cell-attached single-channel recording configuration, we examined human 5-HT3A(QDA) receptors expressed in human embryonic kidney 293 cells . </purpose> <method> Using whole-cell rapid-agonist application techniques and the cell-attached single-channel recording configuration, </method> <purpose> we examined human 5-HT3A(QDA) receptors expressed in human embryonic kidney 293 cells . </purpose> Methods Lexico-grammatical patterns 1. Semi-automatic identification of patterns: Wordsmith Tools 5 (Scott 2007) • Starting point: Most frequent items and cluster in each corpus • Analysis of the surrounding context • Patterns should occur at least once per 10,000 words in either corpus 2. Comparison of frequencies test of significance Statistical Results Overall … Significant differences: • Between student and published abstracts • Across the two broad areas PURPOSE: Life and Health Sciences (BIO) Results The AIM of this STUDY the present the STS Frequency per 10,000 words 20,0 15,0 PUB aim objective purpose aims objectives our study work review paper 10,0 5,0 0,0 aim purpose objective goal aims Objectives purposes intent study work Investigation Article Project Research Clinical trial paper PURPOSE: Life and Health Sciences (BIO) Results (In this STUDY), we VERB (the/a) Frequency per 10,000 words STS 20,0 15,0 10,0 5,0 0,0 PUB REPORT DESCRIBE INVESTIGATE SHOW ANALYSE EVALUATE DETERMINE … INVESTIGATE EXAMINE REPORT PROPOSE TEST HYPOTHESIZE DESCRIBE PRESENT SEEK TO ANALYSE EVALUATE DEMONSTRATE … Results PURPOSE: Physical Sciences and Engineering (EXA) 1. The AIM of this STUDY 2. This STUDY VERB 3. (In this STUDY), we VERB (the/a) Frequency per 10,000 words STS PUB 30,0 20,0 10,0 0,0 1 2 3 RESULTS: Results 1. Results VERB (that/the) e.g. The results show that 2. we VERB (that/the) e.g. we found that Frequency per 10,000 words ST-BIO ST-EXA PB-BIO PB-EXA 30,0 20,0 10,0 0,0 1 2 Contribu tions Main Contributions 1. Pedagogic applications a) Syllabus b) Teaching material 2. Development of writing tools Contribu tions Pedagogic applications Overuse and underuse Patterns Results VERB (that/the) BE PARTICIPLE to VERB (e.g. was found to be) Items within patterns It BE observed that X It BE shown/found that Contribu tions Writing Tools: AZEA Manual validation AZEA++ New features to be considered: • Lexico-grammatical patterns • Multi-labels • Disciplinary variations Future Work Writing Tools Physical Sciences and Engineering Life and Health Sciences ELC 2010 Thank you!