Building up Corpus of Technical Vocabulary – Strategies and Feasibility Presenters: Dr. Aparna Palle, Preetha Anthony GNITS, HYDERABAD An overview of the presentation • • • • • • • • Introduction Theoretical premise Interfacing of ESP and Corpora Criteria for selection of words Web Tools The Corpus Classroom techniques Conclusion What is Corpus? • Corpora or corpuses are simply large collections or databases of language, incorporating stretches of discourse ranging from a few words to entire books. (Norbert Schmitt, 2000). • A corpus is a collection of naturally occurring texts that is usually stored on a computer. (Randi Reppen, 2011). • A corpus is a large collection or database of machine-readable texts involving natural discourse in diverse contexts. (Bernardini, 2000) Definition • A Corpus is an inventory of essential language inputs drawn from authentic contexts using web tools. Why Corpus? • Emphasis on the specific needs of the learners of professional courses. • Limited vocabulary to perform academic tasks. • Lack of knowledge of specialised vocabulary. • Corpus data provide descriptive insights relevant to how people use language. • Acts as tool that enable students and instructors to analyse both how people use different language forms at various levels of formality and how language fulfils multiple speech functions across contexts. Why Corpus? (contd.) • Learning activities centred on analysing corpus data are consistent with current principles of languagelearning theory, that is students develop more autonomy when they receive guidance about how to observe language and make generalizations. • Such activities promote noticing and grammatical consciousness raising (Schmidt 1990), which can enhance second language learning and development. Word-building criteria • • • • • • Frequency and Range Keyword in context Collocation Homonymy Word families Idioms and set expressions etc….. Web tools • • • • • • AWL Highlighter British National Corpus (BNC) Collins Cobuild Corpus Concordance Sampler Compleat Lexical Tutor Corpus.BYU.edu Corpus of Contemporary American English (COCA) • WordSmith Source: Materials Development in Language Teaching, Ed. By Brian Tomlinson (1998) AWL Highlighter Corpus of Computer Programming Word List (CCPWL) Source from which the Corpus was extracted “C the Complete Reference” Herbert Schildt Distinguishing Technical Vocabulary (Computing) from others Category 1: The word form appears rarely if at all outside this particular field De bug, operand, recompile, loop Purely Technical Category 2: The word form is used both inside and outside this particular field but not with the same meaning Characters, flag, error, default, constants Homonyms - specialised Category 3: The word form is used both inside and outside this particular field, but the majority of its uses with a particular meaning though not all, are in this field. The specialised meaning it has in this field is readily accessible through its meaning outside the field. Variable, parameter, in-put, out-put, pre-fix, code Homonyms - general Category 4: The word form is more common in this field than else where. There is little or no specialisation of meaning, though someone knowledgeable in the field would have a more precise idea of its meaning. Manuals, memory, application, functions Literal Meaning Filling Word Parts Noun Verb Adjectives Adverbs Compatability Programme Incremental variously Cutting up complex words Word Meaning Decode: a methodical process of finding and reducing the number of defects, in a computer program or a piece of electronic hardware Encode The process of assigning load addresses to various parts of a program and adjusting the code and data in the program to reflect the assigned addresses Debugging the process of putting a sequence of characters (letters, numbers, punctuation, and certain symbols) into a specialized format for efficient transmission or storage Relocation the conversion of an encoded format back into the original sequence of characters Meanings of the Prefixes: Re – Again En – also De- down, away completely removal, reversal Choosing the Correct Form Learning C is similar and ____ (easy). Instead of straight-away l______ (learn)how to write programs, we must first know what alphabets, numbers and special symbols are ____ (use) in C, then how _____ (use) them constants, variables and keywords are _____ (construct), and _____ (final) how are these _____ combine) to form an _____ (instruct). Strengthening the Form – Meaning Connection Word Definition Manual a value automatically assigned Syntax A well structured collection of information for reference Default the set of rules that defines the combinations of symbols Answering questions • Qn. Differentiate between syntax error and semantic error. • Ans. A syntax error is an error in the type of code or statement. A semantic error basically means invalid logic. • Qn. What is the difference between character array and integer array? • Ans. Character array stores an array of characters, where as an integer array stores sequence of number integers. Defining in the second language (a) Term (b) class (c ) defining characteristics (a) A character constant is (b) either a single alphabet, a single digit or a single special symbol (c) enclosed within single inverted commas. (b) A variable in C is (b) a quantity which may vary (c ) during programme execution. (a) Key words are (b) the words whose meaning has already been explained (c ) to the C compiler. Conclusion • Writing skills of the learners would be enhanced with the appropriate use of technical vocabulary. • Teaching of vocabulary becomes meaningful enhancing their academic writing. • The learners would be able to produce better answers using the words from the corpus – the end result from examination point of view is fulfilled. • Enhancement of learner autonomy. • Confident in their discourse with the professional community.