Preposition Errors in ESL Writings Mohammad Moradi KOWSAR INSTITUTE Introduction • Increasing need for tools for instruction in English as a Second Language (ESL) • Preposition usage is one of the most difficult aspects of English for non-native speakers – [Dalgish ’85] – 18% of sentences from ESL essays contain a preposition error – [Joel Tetreault] : 8-10% of all prepositions in TOEFL essays are used incorrectly Dr. M. Moradi Kowsar Corp. ESL Testing Corpus • Collection of randomly selected TOEFL essays by native speakers of Chinese, Japanese and Russian • 8192 prepositions total (5585 sentences) • Error annotation reliability between two human raters: – Agreement = 0.926 – Kappa = 0.599 Dr. M. Moradi Kowsar Corp. Method Performance [Eeg-Olofsson et al. ’03] Handcrafted rules for Swedish learners 11/40 prepositions correct [Izumi et al. ’03, ’04] ME model to classify 13 error types 25% precision 7% recall [Lee & Seneff ‘06] Stochastic model on 80% precision restricted domain 77% recall [De Felice & Pullman ’08] Maxent model (9 prep’s) ~57% precision ~11% recall [Gamon et al. ’08] LM + decision trees (12 prep’s) 80% precision Why are prepositions hard to master? • Prepositions perform so many complex roles – Preposition choice in an adjunct is constrained by its object (“on Friday”, “at noon”) – Prepositions are used to mark the arguments of a predicate (“fond of beer.”) – Phrasal Verbs (“give in to their demands.”) • “give in” “acquiesce, surrender” – Multiple prepositions can appear in the same context • “…the force of gravity causes the sap to move _____ the underside of the stem.” Dr. M. Moradi Kowsar Corp. Features • Prepositions are influenced by: – Words in the local context, and how they interact with each other (lexical) – Syntactic structure of context – Semantic interpretation Dr. M. Moradi Kowsar Corp. What to do?? • Develop a training set of error-annotated ESL essays (millions of examples?): – Too labor intensive to be practical • Alternative: – Train on millions of examples of proper usage • Determining how “close to correct” writer’s preposition is Dr. M. Moradi Kowsar Corp. Books • The most reliable and trusted way is to use grammar books and Dictionaries like • Practical English Usage • A-Z of Correct English • Common Mistakes In English • Oxfords advanced dictionary • Webster's Dr. M. Moradi Kowsar Corp. Google-Ngram • Typical way that non-nativeFeatures speakers check if usage is correct: – “Google” the phrase and alternatives • Created a fast-access Oracle database from the POS-tagged Google N-gram corpus • Queries provided frequency data for the +Combo features Dr. M. Moradi Kowsar Corp. Google Features • Adding Google features had minimal impact • Using solely Google features (or counts) as a classifier: ~45% accuracy on native text • Disclaimer: very naïve implementation Dr. M. Moradi Kowsar Corp. MEDLINE • Use search tags to see whether this phrase has been used in over 20 millions of indexed articles in Medline • But it may has mistakes due to non-native journals • Better to use advanced search to limit your search language • EX: • Regarding to[tw] Dr. M. Moradi Kowsar Corp. Here are some of the more common prepositions: A aboard about above across after against along alongside amid among amongst around as aside astride at atop B barring before behind below beneath beside besides between beyond but by C circa concerning considering D despite down during E except excepting excluding F failing following for from I in including inside into L like M minus N near nearby next notwithstanding Kowsar Corp. O of off on onto opposite outside over P past per plus R regarding round S save since T than through throughout till times to toward towards U under underneath unlike until unto up upon V versus via W with within without worth Here are some compound prepositions: according to by way of instead of ahead of in addition to on account of apart from in front of In place of with respect to because of in spite of by means of up on aside from prior to Common Preposition Confusions Writer’s Prep Rater’s Prep Frequency to of in to in of in null null at for null for on 9.5% 7.3% 7.1% 4.6% 3.2% 3.1% 3.1% By and with By is used to refer to the doer of an action; with is used to refer to the instrument with which the action is done. Ex: It is done by Peterson in 1998. Ex: It was done with ELISA method. By is used to show the latest time at which an action will be finished. So it is usually used with the future tenses. I shall be leaving by 6 o' clock. I hope to finish the work by the end of this year. On, in, at and by (time) While speaking about time at indicates an exact point of time on a more general point of time and in a period of time. I shall be there at 4 pm. We set out at dawn. I was born on May 26. The postman brought this letter in the morning. I shall visit them in summer. It is very hot in the day and quite cold at night. (Note that 'at night' is an exception to this rule) Beside and besides Beside means ‘by the side of’ and besides means ‘in addition to’. The house was beside the river. (= by the side of the river) He stood beside me. (= by my side) He plays tennis besides (in addition to) basketball and football. Besides (in addition to) being a good speaker, he is also an excellent actor. Examples aim at, not on The estimated time (for) of development alpha-fetoprotein is still recommended to compliment ultrasound (for) in the surveillance of HCC. A total of 1132 female sand flies (from) due to two species…. are attributed (to)with approximately 15%... The authors would like to thank (null) from… these tests must depend (on) to careful evaluation In order (to) for detecting… Apart (from) of their affinity … It accounts (for) null more than 10% of ….. in most instances leads (to) null the increase of infection and being genetically prone (to) for alpha-1 antitrypsin deficiency