Curriculum vitae Toldova Svetlana Febrary 20, 2015 Email: toldova@yandex.ru stoldova@hse.ru Personal Information Date and place of birth: March 25, 1963, Moscow, Russia. Citizenship: Russian Federation Affiliation: National Research University “Higher School of Economics” Address: Moscow, 21/4 Staraya Basmannaya Ulitsa, Russia +7-916-968-25-34 Education May 1994: Candidate of Sciences* (PhD) in Linguistics, Moscow State Lomonosov University June 1990: M.A in Applied Mathematics, , Moscow State University. –June 1985: M.A. in Linguistics Moscow State University. Theses: 1985, M.A. thesis: “Nominacija odnogo i togo ache ob’ekta v svyaznom tekste ” (Ways of referent nomination choice in coherent texts) 120 p.. 1988, Ph.D. dissertation: Fokus vnimaniya I ierarchija discursa kak vazchnyje factory vybora nominacii ob’ekta v tekste. ("Focusing and Discourse structure as important factors of reference choice in text.). Department of Structural and Applied Linguistics, Faculty of Philology, Moscow State University. 180 pp Research interests Theoretical linguistics (syntax, anaphora, discourse structure, Daghestanian languages, Finno-Ugric languages, differential object marking, reported speech), natural language processing (semantic-oriented analysis, knowledge representation of the text), corpora, Tungusic languages, language documentation. Employment 1985-1987 Engineer, Research center for the study of surface properties and vacuum (linguistic support in Information System development) 1987 – 2007 Research fellow, Department of theoretical and applied linguistics, Faculty of Philology, Moscow State University 2007 – 2013 Associate Professor, State Academic University for the Humanities, Institute of Linguistics 2013 Senior Fellow, Institute of World Cultures, Lomonosov State University 2013 – Associate Professor, School of Linguistics, Faculty of Humanities, National Research University “Higher School of Economics” Teaching 2000 – 2007 Associate Professor, State Academic University for the Humanities, Institute of Linguistics; 2007 – 2014 Research Fellow, Associate Professor, Department of theoretical and applied linguistics, Faculty of Philology, Moscow State University; 2013 Associate Professor, State Academic University for the Humanities, Institute of Linguistics; 2012 – 2013 Associate Professor, School of Linguistics, Faculty of Humanities, National Research University “Higher School of Economics” Research positions 1983 – 1985 Research Assistant, Department of theoretical and applied linguistics, Faculty of Philology, Moscow State University; 1991 – 2002 Engineer, Russian Research Institute for Artificial Intelligence; 2007 – 2010 Linguist, Medialogia (NLP consultant); 2010 – 2011 Leading Research Fellow, News360; 2011 – Leading Research Fellow, Center for Semantic Technologies, National Research University “Higher School of Economics” Teaching experience School of Linguistics, Faculty of Humanities, National Research University “Higher School of Economics” Computational Linguistics Natural Text Processing Tools for Linguistic Research Discourse Lomonosov Moscow State University Natural Language processing. Introduction (2008 – present) Corpus Linguistics (2012-present) Old church Slavonic (2000-2014, seminars) Quantitative linguistics (2000-2014, statistical methods in linguistics) Verb agreement (typological approach) (autumn 2002 – spring 2003) Typology of agreement (spring 2003) Typology of complement clause (between clause and discourse) (autumn 2001 – spring 2002) Typology of Anaphora Introduction into the field work on corresponding language (Mari, Komi, Udmurt, Khanty, Erzya, Moksha (some with Kalinina E.Ju.), 2000 – present) Typology of complement clause (with Serdobolskaya N. V., autumn 2005 – spring 2006) Natural Language Understanding Systems (1998) Ru-Eval: Evaluation of syntactic parsers for Russian (2011-2012) Ru-Eval: Evaluation of Anaphora and Coreference resolution for Russian (2013-2014) Finno-Ugric Languages (research seminar, 2013) Russian State University for the Humanities, Institute of Linguistics, Corpus linguistics (2001-2013) Introduction to linguistics (autumn 2003) Syntax (seminars) (autumn 2005 - 2011) Introduction to Natural Language Processing (2012-2013) Computational and Corpus Linguistics Projects 1987 – present– participation in the following research projects carried out by the Department of Theoretical and Applied Linguistics (Lomonosov Moscow State University), supported since 1994 by grants from the Russian Foundation for Humanities (Rossiiski Humanitarnii Nauchnii Fond) and Russian Fund for Fundamental Research (Rossijskij Fond Fundamentalnyh Issledovanij), Fund of Russian Academy of Sciences. 1985-1989 “Modeling Dialogue in telephone number information system” (Lomonosov State University together with Russian Research Institute for Artificial Intelligence); 1993-1997 «AURA» – modeling natural text understanding for texts in restricted domain; 1992-1993 «Elaboration of factual Question-Answering system: natural language questions analysis»; 1990-1991 «Linguistic aspects of text generation»; 1993-1995 «Natural Language Understanding Modeling: linguistic aspects». 2004 Semantic annotation in Russian National Corpus 2006 Morphological disambiguation for Russian (HMM tagger for Russian, mismatches analysis) 2007 Verb sense disambiguation based on Verb patterns 2010 Ru-eval: Evaluation of Morphological taggers for Russian 2011-2012 Ru-eval: Evaluation of Dependency Parsers for Russian 2013-2014 Ru-eval: Evaluation of Anaphora and Coreference Resolution for Russian 2011 Research in Syntax annotation and Syntax query interface for the Russian National Corpus 2012-2014 Treebank for Russian: manual annotation for basic syntactic relations 2015-2017 Platform for NLP systems for anaphora and coreference resolution and information extraction evaluation Some outcomes Pattern-based Generation Module for Dialogue in Telephone Number Information System. A prototype; Text Understanding in Texts for Ultrasound Investigation. A prototype; ALEX – a System for Multi-purpose Automatized Text Processing; Russian Treebank with parallel annotation: http://otipl.philol.msu.ru/~soiza/testsynt/ (system results for parsers evaluation for Russian) RTB: Russian Treebank (system results for parsers evaluation for Russian with manual corrections for Subject and Object relations) RuCor – Russian Coreference Corpus – manually annotated for coreference ((http://ant1.maimbava.net/res03/ant1.php?b=big) Pattern-based Core for Named Entities Recognition System for English – News-360 Topic tagging system – News-360 Reported speech detection for Russian (Medialogia) Dictionary-based Subjectivity and Opinion Mining for Russian Fieldwork and language documentation Since 1987 I have been participating in the research projects carried out by the Department of Theoretical and Applied Linguistics (Lomonosov Moscow State University) since 1994 these projects were supported by grant of Russian Foundation for Humanities (Rossiiski Humanitarnii Nauchnii Fond) and the projects carried out by the Center of Typology (Russian state University oh Humanities) on documentation and description of endangered languages of Russia and 2 research projects for language documentation supported by SOAS: 1. Caucasian Languages 1987 Svan language (Kartvelian) 1990 -1996 Daghestantian languages (Dargva (Megeb, Icari), Godoberi, Bagvalal, Tsakhur) 2003-2006 Adyghe (Northwest Caucasian) 2. Finno-Ugric Languages (grants of the 2000-2015): 2000-2001, 2004 Mari (Eastern) 2002 Komi (Pechora dialect) 2003-2005 Udmurt (Besermen) 2006-2007 Ersya (Mordvin) 2008-2009 Komi-zyryan 2010-2012 Khanty 2013-2014 Moksha (Mordvin) 3. Turkic Languages 2001-2002 Khakas 2014 Bashkir 4. Tungusic Languages Winter 2005, August 2005 – Nanaj – Oroch –Negidal – Ulcha – languages (grant of Hans Rausing Endangered Languages Project, GB) August 2006, 2007 – Kur-Urmi dialect of Nanaj (supported by grant of Russian Foundation for Humanities (Russian Fund for the Humanties) 2008-2011 Ulcha, Kur-Urmi - (grant of Hans Rausing Endangered Languages Project, GB) 2007-2008 Uilta (Orok) (supported by grant of Russian Foundation for Humanities (Russian Fund for the Humanties) 2011-2012 Uilta (Orok) (supported by the Foundation for Fundamental Linguistic Research) Publications 1. 1996. Gizatullina Ju., Toldova S. Pronouns. In A.E. Kibrik (ed.) Godoberi. Lincom Europa Studies in Caucasian linguistics, Muenchen-Newcastle. 2. 1996. Toldova S. Reflexivization. In A.E. Kibrik (ed.) Godoberi. Lincom Europa Studies in Caucasian linguistics, Muenchen-Newcastle. 3. 1996. Toldova S.Ju. Situacija konkurencii otnositel’no vybora nominacii objekta v tekste [Referencial ambiguity and the NP choice in text]. In Proceedings of Dialogue’96 – International Workshop on Computational Linguistics and its Applications, ed. A. Narinjani, Moscow. 4. 1997. Toldova S.Ju. Opyt postrojenija systemy avtomaticheskogo analiza tekstov v ogranichennoj predmetnoj oblasti (na materiale tekstov ul’trazvukovyh issledovanij)” [The experimental system of automatic text understanding in restricted domain]. In Proceedings of Dialogue’97 – International Workshop on Computational Linguistics and its Applications, ed. A. Narinjani, Moscow. 5. 1998. Sokolova E. G., Sosenskaya T. B., Toldova S. Ju., Fedorova O.V. , Sharov S.A. K modelitovaniju jazykovoj deyatelnosti v spravochno-informacionnyk sistemah [Natural Language Comunication Modeling in Informational Systems] // Rakhilina E. V., Testelets Ja. G. (eds.) Sbornik statej k 60-letiju A.E.Kibrika, Isdatelstvo MGU, 1998 6. 1998. Testelets Ja., Toldova S.Ju. Refleksivy v Dagestanskih jazykah i tipologija refleksiva [Reflexives in Daghestanian languages and the Typology of reflexives]. In Voprosy Yazykoznanija. 1998, no.4, 35-57. 7. 1998. Toldova S.Ju. Kommunikativnyje harakteristiki razlichnyh tipov kontextov i vybor anaforicheskih sredstv povtornoj nominacii. [Informational characteristics of context and anaphoric choice in discourse]. In Proceedings of Dialogue’96 – International Workshop on Computational Linguistics and its Applications, ed. A. Narinjani. , Moscow. 8. 1999. Kalinina E., Toldova S. Attributivizacija (Attributivization) // In A.E. Kibrik (ed.) Studies in Tsakhur: a typological perspective, Moscow. 9. 1999. Toldova S. Mestoimennaja referencija v konstrukcijah s kosvennoj rechju: mezhdu dejksisom i anaforoj [Pronominal reference in reported speech constructions: between deixis and anaphora]. In Proceedings of Dialogue’99 – International Workshop on Computational Linguistics and its Applications, ed. A. Narin’yani, Moscow. 10. 1999. Toldova S. Mestoimennye sredstva podderzhanija referencii [Pronouns as the means of reference maintaining in discourse]. InA. Kibrik (ed.) Studies in Tsakhur: a typological perspective, Moscow. 11. 1999. Toldova S. Ju. Pattern Response Generation as a Part of the Model of the Dialogue with the Telephone Query System, In Proceedings of SPECOM’99 international workshop «Speech and Computer» Moscow 4-7 October 1999. 12. 2000. Toldova S. O kognitivnom podxode k nekotorym problemam referencii [Cognitive approach to the reference maintaining in discourse]. In Proceedings ofDialogue’00 – International Workshop on Computational Linguistics and its Applications, ed. A. Narin’yani , Moscow. 13. 2002. Sistema ALEX kak sredstvo dlya mnogocelevoj obrabotki teksta. [“Alex” - a system for Natural language processing] In Proceedings of Dialogue’05 – International Workshop on Computational Linguistics and its Applications, ed. A. Narin’yani , Moscow, 2002; 14. 2002. Toldova S., Serdobolskaya N.V. Namerenija govorjaschego i referencialjnyje svojstva imennyh grupp. [The Speaker’s Intentions and the Referential Properties of Noun Phrases]. In Proceedings of Dialogue’02 – International Workhop on Computational Linguistics and Intellectual Technologies Moscow: Nauka, pp. 508–523. 15. 2002. Toldova S., Serdobolskaya N.V. Nekotoryje osobennosti oformlenija prjamogo dopolnenija v marijskom jazyke. [Some Pecularities of Direct Object Encoding in Mari]. In Lingvisticheskij bespredel, Moscow: Moscow Lomonosov University, pp. 106-125. 16. 2005. Toldova S., Serdobolskaya N.V. Ocenochnyje predicaty: tip ocenki I sintaksis konstrukcii [Evaluative predicates semantics and syntaxis]. In Proceedings of Dialogue’05 – International Workshop on Computational Linguistics and its Applications, ed. A. Narin’yani. Moscow. 17. 2005. Toldova S., Serdobolskaya N.V. Rascheplennoje oformlenije pryamogo dopolnenija i ego diskursivnyj ves. (Discourse characteristics of direct object and split direct object encoding) // Fourth Winter Typological School (Fourth International School in Linguistic Typology and Anthropology). Armenia, Erevan, September 2005. Moscow: RSUH, pp. 323– 326. 18. 2005. Toldova S., Sokirko A.V. Sravnenije effektivnosti dvuh metodik snyatija leksicheskoj i morphologicheskoj neodnoznachnosti dlya russkogo jazyka (Skrytaya model’ Markova i sintaksicheskij analizator imennyh grupp) [The comparison of two methods for the morphological umbiguity resolution in Russian language]. http://company.yandex.ru/grant/2005/01_Sokirko_92802.pdf 19. 2007. Kobritsov, Boris P., Olga N. Lashevskaja, and Svetlana Ju. Toldova. Snjatije semanticheskoj mnogoznachnosti s ispol'zovanijem modelej upravlenija, izvlechennykh iz elektronnykh tolkovykh slovarej [Word-sense disambiguation with the help of government patterns retrieved from electronic dictionaries]. Electronic publication. Mode of access: http://download.yandex.ru/IMAT2007/kobricov.pdf. 20. 2008. Toldova, Svetlana Ju., Galina I. Kustova, and Olga N. Lashevskaja. Semanticheskie fil'try dlja razreshenija mnogoznachnosti v Nacional'nom korpuse russkogo jazyka: glagoly [Semantic filters for word sense disambiguation in the Russian National Corpus: verbs]. In: Computational linguistics and intellectual technologies. Proceedings of International Workshop Dialogue'2008. Vol. 7 (14). Moscow: RGGU, 2008. Pp. 522-529. 21. 2009. Brykina M. M., Toldova S. Ju. Dokumentacija uiltinskogo jazyka: chto predlagaet XXI vek? [Uilta Lmaguage Documentation: what does XXI century suggests]. In Roon T. (ed.) Kulturnoje nasledije narodov Dalnego Vostoka Rossii. Sakhalinskaya Oblast’. Uilta. Evenki. – Juzchno-Sakhalinsk: Sakhalinskij gos. obl. Krajevedcheskij miuzej, 2009. 22. 2009. Kustova G., Toldova S. RNC: Semantic filters for the verb disambiguation [‘NKRJA: semanticheskije filtry dlja razreshenija mnogoznachnosti glagolov’]. Russian national corpus: 2006–2008. New results and perspectives. [‘Natsionalnyi korpus russkogo jazyka: 2006–2008. Novye rezultaty i perspektivy’]. SaintPetersburg: Nestor-Istorija. 23. 2010. Lyashevskaya, Olga, Irina Astaf'eva, Anastasia Bonch-Osmolovskaya, Anastasia Garejshina, Julia Grishina, Vadim D'jachkov, Maxim Ionov, Anna Koroleva, Maxim Kudrinsky, Anna Lityagina, Elena Luchina, Eugenia Sidorova, Svetlana Toldova, Svetlana Savchuk, and Sergej Koval'. Ocenka metodov avtomaticheskogo analiza teksta: morfologicheskije parsery russkogo jazyka [NLP evaluation: Russian morphological parsers]. In: Computational linguistics and intellectual technologies. Proceedings of International Workshop Dialogue'2010. Vol. 9 (16), 2010. Pp. 318-326. 24. 2012. Anastasia Gareyshina, Maxim Ionov, Olga Lyashevskaya, Dmitry Privoznov, Elena Sokolova, Svetlana Toldova RU-EVAL-2012: Evaluating dependency parsers for Russian. In Proceedings of COLING 2012: Posters. Pp. 349–360. URL: http://aclweb.org/anthology/C/C12/C12-2035.pdf. December 9, 2012, IIT Bombay, Mumbai, India. COLING2012. 25. 2012. Serdobolskaya N., Toldova S. Differencirovannoe markirovanie pramogo dopolnenija v finno-ugorskix jazykax [Differential object marking in finno-ugric languages]. In Finnougorskie jazyki: fragmenty grammaticheskogo opisanija. Formalnyj i funkcionalnyj podxody. Moscow: Jazyki slavjanskix kultur. 26. 2012. Bonch-Osmolovskaya A.A., Toldova S. Ju., Klincov V. P. Strategii introduktivnoj nominacii v teksrah SMI. [Referent introduction into discourse in News reports]. // Elektronnoje nauchnoje izdanije “Aktualnyje innovacionnyje issledovanija: nauka I praktika”. 2012. №4. URL: http://actualresearch.ru/nn/2012_3/Article/philology/bonchosmolovskaja20123.htm. 27. 2012. Toldova S.Ju., Sokolova E.G., Astaf'eva I., Gareyshina A., Koroleva A., Privoznov D., Sidorova E., Tupikina L., Lyashevskaya O. N. Ocenka metodov avtomaticheskogo analuza teksta 2011-2012: Sintaksicheskie parsery russkogo jazyka [NLP evaluation 2011-2012: Russian syntactic parsers]. In Computational linguistics and intellectual technologies. Proceedings of International Workshop Dialogue'2012. Vol. 11 (18), 2012. Moscow: RGGU. Pp. 797-809. 28. 2013. Akinina Y. S., Kuznetsov I. O, Toldova S.Ju. Sravnenije dvuh metodov avtomaticheskogo izvlechenija uchastnikov sobytij iz nestrukturirovannyh istochnikov. [The comparison of two methods for participants of an event extraction] In Nauchno-technicheskaya informacija. 2. Informacionnyje process i sistemy. 2013. №6. С. 23-34. 29. 2013. Akinina Y. S., Kuznetsov I. O, Toldova S.Ju. The impact of syntactic structure on verbnoun collocation extraction. In Proceedings of Dialogue’13 – International Workshop on Computational Linguistics and its Applications, ed. A. Narin’yani, v. 1, pp. 2–17. Moscow. http://www.dialog-21.ru/digests/dialog2013/materials/pdf/AkininaJS.pdf. 30. 2013. Akinina Yu. S., A. A. Bonch-Osmolovskaya, I. O. Kuznetsov, V. P. Klintsov, S. Yu. Toldova. Rol obschej I specificheskoj leksiki pri izvlechenii informacii iz teksta na primere analiza sobytija “Vvod novyh tehnologij. [The role of general and specific vocabulary in fact extraction: the case of innovation-event]. In Vestnik NGU: serija Informacionnyje tehnologii. Vol.10, No 4. PP.74-80. http://lib.nsu.ru:8080/xmlui/handle/nsu/257 31. 2013. Brykina M. M., Faynveyts A. V., Toldova S.Ju. Dictionary-based ambiguity resolution in Russian named entities recognition. A case study. In Proceedings of Dialogue’13 – International Workshop on Computational Linguistics and its Applications, ed. A. Narin’yani, v. 1. http://www.dialog-21.ru/digests/dialog2013/materials/pdf/brykinamm.pdf. 32. 2014. Serdobolskaya N.V., Toldova S. Glagol rechi manaš v marijskom jazyke: ocobennosti grammarikalizacii. [The speech verb manaš in mari language: grammaricalization]. Voprocy jazykoznanija. 2014. № 6. С. 66-91. (1.5 п.л.) 33. 2014. Toldova S., Serdobolskaya N.V. Leksicheskije svojstva glagola i oformljenije prjamogo dopolnjenija v komi jazyke. [Lexical Semantics of Transitive Verbs and Direct Object Encoding in Komi]. In A.E.Kibrik et al. (eds). Lingvisticheskij bespredel-2. Collected works for the of 80-year Jubilee of A.I.Kuznecova. Лингвистический беспредел. Moscow: Moscow Lomonosov University. 2014. C. 164-175. 34. 2014. Toldova S., Serdobolskaya N.V. Serdobolskaya N. V., Toldova S. Ju. Konstrukcii s ocenochnymi predikativami: uchastniki situacii ocenki I semantika ocenochnogo predicata. [Evaluative predicates constructions in Russian: event participants and a predicate meaning] // Acta Linguistica Petropolitana. Trudy linguisticheskih issledovanij. 2014. V. 10. № 2. С. 443478. 35. 2014. Toldova S.Ju., Roytberg A., Nedoluzhko А., Kurzukov M., Ladygina A., Vasilyeva M., Azerkovich I., Grishina Y., Sim G., Ivanova A., Gorshkov D. Evaluating Anaphora and Coreference Resolution for Russian // In Proceedings of Dialogue’13 – International Workshop on Computational Linguistics and its Applications, ed. V. Selegej, v.2. 2014. С. 681-695. 36. 2014. Toldova, Svetlana and Olga Lyashevskaya. Sovremennye problemy i tendencii kompjuternoj lingvistiki: COLING 2012 [State-of-the-art in computational linguistics: COLING 2012]. Voprosy Jazykoznanija, No. 1, 2014. Pp. 120-145.