Toldova Svetlana

advertisement
Curriculum vitae
Toldova Svetlana
Febrary 20, 2015
Email: toldova@yandex.ru
stoldova@hse.ru
Personal Information
Date and place of birth:
March 25, 1963, Moscow, Russia.
Citizenship: Russian Federation
Affiliation: National Research University “Higher School of Economics”
Address:
Moscow, 21/4 Staraya Basmannaya Ulitsa, Russia
+7-916-968-25-34
Education
May 1994: Candidate of Sciences* (PhD) in Linguistics, Moscow State Lomonosov University
June 1990: M.A in Applied Mathematics, , Moscow State University.
–June 1985: M.A. in Linguistics Moscow State University.
Theses:
1985, M.A. thesis: “Nominacija odnogo i togo ache ob’ekta v svyaznom tekste ” (Ways of
referent nomination choice in coherent texts) 120 p..
1988, Ph.D. dissertation: Fokus vnimaniya I ierarchija discursa kak vazchnyje factory vybora
nominacii ob’ekta v tekste. ("Focusing and Discourse structure as important factors of reference
choice in text.). Department of Structural and Applied Linguistics, Faculty of Philology,
Moscow State University. 180 pp
Research interests
Theoretical linguistics (syntax, anaphora, discourse structure, Daghestanian languages, Finno-Ugric
languages, differential object marking, reported speech), natural language processing (semantic-oriented
analysis, knowledge representation of the text), corpora, Tungusic languages, language documentation.
Employment
1985-1987
Engineer, Research center for the study of surface properties and vacuum (linguistic
support in Information System development)
1987 – 2007
Research fellow, Department of theoretical and applied linguistics, Faculty of Philology,
Moscow State University
2007 – 2013
Associate Professor, State Academic University for the Humanities, Institute of
Linguistics
2013
Senior Fellow, Institute of World Cultures, Lomonosov State University
2013 –
Associate Professor, School of Linguistics, Faculty of Humanities, National Research
University “Higher School of Economics”
Teaching
2000 – 2007
Associate Professor, State Academic University for the Humanities, Institute of
Linguistics;
2007 – 2014
Research Fellow, Associate Professor, Department of theoretical and applied linguistics,
Faculty of Philology, Moscow State University;
2013
Associate Professor, State Academic University for the Humanities, Institute of
Linguistics;
2012 – 2013
Associate Professor, School of Linguistics, Faculty of Humanities, National Research
University “Higher School of Economics”
Research positions
1983 – 1985
Research Assistant, Department of theoretical and applied linguistics, Faculty of
Philology, Moscow State University;
1991 – 2002
Engineer, Russian Research Institute for Artificial Intelligence;
2007 – 2010
Linguist, Medialogia (NLP consultant);
2010 – 2011
Leading Research Fellow, News360;
2011 –
Leading Research Fellow, Center for Semantic Technologies, National Research
University “Higher School of Economics”
Teaching experience
School of Linguistics, Faculty of Humanities, National Research University “Higher School of
Economics”




Computational Linguistics
Natural Text Processing
Tools for Linguistic Research
Discourse
Lomonosov Moscow State University








Natural Language processing. Introduction (2008 – present)
Corpus Linguistics (2012-present)
Old church Slavonic (2000-2014, seminars)
Quantitative linguistics (2000-2014, statistical methods in linguistics)
Verb agreement (typological approach) (autumn 2002 – spring 2003)
Typology of agreement (spring 2003)
Typology of complement clause (between clause and discourse) (autumn 2001 – spring 2002)
Typology of Anaphora






Introduction into the field work on corresponding language (Mari, Komi, Udmurt, Khanty, Erzya,
Moksha (some with Kalinina E.Ju.), 2000 – present)
Typology of complement clause (with Serdobolskaya N. V., autumn 2005 – spring 2006)
Natural Language Understanding Systems (1998)
Ru-Eval: Evaluation of syntactic parsers for Russian (2011-2012)
Ru-Eval: Evaluation of Anaphora and Coreference resolution for Russian (2013-2014)
Finno-Ugric Languages (research seminar, 2013)
Russian State University for the Humanities, Institute of Linguistics,




Corpus linguistics (2001-2013)
Introduction to linguistics (autumn 2003)
Syntax (seminars) (autumn 2005 - 2011)
Introduction to Natural Language Processing (2012-2013)
Computational and Corpus Linguistics Projects
1987 – present– participation in the following research projects carried out by the Department of
Theoretical and Applied Linguistics (Lomonosov Moscow State University), supported since 1994 by
grants from the Russian Foundation for Humanities (Rossiiski Humanitarnii Nauchnii Fond) and Russian
Fund for Fundamental Research (Rossijskij Fond Fundamentalnyh Issledovanij), Fund of Russian
Academy of Sciences.
1985-1989
“Modeling Dialogue in telephone number information system” (Lomonosov State
University together with Russian Research Institute for Artificial Intelligence);
1993-1997
«AURA» – modeling natural text understanding for texts in restricted domain;
1992-1993
«Elaboration of factual Question-Answering system: natural language questions
analysis»;
1990-1991
«Linguistic aspects of text generation»;
1993-1995
«Natural Language Understanding Modeling: linguistic aspects».
2004
Semantic annotation in Russian National Corpus
2006
Morphological disambiguation for Russian (HMM tagger for Russian, mismatches
analysis)
2007
Verb sense disambiguation based on Verb patterns
2010
Ru-eval: Evaluation of Morphological taggers for Russian
2011-2012
Ru-eval: Evaluation of Dependency Parsers for Russian
2013-2014
Ru-eval: Evaluation of Anaphora and Coreference Resolution for Russian
2011
Research in Syntax annotation and Syntax query interface for the Russian National
Corpus
2012-2014
Treebank for Russian: manual annotation for basic syntactic relations
2015-2017
Platform for NLP systems for anaphora and coreference resolution and information
extraction evaluation
Some outcomes
Pattern-based Generation Module for Dialogue in Telephone Number Information System. A
prototype;
Text Understanding in Texts for Ultrasound Investigation. A prototype;
ALEX – a System for Multi-purpose Automatized Text Processing;
Russian Treebank with parallel annotation: http://otipl.philol.msu.ru/~soiza/testsynt/ (system
results for parsers evaluation for Russian)
RTB: Russian Treebank (system results for parsers evaluation for Russian with manual
corrections for Subject and Object relations)
RuCor – Russian Coreference Corpus – manually annotated for coreference
((http://ant1.maimbava.net/res03/ant1.php?b=big)
Pattern-based Core for Named Entities Recognition System for English – News-360
Topic tagging system – News-360
Reported speech detection for Russian (Medialogia)
Dictionary-based Subjectivity and Opinion Mining for Russian
Fieldwork and language documentation
Since 1987 I have been participating in the research projects carried out by the Department of
Theoretical and Applied Linguistics (Lomonosov Moscow State University) since 1994 these projects
were supported by grant of Russian Foundation for Humanities (Rossiiski Humanitarnii Nauchnii Fond)
and the projects carried out by the Center of Typology (Russian state University oh Humanities) on
documentation and description of endangered languages of Russia and 2 research projects for language
documentation supported by SOAS:
1. Caucasian Languages
1987
Svan language (Kartvelian)
1990 -1996
Daghestantian languages (Dargva (Megeb, Icari), Godoberi, Bagvalal,
Tsakhur)
2003-2006
Adyghe (Northwest Caucasian)
2. Finno-Ugric Languages (grants of the 2000-2015):
2000-2001, 2004
Mari (Eastern)
2002
Komi (Pechora dialect)
2003-2005
Udmurt (Besermen)
2006-2007
Ersya (Mordvin)
2008-2009
Komi-zyryan
2010-2012
Khanty
2013-2014
Moksha (Mordvin)
3. Turkic Languages
2001-2002
Khakas
2014
Bashkir
4. Tungusic Languages
Winter 2005, August 2005 – Nanaj – Oroch –Negidal – Ulcha – languages (grant of Hans Rausing
Endangered Languages Project, GB)
August 2006, 2007 – Kur-Urmi dialect of Nanaj (supported by grant of Russian Foundation for
Humanities (Russian Fund for the Humanties)
2008-2011
Ulcha, Kur-Urmi - (grant of Hans Rausing Endangered Languages Project, GB)
2007-2008
Uilta (Orok) (supported by grant of Russian Foundation for Humanities (Russian Fund for
the Humanties)
2011-2012
Uilta (Orok) (supported by the Foundation for Fundamental Linguistic Research)
Publications
1. 1996. Gizatullina Ju., Toldova S. Pronouns. In A.E. Kibrik (ed.) Godoberi. Lincom Europa
Studies in Caucasian linguistics, Muenchen-Newcastle.
2. 1996. Toldova S. Reflexivization. In A.E. Kibrik (ed.) Godoberi. Lincom Europa Studies in
Caucasian linguistics, Muenchen-Newcastle.
3. 1996. Toldova S.Ju. Situacija konkurencii otnositel’no vybora nominacii objekta v tekste
[Referencial ambiguity and the NP choice in text]. In Proceedings of Dialogue’96 –
International Workshop on Computational Linguistics and its Applications, ed. A. Narinjani,
Moscow.
4. 1997. Toldova S.Ju. Opyt postrojenija systemy avtomaticheskogo analiza tekstov v
ogranichennoj predmetnoj oblasti (na materiale tekstov ul’trazvukovyh issledovanij)” [The
experimental system of automatic text understanding in restricted domain]. In Proceedings of
Dialogue’97 – International Workshop on Computational Linguistics and its Applications, ed.
A. Narinjani, Moscow.
5. 1998. Sokolova E. G., Sosenskaya T. B., Toldova S. Ju., Fedorova O.V. , Sharov S.A. K
modelitovaniju jazykovoj deyatelnosti v spravochno-informacionnyk sistemah [Natural
Language Comunication Modeling in Informational Systems] // Rakhilina E. V.,
Testelets Ja. G. (eds.) Sbornik statej k 60-letiju A.E.Kibrika, Isdatelstvo MGU, 1998
6. 1998. Testelets Ja., Toldova S.Ju. Refleksivy v Dagestanskih jazykah i tipologija refleksiva
[Reflexives in Daghestanian languages and the Typology of reflexives]. In Voprosy
Yazykoznanija. 1998, no.4, 35-57.
7. 1998. Toldova S.Ju. Kommunikativnyje harakteristiki razlichnyh tipov kontextov i vybor
anaforicheskih sredstv povtornoj nominacii. [Informational characteristics of context and
anaphoric choice in discourse]. In Proceedings of Dialogue’96 – International Workshop on
Computational Linguistics and its Applications, ed. A. Narinjani. , Moscow.
8. 1999. Kalinina E., Toldova S. Attributivizacija (Attributivization) // In A.E. Kibrik (ed.)
Studies in Tsakhur: a typological perspective, Moscow.
9. 1999. Toldova S. Mestoimennaja referencija v konstrukcijah s kosvennoj rechju: mezhdu
dejksisom i anaforoj [Pronominal reference in reported speech constructions: between deixis
and anaphora]. In Proceedings of Dialogue’99 – International Workshop on Computational
Linguistics and its Applications, ed. A. Narin’yani, Moscow.
10. 1999. Toldova S. Mestoimennye sredstva podderzhanija referencii [Pronouns as the means of
reference maintaining in discourse]. InA. Kibrik (ed.) Studies in Tsakhur: a typological
perspective, Moscow.
11. 1999. Toldova S. Ju. Pattern Response Generation as a Part of the Model of the Dialogue with
the Telephone Query System, In Proceedings of SPECOM’99 international workshop «Speech
and Computer» Moscow 4-7 October 1999.
12. 2000. Toldova S. O kognitivnom podxode k nekotorym problemam referencii [Cognitive
approach to the reference maintaining in discourse]. In Proceedings ofDialogue’00 –
International Workshop on Computational Linguistics and its Applications, ed. A. Narin’yani
, Moscow.
13. 2002. Sistema ALEX kak sredstvo dlya mnogocelevoj obrabotki teksta. [“Alex” - a system for
Natural language processing] In Proceedings of Dialogue’05 – International Workshop on
Computational Linguistics and its Applications, ed. A. Narin’yani , Moscow, 2002;
14. 2002. Toldova S., Serdobolskaya N.V. Namerenija govorjaschego i referencialjnyje svojstva
imennyh grupp. [The Speaker’s Intentions and the Referential Properties of Noun Phrases]. In
Proceedings of Dialogue’02 – International Workhop on Computational Linguistics and
Intellectual Technologies Moscow: Nauka, pp. 508–523.
15. 2002. Toldova S., Serdobolskaya N.V. Nekotoryje osobennosti oformlenija prjamogo
dopolnenija v marijskom jazyke. [Some Pecularities of Direct Object Encoding in Mari]. In
Lingvisticheskij bespredel, Moscow: Moscow Lomonosov University, pp. 106-125.
16. 2005. Toldova S., Serdobolskaya N.V. Ocenochnyje predicaty: tip ocenki I sintaksis
konstrukcii [Evaluative predicates semantics and syntaxis]. In Proceedings of Dialogue’05 –
International Workshop on Computational Linguistics and its Applications, ed. A. Narin’yani.
Moscow.
17. 2005. Toldova S., Serdobolskaya N.V. Rascheplennoje oformlenije pryamogo dopolnenija i
ego diskursivnyj ves. (Discourse characteristics of direct object and split direct object
encoding) // Fourth Winter Typological School (Fourth International School in Linguistic
Typology and Anthropology). Armenia, Erevan, September 2005. Moscow: RSUH, pp. 323–
326.
18. 2005. Toldova S., Sokirko A.V. Sravnenije effektivnosti dvuh metodik snyatija leksicheskoj i
morphologicheskoj neodnoznachnosti dlya russkogo jazyka (Skrytaya model’ Markova i
sintaksicheskij analizator imennyh grupp) [The comparison of two methods for the
morphological
umbiguity
resolution
in
Russian
language].
http://company.yandex.ru/grant/2005/01_Sokirko_92802.pdf
19. 2007. Kobritsov, Boris P., Olga N. Lashevskaja, and Svetlana Ju. Toldova. Snjatije
semanticheskoj mnogoznachnosti s ispol'zovanijem modelej upravlenija, izvlechennykh iz
elektronnykh tolkovykh slovarej [Word-sense disambiguation with the help of government
patterns retrieved from electronic dictionaries]. Electronic publication. Mode of access:
http://download.yandex.ru/IMAT2007/kobricov.pdf.
20. 2008. Toldova, Svetlana Ju., Galina I. Kustova, and Olga N. Lashevskaja. Semanticheskie
fil'try dlja razreshenija mnogoznachnosti v Nacional'nom korpuse russkogo jazyka: glagoly
[Semantic filters for word sense disambiguation in the Russian National Corpus: verbs].
In: Computational linguistics and intellectual technologies. Proceedings of International
Workshop Dialogue'2008. Vol. 7 (14). Moscow: RGGU, 2008. Pp. 522-529.
21. 2009. Brykina M. M., Toldova S. Ju. Dokumentacija uiltinskogo jazyka: chto predlagaet XXI vek?
[Uilta Lmaguage Documentation: what does XXI century suggests]. In Roon T. (ed.)
Kulturnoje nasledije narodov Dalnego Vostoka Rossii. Sakhalinskaya Oblast’. Uilta. Evenki.
– Juzchno-Sakhalinsk: Sakhalinskij gos. obl. Krajevedcheskij miuzej, 2009.
22. 2009. Kustova G., Toldova S. RNC: Semantic filters for the verb disambiguation [‘NKRJA:
semanticheskije filtry dlja razreshenija mnogoznachnosti glagolov’]. Russian national corpus:
2006–2008. New results and perspectives. [‘Natsionalnyi korpus russkogo jazyka: 2006–2008.
Novye rezultaty i perspektivy’]. SaintPetersburg: Nestor-Istorija.
23. 2010. Lyashevskaya, Olga, Irina Astaf'eva, Anastasia Bonch-Osmolovskaya, Anastasia
Garejshina, Julia Grishina, Vadim D'jachkov, Maxim Ionov, Anna Koroleva, Maxim
Kudrinsky, Anna Lityagina, Elena Luchina, Eugenia Sidorova, Svetlana Toldova, Svetlana
Savchuk, and Sergej Koval'. Ocenka metodov avtomaticheskogo analiza teksta:
morfologicheskije parsery russkogo jazyka [NLP evaluation: Russian morphological parsers].
In: Computational linguistics and intellectual technologies. Proceedings of International
Workshop Dialogue'2010. Vol. 9 (16), 2010. Pp. 318-326.
24. 2012. Anastasia Gareyshina, Maxim Ionov, Olga Lyashevskaya, Dmitry Privoznov, Elena Sokolova,
Svetlana Toldova RU-EVAL-2012: Evaluating dependency parsers for Russian. In
Proceedings
of
COLING
2012:
Posters.
Pp.
349–360.
URL:
http://aclweb.org/anthology/C/C12/C12-2035.pdf. December 9, 2012, IIT Bombay, Mumbai,
India. COLING2012.
25. 2012. Serdobolskaya N., Toldova S. Differencirovannoe markirovanie pramogo dopolnenija v
finno-ugorskix jazykax [Differential object marking in finno-ugric languages]. In Finnougorskie jazyki: fragmenty grammaticheskogo opisanija. Formalnyj i funkcionalnyj podxody.
Moscow: Jazyki slavjanskix kultur.
26. 2012. Bonch-Osmolovskaya A.A., Toldova S. Ju., Klincov V. P. Strategii introduktivnoj
nominacii v teksrah SMI. [Referent introduction into discourse in News reports]. //
Elektronnoje nauchnoje izdanije “Aktualnyje innovacionnyje issledovanija: nauka I praktika”.
2012.
№4.
URL:
http://actualresearch.ru/nn/2012_3/Article/philology/bonchosmolovskaja20123.htm.
27. 2012. Toldova S.Ju., Sokolova E.G., Astaf'eva I., Gareyshina A., Koroleva A., Privoznov D.,
Sidorova E., Tupikina L., Lyashevskaya O. N. Ocenka metodov avtomaticheskogo analuza
teksta 2011-2012: Sintaksicheskie parsery russkogo jazyka [NLP evaluation 2011-2012:
Russian syntactic parsers]. In Computational linguistics and intellectual technologies.
Proceedings of International Workshop Dialogue'2012. Vol. 11 (18), 2012. Moscow: RGGU.
Pp. 797-809.
28. 2013. Akinina Y. S., Kuznetsov I. O, Toldova S.Ju. Sravnenije dvuh metodov avtomaticheskogo
izvlechenija uchastnikov sobytij iz nestrukturirovannyh istochnikov. [The comparison of two
methods for participants of an event extraction] In Nauchno-technicheskaya informacija. 2.
Informacionnyje process i sistemy. 2013. №6. С. 23-34.
29. 2013. Akinina Y. S., Kuznetsov I. O, Toldova S.Ju. The impact of syntactic structure on verbnoun collocation extraction. In Proceedings of Dialogue’13 – International Workshop on
Computational Linguistics and its Applications, ed. A. Narin’yani, v. 1, pp. 2–17. Moscow.
http://www.dialog-21.ru/digests/dialog2013/materials/pdf/AkininaJS.pdf.
30. 2013. Akinina Yu. S., A. A. Bonch-Osmolovskaya, I. O. Kuznetsov, V. P. Klintsov, S. Yu. Toldova.
Rol obschej I specificheskoj leksiki pri izvlechenii informacii iz teksta na primere analiza
sobytija “Vvod novyh tehnologij. [The role of general and specific vocabulary in fact
extraction: the case of innovation-event]. In Vestnik NGU: serija Informacionnyje tehnologii.
Vol.10, No 4. PP.74-80. http://lib.nsu.ru:8080/xmlui/handle/nsu/257
31. 2013. Brykina M. M., Faynveyts A. V., Toldova S.Ju. Dictionary-based ambiguity resolution
in Russian named entities recognition. A case study. In Proceedings of Dialogue’13 –
International Workshop on Computational Linguistics and its Applications, ed. A. Narin’yani,
v. 1. http://www.dialog-21.ru/digests/dialog2013/materials/pdf/brykinamm.pdf.
32. 2014. Serdobolskaya N.V., Toldova S. Glagol rechi manaš v marijskom jazyke: ocobennosti
grammarikalizacii. [The speech verb manaš in mari language: grammaricalization]. Voprocy
jazykoznanija. 2014. № 6. С. 66-91. (1.5 п.л.)
33. 2014. Toldova S., Serdobolskaya N.V. Leksicheskije svojstva glagola i oformljenije prjamogo
dopolnjenija v komi jazyke. [Lexical Semantics of Transitive Verbs and Direct Object
Encoding in Komi]. In A.E.Kibrik et al. (eds). Lingvisticheskij bespredel-2. Collected works
for the of 80-year Jubilee of A.I.Kuznecova. Лингвистический беспредел. Moscow: Moscow
Lomonosov University. 2014. C. 164-175.
34. 2014. Toldova S., Serdobolskaya N.V. Serdobolskaya N. V., Toldova S. Ju. Konstrukcii s
ocenochnymi predikativami: uchastniki situacii ocenki I semantika ocenochnogo predicata.
[Evaluative predicates constructions in Russian: event participants and a predicate meaning] //
Acta Linguistica Petropolitana. Trudy linguisticheskih issledovanij. 2014. V. 10. № 2. С. 443478.
35. 2014. Toldova S.Ju., Roytberg A., Nedoluzhko А., Kurzukov M., Ladygina A., Vasilyeva M.,
Azerkovich I., Grishina Y., Sim G., Ivanova A., Gorshkov D. Evaluating Anaphora and
Coreference Resolution for Russian // In Proceedings of Dialogue’13 – International
Workshop on Computational Linguistics and its Applications, ed. V. Selegej, v.2. 2014.
С. 681-695.
36. 2014. Toldova, Svetlana and Olga Lyashevskaya. Sovremennye problemy i tendencii
kompjuternoj lingvistiki: COLING 2012 [State-of-the-art in computational linguistics:
COLING 2012]. Voprosy Jazykoznanija, No. 1, 2014. Pp. 120-145.
Download