eVikings II Establishment of the Virtual Centre of Excellence for IST RTD in Estonia FP5 IST accompanying measures project IST-37592 Deliverable D3.4: New dictionaries and linguistic software packages available on Internet Institute of the Estonian Language Compiled by Urmas Sutrop and Ülle Viks Tallinn September 2004 Description The goal of this package was to enable www-based access to different electronic dictionaries and linguistic software. Institute of the Estonian Language (IEL) was responsible for this task. The main activity of IEL is connected with the compilation of academic original dictionaries, which all exist also electronically. As entering dictionaries into computer started already 25 years ago, several different mark-up systems are in use. Some dictionaries were accessible already earlier in KeeleWeb [Language Web] (http://ee.www.ee/) and on the home page of IEL (http://www.eki.ee/dict/), but the large electronic publication has been achieved in Keelevara (see Deliverable D3.2). Some software packages have been developed in IEL to help lexicographers and other linguists: technical aids of a lexicographer (sort, structure modifications, comparisons, etc) tools for the compilation of special dictionaries (entering and editing programs) universal XML-based dictionary editor morphology modules (as freeware DLLs http://www.eki.ee/tarkvara/): syllabification, part of speech and inflectional type recognition, stem generation, morphological synthesis and analysis grammatical entry generator (to add grammatical data into a dictionary entry) During this project several dictionaries have gone through the process of standardisation and transformation into XML-based standard of source texts and have been integrated into the Keelevara (see Lexical resources). Some packages of linguistic software are accessible through Keelevara but the real integration of them into working environment is not realised yet (speech synthesiser, modules of Estonian morphology of IEL, Filosoft language software, translating browser). They are accessible via homepage of Language technology project, partners: University of Tartu, Institute of the Estonian Language, Institute of Cybernetics, Ltd Filosoft (http://www.eki.ee/keeletehnoloogia/tarkvara.html). The main target of the project has been achieved, but preparation of new dictionaries will be continue. Final integration of software packages into Keelevara environment is under develpoment. List of linguistic resources 1. Lexical resources Estonian language planning dictionary (ÕS 1999) Eesti keele sõnaraamat ÕS 1999. Tallinn, 1999 (50 000 entries) http://www.keelevara.ee/teosed/qs1999/ Defining dictionary of contemporary Estonian (Seletussõnaraamat) Eesti kirjakeele seletussõnaraamat I–VI. Tallinn, 1988– (100 000 entries) (24 volumes available) http://www.keelevara.ee/teosed/seletav/ Orthological dictionary for basic school (Õpilase ÕS) Õpilase ÕS. Tallinn, 2004 (22 000 entries) http://www.keelevara.ee/teosed/qpilase_qs/ English-Estonian MT dictionary (Inglise-eesti masint.) http://www.keelevara.ee/teosed/en-et_masin/ (80 000 entries) Estonian-English MT dictionary (Eesti-inglise masint.) http://www.keelevara.ee/teosed/et-en_masin/ (80 000 entries) Russian-Estonian dictionary (Vene-eesti) Vene-eesti sõnaraamat I–IV. Tallinn, 1984–1994 (75 000 entries) http://www.keelevara.ee/teosed/ru-et_eki/ Estonian-Russian dictionary (Eesti-vene) Eesti-vene sõnaraamat I–V. Tallinn, 1997– (60 000 entries) (3 volumes available) http://www.keelevara.ee/teosed/et-ru_eki/ Norwegian-Estonian dictionary (Norra-eesti) T. Farbregd, S. Kangur, Ü. Viks. Norra-eesti eesti-norra sõnaraamat. Tallinn 1998 (21 000 Norwegian entries) http://www.keelevara.ee/teosed/no-et_eksa/ Estonian-Norwegian dictionary (Eesti-norra) T. Farbregd, S. Kangur, Ü. Viks. Norra-eesti eesti-norra sõnaraamat. Tallinn 1998 (19 000 Estonian entries) http://www.keelevara.ee/teosed/et-no_eksa/ Place names of the world (Maailma kohanimed) P. Päll, Maailma kohanimed. Tallinn, 1999 (4200 entries) http://www.keelevara.ee/teosed/maailma_kohanimed/ Concise dialect dictionary (Väike murdesõnastik) Väike murdesõnastik I-II. Tallinn 1982–1989 (80 000 entries) http://www.keelevara.ee/teosed/vaike_murdesqnastik/ Poetic synonyms from J. Peegel (Poeetilised sünonüümid) (412 pp) J. Peegel, Nimisõna poeetilised sünonüümid eesti regivärssides. Tallinn 2004. http://www.keelevara.ee/teosed/poeetilised/ First dictionary of Estonian slang from M. Loog (Slängisõnaraamat) M. Loog, Esimene eesti slängisõnaraamat. Tallinn, 1991 (7500 entries) http://www.keelevara.ee/teosed/slang/ Handbook of the Estonian langauage (Eesti keele käsiraamat) M. Erelt, T. Erelt, K. Ross. Eesti keele käsiraamat. Tallinn, 1997 http://www.keelevara.ee/teosed/ekkr/ Person names of Estonia from 1900-2004 (Mehenimed, Naisenimed, Perekonnanimed) http://www.keelevara.ee/teosed/m_nimed/, http://www.keelevara.ee/teosed/n_nimed/, http://www.keelevara.ee/teosed/p_nimed/ Some additional dictionaries have been preprocessed and are ready to be integrated into the portal (in-house versions are available in http://www.eki.ee/dict/ ): Estonian orthological dictionary (Õigekeelsussõnaraamat 1976) Õigekeelsussõnaraamat. Tallinn, 1976 (114 000 entries) Dictionary of Estonian idioms (A. Õim, Fraseoloogiasõnaraamat) Õim, Fraseoloogiasõnaraamat. Tallinn, 1993 (6500 entries) Dictionary of synonyms (A. Õim, Sünonüümisõnastik) Õim, Sünonüümisõnastik. Tallinn, 1991 (10 000 entries) Dictionary of antonyms (A. Õim, Antonüümisõnastik) Õim, Antonüümisõnastik. Tallinn, 1995 (2000 entries) Word-index of the Saareste's thesaurus (Saareste indeks) Andrus saareste eesti keele mõistelise sõnaraamatu indeks. Uppsala, 1979 Etymological reference book (A. Raun, Etümoloogiline teatmik) Raun, Etümoloogiline teatmik. Bloomington, 1982 Finnish-Estonian dictionary I, II (Soome-eesti suursõnaraamat) Soome-eesti suursõnaraamat I–II. Tallinn, 2003 (90 000 entries) 2. Language software resources (in home page of Language technology project: http://www.eki.ee/keeletehnoloogia/tarkvara.html) Nimisõnafraaside filtreerija Software for filtering of noun phrases from running text (UT) Inglise-eesti sõnastik Implementation software of English-Estonian dictionary (IEL) ESTMORF, eesti keele morfoloogiline süntees ja analüüs Morphological synthesis and analysis of Estonian (Filosoft) EKI morfoloogiamoodulid Modules of Estonian morphology (IEL) TAHMM, ESTMORFi tulemuste ühestaja Morphological disambiguator (Filosoft) Eesti keele kõnesüntees ja ekraanilugeja Speech synthesizer of Estonian and screen reader (IOC, IEL, Filosoft) ELA, teksti lausestaja Sentence parser (from text) (UT) Links: http://www.keelevara.ee/, http://www.eki.ee/dict/ , http://www.eki.ee/keeletehnoloogia/tarkvara.html Developed by Indrek Hein, Margit Langemets, Ülle Viks