IS1305 “European Network of e-Lexikography (ENel)” Working Group 2: Retro-digitised dictionaries Objectives of WG 2 (according to the application) set up guidelines and standards for turning paper dictionaries into a digital format development of common standards in the field of e-lexikography for retro-digitised paper dictionaries already online or planning to go online (objective 3 of the action) European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 2 Tasks of WG 2 — Task 1 1. establish an overview of existing retro-digitised dictionaries and an overview of dictionaries which should be retro-digitised (necessity to be digitised → ranking? → no, not necessary!) → necessary to give this overview: “scheme of categories” describing the dictionaries (to develop in close exchange with WG 1, WG 2, WG 3) result: database to browse (→ to coordinate with WG 1) → question: different categories as search parameters? time frame: year 1 European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 3 Tasks of WG 2 — Task 2 2. develop a standard workflow for digitisation of dictionaries planning to go online including parameters necessary for estimating costs digitisation (fulltext, images, OCR) encoding of retro-digitised dictionaries development of GUI standards of presentation and design long term preservation … result: guidelines (have to be written in such a way that policy makers understand them) time frame: year 1—4 European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 4 Tasks of WG 2 — Task 3 3. define standards for the encoding of information and the description of relevant information categories for paper dictionaries → main objective: guarantee interoperability, platform interdependence → task: collect standards used within the action (TEI, LFM, ISO → give this question to MC) → questions: what markup languages to use? do we need a “minimal set” of standards for both retro-digitised and new, born digital dictionaries? European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 5 Tasks of WG 2 — Task 3 3.1 part of task 3: establish an overview of software for the conversion of physical lay-out information to logical information → question: how to mark-up the dictionaries (i.e. automatically, semi-automatically; are there “markup tools” to be re-used)? result: best practices for the encoding of information, linked with dictionary database time frame: year 1 and 2 European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 6 Tasks of WG 2 — Task 4 4. a) investigate relevant information categories to be added to the dictionary in order to make the dictionary content more readily accessible and interoperable b) develop concepts for linking retro-digitised dictionaries → questions: which information do we need to interlink dictionaries (extra-information?)? → describe the strategies European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 7 Tasks of WG 2 — Task 4 → questions: integration of additional information to create up new information (e.g. WordNet, wiki dictionary, FrameNet)? → question to address to the WGs: do you put additional information in your dictionary result: best practices time frame: year 3 European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 8 Tasks of WG 2 — Task 5 5. investigate the possible use of dictionary content for computational linguistic applications → task is already done, no further need => clear task list! time frame: year 4 European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 9 Tasks of WG 2 — Task 6 6. identify future funding sources and develop collaborative funding applications considering the dictionary-candidates to retro-digitise and the working plan for digitisation → information to have on an European level → develop awareness in governments of Europe! → questions: – national and international funds to go for financial support? – develop guidelines / best practices for writing funding applications? → responsibility of steering group! European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 10 Tasks of WG 2 — Task 6 → task 6: responsibility of steering group! time frame: year 1—4 European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 11 Tasks of WG 2 in Leiden we tried to divide tasks, to find responsible(s) for the tasks, to form subgroups → not yet finished, especially for task 4 and 5 (task 5 already done, no need to find responsible(s)) European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 12 Participants 27 participants from 14 countries: Austria (1), Denmark (2), Finland (3), France (1), Germany (5), Hungary (1), Netherlands (2), Poland (2), Portugal (2), Romania (1), Serbia (2), Slovacia (1), Switzerland (3), United Kingdom (1) see file “WG 2 Leiden 16-01-2014 minutes Annex1 participants.pdf” European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 13 Dictionaries in WG 2 see list in file “WG 2 Leiden 16-01-2014 minutes Annex2 dictionaries.pdf” not yet complete for now: 25 dictionaries of different types – most of them monolingual – 10 (?) languages – most of them diachronic / historical dictionaries, standard language dictionaries, some dialect dictionaries European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 14 Plans / ideas / work in progress bibliography of retro-digitised dictionaries online available (student using Citavi for organizing the bibliography) → structure of the bibliography: language dictionaries, specific dictionaries (e. g. “A dictionary of food and nutrition”) → structure of entries: author, year of publication, title, place of publication, publisher, url (Adelung, Johann Christoph (1808): Grammatisch-kritisches Wörterbuch der hochdeutschen Mundart. Mit beständiger Vergleichung der übrigen Mundarten, besonders aber der oberdeutschen. Wien: Richter. Online: http://ds.ub.unibielefeld.de/viewer/image/1323497/1/LOG_0003/.) European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 15 Plans / ideas / work in progress → work in progress (for now: 22 pages in Word file) → questions: re-use in the Action? which information should be given in this bibliography of retro-digitised dictionaries (close connection to the “scheme of categories” describing the dictionaries?) bibliography as basis for the database of retrodigitised dictionaries and part of the dictionary portal? European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 16 Plans / ideas / work in progress “collection” of “dictionary typologies” trying to find a “scheme of categories” describing the dictionaries in the Action problem: so far only consideration of German “typologies” – Storrer: classification of internet dictionaries • retro-digitised dictionaries • digital born dictionaries • dictionaries with user participation • user generated dictionaries • finished dictionaries • dictionaries “under construction” European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 17 Plans / ideas / work in progress – Schlaefer: • language(s) covered: monolingual, multilingual • vocabulary/lexicon described • user group addressed • methodological basis • lexikographical basis – Hausmann • synchronic vs diachronic dictionary • historical vs contemporary dictionary • standard language vs dialect dictionary •… European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 18 Cooperation with other WGs cooperation with WG 1 concerning – the encoding of dictionaries – the linking of information between dictionaries – user interfaces – the overview of dictionaries cooperation with WG 3 in finding common approaches to linking contents of retro-digitised and innovative dictionaries cooperation with WG 1, WG 3, WG 4 – in identifying funding sources and developing funding applications European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 19 Decisions which have to be made / questions “Scientific” aim: – develop a “scheme of categories” describing the dictionaries (short standardized “profile”) → cooperation with WG 1, WG 3 and WG 4 → question: which information should be given about the dictionaries? 1. information about the dictionary itself (short and clear description!) dictionary type language covered (source language, description language, target language) European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 20 Decisions which have to be made / questions (→ 1. information about the dictionary itself) year of publication (print and online) number of entries references, literature concerning the dictionary … 2. information about the technical process encoding XML schema and documentation year of publication … European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 21 Decisions which have to be made / questions → questions: which kinds of dictionaries to include / exclude? propose parameters / properties for all dictionaries which can function as search parameters in the dictionary portal (“search for dictionaries”)? European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 22 Decisions which have to be made / questions Organisation: mailing list for each WG? → establish at INL? (Google Groups for each WG, all of them including members of steering group) how to exchange information / results of WGs within WGs and amongst all participants → can we use the intranet as envisaged in the proposal? or Google Groups and Google Docs? (“suitable instruments”?) do we need slots for inter-WG meetings at all WG meetings? specialist workshops preceeding the WG meeting? European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 23 Decisions which have to be made / questions Organisation: STSMs: – “central” and open call? or call focused on certain topics fostering certain tasks in the action? – information concerning reimbursement to participants? Training Schools: – how to organize? where? when? how long? – number of participants? experts? – budget? European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 24 To ask from participants of WG 2 short biographies concerning their background (like Anne did in WG 1, see minutes)? – collect them for ENeL website? secured or open part of website? continue to divide tasks / build subgroups (especially for task 4) invite them to think about topics relevant for any concern of WG 2 not yet fixed in working plan; European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 25 To ask from participants of WG 2 invite them to think about experts to be involved in the discussions of WG 2 (specialist workshops) invite them to think about topic(s) to deal with at Bolzano → fixed in Leiden: presentation of first results of task 3 (development of standards for the encoding of information and the description of relevant information categories for print dictionaries) at meeting in Bolzano invite them once again to think about a 5-day meeting in the Lorentz Center in Leiden in 2016 European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 26 To ask from participants of WG 2 invite them to think about the Training School in 2015: “Standard tools and methods for retrodigitising dictionaries“ → date: year 2, semester 2 → Rute will check location with Vlado and Vera give a description of “their” dictionary/ies according to our “dictionary scheme” (“deadline” depending on decision how this “dictionary profile” looks like) European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 27 Tasks for Bolzano define a list of dictionaries to begin with (e.g. bilingual synonym dictionaries) define a list of dictionaries to be retro-digitized define a list of metadata (ask all WGs for a list of dictionaries and a list of mark up) proposal with dictionary typology including definitions of technical terms used (end of June); define European Network of e-Lexicography Working Group 2 Vienna, 14–4-2014 28