Annex 3 - elexicography.eu

advertisement
IS1305 “European Network of
e-Lexikography (ENel)”
Working Group 2:
Retro-digitised dictionaries
Objectives of WG 2
(according to the application)
 set up guidelines and standards for turning paper
dictionaries into a digital format
 development of common standards in the field of
e-lexikography for retro-digitised paper
dictionaries already online or planning to go online
(objective 3 of the action)
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
2
Tasks of WG 2 — Task 1
1. establish an overview of existing retro-digitised
dictionaries and an overview of dictionaries
which should be retro-digitised (necessity to be
digitised → ranking? → no, not necessary!)
→ necessary to give this overview: “scheme of
categories” describing the dictionaries (to develop in
close exchange with WG 1, WG 2, WG 3)
result: database to browse (→ to coordinate with
WG 1)
→ question: different categories as search
parameters?
time frame: year 1
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
3
Tasks of WG 2 — Task 2
2. develop a standard workflow for digitisation of
dictionaries planning to go online including
parameters necessary for estimating costs
 digitisation (fulltext, images, OCR)
 encoding of retro-digitised dictionaries
 development of GUI
 standards of presentation and design
 long term preservation …
result: guidelines (have to be written in such a way
that policy makers understand them)
time frame: year 1—4
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
4
Tasks of WG 2 — Task 3
3. define standards for the encoding of information
and the description of relevant information
categories for paper dictionaries
→ main objective: guarantee interoperability,
platform interdependence
→ task: collect standards used within the action
(TEI, LFM, ISO → give this question to MC)
→ questions:
 what markup languages to use?
 do we need a “minimal set” of standards for both
retro-digitised and new, born digital dictionaries?
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
5
Tasks of WG 2 — Task 3
3.1 part of task 3: establish an overview of software
for the conversion of physical lay-out information to
logical information
→ question: how to mark-up the dictionaries (i.e.
automatically, semi-automatically; are there “markup tools” to be re-used)?
result: best practices for the encoding of
information, linked with dictionary database
time frame: year 1 and 2
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
6
Tasks of WG 2 — Task 4
4. a) investigate relevant information categories to
be added to the dictionary in order to make the
dictionary content more readily accessible and
interoperable
b) develop concepts for linking retro-digitised
dictionaries
→ questions:
 which information do we need to interlink
dictionaries (extra-information?)? → describe the
strategies
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
7
Tasks of WG 2 — Task 4
→ questions:
integration of additional information to create up
new information (e.g. WordNet, wiki dictionary,
FrameNet)?
→ question to address to the WGs: do you put
additional information in your dictionary
result: best practices
time frame: year 3
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
8
Tasks of WG 2 — Task 5
5. investigate the possible use of dictionary content
for computational linguistic applications
→ task is already done, no further need => clear task
list!
time frame: year 4
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
9
Tasks of WG 2 — Task 6
6. identify future funding sources and develop
collaborative funding applications considering the
dictionary-candidates to retro-digitise and the
working plan for digitisation
→ information to have on an European level
→ develop awareness in governments of Europe!
→ questions:
– national and international funds to go for
financial support?
– develop guidelines / best practices for writing
funding applications?
→ responsibility of steering group!
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
10
Tasks of WG 2 — Task 6
→ task 6: responsibility of steering group!
time frame: year 1—4
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
11
Tasks of WG 2
in Leiden we tried to divide tasks, to find
responsible(s) for the tasks, to form subgroups
→ not yet finished, especially for task 4 and 5
(task 5 already done, no need to find responsible(s))
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
12
Participants
 27 participants from 14 countries: Austria (1),
Denmark (2), Finland (3), France (1), Germany (5),
Hungary (1), Netherlands (2), Poland (2), Portugal
(2), Romania (1), Serbia (2), Slovacia (1),
Switzerland (3), United Kingdom (1)
 see file “WG 2 Leiden 16-01-2014 minutes Annex1
participants.pdf”
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
13
Dictionaries in WG 2
 see list in file “WG 2 Leiden 16-01-2014 minutes
Annex2 dictionaries.pdf”
 not yet complete
 for now: 25 dictionaries of different types
– most of them monolingual
– 10 (?) languages
– most of them diachronic / historical
dictionaries, standard language dictionaries,
some dialect dictionaries
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
14
Plans / ideas / work in progress
 bibliography of retro-digitised dictionaries online
available (student using Citavi for organizing the
bibliography)
→ structure of the bibliography: language
dictionaries, specific dictionaries (e. g. “A
dictionary of food and nutrition”)
→ structure of entries: author, year of
publication, title, place of publication, publisher,
url
(Adelung, Johann Christoph (1808): Grammatisch-kritisches Wörterbuch der
hochdeutschen Mundart. Mit beständiger Vergleichung der übrigen Mundarten,
besonders aber der oberdeutschen. Wien: Richter. Online: http://ds.ub.unibielefeld.de/viewer/image/1323497/1/LOG_0003/.)
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
15
Plans / ideas / work in progress
→ work in progress (for now: 22 pages in Word
file)
→ questions:
re-use in the Action?
which information should be given in this
bibliography of retro-digitised dictionaries (close
connection to the “scheme of categories”
describing the dictionaries?)
bibliography as basis for the database of retrodigitised dictionaries and part of the dictionary
portal?
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
16
Plans / ideas / work in progress
 “collection” of “dictionary typologies” trying to
find a “scheme of categories” describing the
dictionaries in the Action
 problem: so far only consideration of German
“typologies”
– Storrer: classification of internet dictionaries
• retro-digitised dictionaries
• digital born dictionaries
• dictionaries with user participation
• user generated dictionaries
• finished dictionaries
• dictionaries “under construction”
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
17
Plans / ideas / work in progress
– Schlaefer:
• language(s) covered: monolingual, multilingual
• vocabulary/lexicon described
• user group addressed
• methodological basis
• lexikographical basis
– Hausmann
• synchronic vs diachronic dictionary
• historical vs contemporary dictionary
• standard language vs dialect dictionary
•…
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
18
Cooperation with other WGs
 cooperation with WG 1 concerning
– the encoding of dictionaries
– the linking of information between dictionaries
– user interfaces
– the overview of dictionaries
 cooperation with WG 3 in finding common
approaches to linking contents of retro-digitised
and innovative dictionaries
 cooperation with WG 1, WG 3, WG 4
– in identifying funding sources and developing
funding applications
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
19
Decisions which have to be made / questions
“Scientific” aim:
– develop a “scheme of categories” describing the
dictionaries (short standardized “profile”) →
cooperation with WG 1, WG 3 and WG 4
→ question: which information should be given about
the dictionaries?
1. information about the dictionary itself (short and
clear description!)
 dictionary type
 language covered (source language, description
language, target language)
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
20
Decisions which have to be made / questions
(→ 1. information about the dictionary itself)
 year of publication (print and online)
 number of entries
 references, literature concerning the dictionary
 …
2. information about the technical process
 encoding
 XML schema and documentation
 year of publication
 …
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
21
Decisions which have to be made / questions
→ questions:
 which kinds of dictionaries to include / exclude?
 propose parameters / properties for all
dictionaries which can function as search
parameters in the dictionary portal (“search for
dictionaries”)?
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
22
Decisions which have to be made / questions
Organisation:
mailing list for each WG? → establish at INL?
(Google Groups for each WG, all of them including
members of steering group)
how to exchange information / results of WGs
within WGs and amongst all participants → can we
use the intranet as envisaged in the proposal? or
Google Groups and Google Docs? (“suitable
instruments”?)
do we need slots for inter-WG meetings at all WG
meetings?
specialist workshops preceeding the WG meeting?
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
23
Decisions which have to be made / questions
Organisation:
 STSMs:
– “central” and open call? or call focused on
certain topics fostering certain tasks in the
action?
– information concerning reimbursement to
participants?
 Training Schools:
– how to organize? where? when? how long?
– number of participants? experts?
– budget?
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
24
To ask from participants of WG 2
 short biographies concerning their background
(like Anne did in WG 1, see minutes)?
– collect them for ENeL website? secured or open
part of website?
 continue to divide tasks / build subgroups
(especially for task 4)
 invite them to think about topics relevant for any
concern of WG 2 not yet fixed in working plan;
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
25
To ask from participants of WG 2
 invite them to think about experts to be involved
in the discussions of WG 2 (specialist workshops)
 invite them to think about topic(s) to deal with at
Bolzano → fixed in Leiden: presentation of first
results of task 3 (development of standards for the
encoding of information and the description of
relevant information categories for print
dictionaries) at meeting in Bolzano
 invite them once again to think about a 5-day
meeting in the Lorentz Center in Leiden in 2016
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
26
To ask from participants of WG 2
 invite them to think about the Training School in
2015: “Standard tools and methods for retrodigitising dictionaries“
→ date: year 2, semester 2
→ Rute will check location with Vlado and Vera
 give a description of “their” dictionary/ies
according to our “dictionary scheme” (“deadline”
depending on decision how this “dictionary
profile” looks like)
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
27
Tasks for Bolzano
 define a list of dictionaries to begin with (e.g.
bilingual synonym dictionaries)
 define a list of dictionaries to be retro-digitized
 define a list of metadata
(ask all WGs for a list of dictionaries and a list of
mark up)
 proposal with dictionary typology including
definitions of technical terms used (end of June);
define
European Network of e-Lexicography
Working Group 2
Vienna, 14–4-2014
28
Download