DAM-LR - CORDIS

advertisement
Project acronym:
DAM-LR
Distributed Access Management for Language Resources
Proposal/Contract no.: 011841-CN
Construction of new infrastructure
1.
Project Summary
DAM-LR proposes to develop and deploy an infrastructure for the European research community that is
interested in an easy management of and access to linguistic resources of all kinds such as large (multimedia)
corpora, lexicons, grammar descriptions and others. It will not only foster the local developments that take place
at this moment in various linguistic data centers by deploying prototypical archive solutions but also integrate
these local archives virtually such that users of linguistic resources just see one large collection, users have just
one identity to access the stored material, ingest mechanisms allow users to integrate new data into this domain
of linguistic resources, managers get efficient tools to manage resources in the distributed domain and managers
get efficient mechanisms to deal with the access management aspects. Therefore, the proposed DAM-LR
concept will offer completely new opportunities for the producers of data, for the managers and for the users. In
doing so DAM-LR will be a very important contribution to establish a Semantic Web of language resources,
since only an integrated domain based on interoperable concepts such as unified access mechanisms will allow
agents to smoothly find their way in a complex domain of heterogeneous data types.
DAM-LR will be based on 4 pillars that have been discussed intensively at various international meetings.
 The metadata concept for language resources has been developed during the last 4 years and can be seen as
being stabilized and solid.
 The introduction of unique resource identifiers will be important for operating in distributed collections and
DAM-LR can use well-proven technology here.
 A unified user and group management system will give persons one identity for accessing all resources and
 A unified access management system will allow managers to set access rights and delegate the possibility to
set access rights in the intended distributed domain.
2.
Project website address:
http://www.mpi.nl/DAM-LR/
or
http://www.mpi.nl/dam-lr/
3.
Project Achievements:
Since the start of the project progress is made on several aspects of DAM-LR. A first version of the
local prototype of the Language Archive Management and Upload System (LAMUS) was presented.
The ISLE MetaData Initiative (IMDI, see http://www.mpi.nl/IMDI/) is used as a metadata framework to
organize archive structure and describe important details about the stored resources. An international
workshop (http://www.mpi.nl/delaman/workshop/) about access management for distributed archives
served as the starting point for future discussions about existing architectures, frameworks and
technologies to be used within DAM-LR. The language resource center at LUND is established and
work on the language resource archive is started. INL established a language resource center and are
busy to transfer technology and start working on the language resource archive. SOAS established their
language resource center and work on data gathering is started.
The digital archive at the MPI for Psycholinguistics now holds about 40.000 sessions or resource
bundles covering over 100.000 language resource objects such as audio and video media, linguistic
annotations, lexica, field notes, sketch grammars and documentation. Researchers from different
projects continuously add and integrate new data to the existing corpus. Most of the corpus resources
consist of audio or video recordings consuming huge amounts of space which currently amounts to 11
Terabyte of data.
The LAMUS system now under development at the MPI is needed to have a secure and stable
management system which will allow users to keep track of their data and upload new material into the
archive in a controlled web-based environment. The IMDI metadata is already used by about 50
institutions worldwide including European language resource centers such as ELDA, BAS, Meertens,
Lund University, Helsinki University, Florence University, ILC, ILSP, DFKI and various National Sign
Language Communities.
4.
List of participants
Participant
number
(co-ordinator = N°1)
Participant name
(Organisation, city, country)
1
Max-Planck-Institute for Psycholinguistics,
Nijmegen, The Netherlands
MPI
2
Linguistics Department University Of Lund,
Lund, Sweden
LUND
3
School of Oriental and African Studies, University of
London, London, UK
SOAS
4
Institute for Dutch Lexicology,
Leiden, The Netherlands
INL
Short name
Download