Making of a multimedia lexicon for the Marquesan and Tuamotuan

advertisement
Towards a multimedia
encyclopaedic lexicon for
the Marquesan and
Tuamotuan languages
Gaby Cablitz
Christian-Albrechts-Universität zu Kiel
Overview of this talk
 Motivation for our project
 Why multimedia dictionaries?
 Project objectives and basic design
 Some major developments for our project
 Examples of linking multimedia extensions
with lexicographic data
 Web-based collaboration with speech
communities
Motivation for the project
 How can a language documentation be made more accessible




and usable to the speech and research community?
Two problems:
1. Limited ways of structuring archive
2. Primary data do not reveal much about language structure
and relatedness between words of a language
Annotation of multimedia documents shows meaning of word in
specific contexts, not network of associations between words
nor full range of meanings -> need for structural data to
understand primary data
Role of lexicography backgrounded in DoBeS-program
Dictionaries are necessary elements in language documentation
projects
Multimedia dictionaries: beyond traditional
lexicography and language archiving
 New ways of meaning presentation: Linking of linguistic
information with media files (video clips, photos, drawings,
sound files)
 Multimedia extensions provide:
-> information on pragmatics of lexical units (use in context)
-> information on cultural knowledge related to meaning and use
of lexical units (LU)
-> non-verbal aspects of cultural activities relevant for
understanding concepts encoded by LU
 New form of archiving: dense network of lexical entries with all
kinds of media and archive files
 Moving from a conventional dictionary towards an
encyclopaedia
Major project objectives
 Major objectives:




1. Create multimedia encyclopaedic lexicon for Marquesan and
Tuamotuan languages,
2. Advance development of LEXUS,
3. Involve speech community actively in lexicon creation via
web-based collaboration
Upload non-archived multimedia data with lexical database in
LEXUS as possible (photos, drawings, photo galleries, etc.)
Create links between lexical, multimedia and archive data in a
thematically organised way
Represent data by reflecting indigenous categorisation and
understanding of relatedness between elements
Create a database which is useful for language maintenance
and language revival
Design focus: creating thematically
organised spaces
 Creation from an ethnobotanical perspective
 Plants important in traditional material culture, natural way of
teaching traditional knowledge
 Linking of data shall be visualised in one space which allows
continuous navigation through the database
Some major software developments
for our project purposes
 Improvement of UI issues, functionalities etc.
 Development of the ViCoS tool: key feature
creating for thematically organised spaces via
relational links
 Unlike the Kirrkirr software, ViCoS can also
integrate multimedia data, has good
navigation and visualisation solutions, parts
of a photo or drawing can be selected, userfriendly way of creating relational links
(drag&drop option, etc.) making it accessible
for speech communities
Realisation in ViCoS
Realisation and navigation in ViCoS
Jump to photo gallery
Linking media with lexicographic
data: corpus-based examples
 Edition of
corpusbased
example
sentences
-> creating
a resource
for
comparing
spoken vs.
written
language
Link to archive
Linking media with lexicographic
data: made-up example sentences
Link to archive with
interlinearisation
Video clips: acting out meaning of
motion verbals
 Documenting word
meaning
 Letting consultants
design and act out
word meaning
without verbal
interaction
 Supportive element
of word meaning,
also useful for
language revival
 Creation of
semantic word
fields (e.g. CUT or
BREAK verbals) in
ViCoS
Web-based collaboration with speech
community
 Problematic aspects of web-based
collaboration with SC
 Requirements for web-based collaboration
with speech community (e.g. capacity
building)
 Problems of using a wiki-like lexicon tool
 Proposal for speech community-based
participation in the process of lexicon creation
Basic challenges for an online
cooperation with speech community
 Current state of LEXUS and proposal of
collaborative WSs have wiki-like set-up based
on consensus
 Who is a suitable administrator/primary
editor?
 Is it really sufficiant to make a web-based tool
available and assume that an encyclopaedic
lexicon will be simply created in a wiki-like
manner by the speech community?
Design of collaborative WS by speech
community
 Panel of moderators
interacting with
administrator and SC
 Complicated system of
collaborative WS, not
realistic
 Development and
implementation is timeconsuming
 Organising, editing and
revising large amounts of
new data with multiple
entry writers and multiple
drafts can get out of
control
Community-internal obstacles I:
linguistic situation
 In context of endangered speech communities ->
wiki-like set up of collaborative WS is very
problematic
 Documentation of lexical and cultural knowledge not
an easy task -> consultants do not share same
metalinguistic and cultural/ encyclopaedic knowledge
about words (Haviland 2006)
 Indigenous Polynesian languages -> undergoing
rapid linguistic change
 Depending on age and upbringing -> metalinguistic
knowledge very heterogenous
Community-internal obstacles II:
culture-specific reasons
 Problem rooted in their traditional society: very secretive about
their culture, transmission of cultural knowledge not public affair
-> often only one selected person within a family
 Unlike western cultures, cultural knowledge has no open
verifiable and codified standards
 Continuous loss of linguistic and cultural heritage feeds into
many insecurities of speakers -> ground for conflicts about what
is authentic knowledge and what not -> results in „editing wars“?
 Within speech community: accusations of re-inventing and
transforming the language and culture, knowledgable speakers
often stigmatised as „liars“
-> withdrawal from documenting their endangered linguistic and
cultural heritage
Community-internal obstacles III:
cultures with oral traditions
 No writing tradition, difficult
to motivate literate speech
community members to
express knowledge in writing
 Most knowledgable
community members often
cannot read or write, total
lack of IT skills
 Recording is better way of
fixing knowledge
 Transmission of traditional
knowledge still „observing
and learning by doing“
Capacity building in the speech
community
 Prerequisite: substantial training
in basics of lexicography and
usage of linguistic software
 Understanding lexicon
structures (e.g. Toolbox)
requires training and continuous
familiarisation as well as
constant repetition of usage
over protracted period of time
 Writing definitions,
encyclopaedic articles and
example sentences needs to be
learned despite a simplified user
interface
 New participants of speech
community have to be trained
subsequently -> who does the
training?
Psychological barriers
 Native speakers feel lost
when having to edit lexical
entries on their own
 Psychological blockade of
writing lexical entries ->
formal aspect of lexical entry
structure puts pressure on
contributors to do a good job
 Older community members
have to learn to cooperate
with younger community
members with good IT skills,
but lack of knowledge about
language and culture
Enrichment of lexicon with linguistic
and encyclopaedic knowledge
 Sensitivising speakers for the difference between describing the
meaning of word/lexical unit (=definition) and writing an
encyclopaedic article
-> encyclopaedic knowledge can be part of word meaning,
lexical units can denote complex phenomena and procedures or
culture-specific activities
 Enrichment of lexicon with linguistic and cultural knowledge still
best achieved during fieldwork periods based on mutual
dialogue between researcher and consultants
-> detailed investigations about language and culture, pickingup on interesting comments, questions about grammar etc.,
semantic relations between lexemes, etc. best obtained in faceto-face communication
-> miscommunications and misunderstandings can be instantly
clarified
New proposal for online participation
by SC
 Both communities would like to have a limited
„panel of moderators“ interacting with linguist
 Only reduced editing possibilities for speech
community
 Lexicon should be open to community with
reading rights only
 „Whiteboarding tool“ should be available
coupled with the LEXUS tool
-> informal editing possible
Editing lexicon with whiteboarding
tool Twiddla
 Web-based tool, access to websites, easy editing
possibilities, edited page can be saved and sent as
attachment
Web-based whiteboarding tool
ReviewBasics



Comment, annotate, markup images, documents and videos, upload other
media files, etc.
User-friendly UI for editing documents and handling web-based collaboration
Disadvantage: cannot access protected websites
Advantages of whiteboard editing
 Informal way of editing lexicon and participating in its




creation -> motivating effects on speech community
Pressure of producing good definitions,
encyclopaedic articles, etc. is taken away, no need to
deliver complete definitions, etc.
Playful aspect motivates younger speech community
members to participate, consequently learn about
their language and culture
No interference with lexical database as such, only in
accordance with moderators
Workload reduced for the panel of moderators
(accept or reject changes)
Conclusions
 Web-based tool like LEXUS can be a powerful tool of




1. Linguistic and cultural revival
2. Tool for visualising primary and structural data together (e.g.
lexicon) -> new form for archiving making linguistic and cultural
networks more visible in KS of ViCoS
Online participation of (Marquesan andTuamotuan) speech
community is problematic if LEXUS is set up in wiki-like manner
LEXUS needs to be adjusted to culture-specific circumstances
of the speech communities
Simplified user interface for SC will not solve the problem of
online participation, contributors still need to learn basics of
lexicography
Enriching a lexicon with detailed linguistic and encyclopaedic
information by online participation of SC is doubtful and will not
replace extensive fieldwork
Download