Jeremy Nordmoe

advertisement
JEREMY NORDMOE
SIL International
It is crucial that the capture of
metadata occur as close as
possible to the collection of
resources to ensure that vital
information is accurately and
timely collected.
The SIL International
Language & Culture
Archives
42,000+ items
1,500+ languages
Given this scope and diversity of languages, the
archives must rely on field linguists to submit
complete and accurate metadata.
Past Practice
 Field
linguists
filled out a one
page metadata
questionnaire to
accompany
physical material
 Archives staff
processed items
into database and
on to shelves
Past Practice: Problems
1.
2.
Archiving physical material from
field was cumbersome and risky
Missing or incomplete questionnaires
required Archives staff to:
o research the missing information,
o settle for minimal description, or
o postpone processing for lack of key
metadata.
New Practice
Deployment of a DSpace
Institutional Repository
 Facilitates digital archiving
 Empowers field linguists to directly
engage with the archives both in
data discovery and data
submissions
Roadblocks
Two issues hamper the
submission of language resources
to DSpace from the field:
1)
2)
a wide range of metadata options
limited internet connectivity
Challenge #1
So much metadata, so little time…
Deploying a simplified interface that handles the
intricacies of metadata schemas used for vastly
different types of resources:





language documentation
vernacular literacy products
translated texts
language & culture descriptions
training materials
Many of metadata fields are only relevant for
specific kinds of documents
DSpace web submission process
Challenge#2
A field linguist’s internet blues…
Working in remote areas with unreliable
or non-existent internet connections
Conventional web applications are
ineffective
File uploads to over HTTP are not
resumable
Building an On-RAMP to
the Digital Repository
Resource And Metadata Packager
(RAMP)
a client side application that assembles
metadata and all relevant data files in a
‘package’ that the SWORD API decodes
into a submission in the repository
RAMP Initial Screen
Dots are labeled, show
progress through the steps
and allow jumping between
steps
Required metadata is denoted by an
orange bar to the left of the input
box. Brief help text appears below.
Metadata wizardry
Linguists proceed through a series of
data entry screens, each addressing a
small group of metadata elements
similar to a software installation
wizard.
As the user enters descriptive information, the selection and
contents of subsequent data entry screens is affected. As a
result, the user never encounters irrelevant questions.
Users simply type ahead to select
a language code and name
Contributors are easily
classified by role
Users can initiate an
Users can also
upload to the
export the package
repository directly
to a portable media
from this Summary
device, if internet is
screen.
a problem.
Before uploading,
users have
opportunity to
review all metadata
on one convenient
summary screen.
Users are encouraged to
maintain a library. A package
may be duplicated as a
template for future packages.
Uploading in chunks
The package of files and metadata is broken down into
chunks that are transmitted separately. If a portion
fails to upload, that portion is attempted again, thus
avoiding the need to re-send the entire package.
How it works
The data entry screens, the rules (condition
steps) for displaying them, and the contextsensitive help are dynamically generated
from a conveniently editable YAML
document maintained by the archiving
staff.
o Allows for quick and effortless changes
without requiring program code to be
written or modified.
o Customizable to work in additional
contexts.
o
Future development
integrate RAMP into other SIL open source
tools. The first of these will be SayMore – the
language documentation session organizer.
http://saymore.palaso.org/
 include a help menu providing more detailed
assistance.
 add a feature allowing users to create custom
templates or choose from a template library.
 add a mechanism for sorting and searching the
RAMP library.

Implementation



RAMP launched alongside our institutional
repository in January 2011
To date, early adopters have packaged and
uploaded roughly 1400 items representing a
wide variety of resources.
Feedback from linguists has confirmed that
RAMP streamlined clunky DSpace features
and simplified the descriptive process by
limiting choices.
Ongoing Challenges
Insufficient Upload Capacity
At launch, RAMP handled files only as large
as 250MB, and many users expect to
upload considerably larger data sets,
especially in audio and video formats.
Recent development work has increased
capacity to at least 1.25GB per upload
Ongoing challenges
The SWORD API
 lacks
specific error messaging in an
instance of import failure that
frustrates users and makes
troubleshooting difficult
 unable to import several key pieces of
metadata requested in DSpace’s
‘initial questions’ and ‘upload’ screens
– requiring submission reviewers to
manually insert this data.
Ongoing challenges
Greater Simplification
Feedback reveals that the descriptive
process in RAMP may still be too
cumbersome for some.
Factors to investigate:
1. Generational
2. Cross-cultural
3. Work-load
Lessons learned
 Better
communication and the enabling
of the auto-update during field beta
testing.
 Training plan lagged behind the launch.
 Integrating the manual, currently
delivered via the corporate intranet, into
the initial release in order to serve users
who have subpar internet capability.
CONCLUSION
Good metadata collection will vastly improve
the preservation and discovery of language
resources archived in both traditional and
digital repositories.
 Linguists who are collecting, analyzing and
publishing these resources are the experts
when it comes to describing their work.
 By addressing the dual obstacles of a complex
metadata schema and inadequate internet in
the field, RAMP enables linguists to easily
submit quality archive packages from the field.

AVAILABILITY
SIL invites the linguistic community to
examine, adapt and improve upon RAMP as a
tool for increasing both the quality and
quantity of documentation about the world’s
languages.
RAMP is built on Adobe Air 2.7.1
Free download: http://get.adobe.com/air/
Download RAMP: http://ramp.leancoder.com/
open source under the GNU general public license v.3.0
Download