Visions: Future, past and present

advertisement
Chapter Nine
Visions
Future, past, and present
How to Build a Digital Library
Ian H. Witten and David Bainbridge
Visions: Future, Past, and Present


Digital libraries have practical advantages over
physical ones
Digital libraries offer the promise of far greater
universality
Mission of a Library

The mission of a library is twofold:
To collect, organize, and provide access to
information
 To pass it down to succeeding generations as a
record of culture

The Librarian’s Duty

The librarian has twin duties:
Access
to the world’s literature for today’s readers
 Preservation
for future generations

Challenges for Digital Libraries



Today’s collections are mostly text
The real challenge is to create collections of
digital documents in diverse media types
Examples:

Music libraries that can be searched by humming
Libraries of the future
Libraries of the Future

Digital libraries
Have the potential to be far more flexible than
conventional ones
 Will be large
 Will not be static

Today’s Visions

Impersonal and utilitarian


Example: Figure 9.2
Real people in real environment
Example: Figure 9.3
 Kataayi cooperative in Uganda
 Low tech

Today’s Visions

Libraries are about connecting people with the
information they need
Tomorrow’s Visions

Sci-fi image


Personalized space


A kitchen for knowledge preparation
Workshop


emphasis of preservation over access
Comfortable, personalized, dynamic, up to date
Your Visions?
Librarianship

Librarianship:
Selection, organization, and maintenance
 Wisdom and value judgments

What information to include
 How to organize the information

Working Inside the Digital Library

Digital Library


A library without walls but with boundaries
Working inside the digital library:
An environment that surrounds in an intellectual
sense
 More or less immersive
 Reacts and responds

Preserving the past
The Problem of Preservation

Technological progress comes at the expense of
preservation
The Problem of Preservation

Paper


Film


Acid-based paper decomposes after only a few
decades
Film containing nitrate decays quickly
Analog audio

Wax cylinders or magnetic tapes must be preserved
by transferring onto digital formats
The Problem of Preservation

A process of regular copying can be established
to preserve digital material without loss
The Digital Dark Ages

“No one understands how to archive digital
documents”
Preservation Technology




Enormous amounts of digital information are
already lost forever
Information technologies become obsolete very
quickly
Document and media formats continue to
proliferate
Technology standards will not solve
fundamental issues in the preservation of digital
information
Availability of Material



Libraries will shortly see a demographic bulge of
electronic material as the baby boom generation of
authors and academics contribute material gathered
during their careers
Much material will never make it into library collections
for preservation because of increasingly restrictive
intellectual property and licensing regimes
Archiving and preservation functions in a digital
environment will increasingly become privatized as
information continues to be commodified
Traditional Library Functions


Financial resources available to libraries and
archives continue to decrease
Libraries and archives will be required to
continue their existing archival and preservation
practices as the current paper publishing boom
continues
Preservation Strategies


Digital documents are vulnerable to loss because
the media on which they are stored decays and
becomes obsolete
They become inaccessible when the software or
hardware becomes obsolete
Preservation Strategies



Digital formats have advantages over analog
formats
Digital formats seem to promote preservation
The advantages make digital preservation even
harder
Preservation Strategies



Ease of creation causes information glut
Easy of copying makes “copies” seem
dispensable
Improvements in hardware and software
promote obsolescence
Preservation Strategies

“May all your problems be technical ones”
Computer people recognize that the technical
problems can be solved
 It’s the human part that causes problems



Administrative and political processes take time
and cause frustration
Technical problems have solutions which yield
to honest intellectual work
Preservation Strategies

Preservation is not a technical problem
Preservation Strategies

Four Preservation Strategies
Paper
 Museums
 Emulation
 Migration

Preservation Strategies

Paper and Museums
Involves printing the material on paper or microfilm
and storing in museums
 Not considered a long-term preservation strategy


Emulation and Migration

Involves preserving the physical stream of bits
and/or the logical means by which the bits are
interpreted as a document
Preservation Strategies

Emulation
Keeping the documents in exactly the same form
 Emulate the functionality of the original, obsolete
system on future, unknown systems

Preservation Strategies

Preserving the physical bit stream
Regular copying to new media
 Error detection to determine if degradation is
occurring
 Error correcting codes to ensure new generations
are faithful copies of the original

Preservation Strategies

Preserving the logical interpretation
Emulate old interpreters on new hardware
 Backward compatibility

Preservation Strategies


An important feature of any format used for
preserving documents it that it is open: the
details are made publicly available
It must be open in principle as well as practice
Documented well enough for others to understand
and build their own interpreters
 Examples: PostScript and PDF

Preservation Strategies

Migration

Translating the document from the old format to a
format accepted by new software

Designed for near-obsolete software
Involves copying the physical bit stream to new
media
 Involves translation to a new logical format

Preservation Strategies

Emulation or Migration?

Migration may be cheaper
No special emulation software needs written
 Conversion software is usually available


Conversion is a kind of translation

May lose features of the data
Generalized documents:
A challenge for the
present
Generalized Documents: A
Challenge for the Present


Text remains the principal means for searching
and browsing collections, even when they
contain documents in other media
Multimedia documents can be displayed
Linked to text documents
 Text may contain only captions
 Text is browsed and searched

Digital Libraries of Music



Music information retrieval
Motifs in music are analogous to key phrases in
text
OMR
Optical music recognition
 Music analog of OCR

Other Media




Images
Videos
Objects
Other Document Types
Images

Thumbnails
Visual material can be rapidly browsed using
thumbnails
 Captures the readers attention
 Gives a feeling for what the collection is about
 Difficult to automatically search images rather than
manually browse them

Videos

Video


Cut detection


a sequence of pictures?
Locating techniques where the scene changes
Movies
Browsed and manipulated using thumbnails
 Each thumbnail represents a typical image or the
initial image in a scene

Objects

Realia
Real artifacts
 Computer graphics allow three-dimensional objects
to be captured in the form of a data set


Artifacts


In libraries and museums, artifacts are indexed and
located on the basis of metadata
Books

Can be modeled as physical objects
Other Document Types

Teaching material


Research material


Multimedia elements
Laboratory notebooks
Scientific and engineering data
Results of experiments, simulations, and surveys
 Information is expressed in many forms

Generalized Documents in
Greenstone

Digital Library


Focused collection of digital objects, including text,
video, and audio
The Challenge

Integrate objects of all kinds of media into digital
libraries in such a way that each becomes a first-class
citizen
Generalized Documents in
Greenstone

Greenstone does not incorporate searching and
browsing techniques for non-textual media
Generalized Documents in
Greenstone

Solutions to current Greenstone limitations:
New modules can be added
 New search engine can be deployed by replacing or
augmenting the MG system that does text searching
 Browsing horizontal and vertical lists can be handled
by adding a new classifier
 New browsers can be added through Perl code
 New media types can be imported by adding new
plug-ins

Digital Libraries for Oral Cultures


Libraries are about literature
Literature:
The writings of a society, in prose or verse
 Broadly speaking, literature includes all types of
fiction and nonfiction writing intended for
publication

Digital Libraries for Oral Cultures



It should be possible to create digital library
collections intended for people in oral cultures
Useful for people who may be illiterate or semiliterate
Useful for people who cannot speak or read the
language of the digital library
Digital Libraries for Oral Cultures

Iconic Form


Serious practical information can be conveyed in a
purely iconic form
Examples
How to splint a broken forearm
 User manual for underground transport system
 Historical precedent of Beggar’s Bibles

Digital Libraries for Oral Cultures

Libraries for the illiterate


We are all illiterate with respect to some other
languages and cultures
Media types:
Static images
 Motion, sound, video, interaction, 3D objects,
simulations, virtual reality

Download