digital library - Alexandria University

advertisement
Introduction to Digital Libraries
Digital Libraries Defined
What is a Library?
Main Entry: li·brary Pronunciation: 'lI-"brer-E;
1) a place in which literary, musical, artistic, or reference materials (as
books, manuscripts, recordings, or films) are kept for use but not for sale
2) a series of related books issued by a publisher
3) a collection of publications on the same subject
4) material of a particular organism or tissue
What is Library
Collection of books, documents, newspapers, audio visual
materials kept and organized for people to read or borrow.
Characteristics
1.Collection of data objects
2.Collection of Metadata Structures
3.Collection of Services
4.Domain Focus
5.Quality Control
6.Preservation
A History of Libraries
Lyceum - Ancient Greece
– http://en.wikipedia.org/wiki/Lyceum
Alexandria - Ancient Egypt
– http://en.wikipedia.org/wiki/Library_of_Alexandria
Boston Public Library - First US public lending library
(1848)
– http://www.bpl.org/
– “The commonwealth requires the education of the people as
the safeguard of order and liberty”
the memex
The memex was a proposed desktop machine
that would store millions of books in microfilm.
It would have a mechanism that would allow any
known item from the collection rapidly.
But the problem is what items to look at?
memex
Vannevar Bush’s vision
– How far have we come?
– What did you notice about this article -- style or content or background or
anything else.
– Did the article suggest anything you would not want to see happen?
Image source:
kelty.rice.edu/375/images/memex/camera.jpg
http://www.knowledgesearch.org/presentations/etcon/images/memex.gif
digital libraries
• Generally, we can think about digital
libraries are
– information stored on a computer
– delivered via a network
– mimics existing libraries
Digital Library access multimodal data
image
available
available
texts
……
……
video
available
available
audio
the semantic web
The semantic web is the actual successor to Lick’s
vision.
It’s still not done.
He also had too optimistic a vision about AI.
query example
relevance
feedback
Searching
images
negative example
positive example
ICUDL06, YT Zhuang
What is a Digital Library (DL)?
“…a managed collection of information, with
associated services, where the information is
stored in digital formats and accessible over a
network”
– there are any number of alternate definitions, but this
seems fair enough
DL Definition
-
According to Gladney H.M, et. al. (1994)
“A digital library service is an assemblage of digital computing,
storage, and communications machinery together with the
software needed to reproduce, emulate, and extend the services
provided by conventional libraries based on paper and other
material means of collecting, storing, cataloguing, finding, and
disseminating information.”
DL Definition (1)
– Paul Duguid (1997) has defined the Digital Library as an environment to
bring together in support of life cycle of information in addition to digital
collection and information management tools.
The concept of a "digital library" is not merely equivalent to a digitized
collection with information management tools. It is rather an environment to
bring together collections, services, and people in support of the full life cycle
of creation, dissemination, use, and preservation of data, information, and
knowledge. (Duguid, Paul, 1997).
DL Definition (2)
– The Internet is the digital library.
• Many different groups to signify simply a collection of digital
objects that people can access from their desktops have
appropriated the word “library”.
• But is this a "digital library"?
• For many common library requests, locating information on the
Internet remains highly inefficient compared to traditional library
sources and Finding information is difficult.
DL Definition (3)
– Digital libraries will be cheaper than print libraries.
• A common assumption among technology reporters
about the costs of "digital libraries" is that digital is
cheaper than paper.
• It is no surprise Digital Content providers are resorting
to Contract Agreements and Licensing Mechanisms
instead of normal copyright provisions.
Definitions
Digital Library
Collection of electronic resources that provides direct/indirect access to
a systematically organized collection of digital objects.
Hybrid Library
Provides services in a mixed-mode, electronic and paper, environment,
particularly in a co-coordinated way. Derived from a strand of eLib
which explored the issues surrounding the retrieval and delivery of
information in these types of environment but also investigated the
integration of different electronic services so that single search approach
could be offered to the End user.
Virtual Library
Access to electronic information in a variety of remote locations
through a local online catalogue or other gateway, such as the internet
Advantages: Why digital libraries?
Access: brings library to users
– always available; better and wider delivery
Sharing: information resources; linking
Timeliness: easier to keep current
Searching, browsing: use of computer power
Information resources: new forms possible
Services: new & new forms possible
Costs: may save effort, money??
benefits: availability
Digital libraries bring the information closer to the
user than physical libraries can
– physically
– temporarily
Even when you are in the physical library you still
get faster access to digital library items.
benefits: findability
Information can be more easily found in digital
than in print.
Some non-textual information is still only findable
via metadata.
But computer scientists are working on that.
benefits: sharing
Information can be shared.
Items can not be damaged.
Items can not be stolen.
benefits: updating
Information can be kept up-to-date more easily.
To update a book, you have to reprint all copies,
and replace them.
benefits: new media
Information can be created and manipulated in
completely new ways.
For example location information can be mixed
up with subject information.
issue: costs
The cost of storing print information is very high.
It is a multiple of acquisition costs.
Digital storage devices decline in price.
But digital information manipulation requires
skills that are not easy to procure.
The overall cost comparison is difficult to assess.
The Study of Digital Libraries is
Multidisciplinary
computer science
– tools, protocols, transport
information science
– models of information access and storage
human factors
– usability, adaptability
law
– rights management
economics
– “it’s all about using…”
Problems for libraries
Integration between print and digital
– mixing new digital technology with print, local with global;
managing diverse resources - all difficult
– economic trade-off decisions; new economic relations
Competition for scarce resources sharpening
Institutional, cultural & social adjustments not easy
Bridging the digital divide
Resistance, threats:
– guerilla warfare within and nuclear annihilation without
drawbacks: monopoly dangers
Since the information only needs to be kept in
one copy, and others can access it, there are
inherent dangers of the build-up of monopolies.
One example is Google search engine.
digital information was hard to use
Computers had to be driven by esoteric
commands.
Screens were hard to read from.
Telephone lines where hard to get to work to
transmit information
Access costs to digital information was high.
The service aspect was important.
Economic issues
Costs not insignificant - WHO PAYS?
– Two traditions: old - users, new (“free”) - providers
Dilemma in library budgets
– licensing of digital publications vs. subscriptions
Publishers’ economics for digital publications
– approaches vary, not settled, even scared
– even: who is a publisher? - lines blurring
Economics of digital libraries still up in the air
– room for research & experimentation
Social issues
Legal issues: copyright protection, security
Individual: privacy protection; rights; obligations
– role in information exchanges, work, needs; life ...
Organizations: integration; changing structure
Traditional libraries: disappearing? changing?
Education: impact on all levels; integration
Computing & society: disparity between information
rich & poor; digital divide; equity
Example: The Internet Archive
Example: National Library of Australia
Example: National Library of Sweden
Egyptian Universities Libraries
Egyptian Universities Libraries
Egyptian Libraries
building aspect
Building a digital library can basically take three
for
– electronic resource management
– repository building
– cross-repository services
Types of Digital Libraries
1. Stand-alone Digital Library (SDL)
2. Federated Digital Library (FDL)
3. Harvested Digital Library (HDL)
Stand-alone Digital Library (SDL)
This is the regular classical library implemented in a fully
computerized fashion. SDL is simply a library in which the
holdings are digital (scanned or digitized). The SDL is selfcontained - the material is localized and centralized.
The ACM Digital Library
IEEE Computer Society DL
Federated Digital Library (FDL)
This is a federation of several
independent SDLs in the
network, organized around a
common theme, and coupled
together on the network. A
FDL composes several
autonomous SDLs that form a
networked library with a
transparent user interface. The
different SDLs are
heterogeneous and are
connected via communication
networks.
Networked Digital Library of Theses & Dissertation
Bibliographic Navigation Tools for Digital
Libraries
SCOPUS
ELIN
Knowledge Cite Library
Database Advisor
OCLS’ FirstSearch
Harvested Digital Library (HDL)
This is a virtual library providing summarized access
to related material scattered over the network. .
Examples of HDLs are the Internet Public Library
(IPL)
1. A HDL holds only metadata with pointers to the
holdings that are "one click away" in Cyberspace.
2. Developed by Library Professionals, or Computer
Scientists
Four Corner Stones of
Digital Library
Community
Computer
Communication
technologies
Content
Contents
Images
.BMP .TIF .GIF .PNG .WMF .PICT .PCD .EPS
.EMF .CGM .TGA .JPG
Animation
.ANI .FLI .FLC
Video
.AVI .MOV .MPG .QT
Contents
Audio
.WAV .MID .SND .AUD .mp3
Web Page
.HTM .HTML .DHTML .HTMLS .XML
Text
.DOC .TXT .RTF .PDF
Programs
.COM .EXE
Contents
Markup standards
1. Hypertext Markup Language (HTML);
http://www.w3.org/MarkUp/
2. Extensible Markup Language (XML);
http://www.w3.org/XML/
3. Standard Generalized Markup Language
(SGML); http://www.w3.org/MarkUp/SGML/
Contents
Metadata standards
1. Dublin Core;
http://dublincore.org/
2. MARC 21;
http://Icweb.loc.gov/marc/
3. Encoded Archival Description (EAD);
http://Icweb.loc.gov/ead/
How is a DL Different from a
Traditional Library?
TL has as its focus physical objects
– even if the card catalog (metadata) is electronic, the purpose
is to point you to a physical location
– trafficking in physical objects has both obvious and subtle
implications
• object can exist only in 1 place
• if you have it, I can’t have it (zero-sum distribution)
• I have to go to the object, or wait for it to come to me
TLs vs. DLs
DLs clearly better than TLs at:
– Dissemination, storing information variety
However, TL objects are more survivable
– Who will archive the research information?
• the publishers?
• the institutions?
• the authors?
– Will the average DL object still be accessible in 10
years?
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
How is a DL Different from a
Traditional Library?
•
Digital Library
• removing the physical restriction has obvious benefits
• multiple access, multiple listings, electronic transmission
• also complicates many other issues...
• intellectual property, terms and conditions, etc.
•
Note that a TL offers additional social and educational
benefits
• Most TLs also offer hybrid services too.
TLs vs. DLs
Where does publishing stop, and libraries begin?
– there has always been tensions between TLs and
traditional publishers, but the roles were fairly well
defined
– DLs can muddle the separation of these
responsibilities
• result: conflict, and/or new models
Traditional Players
book store
publisher
library
archive
responsibility over time
How is a DL different from a
database?
A traditional SQL database has its basic element data
items in a relation:
o select name
o from employee, project
o where employee.deptnumber = “25” AND project.number = “100”
databases exploit known structures and relations
How is a DL different from the WWW?
The keyword is managed
– The WWW is not managed
Some meta searchers (Yahoo, Lycos, Google)
attempt to add an organizational framework to their
web holdings
– However, most are focused on keyword searching (i.e.,
Google)
How is a DL different from the
WWW?
Another key difference is who controls the input
into the system
– most meta searchers hunt down their holdings
– some (DMOZ, Yahoo) have humans in the loop for
review and classification
DLs are generally more tightly controlled, and
have a targeted customer set
Download