View/Open - NDLTD Document Archive

advertisement
Collaboration on Digital Libraries
NEC
Dec. 27, 2000
Edward A. Fox
fox@vt.edu http://fox.cs.vt.edu
CS
DLRL
Internet TIC
Virginia Tech, Blacksburg, VA, USA
Acknowledgements (Selected)

Mentors: JCR Licklider, Michael Kessler, Gerard Salton

Sponsors: Adobe, IBM, Microsoft, NLM, NSF, OCLC,
SOLINET, SURA, UNESCO, US Dept. of Ed. (FIPSE), …

VT Faculty/Staff: Tony Atkins, Debra Dudley, John Eaton,
Gwen Ewing, Peter Haggerty, JAN Lee, Gail McMillan,
Manuel Perez, Len Peters, James Powell, …

VT Students: Emilio Arce, Fernando Das Neves, Brian
DeVane, Robert France, Marcos Goncalves, Scott Guyer,
Robert Hall, Brian Hobbs, Neill Kipp, Paul Mather, Tim
McGonigle, Todd Miller, Constantinos Phanouriou,
William Schweiker, Ohm Sornil, Hussein Suleman, Patrick
Van Metre, Laura Weiss, …
URLs
 http://fox.cs.vt.edu
 http://ei.cs.vt.edu/~dlib
(Courseware)
 http://www.dlib.org (D-Lib Magazine)
 www.smete.org and later
www.nsf.gov/nsdl
 www.ndltd.org and www.theses.org
 www.cstc.org (CSTC and JERIC)
 www.openarchives.org
 www.jcdl.org (JCDL’2001 – June 24-28)
Digital Library Courseware
http://ei.cs.vt.edu/~dlib/
 WWW
pages or large PDF copy files
 CourseInfo quizzes based on books by Michael
Lesk (MKP.com) and William Arms (MIT Press)
 Contents based on books, with other popular
topics added (e.g., agents)
 Separate pages to supplement: Definitions,
Resources (People, Projects), and References
JCDL 2001
 First
Joint ACM/IEEE Conference on
Digital Libraries (+ NSF DLI-2 PI mtg)
http://www.jcdl.org
 June 24-28, 2001 in Roanoke, VA
 Conference Committee:
General Chair: Edward A. Fox, Virginia Tech
 Program Chair: Christine Borgman, UCLA
 Treasurer: Neil Rowe, Naval Postgraduate School
 Posters Chair: Craig Nevill-Manning, Rutgers U.

Communications
(bandwidth, connectivity)
Locating Digital Libraries in Computing and
Communications Technology Space
Digital Libraries
technology
trajectory: intellectual
access to globally
distributed information
Computing (flops)
Digital content
less
more
(Slide from S. Griffin, NSF)
Service
Machine 1
Service
PetaPlex Complex
Service
Machine 2
Nanoserver
FRONT END MACHINE
RS/6000, 1G RAM, 4 Proc.
Machine 3
Service
Machine 4
Nanoserver
Nanoserver Nanoserver Nanoserver Nanoserver
Nanoserver Nanoserver Nanoserver Nanoserver
Nanoserver Nanoserver Nanoserver Nanoserver
PetaPlex
 Digital
Library Machine (“super” object
store): Parallel computer / storage utility
 Research: inverted files, video server, …
(supported by IBM, AOL, NSF, …)
 Knowledge Systems Incorporated is
supplying VT-PetaPlex-1 with 2.5
terabytes through 100 nodes:
Net
connection + 25GB disk + 233
MHz Pentium + Linux
MARIAN
Multiple Access Retrieval of Information with
Annotations
 (Marian the Librarian …)
 Evolved from CODER system to a distributed
Online Public Access Catalog (OPAC), then DL
backend, now becoming a full DL system
 From C/C++ to Java
 Future: NDLTD, NUDL, PetaPlex
 Use for campus collection management
 Use for www.theses.org as centralized system
with gateway services: OAI, Harvest, Z39.50, …

MARIAN Layers
User
User
User
User Interface Layer
User Information Layer
Search Engine Layer
Database Layer
User
MARIAN Parallelism
response time
(ms)
Java part response time vs. query rate comparation
(type 1 requests)
4000
3000
2000
1000
0
0
100
200
300
queryrate (#/min)
all modules in one machine
one "webgate"
two "webgate"s
four "webgate"s
400
500
Search Services
Recommendation Services, etc
Analysis
Indexing
Linking
5SL
Source
Description
NDLTD/NUDL/Digital
Library User
MARIAN/DEByE Mediation
Middleware
Fusion Layer
Wrapper
Generator
Additional
Evidential
Information
Belief Network Layer
Local Data Store
Queries + Results
wrapper
wrapper
Dublin
Core
SOIF
Harvest
protocol
German
PhysDis
Collection
...
Collection
wrapper
MARC
Open Archives
protocol
VT OAI
wrapper
Z39.50
protocol
...
RFC1807
Dienst
protocol
Greek
Hellenic Dissertations
Collection
MIT ETD
Collection
ENVISION
NSF “A User-Centered Database from the Computer
Science Literature” (1991-93)
 Collected bib/typesetter data, converted to SGML
 Scanned thousands of page images
 MARIAN search engine - can be made available (also
applied to the Virginia Tech library catalog) used as part
of a prototype object-based DL, with tailored
visualization interface (L. Nowell dissertation)

DL-Related Timeline
WWW
1985
1990
Scholarly
EPub in U’s
SGML
1995
xxx
CSTR
PDF
2000
NCSTRL
OAI
CoRR
XML
MPEG-7
JPEG, MPEG
PCs
Proposed DLI
DLI2
NSDL
Ugrad DL
TEI
(CSTC, iLumina,…)
(Envision, EI)
HyperCard
Java
DC
RDF
Hypertext Conf.
ETDs
NDLTD
Information
Life
Cycle
Borgman et al.:
Workshop Report on
Social Aspects of
Digital Libraries:
http://www-lis.gseis.
ucla.edu/DL/
Core of DL
 Collecting
– Authoring, Repositories, Archives, Museums, …
 Organizing
– Packaging of Data and Metadata, Storing
– Naming/Identifying and Cataloging
– Classification, Clustering, …
 Serving
– Indexing, Linking, Summarizing, Visualizing
– Browsing, Accessing, Searching, Filtering,
Retrieving, Distributing, Using, …
Digital Libraries
Shorten the Chain from
Editor
Reviewer
Publisher
A&I
Consolidator
Library
DL = Users Direct
(Organized Artifact Mediated Communication)
Author
Teacher
Digital
Reader
Learner
Reviewer
Editor
Dr.
Library
Patient Librarian
Author tools
www.physik.uni-oldenburg.de/EPS/mmm
A Digital Library Case Study
Domain:
graduate
education, research
Genre: ETDs =
electronic theses &
dissertations
Submission:
http://etd.vt.edu
Collection:
http://www.theses.org
Project:
Networked Digital
Library of Theses &
Dissertations
http://www.ndltd.org
(NDLTD – remember:
ND LTD / NDL TD)
(also, newer NUDL:
Networked University
Digital Library, with
e-courseware, etc.)
Status of the Local Project
 Approved
by university governance Spring
1996; required starting 1/1/97
 Submission & access software in place
 Submission workshops for students (and
faculty) occur often: beginner/adv.
 Faculty training as part of Faculty Development
Initiative
 Over 3000 ETDs in collection
– Some have audio, video, large images, software, …
– Millions of accesses/yr – 100s to 1000s per work
What are the long term goals?
 Attract
all TDs/yr: 50K D-US, 25K D-Germany,
10K TD-Canada, …
 >200K/yr rich hypermedia ETDs that may turn into
electronic portfolios (images, video, audio, …)
 Dramatic increase in knowledge sharing: literature
reviews, bibliographies, …
 Services providing lifelong access for students:
browse, search, prior searches, citation links
 Hundreds/thousands of downloads / year / work
Student Gets Committee
Signatures and Submits ETD
Signed
Grad School
Library Catalogs ETD, Access is
Opened to the New Research
WWW
NDLTD
US University Members (44)
Air
University (Alabama)
Baylor University
Brigham Young University (part, whole)
Caltech
Clemson University
College of William & Mary
Concordia University (Illinois)
East Carolina University
East Tenn. State U. – require fall 2000
Florida Institute of Technology
Florida International University
George Washington University
Louisiana State University
Marshall University (W. Va.)
Miami University of Ohio
Michigan Tech
Mississippi State University
MIT
Naval Postgraduate School (CA)
New Mexico Tech
North Carolina State University























Penn. State University
Rochester Institute of Tech.
U. of Colorado Health Science Center
U. of Florida
U. of Georgia
University of Hawaii, Manoa
U. of Iowa
U. of Kentucky
U. of Maine
U. of North Texas – required since 8/99
U. of Oklahoma
U. of South Florida
U. of Tennessee, Knoxville
U. of Tennessee, Memphis
U. of Texas at Austin – required in 2001
U. of Virginia
U. Wisconsin - Madison
Vanderbilt U.
Virginia Commonwealth U.
Virginia Tech - required since 1/97
West Virginia U. - required fall 1998
Western Michigan U.
Worcester Polytechnic Inst.
National / Regional Projects

Australia
–
–
–
–
–
–
–


U. New South Wales (lead)
U. of Melbourne
U. of Queensland
U. of Sydney
Australian National U.
Curtin U. of Technology
Griffith U.
–
–
–
–
–
–
–
–
–
Germany
– Humboldt University (lead)
– 3 other universities
– 5 learned societies: Math,
Physics, Chemistry,
Sociology, Education
– 1 computing center
– 2 major libraries
Consorci de Biblioteques
Universitàries de Catalunya, as
group, www.cbuc.es:



Universitat de Barcelona
Universitat Autonòma de Barcelona
Universitat Politècnica de Catalunya
Universitat Pompeu Fabra
Universitat de Girona
Universitat de Lleida
Universitat Rovira i Virgili
Universitat Oberta de Catalunya
Biblioteca de Catalunya
OhioLink
South Africa: ECHEA/SEALS
India, Portugal, …
Other Countries with Members
 Belgium
 Netherland
 Brazil
 Norway
 Canada
 Russia
 Germany
 Singapore
 Hong
 S. Africa
Kong
 India
 Italy
 Korea
 Mexico
 S.
Korea
 Spain
 Taiwan
 UK
Build Local ETD Site
ETD
Workshop/Training
Digital Library
Policies
Inspection/Approval
CS Teaching Center (CSTC)
 Collection
of reviewed online resources used
to aid in teaching of Computer Science
 Supports
author submission and peer-review
process for new ACM Journal of Educational
Resources In Computing (JERIC)
 Connected
with NSDL (NSF 00-44)
http://www.cstc.org
CS Teaching Center (CSTC)

Instead of building large, expensive multimedia packages,
that become obsolete and are difficult to re-use, concentrate
on small knowledge units.

Learners benefit from having well-crafted modules that
have been reviewed and tested.

Use digital libraries as a powerful base of support for
learners, upon which a variety of courses, self-study
tutorials & reference resources can be built. [See NSF
NSDL - National Science (math, engineering, technology
education) Digital Library (formerly SMETE-lib) at
www.dlib.org/smete/public/smete-public.html; www.smete.org]

iLumina: NSF NSDL grant with COLLEGIS Research
Institute/Eduprise, UNCW, TCNJ, …
Browsing (1)
Browsing (2)
(From Lee Zia, NSF)
Programmatic History
NSDL Program
NSF: FY 00-02
DL
Operational
Fall, 2002
DLs & UG Earth Systems Education
initiated FY 99, continuing
DLI 2 Special Emphasis in Undergrad
Education FY 98-99
DLI 2 - NSF, et al., initiated in FY 98, continuing
Digital Libraries Initiative (DLI 1) - NSF/NASA/ARPA,
FY 94-97
Expectations of Tracks

Core Integration: to coordinate a distributed alliance of
resource collection and service providers, and to ensure
reliable and extensible access to and usability of the
resulting network of learning environments and resources

Collections: to aggregate and actively manage a subset of
the digital library’s content within a coherent theme or
specialty

Services: to increase the impact, reach, efficiency, and
value of the digital library in its fully operational form

Targeted Research: to have immediate impact on one or
more of the other three tracks
Tracks & 29 Projects
6
Core Integration: Columbia, Cornell,
E.Michigan/MERIT, UCAR, UCB, UMissouri/NCSA (Biology, Eng., Teacher Ed.)
 13
Collections: Atmosphere, Biology, Biosciences,
Earth Systems, Engineering, Health Sciences, Math
9
Services: Competitive Intelligence, Component
Environment, Earth Systems J., Metadata NLP,
Managing LOs, Peer Review, Video
1
Targeted Research: Paths
NSDL Spine
Portals
&
Portals
Portals
& &
Clients
Clients
Clients
NSDL
NSDL
Services
Other
NSDL
Services
Services
full-service
full-service
collections
NSDL
collections
Collections
referenced
referenced
Referenced
items&&
items
Items
&
collections
collections
Collections
Core CollectionCore
Building
CollectionServices
harvesting
Core
Building
CollectionServices
persistence
Building
Services
protocol mediation
Core CollectionUsage
CIServices
Services
annotation
CI Services
query transform
CI Services
topic-map
CIregistry
Services
personalization
discussion
(Slide from Dave Fulker, Bill Arms – 11/2/2000)
Our Collaboration for NSDL
PARTNERS
 Hofstra
 Villanova
 Penn
State (with NEC)
 Virginia
 ACM,
Tech
IEEE-CS, Morgan Kaufmann, …
Our Collaboration for NSDL
FUNDING
 $1M
for 2 years, starting 9/1/2001 - NSF
 $225K:
Hofstra (1 GRA, 1 PI)
 $175K:
Villanova(1 GRA, 1 PI)
 $175K:
Penn State(1 GRA, 1 PI)
 $425K:
VT (4 GRAs, 3 PIs: Fox, Lee, Perez)
 ACM,
IEEE-CS, Morgan Kaufmann, …
Our Collaboration for NSDL
STRENGTHS
 PetaPlex,
MARIAN, NDLTD, CSTC, JERIC
 SIGCSE:
SIG, Conference, Bulletin
 History
as integrating theme; adding demos
 Special
support for Hispanic community
 Niche
portals, search engines, links across
collections, citation data --- for levels:
undergrad, high school, middle school, etc.
Our Collaboration for NSDL
PROPOSAL PLAN
 Student
project completed Fall 2000
 Kate
will continue in Spring 2001 through an
independent study
 Meetings:
John visited VT, Boots and I visit
Hofstra today, I visit NEC on 12/27, John
visits VT again, …
 Get
support letters, refine proposal, …
Our Collaboration for NSDL
DOCUMENTS
 “Computing
 Packet
 See
Digital Library (CoDL)”
prepared by student group
their slides next
 Contents:
– Project report
– CoDL proposal outline
– Proposals from some successful NSDL groups
Download