The NSDL Program Stephen Griffin National Science Foundation 1 The NSDL Program 2 1996 Vision articulated by NSF's Division of Undergraduate Education 1997 National Research Council workshop 1998 Preliminary grants through Digital Libraries Initiative 2 1998 SMETE-Lib workshop 1999 NSDL Solicitation 2000 6 Core Integration demonstration projects + 23 others funded 2001 1 large Core Integration System project funded 2002 More than 60 independent projects funded NSF-funded Research Programs NSF Solicitation New ideas Proposals New ideas Research 3 The NSDL Program NSF's objective Build a comprehensive digital library for all aspects of science education NSF's approach Solicitation encouraged wide diversity of proposals divided into general categories Best 60+ proposals funded -- more to follow Grants allow projects flexibility Result A splendid set of projects A challenge in interoperability! 4 The NSDL: The Challenge of Scale William Y. Arms Cornell University 5 Core Integration Philosophy Scientific and technical information Materials used in education Materials tailored to education 6 Core Integration Philosophy It is possible to build a very large digital library with a small staff. But ... Every aspect of the library must be planned with scalability in mind. Some compromises will be made. 7 How Big might the NSDL be? All branches of science, all levels of education, very broadly defined: Five year targets 1,000,000 different users 10,000,000 digital objects 10,000 to 100,000 independent sites 8 Resources for Core Integration Core Integration Budget $4-6 million Staff 25 - 30 Management Diffuse How can a small team, without direct management control, create a very large-scale digital library? 9 Collections: the Basic Assumption The Core Integration team will not manage any collections 10 Collections 11 The NSDL program funds only a fraction of the relevant collections. Every Collection is Different 12 The Core Integration Task ... ... to provide a coherent set of collections and services across great diversity. 13 Interoperability The Problem Conventional approaches to interoperability require partners to support agreements (technical, content, and business But NSDL needs thousands of very different partners ... most of whom are not directly part of the NSDL program The Approach A spectrum of interoperability 14 Levels of interoperability 15 Level Agreements Example Federation Strict use of standards (syntax, semantic, and business) AACR, MARC Z 39.50 Harvesting Digital libraries expose metadata; simple protocol and registry Open Archives metadata harvesting Gathering Digital libraries do not cooperate; services must seek out information Web crawlers and search engines Searching What to Index? When possible, full text indexing is excellent, but full text indexing is not possible for all materials (non-textual, no access for indexing). Comprehensive metadata is an alternative, but available for very few of the materials. What Architecture to Use? Few collections support an established search protocol (e.g., Z39.50) 16 Broadcast Searching does not Scale Collections User 17 User interface server The Metadata Repository Services The metadata repository is a resource for service providers. Users Metadata repository Collections 18 It holds information about every collection and item known to the NSDL. Search Architecture Portal Portal Portal OAI SDLIP Search and Discovery Services Metadata repository http Collections James Allan, Bruce Croft (University of Massachusetts, Amherst) 19 Support for Service Providers The Metadata Repository as a Resource Records are exposed through Open Archives Initiative harvesting protocol. Core Integration team will provide some services based on the metadata repository. The architecture encourages others to build services. 20 Metadata Strategy Metadata is expensive The NSDL cannot afford to create it manually 21 Metadata Strategy • Support eight standard formats • Collect all existing metadata in these formats • Provide crosswalks to Dublin Core • Expose records in the metadata repository for others to harvest • Concentrate on collection-level metadata • Use automatic generation to augment item-level metadata 22 Collection-level Operations Material in the NSDL is selected and managed as collections: • • • • Alexandria Digital Library Cornell course web sites JISC Resource Discovery Network Joe's web page Human effort will be used to select and integrate major collections. Automated methods (e.g., web crawling) will be used to identify and integrate additional collections. 23 Quality Control The Problem Material in the NSDL should be relevant. But we cannot select each item individually. The Approach Most selection and quality control decisions are made at a collection level, not at an item level. Information about quality will be maintained in a collection-level metadata record, which is stored in a central metadata repository. This metadata is made available to NSDL service providers. User interfaces can display quality information. 24 User Interfaces The Problem Cannot handcraft every web page Must be usable on a very wide range of equipment and with a very diverse group of users The Solution Data driven portals using a channel architecture. Interfaces guide the user to understand the library. One library, many portals. 25 26 27 The Mortal behind the Portal [This space left intentionally blank.] 28 Where is the Center of the Universe? Alexandria Library of Congress Elsevier NSDL Informedia Math DL 29 Joe's Pictures Where is the Center of the Universe? British Library Elsevier Internet Archive Library of Congress OCLC Harvard 30 NSDL Where is the Center of the Universe? email Google Office Course web sites News and weather 31 Bill Arms Directories Technical documentation NSDL Acknowledgement and Disclaimer The NSDL is a program of the National Science Foundation's Directorate for Education and Human Resources, Division of Undergraduate Education. The NSDL Core Integration is a collaboration between the University Center for Atmospheric Research (Dave Fulker), Columbia University (Kate Wittenberg) and Cornell University (Bill Arms). The Technical Director is Carl Lagoze (Cornell University). 32 The NSDL: The Challenge of Scale William Y. Arms Cornell University 33