Digital Library in a Box Ming Luo, Hussein Suleman, Edward Fox Virginia Tech Subcontract to Collaborative Project led by University of Florida (also with NCSA at UIUC) List of DL in a Box components (1) Name Source Status Description IRDB Search Engine VT Available A search engine based on an OAI-accessible data archive, with a pseudo-OAI (ODLSearch) interface for submitting queries and retrieving results. DBBrowse Browse Engine VT Available A indexing system to partition a data source by multiple categories(flat and hierarchical) based on the metadata, where the data source is an OAI or ODL archive and the interface to request subsets of the data is pseudo-OAI (ODL-Browse). In effect, this provides a mechanism to browse based on categories in the metadata. OAI/ODL Harvester VT Available Harvest data from one or more archives. This is a template that does nothing useful besides printing the records to stdout - it is intended that the Harvester class will be subclassed to perform more useful functions. OAIB NCSA Available OAIB (Open Archives "in a box") is a component for exporting metadata stored in a relational database management system (RDBMS) over the Open Archives Initiative protocol for metadata harvesting. DBUnion Archive Merger Component VT Available Merge together different OAI-accessible archives into a single archive for local storage and processing, with a pseudo-OAI (ODL-Union) interface for access. List of DL in a Box components (2) XML Filebased OAI Data Provider VT Available This is a data provider module that operates over a set of XML files which contain the metadata. The requirements are meant to require a minimal effort while retaining all the flexibility of the OAI protocol OAI-PMH2 Data provider VT Available This toolkit implements the skeleton of the OAI-PMH v2.0 in an object-oriented fashion, thus hiding the details of the protocol from code that is derived from the predefined class. Submit Archive Component VT Available Archive with an almost standard OAI interface, supplemented with one additional "PutRecord" verb to allow addition, modification, and deletion of records. In effect, this component creates an abstract view of a database by "filling in the gaps" in the OAI protocol to make this possible. WhatsNew Engine VT Available List a random sample of the most recently harvested records from a specific OAI or ODL source. Threaded Annotation Engine VT Available Manage an archive of external annotations that may be threaded and attached to arbitrary resources in a collection. This may be used for feedback for items or for general purpose discussions. MDEdit XML Schema-based Metadata Editor VT Available This is a data provider module that operates over a set of XML files which contain the metadata. The requirements are meant to require a minimal effort while retaining all the flexibility of the OAI protocol. List of DL in a Box components (3) Grunk NCSA Available Grunk (for GRammar UNderstanding Kernel) is a library for parsing and extracting structured metadata from semistructured text formats. It is based on a very flexible parsing engine capable of detecting a wide variety of patterns in text formats and extracting information from them. Recommend Component VT Under Development To exploit the similarity among people and resources. Recommend recourse to user based on those similarities. Rate Component VT Under Development Allow users to assign numerical ratings to an item, the average of which is subsequently displayed to other users as a trivial peer review mechanism. Review component VT Under Development This component use an appropriate set construction to allow more efficient indexing of the review component’s data and to generate metadata specifically filtered for particular users or resources. Autoclassification component VT Under Development This component automatically classify the input metadata into different categories so the user can browse the metadata. filter component VT Planned This component acts as a filter when harvesting based on the rules or result of classification. DL-in-a-box -> OCKHAM • Hussein Suleman’s dissertation on Open Digital Libraries (ODL) • Lightweight protocols: OAI -> XOAI • Components: Digital library construction by connecting selected elements from pool • Add idea of lightweight reference models • Add peer-to-peer communication Needs addressed by the OCKHAM Project • The NSDL has extraordinary resources and services for scientific education. • However, there has been limited integration and deployment of NSDL into the traditional library community - a valuable dissemination channel. • Learning communities would realize many benefits from a coordinated set of networked services for dissemination of NSDL resources through traditional library protocols. OCKHAM Project Goals 1. Reference Model Development 2. Middleware and Testbed Services Development 3. Evaluation 4. Dissemination and Networking OCKHAM Library Network • P2P network of interoperable web services using: – SOAP, – WSDL, – UDDI, and other protocols • Project collaborators: – Emory, – Virginia Tech, – Arizona, and – Notre Dame (soon to include Oregon State) OCKHAM Library Network NSDL Services NSDL OCKHAM Library Network OCKHAM Services Library Services Teachers Learners Librarians OCKHAM Testbed Services 1. 2. 3. 4. 5. 6. 7. … Interoperation Service OAI-PMH-to-Z39.50 Searching Service Alerting Service Browsing Service Conversion Service Cataloging Service Pathfinding Service