Digital Library Service Integration Senior Projects Professors Bieber, Im and Wu Information Systems Department College of Computing Sciences New Jersey Institute of Technology http://is.njit.edu/dlsi For more senior project information: http://is.njit.edu/dlsi/dlsi-sr-projects-s2003.doc DL = Distance Learning = Digital Library DLSI: DL Service Integration Why Participate in DLSI? • Real-world project – very helpful for students and teachers worldwide • High visibility for you and NJIT • Gain research experience and work with research teams • Learning XML, XLS, and other skills • Support: DLSI project leader and bi-weekly DLSI project meetings Outline • Motivation – supporting learning communities • DLSI Architecture • Senior Projects Motivation: Community Knowledge Resides in... • documents (published papers, reports, photos, videos, lesson plans, syllabi, etc.) • discussions • decisions • conceptual models • formal educational modules • workflows/processes • people’s expertise • links/relationships among all these DLSI Architecture • Digital Library: Multimedia Document Services Integration linking related documents for manipulating and maintaining data for storing data DLSI Architecture • Digital Library: Multimedia Document Services • Asynchronous Discussion Tools (Groupware) Integration Discussing a document for manipulating and maintaining data for storing data DLSI Architecture • Digital Library: Multimedia Document Services • Asynchronous Discussion Tools/Groupware • Hypermedia Services (tours, annotation, linking) Service Modules Doc Disc H/ M Doc Disc H/ M Repo sit o r ies Integration Annotating a discussion Tours of documents and discussion comments DLSI Architecture • Digital Library: Multimedia Document Services • Asynchronous Discussion Tools • Hypermedia Services (tours, annotations, links) • Processes/Workflows Service Modules Doc Disc H/ M Pr o c Doc Disc H/ M Pr o c Repo sit o r ies Integration Annotating and discussing a community process DLSI Architecture • Digital Library: Multimedia Document Services • Asynchronous Discussion Tools • Hypermedia Services • Processes/Workflows • Decision Analysis Support • Conceptual Knowledge Structures • Others... DLSI: Integration through Here are some examples. Linking i The agricultural system is very complex. It consists of farmers in interaction with the envi ronment (weather, soil, pests), the economy andUndersta society. Deeply nding Com plex ity One problem currently receiving a lot of attention {document} is Integrated Pest Management. Vie w Peer Revie w Co mmen ts {JESSE Peer Review service} The problem unfortunately is being addressed separately Ente r your o wn Peer by experts in Plant Pathology, Entomology, Agronomy, BotRe vi- ew Comme nt {JESSE Peer Review any, and Soil Science, with no real common ground (and litservice} tle common understanding) to base our decisions Search for sion. mila r/rela ted documen ts Plant Pathology {concept} Meetings and workshops tasked with integration {Core Se arch have se rvice} taken place, but these various subdisciplines could not seem to Ask an expert abo ut this con cep t Other col lection s with thi s do cum ent an lintegrated viewpoint. No real methodology exists {inform the Virtua Refe rence Desk} {DLSI Coll ection Re gistry} to discuss and analyze the systems, which each Relevant NASA Exp erime nts in Space a n ew co mmen t on docume nt subdiscipline has developed. In the end, Create all IPM decisions { Nati onal Space Scien ce Da ta Cen ter} {Core An nota tion se rvice} end up unintegrated and therefore only partially effective. A Related j ourna l articl es systematic approach that would enable the Adddifferent docume ntpartici to current- Guide d To ur {inpants JESSE}to discuss, and determine all the interrelationships, {DLSI Guid ed Tour service } which help researchers link their practices and derive Search forwould th is concept Start your o wn lin k from this documen t their on each others area and the environment, {Core Seeffects arch se rvice} {DLSI Li nk service } would greatly help the IPM research and lead to better deci Vie w Co mmen ts o n th is concept DLSI is Based on the Dynamic Hypermedia Engine • Automatically adds link anchors, links and other “hypermedia” services to applications: • comments • guided tours • structural search (based on links and relationships instead of keywords) • others... • See separate presentation 251 Requisition Header - Shipping and Text Screen: Vend: V0000304390 PR: R010294 Inv: Deliver-to Address Name: MICHAEL BIEBER Org: NJIT, CIS DEPARTMENT Addr: 323 ML KING BLVD City: NEWARK Zip: 07102 Phone: 973 596 2681 STRATEGIC SUPPLIES INTERN'L 71 UNION AVE Line: F OB: Rte: Delivery Service: UPS Delivery Date: 03-12-2000 St: NJ Country: USA Ext: Requisition Codes: 58 128 PUX ZY2 Requisition Text: Document Notes: N 251 Requisition Header - Shipping and Text Screen: Vend: V0000304390 PR: R010294 Inv: Deliver-to Address Name: MICHAEL BIEBER Org: NJIT, CIS DEPARTMENT Addr: 323 ML KING BLVD City: NEWARK Zip: 07102 Phone: 973 596 2681 STRATEGIC SUPPLIES INTERN'L 71 UNION AVE Line: FOB: Rte: Delivery Service: UPS Delivery Date: 03-12-2000 St: NJ Country: USA Ext: Requisition Codes: 58 128 PUX ZY2 Requisition Text: Document Notes: N 251 Requisition Header - Shipping and Text Screen: Vend: V0000304390 PR: R010294 Inv: STRATEGIC SUPPLIES INTERN'L 71 UNION AVE Line: V0000304390 Deliver-to Address Name: MICHAEL BIEBER Org: NJIT, CIS DEPARTMENT Addr: 323 ML KING BLVD FOB: Rte: Vendor Reliability Delivery Service: UPS Vendor Agreements Delivery Date: 03-12-2000 Vendor Details Other Possible Vendors City: NEWARK Zip: 07102 Phone: 973 596 2681 Your Purchasing History St: NJ All Screens Country: USA with this Vendor Ext: Requisition Codes: 58 128 PUX ZY2 Requisition Text: Document Notes: N Dynamic Hypermedia Engine • Links generated based on application structure, not search or lexical analysis 1997 Sales 1997 Expenses $127,322.12 $85,101.99 – You cannot do a search on the display text “$127,322.12” to find related information… – But you can find relationships for the element Sales[1997] Link Mapping Rules Relationship Manager Rule Base … Vendor ... Vendor Vendor Vendor Vendor Vendor … - Vendor IS - Vendor Details - {commands} - Vendor IS - Vendor Reliability - {commands} - Vendor IS - Vendor Agreements - {commands} - Purchasing Data Warehouse Who else uses vendor - {commands} - Purchasing IS Your Purchasing History - {commands} - CASE Workbench All screens with this vendor - {commands} 251 Requisition Header - Shipping and Text Screen: Vend: V0000304390 PR: R010294 Inv: STRATEGIC SUPPLIES INTERN'L 71 UNION AVE Line: V0000304390 Deliver-to Address Name: MICHAEL BIEBER Org: NJIT, CIS DEPARTMENT Addr: 323 ML KING BLVD FOB: Rte: Vendor Reliability Delivery Service: UPS Vendor Agreements Delivery Date: 03-12-2000 Vendor Details Who Else Uses Vendor City: NEWARK Zip: 07102 Phone: 973 596 2681 Requisition Text: Your Purchasing History St: NJ All Screens Country: USA with this Vendor Ext: Requisition Codes: DHE generates anchors and links 58 128 PUX ZY2 from the Relationship Management Rule Base Relationship Manager Rule Base ... Vendor - Vendor IS - Vendor Details - {commands} Vendor - Vendor IS - Vendor Reliability - {commands} Vendor - Vendor IS - Vendor Agreements - {commands} N Vendor - Purchasing Data Warehouse - Who ElseDocument Uses Vendor -Notes: {commands} Vendor - Purchasing IS - Your Purchasing History - {commands} Vendor - CASE Workbench - All Screens with this Vendor - {commands} ... DLSI: Integration through Here are some examples. Linking i The agricultural system is very complex. It consists of farmers in interaction with the envi ronment (weather, soil, pests), the economy andUndersta society. Deeply nding Com plex ity One problem currently receiving a lot of attention {document} is Integrated Pest Management. Vie w Peer Revie w Co mmen ts {JESSE Peer Review service} The problem unfortunately is being addressed separately Ente r your o wn Peer by experts in Plant Pathology, Entomology, Agronomy, BotRe vi- ew Comme nt {JESSE Peer Review any, and Soil Science, with no real common ground (and litservice} tle common understanding) to base our decisions Search for sion. mila r/rela ted documen ts Plant Pathology {concept} Meetings and workshops tasked with integration {Core Se arch have se rvice} taken place, but these various subdisciplines could not seem to Ask an expert abo ut this con cep t Other col lection s with thi s do cum ent an lintegrated viewpoint. No real methodology exists {inform the Virtua Refe rence Desk} {DLSI Coll ection Re gistry} to discuss and analyze the systems, which each Relevant NASA Exp erime nts in Space a n ew co mmen t on docume nt subdiscipline has developed. In the end, Create all IPM decisions { Nati onal Space Scien ce Da ta Cen ter} {Core An nota tion se rvice} end up unintegrated and therefore only partially effective. A Related j ourna l articl es systematic approach that would enable the Adddifferent docume ntpartici to current- Guide d To ur {inpants JESSE}to discuss, and determine all the interrelationships, {DLSI Guid ed Tour service } which help researchers link their practices and derive Search forwould th is concept Start your o wn lin k from this documen t their on each others area and the environment, {Core Seeffects arch se rvice} {DLSI Li nk service } would greatly help the IPM research and lead to better deci Vie w Co mmen ts o n th is concept Prototype Prototype Benefits of Integration for a system (collection/service) • Users: direct access to related systems – enlarges a system’s feature set • DLSI leads users to a system – systems gain wider use • Users become aware of other systems – systems gain wider awareness • Direct access to a system’s features – streamlined access (bypassing menus) Finding Links • Structural links (as with DHE) – when we know the object type • Lexical analysis (Professor Wu) – NJIT Noun Phrase Extractor – NJIT Ontology Developer Filtering & Rank Ordering Links • Collaborative Filtering (Professor Im) – customize the link set for each user • based on: – user-direct evaluations (ratings) – indirect evaluation (clickstream data) • using a Collaborative Filtering Engine User`s Web Browser Digital Library Service Integration Manager AVC Collection Wrapper AVC Collection Collection Wrapper (i) Core Search Wrapper WIKI Service Wrapper Service Wrapper (j) Collection (i) Core Search & Discovery Service WIKI Service Service (j) How to Integrate (1) Develop a Wrapper – Parse all display screens to identify the “elements of interest” that DLSI will make into link anchors. • Parse each kind of display screen • Parse based on the standard template/layout or metadata provided – Also call the lexical analysis routines to identify key phrases for you How to Integrate, cont. (2) Develop Linking Rules – specify the “structural relationships” for recognized object types within the system being integrated. • e.g., author, address, concept, spacecraft, measurement – one rule per object type (class) per link – all linking rules are merged by DLSI, so rules for other systems apply automatically to your system • (e.g., annotations, discussions, related documents) How to Integrate, cont. (3) Initiate Communications: – Several possible ways, depending on the application. Outline • Motivation – supporting learning communities • DLSI Architecture • Senior Projects – – – – – Project 1: AVC and AskNSDL/VRD Project 2: Metis Workflow and JESSE Project 3: NASA’s NSSDC Project 4: Lexical Analysis and User Preferences Project 5: myKnowledge NSDL (National Science Digital Library) • Sponsored by the National Science Foundation (NSF) • Purpose: to provide educational resources in an integrated environment to students and teachers (kindergarten-graduate school) • URL: http://www.nsdl.org/ • Flash Presentation: http://about.nsdl.org/flash DLSI & NSDL • DLSI is providing the integration for all of the NSDL system! • Senior Projects will – be the first integration prototypes – provide necessary internal features Project 1a Atmospheric Visualization Collection • Provides visualization tools and images of weather data from Atmospheric Radiation Measurement (ARM) program • ARM: the largest global change research program supported by the U.S. Dept. of Energy • based at the Argonne National Laboratory • URL: http://www.nsdl.arm.gov/visualization.shtml Automated Links (AVC) • From concepts found in the glossary and from instruments: – – – – link to the glossary definition link to lesson plans containing it link to an instrument's page link to ARM publications containing the keyword/instrument Automated Links, cont. (AVC) • From any relevant object to the appropriate data display page • From any relevant object to the appropriate internal data file (for AVC internal developers) Automated Links, cont. (AVC) and, of course: • Links to related objects, teaching notes and documents in other systems • Links for additional services such as discussion, comments, guided tours, etc. Project 1b Ask-NSDL & Virtual Reference Desk • Based at the University of Syracuse • See separate presentation Automated Links (Ask-NSDL & VRD) • Lexical analysis to find key phrases recognized in glossaries • Links to other questions/answers for a key phrase • Direct links to all relevant functions for experts, administrators (e.g., show all answers this expert made) Automated Links, cont. and, of course: • Links to related objects, teaching notes and documents in other systems • Links for additional services such as discussion, comments, guided tours, etc. Project 1b Ask-NSDL & Virtual Reference Desk • Same kind of links for the Virtual Reference Desk! (http://www.vrd.org/) Project 2a Metis Workflow Engine • based at the University of Colorado at Boulder • Workflow: the process to get something done – involves triggering events from/in different systems – some steps are automated (e.g., send email), others require people to do something • See separate presentation Automated Linking (Metis) • Linking workflow definition tools to internal Metis documentation • Linking workflow specifications and Metis displays with the systems involved in the workflow Automated Links, cont. (Metis) and, of course: • Links to related objects, teaching notes and documents in other systems • Links for additional services such as discussion, comments, guided tours, etc. Project 2b JESSE/Picture of the Day • Journal of Earth System Science Education • Based at the Universities Space Research Association • See external presentation • URL: http://jesse.usra.edu/testing/ • Also, Earth Science Picture of the Day • URL: http://epod.usra.edu Automated Links (JESSE/POTD) • Links among related pictures and articles and, of course: • Links to related objects, teaching notes and documents in other systems • Links for additional services such as discussion, comments, guided tours, etc. Project 3: NASA’s National Space Science Data Center • based at the Goddard Space Flight Center • URL: http://nssdc.gsfc.nasa.gov/ • preliminary starting demo connecting NSSDC with the University of Arizona Document Summarizer (see next page) Prototype Automated Linking (NSSDC) • Links among related space missions, experiments, astronauts & scientists, and definitions/explanations of key phrase/concepts • Links utilizing other NASA systems Automated Links, cont. (NSSDC) and, of course: • Links to related objects, teaching notes and documents in other systems • Links for additional services such as discussion, comments, guided tours, etc. Project 4a (internal): Linking through Lexical Analysis Finding Links ¥ Structural links (as with DHE) Ğwhen we know the object type ¥ Lexical analysis (Professor Wu) ĞNJIT Noun Phrase Extractor ĞNJIT Ontology Developer • See external presentation Project 4b (internal): User Preference Module • UPM maintains database of user preferences for any module integrating with DLSI. – communicates with other modules through DLSI’s existing message passing protocol • UPM will communicate with users – to gather preferences – to get information from the user about his or her current task, so we can customize the links generated task and preferences • We have a fairly complete set of requirements already prepared. Project 5: myKnowledge • Independent application; integrated with DLSI for all NSDL users • Users maintain a knowledge base of ideas and references • It has several fields where users can make notes, record references and mark characteristics (metadata) about a concept or DL “resource” (document, article or Web page) myKnowledge • myKnowledge information will be stored within a MySQL database. • We have a fairly complete set of requirements already prepared. • We need help determining the best way of presenting the information. Automated Links (myKnowledge) • Links to the actual resource on the Web and, of course: • Links to related objects, teaching notes and documents in other systems • Links for additional services such as discussion, comments, guided tours, etc. Automated Links, cont. (myKnowledge) and, of course: • Links to related objects, teaching notes and documents in other systems • Links for additional services such as discussion, comments, guided tours, etc. Why Participate in DLSI? • Real-world project – very helpful for students and teachers worldwide • High visibility for you and NJIT • Gain research experience and work with research teams • Learning XML, XLS, and other skills • Support: DLSI project leader and bi-weekly DLSI project meetings Digital Library Service Integration Senior Projects Professors Bieber, Im and Wu Information Systems Department College of Computing Sciences New Jersey Institute of Technology http://is.njit.edu/dlsi For more senior project information: http://is.njit.edu/dlsi/dlsi-sr-projects-s2003.doc