From semantic networks, to ontologies, and concept maps: knowledge tools in digital libraries Marcos André Gonçalves Digital Library Research Laboratory Virginia Tech Outline Introduction Semantic Networks in Information Retrieval The MARIAN system Digital Library Ontologies Concepts maps: knowledge representation and visualization in DLs Introduction Experiment how new knowledge representation tools can be used in Digital Libraries Semantic networks Ontologies Representation, retrieval and inference of DL constructs and relationships Formalize, model and generate DLs Concept Maps Visualization tool Supporting collaborative work Transforming information to knowledge creation Outline Introduction Semantic Networks in Information Retrieval The MARIAN system Digital Library Ontologies Concepts maps: knowledge representation and visualization in DLs Semantic Networks in DLs: MARIAN Motivation Support rich DL information services which are: Extensible Tailorable Support large, diverse collections of digital objectives which: have complex internal structures are in complex relationships with each other and with other non-library objects such as persons, institutions, and events Design choices Design choices Objective Examples of use Semantic networks Basic, unified representation Document and metadata structure; of digital library structures hierarchical relationships of classification systems; concept maps Weighting schemes Support IR operations and services; quantitative representation of qualitative properties (similarity, uncertainty, quality) Weighted links representing indexes; multi-field, multi-word, fusion of weighted IR sets; degree of similarity among concepts in different ontologies Object oriented class system Provide common behavior, extensibility, and opportunity for improved performance Shared methods for matching different types of nodes (terms, controlled, free texts) and link topologies; multilingual support and common presentation methods Lazy evaluation Performance; management of large collections Reduced number of search results; enhanced merging algorithms for weighted sets of searching results Design choices: semantic networks Represent knowledge in patterns of interconnected nodes Graph representation to express knowledge or to support automated systems for reasoning Sowa’s classification: Definitional networks Assertional networks Mechanism to pass messages (tokens, weights) Learning networks Implication as the primary relation Executable networks Assert propositions Implicational networks Inheritance hierarchies Modify internal representations (weights, structure) Ability to measure similarity Hybrid networks Design choices: MARIAN semantic network hasAuthor occursInAuthor Person term ETD Metadata occursInAbstract hasAbstract id term Abstract hasSubject Subject occursInAbstract describes ETD Doc id hasChapter Chapter hasSection Section hasParagraph Paragraph cites Section … term occursInSubject term Paragraph Paper id term … occursInParagraph term MARIAN API (Main) ClassMgr termClassMgr nodeClassMgr unwtdLink ClassMgr nGram ClassMgr EnglishRoot ClassMgr SpanishRoot ClassMgr controlledText ClassMgr linkClassMgr TextClassMgr EnglishText ClassMgr has* ClassMgr SpanishText ClassMgr wtdLink ClassMgr occursIn* ClassMgr ChineseText ClassMgr Architecture and Implementation (cont.) The Search layer Mapping from abstract object description to weighted set of objects Types of search Link activation Search in context Searchers OO search engines Based on fusion Examples: maximizing union searcher, summative union searcher Supported by Tables: short-term memory of elements seen to date, checking each new element to keep or discard Sequencers: take a set of incoming streams of weighted sets and produce single output. Exs: PriQueueSequencer, MergeSequencer. Architecture and Implementation (cont.) The Search layer 1 Digital occursInAbstract Abstract hasTitle Library #2006:42369 E. A . Fox query #2007:74667 hasAdvisor Advisor occursInAdvisor 4 hasAdvisor Searcher OccursIn Abstract Searcher #2006:60812 Parser (Morphological matcher) {#6029:65655:1.00, #6029:989:0.74, … } 1 OccursIn Advisor Searcher 3 {#6031:45634:1.0, #6031:5678:0.9, … } {#6000:54544:1.0, #6000:2987:0.9 #6000:003:0.74, … } 5 {#6029:3000:0.85, #6029:65655:0.8 2 … } {#6015:65655:0.90, #6015:3000:0.425 #6015:989:0.37, Summative … } Union Searcher 2 Summative Union Searcher {#6000:856:0.90, #6000:7890:0425, … } Final result set 5 6 4 hasAbstract Searcher Future Work Testing of: Efficiency OO class-model vs. instance level semantic network Lazy evaluation Tables and sequencers Effectiveness with: Structured documents and metadata Fulltext Supporting richer networks of relationships Citation linking Multi-language term relationships Future Work Support for other types of networks and graph-based digital objects and structures Belief networks Topic/Concept maps Ontologies, classification schemes Supporting multimedia retrieval Supporting for CLIR Outline Introduction Semantic Networks in Information Retrieval The MARIAN system Digital Library Ontologies Concepts maps: knowledge representation and visualization in DLs Ontologies for DLs Motivation DLs are an ill-understood phenomena Lack of formal models for DLs Ad-hoc development, interoperability Formal Ontologies for DLs specify relevant concepts – the types of things and their properties – and the semantics relationships that exist between those concepts in a particular domain. use a language with a mathematically well-defined syntax and semantics to describe such concepts, properties, and relationships precisely 5S Model (informally) Digital libraries are complex information systems that: help satisfy info needs of users (societies) provide info services (scenarios) organize info in usable ways (structures) present info in usable ways (spaces) communicate info with users (streams) 5S Model Models Examples Objectives Stream Text; video; audio; image Describes properties of the DL content such as encoding and language for textual material or particular forms of multimedia data Structures Collection; catalog; hypertext; document; metadata; organization tools Specifies organizational aspects of the DL content Spatial Measure; measurable, topological, vector, probabilistic Defines logical and presentational views of several DL components Scenarios Searching, browsing, recommending, Details the behavior of DL services Societies Service managers, learners, Teachers, etc. Defines managers, responsible for running DL services; actors, that use those services; and relationships among them 5S Model: Mathematical formal theory for DLs 5S Definition Streams Structures Sequences of elements of an arbitrary type Labeled directed graphs Spatial Sets and operations on those sets Scenarios sequences of events that modify states of a computation in order to accomplish some functional requirement. Societies Sets of communities and relationships among them measurable, measure, probability, vector, topological spaces relation tuple sequence state event function sequence graph 5S grammar streams structures spaces scenarios services structured stream digital object structural descriptive metadata metadata specification specification indexing service browsing service hypertext metadata catalog transmission societies collection repository digital library (minimal) searching service Ontologies for DLs Ontologies for DLs Realizations of the theory/ontology Meta-Model for a DL descriptive modeling language: 5SL (JCDL2002) Meta-Model for a DL Visual modeling Tool: 5SGraph (ECDL2003) Meta-Model for an XML Log Standard (ECDL2002, JCDL2003) Realizations of the theory/ontology 5S Meta-Schema * Text * Video Stream Model * Audio * Image * Application Structural Model 5S * Collection * Document * Catalog * Metadata * Organizational Tool * Authority File * Classification Schema * Thesaurus Space Model Scenario Model User Interface * Rendering Retrieval Model * Index * Services * Actor Society Model * Manager * Scenarios * Ontology Realizations of the theory/ontology 5SGraph Interface Future Work Semantic relationships Taxonomy of services Only “syntactic” ones were defined Constraints and dependencies (in form of axioms) Composability, Extensibility Formal definitions of properties of DL models/architectures and proofs Completeness Soundness Equivalence Outline Introduction Semantic Networks in Information Retrieval The MARIAN system Digital Library Ontologies Concepts maps: knowledge representation and visualization in DLs Concepts maps: knowledge representation and visualization in DLs Challenges in Visual Interfaces for DLs (Chen & Borner) 1. Supporting collaborative work 2. Transforming information to knowledge creation Hypothesis: Concepts maps can serve as a uniform visual abstraction to provide solutions for these problems. What are concept maps Applications: 1. Knowledge organization and creation 2. Collaborative learning GetSmart Experience (JCDL2003) 3. Domain summarization 4. Browsing tool Knowledge Repository DL Data Information provider information knowledge Knowledge repository GetSmart Experience (Cont.) Collaborative learning: Group maps GetSmart Experience (Cont.) Summarization tool Summarization tool Supplement to document abstracts both for one language and across language ----pilot experiment Group 1(14) Group 2 (14) English papers Original abstract Original abstract concept map Spanish papers Original abstract plus translated version Original abstract plus machine translated version plus translated concept map Summarization tool (Cont.) Pilot experiment results Group 1(14) average Group 2 (14) average P-value Q1 (English) 1.6631 1.3839 0.527 Q2 (English) 1.6599 1.1310 0.185 Q3 (Spanish) 1.7085 1.1039 0.209 Q4 (Spanish) 1.6815 0.9831 0.030 * Likert (English) N/A 3.6, 4.4 0.022 * Likert (English) N/A 2.7, 4.3 0.001 * Automatic generation Motivation: Automatic concept map is tedious and timeconsuming Novices will draw flawed or overly simplistic map Maintain uniformity Technique Term co-occurrence (Gaines & Shaw) Automatic generation (Cont.) Spanish documents Procedure: Determine part-of-speech for each word Collapse all inflected forms to root form Concatenate noun phrases into one “concept” Remove some stopwords, keep others for use in crosslinks Browsing tools • Visual aid to navigate through complex collect inter-related digital objects • Support Multi-hierarchy browsing Concept Maps’ supports for DL (cont.) Browsing and searching assistant Future Work Improve the quality of automatic created concept maps Create repository of maps Provide services over the repository