From semantic networks, to ontologies, and concept maps

advertisement
From semantic networks, to
ontologies, and concept maps:
knowledge tools in digital
libraries
Marcos André Gonçalves
Digital Library Research Laboratory
Virginia Tech
Outline


Introduction
Semantic Networks in Information
Retrieval



The MARIAN system
Digital Library Ontologies
Concepts maps: knowledge representation
and visualization in DLs
Introduction

Experiment how new knowledge representation
tools can be used in Digital Libraries

Semantic networks


Ontologies


Representation, retrieval and inference of DL constructs
and relationships
Formalize, model and generate DLs
Concept Maps

Visualization tool


Supporting collaborative work
Transforming information to knowledge creation
Outline


Introduction
Semantic Networks in Information
Retrieval



The MARIAN system
Digital Library Ontologies
Concepts maps: knowledge representation
and visualization in DLs
Semantic Networks in DLs: MARIAN

Motivation

Support rich DL information services which
are:
Extensible
 Tailorable


Support large, diverse collections of digital
objectives which:
have complex internal structures
 are in complex relationships with each other and
with other non-library objects such as persons,
institutions, and events

Design choices
Design
choices
Objective
Examples of use
Semantic
networks
Basic, unified representation Document and metadata structure;
of digital library structures
hierarchical relationships of classification
systems; concept maps
Weighting
schemes
Support IR operations and
services; quantitative
representation of qualitative
properties (similarity,
uncertainty, quality)
Weighted links representing indexes;
multi-field, multi-word, fusion of
weighted IR sets; degree of similarity
among concepts in different ontologies
Object
oriented
class
system
Provide common behavior,
extensibility, and
opportunity for improved
performance
Shared methods for matching different
types of nodes (terms, controlled, free
texts) and link topologies; multilingual
support and common presentation
methods
Lazy
evaluation
Performance; management
of large collections
Reduced number of search results;
enhanced merging algorithms for
weighted sets of searching results
Design choices: semantic networks



Represent knowledge in patterns of interconnected nodes
Graph representation to express knowledge or to support
automated systems for reasoning
Sowa’s classification:

Definitional networks


Assertional networks


Mechanism to pass messages (tokens, weights)
Learning networks



Implication as the primary relation
Executable networks


Assert propositions
Implicational networks


Inheritance hierarchies
Modify internal representations (weights, structure)
Ability to measure similarity
Hybrid networks
Design choices: MARIAN semantic network
hasAuthor
occursInAuthor
Person
term
ETD Metadata
occursInAbstract
hasAbstract
id
term
Abstract
hasSubject
Subject
occursInAbstract
describes
ETD Doc
id
hasChapter
Chapter
hasSection
Section
hasParagraph
Paragraph
cites
Section
…
term
occursInSubject
term
Paragraph
Paper
id
term
…
occursInParagraph
term
MARIAN API (Main)
ClassMgr
termClassMgr
nodeClassMgr
unwtdLink
ClassMgr
nGram
ClassMgr
EnglishRoot
ClassMgr
SpanishRoot
ClassMgr
controlledText
ClassMgr
linkClassMgr
TextClassMgr
EnglishText
ClassMgr
has*
ClassMgr
SpanishText
ClassMgr
wtdLink
ClassMgr
occursIn*
ClassMgr
ChineseText
ClassMgr
Architecture and Implementation (cont.)

The Search layer


Mapping from abstract object description to weighted set
of objects
Types of search



Link activation
Search in context
Searchers


OO search engines
Based on fusion


Examples: maximizing union searcher, summative union searcher
Supported by


Tables: short-term memory of elements seen to date, checking each
new element to keep or discard
Sequencers: take a set of incoming streams of weighted sets and
produce single output. Exs: PriQueueSequencer, MergeSequencer.
Architecture and Implementation (cont.)

The Search layer
1
Digital
occursInAbstract
Abstract
hasTitle
Library
#2006:42369
E. A . Fox
query
#2007:74667
hasAdvisor
Advisor
occursInAdvisor
4
hasAdvisor
Searcher
OccursIn
Abstract
Searcher
#2006:60812
Parser
(Morphological
matcher)
{#6029:65655:1.00,
#6029:989:0.74,
…
}
1
OccursIn
Advisor
Searcher
3
{#6031:45634:1.0,
#6031:5678:0.9,
…
}
{#6000:54544:1.0,
#6000:2987:0.9
#6000:003:0.74,
…
}
5
{#6029:3000:0.85,
#6029:65655:0.8
2
…
}
{#6015:65655:0.90,
#6015:3000:0.425
#6015:989:0.37,
Summative
…
}
Union
Searcher
2
Summative
Union
Searcher
{#6000:856:0.90,
#6000:7890:0425,
…
}
Final result set
5
6
4
hasAbstract
Searcher
Future Work

Testing of:

Efficiency
OO class-model vs. instance level semantic network
 Lazy evaluation
 Tables and sequencers


Effectiveness with:
Structured documents and metadata
 Fulltext


Supporting richer networks of relationships


Citation linking
Multi-language term relationships
Future Work

Support for other types of networks and
graph-based digital objects and structures





Belief networks
Topic/Concept maps
Ontologies, classification schemes
Supporting multimedia retrieval
Supporting for CLIR
Outline


Introduction
Semantic Networks in Information
Retrieval



The MARIAN system
Digital Library Ontologies
Concepts maps: knowledge representation
and visualization in DLs
Ontologies for DLs

Motivation


DLs are an ill-understood phenomena
Lack of formal models for DLs


Ad-hoc development, interoperability
Formal Ontologies for DLs


specify relevant concepts – the types of things and their
properties – and the semantics relationships that exist
between those concepts in a particular domain.
use a language with a mathematically well-defined syntax
and semantics to describe such concepts, properties, and
relationships precisely
5S Model (informally)

Digital libraries are complex information
systems that:
 help satisfy info needs of users
(societies)
 provide info services (scenarios)
 organize info in usable ways (structures)
 present info in usable ways (spaces)
 communicate info with users (streams)
5S Model
Models
Examples
Objectives
Stream
Text; video; audio; image
Describes properties of the DL
content such as encoding and
language for textual material or
particular forms of multimedia data
Structures
Collection; catalog;
hypertext; document;
metadata; organization tools
Specifies organizational aspects of
the DL content
Spatial
Measure; measurable,
topological, vector,
probabilistic
Defines logical and presentational
views of several DL components
Scenarios
Searching, browsing,
recommending,
Details the behavior of DL services
Societies
Service managers, learners,
Teachers, etc.
Defines managers, responsible for
running DL services; actors, that
use those services; and relationships
among them
5S Model: Mathematical formal theory for DLs
5S
Definition
Streams
Structures
Sequences of elements of an
arbitrary type
Labeled directed graphs
Spatial
Sets and operations on those sets
Scenarios
sequences of events that modify
states of a computation in order to
accomplish some functional
requirement.
Societies
Sets of communities and
relationships among them
measurable, measure, probability, vector, topological
spaces
relation
tuple
sequence
state
event
function
sequence graph
5S
grammar
streams
structures
spaces
scenarios
services
structured
stream
digital
object
structural
descriptive
metadata
metadata
specification specification
indexing
service
browsing
service
hypertext
metadata catalog
transmission
societies
collection
repository
digital
library
(minimal)
searching
service
Ontologies for DLs
Ontologies for DLs

Realizations of the theory/ontology



Meta-Model for a DL descriptive modeling
language: 5SL (JCDL2002)
Meta-Model for a DL Visual modeling Tool:
5SGraph (ECDL2003)
Meta-Model for an XML Log Standard
(ECDL2002, JCDL2003)
Realizations of the
theory/ontology

5S Meta-Schema
* Text
* Video
Stream Model
* Audio
* Image
* Application
Structural Model
5S
* Collection
* Document
* Catalog
* Metadata
* Organizational Tool
* Authority File
* Classification Schema
* Thesaurus
Space Model
Scenario Model
User Interface
* Rendering
Retrieval Model
* Index
* Services
* Actor
Society Model
* Manager
* Scenarios
* Ontology
Realizations of the
theory/ontology

5SGraph Interface
Future Work

Semantic relationships



Taxonomy of services


Only “syntactic” ones were defined
Constraints and dependencies (in form of axioms)
Composability, Extensibility
Formal definitions of properties of DL
models/architectures and proofs



Completeness
Soundness
Equivalence
Outline


Introduction
Semantic Networks in Information
Retrieval



The MARIAN system
Digital Library Ontologies
Concepts maps: knowledge representation
and visualization in DLs
Concepts maps: knowledge representation
and visualization in DLs
Challenges in Visual Interfaces for DLs (Chen &
Borner)
1. Supporting collaborative work
2. Transforming information to knowledge creation
Hypothesis: Concepts maps can serve as a uniform
visual abstraction to provide solutions for these
problems.
What are concept maps
Applications:
1. Knowledge organization and creation
2. Collaborative learning

GetSmart Experience (JCDL2003)
3. Domain summarization
4. Browsing tool
Knowledge Repository
DL
Data
Information provider
information
knowledge
Knowledge repository
GetSmart Experience (Cont.)
 Collaborative learning: Group maps
GetSmart Experience (Cont.)
 Summarization tool
Summarization tool
 Supplement to document abstracts both for
one language and across language
----pilot experiment
Group 1(14)
Group 2 (14)
English papers
Original abstract
Original abstract
concept map
Spanish papers
Original abstract plus
translated version
Original abstract plus machine
translated version plus
translated concept map
Summarization tool (Cont.)
 Pilot experiment results
Group 1(14)
average
Group 2 (14)
average
P-value
Q1 (English)
1.6631
1.3839
0.527
Q2 (English)
1.6599
1.1310
0.185
Q3 (Spanish)
1.7085
1.1039
0.209
Q4 (Spanish)
1.6815
0.9831
0.030 *
Likert (English)
N/A
3.6, 4.4
0.022 *
Likert (English)
N/A
2.7, 4.3
0.001 *
Automatic generation
 Motivation:
 Automatic concept map is tedious and timeconsuming
 Novices will draw flawed or overly simplistic
map
 Maintain uniformity
 Technique
 Term co-occurrence (Gaines & Shaw)
Automatic generation (Cont.)
 Spanish documents

Procedure:

Determine part-of-speech for each word

Collapse all inflected forms to root form

Concatenate noun phrases into one “concept”

Remove some stopwords, keep others for use in
crosslinks
Browsing tools
• Visual aid to navigate through complex collect
inter-related digital objects
• Support Multi-hierarchy browsing
Concept Maps’ supports for DL (cont.)
 Browsing and searching assistant
Future Work



Improve the quality of automatic created
concept maps
Create repository of maps
Provide services over the repository
Download