Cultural Tour Applied to the Sector for

advertisement
Cultural Tour
Applied to the Cultural Heritage Sector
for
© Intelligent Software Components, S.A. 2003
2/27
The Potential of Semantic Web Technology
 Enable a paradigm switch in searching information
 From
• Information Retrieval
 To
• Question Answering
 This work illustrates an application in this line for one
particular domain
Forward
© Intelligent Software Components, S.A. 2003
3/27
Google: Federico García Lorca
© Intelligent Software Components, S.A. 2003
4/27
Archivo Virtual: Federico García Lorca
Spain member Organizations
© Intelligent Software Components, S.A. 2003
5/27
The Potential of Semantic Web Technology
 Enable a paradigm switch in searching information
 From
• Information Retrieval
 To
• Question Answering
 This work illustrates an application in this line for one
particular domain
Forward
© Intelligent Software Components, S.A. 2003
6/27
Federico García Lorca
© Intelligent Software Components, S.A. 2003
7/27
A Semantic Portal for the Spanish Silver Age
How Does it Work?
Ontology of Cultural Content
Knowledge Acquisition
Exploitation
Conclusions
The Overall Process
How does it work?
© Intelligent Software Components, S.A. 2003
9/27
Ingredients
How does it work?
 Multiple, heterogeneous sources
 Ontology
 Knowledge Acquisition “Engine”
• Knowledge Parser
• R2O and ODEMapster
 Exploitation of knowledge
• Publishing the results
- Duontology
• Semantic Navigation
- Hyperlink-based navigation
- 3D navigation
• Semantic Search Engine
- Keyword based
- NLP queries
- Enriched documents with ontological information
© Intelligent Software Components, S.A. 2003
10/27
A Semantic Portal for the Spanish Silver Age
How Does it Work?
Ontology of Cultural Content
Knowledge Acquisition
Exploitation
Conclusions
Ontology of Cultural Content
Construction
 Constructed in collaboration with experts from
Residencia de Estudiantes
 Inspired by IFLA and MARC, and also based on
general ontologies like SUO and Cyc
 Ontology metrics (after population)
•
•
•
•
•
64 concepts
91 properties
60.000 instances
60.000 facts
40Mb in RDF(S) files
© Intelligent Software Components, S.A. 2003
12/27
Ontology of Cultural Content
Illustration
© Intelligent Software Components, S.A. 2003
13/27
A Semantic Portal for the Spanish Silver Age
Semantic Portal Definition
Ontology of Cultural Content
Knowledge Acquisition
Exploitation
Conclusions
The Sources
Knowledge Acquisition
ULAN
Toponyms and Persons
Residencia de Estudiantes
Supervision
© Intelligent Software Components, S.A. 2003
15/27
ULAN: United List of Artist Names
© Intelligent Software Components, S.A. 2003
16/27
The Sources
Knowledge Acquisition
ULAN
Toponyms and Persons
Residencia de Estudiantes
Supervision
© Intelligent Software Components, S.A. 2003
17/27
Residencia de Estudiantes. Revistas
© Intelligent Software Components, S.A. 2003
18/27
Knowledge Parser® Architecture
Knowledge Acquisition
 Source Pre-processing
 Information identification
 Ontology population
Pre-Processing
Types
Pluggable
Strategies
Intelligent Population
of Ontologies
© Intelligent Software Components, S.A. 2003
19/27
Different Types of Pre-processing
Knowledge Acquisition
 Plain Text Model
•
•
Sources
Regular Expression Check and
Retrieval
Offset References
Information Idetification
Ontology Population
Data
Presentation
Identification
Population
Structure
Hypothesis
Operators
Evaluation
Text
Description
Domain
Language
 DOM/Hypertext
•
•
Source Preprocess
HTML object identification
HTTP control and navigation
 NLP
•
•
•
Basic NLP: Tokenizer, Morphology and
Chunk Parsers
Retrieve phrases using head driven
approach
Basic semantic relations (synonyms,
hyponyms, etc.)
 Layout
•
•
Rendered result of a HTML source:
(X,Y) coordinates
Visual Operators: SAME_ROW,
NEAR, etc…
Source
Pre-process
Interpretations
Text
t
Tex
DOM
M
DO
Re
nde
Layout
r
PL N
Language
Back
© Intelligent Software Components, S.A. 2003
20/27
Knowledge Parser® Architecture
Knowledge Acquisition
 Source Pre-processing
 Information identification
 Ontology population
Pre-Processing
Types
Pluggable
Strategies
Intelligent Population
of Ontologies
© Intelligent Software Components, S.A. 2003
21/27
Explicit Extraction Knowledge
Knowledge Acquisition
Sources
Source Preprocess
Information Idetification
Ontology Population
Data
Presentation
Identification
Population
Structure
Operators
Hypothesis
Evaluation
Text
Description
Domain
Language
Wrapping ontology
 Documents
 Pieces
 Relations
• Semantic
• Layout
 Data Types
• Meaning
• Basic Types
• HTML
© Intelligent Software Components, S.A. 2003
22/27
Operators and Strategies
Knowledge Acquisition
Sources
Source Preprocess
Information Idetification
Ontology Population
Data
Presentation
Identification
Population
Structure
 Operators
Operators
Hypothesis
Evaluation
Text
Description
Domain
Language
• Check: data types, relations, constraints
• Retrieve: obtains piece or document (precondition)
• Execute: navigate, select, etc…
 Strategies (operators applied for hypothesis
construction)
• Greedy: quick but not optimal
• Heuristics: hypothesis construction and pruning
• Optimal Backtracking: covering all search space
Back
© Intelligent Software Components, S.A. 2003
23/27
Knowledge Parser® Architecture
Knowledge Acquisition
 Source Pre-processing
 Information identification
 Ontology population
Pre-Processing
Types
Pluggable
Strategies
Intelligent Population
of Ontologies
© Intelligent Software Components, S.A. 2003
24/27
Ontology Population
Knowledge Acquisition
Sources
Source Preprocess
Information Idetification
Ontology Population
Data
Presentation
Identification
Population
Structure
Operators
Hypothesis
Evaluation
Text
Description
Domain
Language
 Actions:
•
•
•
•
Create new instance
Modify existing instance
Remove existing instance
Relate existing instances
 Process:
• Hypothesis evaluation
• Population simulation
• Lowest cost simulation
algorithm
Back
© Intelligent Software Components, S.A. 2003
25/27
A Semantic Portal for the Spanish Silver Age
Semantic Portal Definition
Ontology of Cultural Content
Knowledge Acquisition
Exploitation
Conclusions
Publishing in a Semantic Portal
Exploitation
© Intelligent Software Components, S.A. 2003
27/27
“Traditional” Publishing
Exploitation
Semantic Web Publication: Semantic Portal
Need for SW information publication on WWW (for humans)
Browsable Web Site
Knowledge Base
Milestone
Document
Person
publication
Tool
Partner
Inconveniences of direct publication/translation
• Semantic model is not necessary user-friendly (relations, control attributes)
• Interface change entails model change
• Model publication is not always desired
© Intelligent Software Components, S.A. 2003
28/27
Decoupled Publishing
Exploitation
Knowledge Base
Browsable Web Site
Publication Ontology
Person
Plan
Milestone
Document
publication
RDQL
Person
Deliverable
Tool
Partner
Partner
Knowledge base
(RDF/RDFS)
Publication model
(RDF/RDFS)
WWW Page
(HTML)
Architecture
Command Pattern
Partner
Ontology view
RDQL
Person
Employee
English
Architecture
Command Pattern
Works at
Internal format
(XML)
Works at
V. Richard
Benjamins
iSOCO
RDQL
Richard at
iSOCO
Java
Business Logic
XSL
Transformations
Visualization is independent from the Semantic Model
© Intelligent Software Components, S.A. 2003
29/27
Semantic Navigation
Exploitation
 Example: “Federico García Lorca”
 Returns instances
 Allows reference consulting
© Intelligent Software Components, S.A. 2003
30/27
Semantic Navigation and Annotation. Onto-H
Exploitation
 Browsing between ontology and sources
© Intelligent Software Components, S.A. 2003
31/27
New Ways of Visualising Semantic Web Content
Exploitation
Visualization
Ontology
Visualization
Domain Ontology
GRAPH
ARTGALLERY
© Intelligent Software Components, S.A. 2003
32/27
New Ways of Visualising Semantic Web Content
Exploitation
PEOPLE
PERSON
PLACES
CREATION
© Intelligent Software Components, S.A. 2003
33/27
New Ways of Visualising Semantic Web Content
Exploitation
© Intelligent Software Components, S.A. 2003
34/27
A Semantic Portal for the Spanish Silver Age
Semantic Portal Definition
Ontology of Cultural Content
Knowledge Acquisition
Exploitation
Conclusions
Conclusions
 Towards a paradigm switch in searching?
 Detailed failure analysis needed. Why does Search
Engine fail?
• KA limitation
- Not in ontology
- Missing/wrong instances
• Query construction
- NLP result (ambiguity)
- SeRQL query construction
© Intelligent Software Components, S.A. 2003
36/27
Download