Cultural Tour Applied to the Cultural Heritage Sector for © Intelligent Software Components, S.A. 2003 2/27 The Potential of Semantic Web Technology Enable a paradigm switch in searching information From • Information Retrieval To • Question Answering This work illustrates an application in this line for one particular domain Forward © Intelligent Software Components, S.A. 2003 3/27 Google: Federico García Lorca © Intelligent Software Components, S.A. 2003 4/27 Archivo Virtual: Federico García Lorca Spain member Organizations © Intelligent Software Components, S.A. 2003 5/27 The Potential of Semantic Web Technology Enable a paradigm switch in searching information From • Information Retrieval To • Question Answering This work illustrates an application in this line for one particular domain Forward © Intelligent Software Components, S.A. 2003 6/27 Federico García Lorca © Intelligent Software Components, S.A. 2003 7/27 A Semantic Portal for the Spanish Silver Age How Does it Work? Ontology of Cultural Content Knowledge Acquisition Exploitation Conclusions The Overall Process How does it work? © Intelligent Software Components, S.A. 2003 9/27 Ingredients How does it work? Multiple, heterogeneous sources Ontology Knowledge Acquisition “Engine” • Knowledge Parser • R2O and ODEMapster Exploitation of knowledge • Publishing the results - Duontology • Semantic Navigation - Hyperlink-based navigation - 3D navigation • Semantic Search Engine - Keyword based - NLP queries - Enriched documents with ontological information © Intelligent Software Components, S.A. 2003 10/27 A Semantic Portal for the Spanish Silver Age How Does it Work? Ontology of Cultural Content Knowledge Acquisition Exploitation Conclusions Ontology of Cultural Content Construction Constructed in collaboration with experts from Residencia de Estudiantes Inspired by IFLA and MARC, and also based on general ontologies like SUO and Cyc Ontology metrics (after population) • • • • • 64 concepts 91 properties 60.000 instances 60.000 facts 40Mb in RDF(S) files © Intelligent Software Components, S.A. 2003 12/27 Ontology of Cultural Content Illustration © Intelligent Software Components, S.A. 2003 13/27 A Semantic Portal for the Spanish Silver Age Semantic Portal Definition Ontology of Cultural Content Knowledge Acquisition Exploitation Conclusions The Sources Knowledge Acquisition ULAN Toponyms and Persons Residencia de Estudiantes Supervision © Intelligent Software Components, S.A. 2003 15/27 ULAN: United List of Artist Names © Intelligent Software Components, S.A. 2003 16/27 The Sources Knowledge Acquisition ULAN Toponyms and Persons Residencia de Estudiantes Supervision © Intelligent Software Components, S.A. 2003 17/27 Residencia de Estudiantes. Revistas © Intelligent Software Components, S.A. 2003 18/27 Knowledge Parser® Architecture Knowledge Acquisition Source Pre-processing Information identification Ontology population Pre-Processing Types Pluggable Strategies Intelligent Population of Ontologies © Intelligent Software Components, S.A. 2003 19/27 Different Types of Pre-processing Knowledge Acquisition Plain Text Model • • Sources Regular Expression Check and Retrieval Offset References Information Idetification Ontology Population Data Presentation Identification Population Structure Hypothesis Operators Evaluation Text Description Domain Language DOM/Hypertext • • Source Preprocess HTML object identification HTTP control and navigation NLP • • • Basic NLP: Tokenizer, Morphology and Chunk Parsers Retrieve phrases using head driven approach Basic semantic relations (synonyms, hyponyms, etc.) Layout • • Rendered result of a HTML source: (X,Y) coordinates Visual Operators: SAME_ROW, NEAR, etc… Source Pre-process Interpretations Text t Tex DOM M DO Re nde Layout r PL N Language Back © Intelligent Software Components, S.A. 2003 20/27 Knowledge Parser® Architecture Knowledge Acquisition Source Pre-processing Information identification Ontology population Pre-Processing Types Pluggable Strategies Intelligent Population of Ontologies © Intelligent Software Components, S.A. 2003 21/27 Explicit Extraction Knowledge Knowledge Acquisition Sources Source Preprocess Information Idetification Ontology Population Data Presentation Identification Population Structure Operators Hypothesis Evaluation Text Description Domain Language Wrapping ontology Documents Pieces Relations • Semantic • Layout Data Types • Meaning • Basic Types • HTML © Intelligent Software Components, S.A. 2003 22/27 Operators and Strategies Knowledge Acquisition Sources Source Preprocess Information Idetification Ontology Population Data Presentation Identification Population Structure Operators Operators Hypothesis Evaluation Text Description Domain Language • Check: data types, relations, constraints • Retrieve: obtains piece or document (precondition) • Execute: navigate, select, etc… Strategies (operators applied for hypothesis construction) • Greedy: quick but not optimal • Heuristics: hypothesis construction and pruning • Optimal Backtracking: covering all search space Back © Intelligent Software Components, S.A. 2003 23/27 Knowledge Parser® Architecture Knowledge Acquisition Source Pre-processing Information identification Ontology population Pre-Processing Types Pluggable Strategies Intelligent Population of Ontologies © Intelligent Software Components, S.A. 2003 24/27 Ontology Population Knowledge Acquisition Sources Source Preprocess Information Idetification Ontology Population Data Presentation Identification Population Structure Operators Hypothesis Evaluation Text Description Domain Language Actions: • • • • Create new instance Modify existing instance Remove existing instance Relate existing instances Process: • Hypothesis evaluation • Population simulation • Lowest cost simulation algorithm Back © Intelligent Software Components, S.A. 2003 25/27 A Semantic Portal for the Spanish Silver Age Semantic Portal Definition Ontology of Cultural Content Knowledge Acquisition Exploitation Conclusions Publishing in a Semantic Portal Exploitation © Intelligent Software Components, S.A. 2003 27/27 “Traditional” Publishing Exploitation Semantic Web Publication: Semantic Portal Need for SW information publication on WWW (for humans) Browsable Web Site Knowledge Base Milestone Document Person publication Tool Partner Inconveniences of direct publication/translation • Semantic model is not necessary user-friendly (relations, control attributes) • Interface change entails model change • Model publication is not always desired © Intelligent Software Components, S.A. 2003 28/27 Decoupled Publishing Exploitation Knowledge Base Browsable Web Site Publication Ontology Person Plan Milestone Document publication RDQL Person Deliverable Tool Partner Partner Knowledge base (RDF/RDFS) Publication model (RDF/RDFS) WWW Page (HTML) Architecture Command Pattern Partner Ontology view RDQL Person Employee English Architecture Command Pattern Works at Internal format (XML) Works at V. Richard Benjamins iSOCO RDQL Richard at iSOCO Java Business Logic XSL Transformations Visualization is independent from the Semantic Model © Intelligent Software Components, S.A. 2003 29/27 Semantic Navigation Exploitation Example: “Federico García Lorca” Returns instances Allows reference consulting © Intelligent Software Components, S.A. 2003 30/27 Semantic Navigation and Annotation. Onto-H Exploitation Browsing between ontology and sources © Intelligent Software Components, S.A. 2003 31/27 New Ways of Visualising Semantic Web Content Exploitation Visualization Ontology Visualization Domain Ontology GRAPH ARTGALLERY © Intelligent Software Components, S.A. 2003 32/27 New Ways of Visualising Semantic Web Content Exploitation PEOPLE PERSON PLACES CREATION © Intelligent Software Components, S.A. 2003 33/27 New Ways of Visualising Semantic Web Content Exploitation © Intelligent Software Components, S.A. 2003 34/27 A Semantic Portal for the Spanish Silver Age Semantic Portal Definition Ontology of Cultural Content Knowledge Acquisition Exploitation Conclusions Conclusions Towards a paradigm switch in searching? Detailed failure analysis needed. Why does Search Engine fail? • KA limitation - Not in ontology - Missing/wrong instances • Query construction - NLP result (ambiguity) - SeRQL query construction © Intelligent Software Components, S.A. 2003 36/27