RDFBrowser A tool to analyse metadata Bernhard Schueler CSCI 8350, Spring 2002,UGA

advertisement

CSCI 8350, Spring 2002,UGA

Bernhard Schueler

RDFBrowser

A tool to analyse metadata

Overview

• Intention

• Implementation

• Demo

• Where are the semantics?

• Unfinished feature: Synonyms, Homonyms, Similarity between RDF-Graphs

• What is unique about RDFBrowser

Bernhard Schueler RDFBrowser 2

Intention

Provide a tool to analyze RDF-based metadata.

This includes everything, which is or will be developed on top of RDF.

This tool should allow for:

• Convenient browsing,

• Comparisons between files, focusing on helping the user find semantic similarities/differences,

• Synonyms, homonyms, graph similarity.

Bernhard Schueler RDFBrowser 3

Implementation – Parsing

Parsing of RDF files:

ARP, Another RDF Parser, by Jeremy Carrol, HP.

Uses XERCES XML-parser.

I use this parser to extract the triples of the RDF data model.

Bernhard Schueler RDFBrowser 4

Implementation – Knowledge base

RDFBrowser uses AMZI! Prolog to store and query the

RDF triples.

AMZI! Prolog provides interfaces to Java, C, C++, Delphi and more. It runs under Windows and UNIX.

Prolog can easily be abused as database. First-argument indexing provides a certain efficiency.

I considered it to be convenient, especially for advanced inferences, such as finding synonyms (which unfortunately is unfinished)

Bernhard Schueler RDFBrowser 5

Implementation – GUI

The graphical user interface is realized using Java (JDK

1.3.1), especially the “Swing” library.

All parts of the system run at least under Windows and

UNIX.

Bernhard Schueler RDFBrowser 6

Demo

Not in here … but on the screen (hopefully).

Bernhard Schueler RDFBrowser 7

Where are the semantics?

RDFBrowser tries to overcome syntactic barriers to help the user retrieve the semantics.

The ability to simultaneously browse files should highlight semantic relationships.

A feature to find synonyms, homonyms, and similar structures in the underlying RDF graph would provide semantic analysis of

• Different descriptions of the same domain,

• Descriptions of different domains.

Bernhard Schueler RDFBrowser 8

Bernhard Schueler RDFBrowser

RDF graph

9

RDF graph – a matching

Bernhard Schueler RDFBrowser 10

2 Graphs

Bernhard Schueler RDFBrowser 11

Search space – Example

Bernhard Schueler RDFBrowser 12

Search space - Exponential

The (sub-)graph isomorphism problem is in NP.

The size of the search space is larger than n! .

Precisely: n(1+ (n-1)(1+(n-2)(1+(n-3)…2(1+1).

And that’s only exact matches!

9 Nodes: more than 362880 possible matchings.

Bernhard Schueler RDFBrowser 13

Pruning: Ullmann vs. A*

Bernhard Schueler RDFBrowser 14

My Approach

Only querying of subgraphs of a small user defined length makes sense in a browser.

Inexact matches are more likely than an exact match of 42

Nodes.

Bernhard Schueler RDFBrowser 15

My Approach – Pruning

Heuristics for pruning the search tree:

• Same labels of nodes and edges (URIs),

• Accuracy (percent of not matched nodes),

• Consider only reachable nodes.

Bernhard Schueler RDFBrowser 16

*Indexing techniques

There are advanced techniques for indexing graphs to speed up the average case, e.g. based on the number of adjacent nodes.

But imagine the tree of your direct ancestors (excluding uncles and aunts). Everyone has 2 parents…

Bernhard Schueler RDFBrowser 17

What is unique about

RDFBrowser

Regarding the semantic information contained in RDF files this browsers weakness is its strength:

It does not consider anything on top of RDF, e.g. RDFS,

DAML, OIL.

Thus, it can work with any of them.

Bernhard Schueler RDFBrowser 18

Literature

Dennis Shasha, Jason T. L. Wang, Rosalba Giugno.

Algorithmics and Applications of Tree and Graph

Searching.

Jeremy Carrol. Matching RDF Graphs. Draft, July 2001.

Koebler, Schoening, Toran. The Graph Isomorphism

Problem: Its structural complexity.Birkhaeuser, Boston,

1993.

Bernhard Schueler RDFBrowser 19

Download