Linked Data

advertisement
Linked Data
Scott E. Barasch
barasch1 [at] umbc [dot] edu
scottbarasch.com

Linked Data exemplifies the original vision of
the Semantic Web as being a web of
interconnected links of information such as
those stored in FOAF, RDF, OWL or other files





All data must be named with a URI
This URI must be a valid URL
There must be a page at this URL which
contains the data that is represented by the URI
Name
This URL / URL should NEVER change
Data should be interlinked between documents
/ files on the web




In the past, Semantic Web data was not
published to the web
It was stored in a zip file, and often stored on
an external disk or tape media
An example of this is an ontology which
contains data about all of the Semantic Web
researchers
Recently this has changed, as the need for an
interwoven mesh of linked data has become
appearent

Many different ontologies contain similar
information for various data members




I.e. Name, SSN, Birthday, Zip Code, Telephone Number
These data members can be connected, to join the
data from multiple ontologies into a giant
collection of data, which can be commonly
queried.
The ultimate result would be to create an entire
mesh web of all the ontologies in the world, where
each ontology would be a node in a giant graph.
That graph would be the Semantic Web







DBpedia - a dataset containing extracted data from Wikipedia; it contains
about 2.18 million concepts described by 218 million triples, including
abstracts in 11 different languages
DBLP Bibliography - provides bibliographic information about scientific
papers; it contains about 800,000 articles, 400,000 authors, and approx. 15
million triples
GeoNames provides RDF descriptions of more than 6,500,000
geographical features worldwide.
Revyu - a Review service consumes and publishes Linked Data, primarily
from DBpedia.
riese - serving statistical data about 500 million Europeans (the first
linked dataset deployed with XHTML+RDFa)
UMBEL - a lightweight reference structure of 20,000 subject concept
classes and their relationships derived from OpenCyc, which can act as
binding classes to external data; also has links to 1.5 million named
entities from DBpedia and YAGO
Sensorpedia - A scientific initiative at Oak Ridge National Laboratory
using a RESTful web architecture to link to sensor data and related
sensing systems.

Creating a single ontology out of all of the
linked ontologies in the world would be a
nightmare



Data access and reasoning time would be
astronomical
The sheer load of a single user could possibly cripple
the network
No computer on the earth could realistically process
and compute such a large amount of data

Imagine a just in time access model for this
single ontology:





The multiple ontologies would be “linked” by
common data members (Name, Address, Zip Code,
et. al.)
Users or agents know ahead of time which
ontologies they would wish to query
These queries go only to the individual ontologies
The data is returned to the user agent, which then
parses the data, and connects the similar data
members
These data members are “linked”, and a local subset
of the global single ontology is created for the
extracted data



Linking Data by itself is not enough
We need to be able to follow those links, and
combine ontologies so that we can combine the
information stored in one ontology with the
data stored in many other ontologies
This merging of data allows us to gain more
enhanced information, and sometimes can
provide new information that is larger than the
sum of all the information in all of the
ontologies we are querying.




The concept of a data Mashup is how this is
accomplished today
A Mashup engine is the client side user agent.
Web Services query Semantic Web data
repositories and retrieve the requested data.
The data is connected, and a greater meaning is
discovered from small sets of disjoint data
which are now connected.

A Mashup is a way of combining related data
into a pictorial form using Socially Rich
computing technology to make the data easy to
read and understand







Charts
Graphs
Websites
Maps
Tables
Movies
AJAX Rich Applications





Web 2.0 is known as the Social / Collaborative
Web
Web 3.0 is another term used to express the
Semantic Web
The linked data is considered Web 3.0.
The practice of pulling the data into the
Mashup Engine is a mix between Web 3.0 and
Web 2.0
The practice of displaying the data in a Mashup
is referred to as Web 2.0.
http://www.jackbe.com/enterprisemashup/
http://www-01.ibm.com/software/lotus/products/mashups/
Data can be pulled from existing
Enterprise Datacenter Services,
and also from feeds on the
internet or Semantic Web.
Example input data can include:
XML, RDF, LDAP, SQL, CSV,
Office Documents, RSS Feeds,
Directory Servers, among
others.
Data mapping patterns, merging, looping, and logical operations are all supported
Download