RDP: Building Block for the Semantic Web

RDF: Building Block for the
Semantic Web
Jim Ellenberger
Spring 2011
Semantic web
• A phrase coined by Tim Berners-Lee, inventor of the WWW, in a
2001 Scientific American Article
• Berners-Lee and others have described it as a major component
of “Web 3.0”
• Wikipedia defines it well:
• A “web of data” that enables machines to understand the
semantics, or meaning, of information on the WWW
• Extends the network of hyperlinked human-readable web pages
by inserting machine-readable metadata
• Enables automated agents to access the Web more intelligently
and perform tasks on behalf of users
RDF - Jim Ellenberger - May, 2011
• What is it?
Why do we need it?
• Can’t directly access the meaning of information on the Web
• Can’t provide consistent methods to aggregate and query
information on the Web
• Semantic web technologies provide these missing
• Information can be stored, aggregated and queried based on
its meaning
• All of this can be automated, because the information is
available in machine-readable formats
RDF - Jim Ellenberger - May, 2011
• Traditional web technologies like HTML are focused on
organizing, presenting and linking documents
How is the semantic web
Resource Description Framework (RDF)
Data interchange formats (RDF/XML, N3, Turtle, N-Triples)
Notations (RDFS, OWL)
Query languages (SPARQL)
• My focus: RDF
• Essentially, the building block for all semantic web technologies
• Originally specified W3C as a metadata language; it was
extended to accommodate semantic web concepts
• See http://www.w3.org/RDF
RDF - Jim Ellenberger - May, 2011
• There is a need to encode and manipulate knowledge on the
web, but how can it be done?
• Technologies that describe and manipulate information based
on meanings and relationships
RDF: general structure
• RDF is graph-based
• Advantages of graph-based model
• Virtually any kind and number of relationships can be
represented - no need to adhere to a hierarchy
• Diverse graphs can be combined as simply as defining a
relationship between two nodes - no need for graphs to have
compatible hieracrchies
RDF - Jim Ellenberger - May, 2011
• Not hierarchical like XML and other data description formats
• Single pieces of information are graph nodes and the
relationships between them are graph edges
RDF statements
• Subject – thing the statement is about
• Predicate or property – a property or characteristic of the subject
• Object – the value of the property or characteristic
• Example, a statement about a camera:
• The D300 – subject of the statement
• is manufactured by – predicate
• Nikon – object of the predicate
• This triple encodes a single piece of information: The D300 is
manufactured by Nikon
RDF - Jim Ellenberger - May, 2011
• The basic unit of information in RDF is a statement or triple
with three components
• Unique – to avoid confusion
• Universally accessible – to make useable web wide
• These identifiers are called URIs - Uniform Resource Identifiers
• The camera example in URIs:
• http://dbpedia.org/page/Nikon_D300 - subject
• http://mywebpage.org/camera#manufactured_by - predicate
• http://www.dbpedia.org/resource/Nikon - object
RDF - Jim Ellenberger - May, 2011
• Subjects and objects that make up RDF statements are called
• In order to be useful web wide, resources and the predicates
that link them need identifiers that are:
More abut URIs
• URIs are not URLs (but URLs are URIs)
• Where do URIs come from?
• Use an existing URI if an appropriate one exists:
• If one doesn’t exist, make your own:
• If you create your own, it must be universally accessible and must
return data to RDF clients
RDF - Jim Ellenberger - May, 2011
• URLs represent things retrievable from the web
• URIs represent things identified on the web, which may or may
not be retrievable
Camera example in graph form
RDF - Jim Ellenberger - May, 2011
Camera example linked to
other graphs
[URL: stock_price_of]
[URL of Stock Price]
[URL: Review]
RDF - Jim Ellenberger - May, 2011
[URL: review_of]
What Does RDF Look Like in
the Wild?
• RDF/XML is probably the most common
RDF - Jim Ellenberger - May, 2011
• RDF statements need to be serialized to be used on the WWW
and processed by machines
• There are many formats used for this:
RDF/XML Example
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#
<rdf:Description rdf:about="http://dbpedia.org/page/Nikon_D300">
<mypage:manufacured_by rdf:resource="http://www.dbpedia.org/resource/Nikon"/>
• XML Tags
rdf:RDF - begin RDF document
rdf:Description – begin description of subject(s)
rdf:about – URI for the subject
mypage:manufactured_by – the predicate
rdf:resource – URI for the object
RDF - Jim Ellenberger - May, 2011
• RDF is not XML, but it can be encoded in XML
• The camera example, in RDF/XML:
A real world example:
<rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#
<rdf:Description rdf:about="http://d.opencalais.com/er/product/electronics...">
<c:name>Nikon D300 Digital Camera</c:name>
• Essentially, the edited RDF code contains the triple
• electronics product (subject)
• name (predicate)
• Nikon D300 Digital Camera (object)
RDF - Jim Ellenberger - May, 2011
• OpenCalais is a web service that automatically generates
semantic metadata in RDF/XML from text submitted to it
• This is a portion of OpenCalais’ output when “D300” is
What else is happening?
• DBPedia project
• FOAF - Friend of a Friend project
•Uses RDF to describe relationships among people
• OpenPSI project
•Publishes UK government data in semantic web formats
• GoodRelations vocabulary
•A means to publish product info in semantic web formats
RDF - Jim Ellenberger - May, 2011
•Publishes Wikipedia information in semantic web formats
• The amount of information that could be encoded is
• Encoding meaning isn’t always straightforward -- e.g., what
does “young” mean?
• Not everyone wants their information freely available
•Information can be a commodity
•Information can be a trade secret
• Accuracy -- how do we deal with information that is inaccurate
or deceptive
• Performance -- how will semantic web data stores perform
compared to more traditional datasets?
RDF - Jim Ellenberger - May, 2011
Important Issues
• There is quite a bit more to RDF
• There are also many related areas to explore
How can RDF data be created?
How can it be stored?
How can it be served and retrieved?
Once we retrieve RDF data, what should we do with it?
RDF - Jim Ellenberger - May, 2011
• RDF has more capabilities than described here
• RDF has been expanded with other technologies to create still
more capabilities