RDP: Building Block for the Semantic Web

advertisement
RDF: Building Block for the
Semantic Web
Jim Ellenberger
UCCS CS5260
Spring 2011
Semantic web
• A phrase coined by Tim Berners-Lee, inventor of the WWW, in a
2001 Scientific American Article
• Berners-Lee and others have described it as a major component
of “Web 3.0”
• Wikipedia defines it well:
• A “web of data” that enables machines to understand the
semantics, or meaning, of information on the WWW
• Extends the network of hyperlinked human-readable web pages
by inserting machine-readable metadata
• Enables automated agents to access the Web more intelligently
and perform tasks on behalf of users
RDF - Jim Ellenberger - May, 2011
• What is it?
2
Why do we need it?
• Can’t directly access the meaning of information on the Web
• Can’t provide consistent methods to aggregate and query
information on the Web
• Semantic web technologies provide these missing
components
• Information can be stored, aggregated and queried based on
its meaning
• All of this can be automated, because the information is
available in machine-readable formats
RDF - Jim Ellenberger - May, 2011
• Traditional web technologies like HTML are focused on
organizing, presenting and linking documents
3
How is the semantic web
implemented?
•
•
•
•
Resource Description Framework (RDF)
Data interchange formats (RDF/XML, N3, Turtle, N-Triples)
Notations (RDFS, OWL)
Query languages (SPARQL)
• My focus: RDF
• Essentially, the building block for all semantic web technologies
• Originally specified W3C as a metadata language; it was
extended to accommodate semantic web concepts
• See http://www.w3.org/RDF
RDF - Jim Ellenberger - May, 2011
• There is a need to encode and manipulate knowledge on the
web, but how can it be done?
• Technologies that describe and manipulate information based
on meanings and relationships
4
RDF: general structure
• RDF is graph-based
• Advantages of graph-based model
• Virtually any kind and number of relationships can be
represented - no need to adhere to a hierarchy
• Diverse graphs can be combined as simply as defining a
relationship between two nodes - no need for graphs to have
compatible hieracrchies
RDF - Jim Ellenberger - May, 2011
• Not hierarchical like XML and other data description formats
• Single pieces of information are graph nodes and the
relationships between them are graph edges
5
RDF statements
• Subject – thing the statement is about
• Predicate or property – a property or characteristic of the subject
• Object – the value of the property or characteristic
• Example, a statement about a camera:
• The D300 – subject of the statement
• is manufactured by – predicate
• Nikon – object of the predicate
• This triple encodes a single piece of information: The D300 is
manufactured by Nikon
RDF - Jim Ellenberger - May, 2011
• The basic unit of information in RDF is a statement or triple
with three components
6
RDF URIs
• Unique – to avoid confusion
• Universally accessible – to make useable web wide
• These identifiers are called URIs - Uniform Resource Identifiers
• The camera example in URIs:
• http://dbpedia.org/page/Nikon_D300 - subject
• http://mywebpage.org/camera#manufactured_by - predicate
• http://www.dbpedia.org/resource/Nikon - object
RDF - Jim Ellenberger - May, 2011
• Subjects and objects that make up RDF statements are called
resources
• In order to be useful web wide, resources and the predicates
that link them need identifiers that are:
7
More abut URIs
• URIs are not URLs (but URLs are URIs)
• Where do URIs come from?
• Use an existing URI if an appropriate one exists:
http://dbpedia.org/page/Nikon_D300
• If one doesn’t exist, make your own:
http://mywebpage.org/camera#manufactured_by
• If you create your own, it must be universally accessible and must
return data to RDF clients
RDF - Jim Ellenberger - May, 2011
• URLs represent things retrievable from the web
• URIs represent things identified on the web, which may or may
not be retrievable
8
Camera example in graph form
http://mywebpage.org/camera#manufact
ured_by
http://www.dbpedia.org/resource/
Nikon
RDF - Jim Ellenberger - May, 2011
http://dbpedia.org/page/Nikon_
D300
9
Camera example linked to
other graphs
http://dbpedia.org/page/Nikon_
D300
http://mywebpage.org/camera#manufact
ured_by
http://www.dbpedia.org/resource/
Nikon
[URL: stock_price_of]
[URL of Stock Price]
[URL: Review]
RDF - Jim Ellenberger - May, 2011
[URL: review_of]
10
What Does RDF Look Like in
the Wild?
•
•
•
•
RDF/XML
Turtle
N3
RDFa
• RDF/XML is probably the most common
RDF - Jim Ellenberger - May, 2011
• RDF statements need to be serialized to be used on the WWW
and processed by machines
• There are many formats used for this:
11
RDF/XML Example
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#
xmlns:mypage="http://mywebpage.org#">
<rdf:Description rdf:about="http://dbpedia.org/page/Nikon_D300">
<mypage:manufacured_by rdf:resource="http://www.dbpedia.org/resource/Nikon"/>
</rdf:Description>
</rdf:RDF>
• XML Tags
•
•
•
•
•
rdf:RDF - begin RDF document
rdf:Description – begin description of subject(s)
rdf:about – URI for the subject
mypage:manufactured_by – the predicate
rdf:resource – URI for the object
RDF - Jim Ellenberger - May, 2011
• RDF is not XML, but it can be encoded in XML
• The camera example, in RDF/XML:
12
A real world example:
OpenCalais
<rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#
xmlns:c="http://s.opencalais.com/1/pred/">
<rdf:Description rdf:about="http://d.opencalais.com/er/product/electronics...">
...
<c:name>Nikon D300 Digital Camera</c:name>
</rdf:Description>
</rdf:RDF>
• Essentially, the edited RDF code contains the triple
• electronics product (subject)
• name (predicate)
• Nikon D300 Digital Camera (object)
RDF - Jim Ellenberger - May, 2011
• OpenCalais is a web service that automatically generates
semantic metadata in RDF/XML from text submitted to it
• This is a portion of OpenCalais’ output when “D300” is
submitted:
13
What else is happening?
• DBPedia project
• FOAF - Friend of a Friend project
•Uses RDF to describe relationships among people
•http://www.foaf-project.org/
• OpenPSI project
•Publishes UK government data in semantic web formats
•http://www.openpsi.org/
• GoodRelations vocabulary
•A means to publish product info in semantic web formats
•http://www.heppnetz.de/projects/goodrelations/
RDF - Jim Ellenberger - May, 2011
•Publishes Wikipedia information in semantic web formats
•http://dbpedia.org
14
• The amount of information that could be encoded is
staggering
• Encoding meaning isn’t always straightforward -- e.g., what
does “young” mean?
• Not everyone wants their information freely available
•Information can be a commodity
•Information can be a trade secret
• Accuracy -- how do we deal with information that is inaccurate
or deceptive
• Performance -- how will semantic web data stores perform
compared to more traditional datasets?
RDF - Jim Ellenberger - May, 2011
Important Issues
15
Conclusion
• There is quite a bit more to RDF
• There are also many related areas to explore
•
•
•
•
How can RDF data be created?
How can it be stored?
How can it be served and retrieved?
Once we retrieve RDF data, what should we do with it?
RDF - Jim Ellenberger - May, 2011
• RDF has more capabilities than described here
• RDF has been expanded with other technologies to create still
more capabilities
16
Download