for librarians!
A hands-on-exploration of an often nebulous concept
Reinhard Engels, ABCD Library, October 2013
1. I am not an expert!
2. Apparently it takes more than a week to become one
3. Your brain may hurt
1. Convey some actual knowledge about LD
2. Let you pass a polygraph
3. Reassure that it’s OK to be confused
4. Lower the bar for asking “stupid questions”
1. Quick review of what Linked Data (LD) is
2. Look at some real LD (Dbpedia, NY Times)
3. Make some simple LD (RDF “N-Triples”)
4. Query remote LD source (SPARQL on
Dbpedia)
5. How to embed LD in HTML (RDFa et al.)
6. Ponder things that are kinda sorta like LD
7. Recover!
• “a set of best practices for publishing and connecting structured data on the web”
• Conceived by the guy who invented the WWW
• Web of Data
• Turns the web into a giant database
• With a single, consistent API
• Simple, elegant, familiar mechanism: URIs and
“typed links”
• For users: enables meaningful queries instead of just text string searches; research applications, consumer applications
• For creators: efficiency of not having to redundantly create and maintain data.
• One API for all data: this is a thing of beauty in itself.
1. Use URIs as names for things.
2. Use HTTP URIs, so that people can look up those names.
3. When someone looks up a URI, provide useful information, using the standards (RDF,
SPARQL).
4. Include links to other URIs, so that they can discover more things.
Linked Data: How? (RE’s formulation)
1. Describe things using RDF triples
2. Identify things using HTTP URIs
3. Those URIs should link to more LD (that other people have already created, whenever possible)
Linked Data: How? (RE’s even shorter reformulation)
1. Describe with RDF
2. Identify with HTTP URIs
3. Link to more LD
• RDF = Resource Description Framework
• F stands for framework, not file type!
• It’s a conceptual model
• “content agnostic” (can describe anything)
• Describe things using 3 terms (“RDF triples”)
1. Subject
Fred
Fred
2. Predicate
Likes
Date of Birth
3. Object
Wilma
October 2, 1973
1. Subject and Predicate MUST be URIs
2. Object may be URI or raw value (number, text, date, etc.)
1. Subject
Fred http://s.org/fred http://s.org/fred
2. Predicate
Likes
3. Object
Wilma http://p.org/likes “Wilma” http://p.org/likes http://o.org/wilma
• What format should the referenced LD be in?
• If I go to http://o.org/wilma , what should I see there?
• Are predicates in RDF too? http://p.org/likes
(Why are you even here?)
(Why aren’t you making Linked Data NOW?)
"At first glance, the principles of Linked Data seem simple enough.
However experienced Web developers, designers and architects who attempt to put these ideas into practice often find themselves having to digest and understand debates about Web architecture, the semantic web, artificial intelligence and the philosophical nature of identity.”
– Ed Summers & Dorothea Salo
• It’s OK!
• Though core concepts are very simple
• It quickly gets confusing – it’s not just you
• Accidental: partially overlapping concepts.
• Intrinsic: simple parts make complex whole
• Danger: Is it too simple? (ambiguous)
“Make things as simple as possible, but not simpler.” – Einstein (paraphrased)
• There are a lot of things that are kinda sorta like LD!
• Semantic Web (1994)
• Web APIs (10,214 and counting)
• Facebook Open Graph?
• Schema.org and microdata? (google, yahoo, microsoft)
• microformats
• Semantic Web: 1994
• “The vision of the Semantic Web is to extend principles of the Web from documents to data” – W3C
• “This simple idea [the Semantic Web]… remains largely unrealized.” – Tim Berners-Lee et al., 2006
Is Linked Data (2006) a:
• Special case: narrowing and focusing?
• Redo: “The semantic web done right?”
• Addition: Semantic web + links?
• Rebranding of a troubled project?
• URI, URL, URN, IRI, CURIE
• RDF “Serializations”: RDF/XML, RDFa, N-
Triples, Turtle, JSON-LD
• Ontologies vs. ontology languages vs. “schema languages” vs. plain old RDF: RDFs, OWL, FOAF
• SPARQL
• 5 star LD!
• That means we need to link to other LD
• So we need to identify some existing LD to link to…
• Linked Data version of Wikipedia
• Take any wikipedia url
• Replace “en.wikipedia.org/wiki”
• With “dbpedia.org/page”
• And you have the LD expression of that concept.
• http://en.wikipedia.org/wiki/Cambridge,_Massachusetts
• http://dbpedia.org/page/Cambridge,_Massachusetts
• A set of RDF triples is called a “graph”
• Graph in this sense is a math/comp sci data structure
• Not a visual plot
“provide useful information using the
standards…”
2409 RDF Triples about Cambridge
Let’s make an LD “comment” about Cambridge!
1. Open the “ntriples” dbpedia file and find the existing English language comment
Subject
<http://mylinkeddata.org/r esource/123>
Predicate
<http://www.w3.org/2002
/07/owl#sameAs>
<http://mylinkeddata.org/r esource/123>
<http://www.w3.org/2000
/01/rdfschema#comment>
Object
<http://dbpedia.org/resou rce/Cambridge,_Massachu setts>
"Cambridge is a pretty cool town"@en
Stick this data:
At this URL: http://mylinkeddata.org/resource/123
• Creating RDF triples is easy
• Figuring out the right HTTP URIs to use is hard
• Figuring out how to respond to any HTTP URI requests you receive is also harder than I would like
• SPARQL: Recursive acronym for SPARQL
Protocol and RDF Query Language
• RQL is the part we’re interested in
• LD’s answer to SQL
• Instead of querying tables in a db
• You query a graph of rdf triples
• Using “triple patterns” (and some other stuff)
Let’s query the DBPedia SPARQL endpoint!
Note: You want to point your browser to
“snorql” (not sparql!): http://dbpedia.org/snorql
• Show me name and dates of birth and death for people whose “main interests” are theology and nihilism
PREFIX foaf: http://xmlns.com/foaf/0.1/
PREFIX dbo: http://dbpedia.org/ontology/
PREFIX : http://dbpedia.org/resource/
SELECT ?name ?birth ?death ?person WHERE {
?person dbo:mainInterest :Nihilism .
?person dbo:mainInterest :Theology .
?person dbo:birthDate ?birth .
?person foaf:name ?name .
?person dbo:deathDate ?death .
}
ORDER BY ?name
http://bit.ly/1ip6leF
http://wiki.dbpedia.org/OnlineAccess#h28-5
Play around with them. Swap out some parameters. Stare at your favorite dbpedia records you found bymodifying wikipedia urls to get ideas for other triple patterns.
If you want to run SPARL against your own RDF data…
• Install apache Jena (java framework)
• Use the command line ARQ tool
• Warning: probably too geeky for most folks in this room.
• But if you’re serious about going deeper, probably unavoidable
• SPARQL syntax harder than RDF
• But again, the hardest part seems to be figuring out what URIs to plug in
• Existing tools not very user friendly
• Promise of querying the entire Web of Data still a way off
• Regular LD sort of a parallel web of data
• RDFa and related technologies embed web of data within the web of documents
• The “a” stands for attributes”
• Metatags on steroids
• But good, W3C doctor approved steroids!
• Sounds like an afterthought, but probably far more widely used than any other form of LD.
What does RDFa look like under the hood?
http://en.wikipedia.org/wiki/RDFa
• Graph? Sounds like LD!
• And indeed, uses RDFa
• But not “pure RDFa”
• And only for ingest
• http://graph.facebook.com/reinhard.engels
• http://graph.facebook.com/harvard
• http://graph.facebook.com/zuck
• Frustrated by ambiguities and the many competing ways of doing more or less the same thing
• Frustrated by disconnect between grand vision of one API for the Web of Data and the sorry little SPARQL queries I was able to run
• Not overjoyed that SEO spamming seems the one area in which LD is really succeeding
• “We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run.” –
Amara’s Law