Slide - American Library Association

advertisement
Linked Library Data
Tuning Library Metadata for
the [Semantic] Web
Presented 2011-03-16
ALCTS RDA Webinar Series
Corey A Harper
Topical Overview




Semantic Web & RDF Intro
Linked Open Data
[Linked] Library Data
Resource Description and Access (RDA)
 Beyond
MARC
 As RDF Vocabularies


Broader Interoperability
Small steps forward…
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
2
Semantic Web

TBL’s original vision
 “Weaving

the Web” – 1999
Then: Focus on Machine Reasoning
 Scientific American Article

Now: Focus on things & links
 Reasoning
2011-03-16
& Inferencing less central
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
3
Semantic Web

Originally:
 Metadata
standard built on XML
 Metadata about “Web” things (documents)

Eventually:
 Metadata
about all sorts of things
 And about relationships between things

What are the “things”?
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
4
Semantic Web Terminology





Resource: Any “thing”
Class: Abstraction of a type of thing
Individual: An instance of a class
Property: An attribute of an individual
Statement/Triple:





A Resource (subject)
A Property (predicate / verb)
A Value (object) - Nodes
Graph: Visual Representation of statements
Ontology: A domain specific collection of classes and
properties
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
5
Semantic Web Terminology



Nodes: The Subjects and Objects in a Graph
Arcs: The Predicates in a Graph
Domains and Ranges: Constraints on Nodes
 Domain:
What things can be subjects
 Range: What things (or strings) can be objects


Literals: Values as strings rather than things
Named Graphs: Graphs with URIs treated as
nodes.
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
6
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
7
Linked Open Data
Use URIs as names for things
 Use HTTP URIs so that people can look
up those names.
 When someone looks up a URI, provide
useful information.
 Include links to other URIs. so that they
can discover more things.

http://www.w3.org/DesignIssues/LinkedData.html
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
8
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
9
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
10
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
11
Data in the Cloud

Hubs in the May 2008 Version:
 FOAF
 DBPedia



Geonames
MusicBrains
Myriad Sources coming online:
 Thompson Reuters
 New York Times
 British Broadcasting Corporation
 Government Data (UK, US and more)
 Google and Facebook
 More
2011-03-16
and More Library, Archive and Museum Data
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
12
DBpedia
Structured Wikipedia Data
 Genres, Influences, External Links
 Multi-lingual / Multi-script labels
 Rich Semantics
 Many linkages to other datasets

2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
13
DBpedia Model
Partial basis in data entry conventions
 InfoBox’s, and InfoBox Templates
 Metadata Entry Format
 Partial source of Ontology

 Class
Structure
 Vocabulary Design
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
14
DBpedia
3.4 Million “things” described
 Ontology based on “infoboxes”

 1.5
million things classified
 http://wiki.dbpedia.org/Ontology

Approx. 50,000 “Properties”
 Approx.
2011-03-16
1,200 defined in ontology
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
15
What *things* are in
our data???
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
19
…Library
data is
extremely
complicated
Library Metadata
Rich stores of MARC, MODS, &c.
 Robust Controlled Vocabularies

 Subject
Heading lists
 Code lists
 Thesauri

Emerging data model in FR*
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
21
Bibliographic Vocabs

Bibliographic Ontology (Bibo)
 Zotero,

Omeka, EPrints and Others
FRBR – unofficial
 And
now Official (Thank you IFLA!)
ISBD
 Resource Description and Access (RDA)

2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
22
Linked Library [Archive,
Museum] Data
LIBRIS (Swedish Union Catalog)
 Library of Congress (LCSH, OSI)
 German National Library
 Hungarian National Library
 British Library
 Europeana
 Archives Hub & LOCAH

2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
23
Library Authority Data
“Include links to other URIs. so that they can
discover more things.”
Short of providing and linking to URIs, this
*is* authority data.
This is what our authority files are for.
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
24
Library Controlled
Vocabularies: Benefits
Reputation - Trusted Tradition
 Mature - Time tested and carefully
developed
 General & Comprehensive - Cover large
knowledge spaces

2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
25
SKOS
Simple Knowledge Organization System
 Properties and Classes for describing
Controlled Vocabulary
 Heavily used in Linked Library Data

 id.loc.gov
 Virtual
International Authority File (VIAF)
skos:primaryTopic
bibo:book
2011-03-16
skos:subject
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
26
Other Vocabularies








Thesaurus for Economics
French Subject Headings
Swedish Subject Headings
IconClass (not on web yet)
OCLC Terminology Services
Dewey Decimal Classification
Virtual International Authority File
Metadata Authority Description Schema (MADS)
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
27
Resource Description and
Access

Current focus on MARC
 Much
criticism
 Within MARC, not a tremendous change
 Different problems outside of MARC

Possible focus outside of MARC
 RDA as
realization of FRBR
 RDA as Metadata Vocabularies
 RDA as related to Bibo
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
28
RDA as Metadata Vocabularies
roles and vocabularies
have been provisionally registered
 IFLA FRBRer and ISBD elements and
vocabularies have been officially registered
 Discussions about long term maintenance
of both RDA and the vocabularies
 Effort to create multi-language RDA
Vocabularies
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
29
Slide Adapted from Diane Hillmann
 RDA elements,
Metadata Registries

Formerly NSDL Registry
 Now
“Open Metadata Registry”
 Managing Vocabularies
 Providing Vocabulary Services


RDA – Now adding translations
IFLA Work
 FRBR,
2011-03-16
FRAD, FRSAD, ISBD
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
30
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
31
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
32
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
33
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
34
RDA as realization of FRBR
What will this look like?
 Probably *won’t* be stored in MARC
 Overly constrained by FRBR?

 Properties
have FRBR domains & ranges
 Unofficial “Generalized” properties
Non-FRBR metadata
 Similar to DCMI’s range constraints…

2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
35
Support Free Range Metadata!
Photo Credit: http://www.flickr.com/photos/ciwf/3217378769/
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
36
BIBO and RDAVocab
Open question re: alignment
 Simplified view of Bib Data is useful

 Interlinking
with more general data
 Interlinking with non-library domain data
FRBR as internal model for library domain
 Examples

2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
37
Why Does This Matter?
Our descriptions no longer stand alone!
Connect our data with the rest of the WEB
Allow others to reuse more easily







FOAF, Geonames
DBPedia
MusicBrains
New York Times, Thomson Reuters
Government Data - data.gov
British Broadcasting Corporation
Other Library, Archive and Museum Data
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
38
Conclusions

Distributed bibliographic control environment
 Linking
Data
 Focus on identification over description

“In short, by treating values as non-literal
resources and assigning URIs to them we give
ourselves (and others) the hooks on which to
hang further descriptions.” - Andy Powell
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
39
Future Work


“Records” in Linked Library Data
Vocabulary Alignment and Interoperability
 DCMI

planning in this space
General Metadata Interoperability
 Application

Profiles?
Archival Data for *context* - (EAC-CPF)
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
40
W3C Linked Library Data
Incubator
Collecting, Curating and Clustering over
50 Use Cases
 Mining use cases for functional
requirements and design patterns
 Recommendations to W3C

 Should

lead to Working Groups
http://www.w3.org/2005/Incubator/lld/
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
41
Other Activities
ALCTS/LITA Linked Library Data IG
 IFLA Semantic Web IG

 https://wiki.d-nb.de/x/vA10Ag

Open Knowledge Foundation
 http://okfn.org/

CKAN Linked Library Data Group:
 http://ckan.net/group/lld
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
42
Thanks!
Questions?
corey.harper@nyu.edu
212.998.2479
@chrpr
2011-03-16
Harper - Linked Library Data - RDA Webinar Series
Hosted by the Association for Library Collections and Technical Services
43
Download