Open Government Data - SWIB

advertisement
Enrichment of Library Authority
Files by Linked Open Data
Sources
Gerd Zechmeister
Semantic Web Company – http://www.semantic-web.at
Presentation agenda
1.
2.
3.
4.
5.
About us
LOD2 Project
Demonstration Scenario
Process & Results
Summary & Outlook
© Semantic Web Company – http://www.semantic-web.at/
2
About us
• Based in Vienna (privately held)
• 20 specialists from several fields
• Focus: Semantic (web) technologies &
search applications
– 1st project based on semantic technologies in
2001
– Foundation of Semantic Web School in 2004
 Semantic Web Company GmbH since 2008
– PoolParty development started in 2007, on
the market since 2009
© Semantic Web Company – http://www.semantic-web.at/
3
© Semantic Web Company – http://www.semantic-web.at/
4
PP Thesaurus Manager
2.
4.
3.
1.
5.
1.
2.
3.
4.
5.
Each concept in one or many concept schemes
Each concept has one URI
Each concept has one ore more labels
(Poly-)Hierarchical and non-hierachical relations
Matching between concepts from various sources
© Semantic Web Company – http://www.semantic-web.at/
5
SKOSsy
• Select DBPedia
categories
• Choose extraction
depth, data to
extract and format
(TTL, TriG etc.)
• Extract it and
import it into
PoolParty as Seed
Thesaurus
© Semantic Web Company – http://www.semantic-web.at/
6
• FP7 project (2010-2014)
• 15 partners (technology researchers,
companies and service providers) from
11 European countries plus 1
associated partner from Korea
• Coordinated by the AKSW research
group at the University of Leipzig
© Semantic Web Company – http://www.semantic-web.at/
7
LOD Life-Cycle
Management
• Extraction of RDF
from text, XML and
SQL
• Querying and
Exploration using
SPARQL
• Authoring of Linked
Data using a
Semantic Wiki
• Semi-automatic link
discovery between
Linked Data sources
• Knowledge-base
Enrichment and
Repair
© Semantic Web Company – http://www.semantic-web.at/
8
Demonstration
Scenario
• Alignment
– Example Data vs LOD resources in SKOS
– Identification of matching concepts
• Enrichment
– Addition of matches to Example Data
dump
© Semantic Web Company – http://www.semantic-web.at/
9
Demonstration
Scenario
• Applied tools and frameworks
Tool/Framework
Function
Using SKOS Thesauri as graph/SPARQL endpoint
Creating example data as graph/SPARQL
endpoint
Comparing data to detect matching concepts
Extracting categories from DBPedia to import it
as Thesaurus into PoolParty
© Semantic Web Company – http://www.semantic-web.at/
10
Demonstration
Scenario
• Example Data
–
–
–
Schlagwortnormdatei (SWD = keyword
authority file) from DNB data dump
166.414 concepts in German with
alignments to LCSH, RAMEAU etc.
Expressed in SKOS (hierarchical and
associative relations)
© Semantic Web Company – http://www.semantic-web.at/
11
Demonstration
Scenario
• SKOS vocabularies for alignment
– Standard Thesaurus Economy (STW)
• 6520 concepts with english/german prefLabel
– European Union Thesaurus (EUROVOC)
• 6797 concepts with multilingual prefLabel
– Extracted concepts from DBPedia via
SKOSsy: „Economy“
• 13294 concepts in German
© Semantic Web Company – http://www.semantic-web.at/
12
Process & Results:
preparational steps
1. Download
– SWD data dump from DNB server
2. Evaluation
– SKOS compatibility
3. Transformation
– SWD data as SPARQL endpoint
4. Vocabulary selection
–
Focus on Economy vocabularies
© Semantic Web Company – http://www.semantic-web.at/
13
Process & Results:
Alignment
• Specification in SILK workbench
– Define data sources: SWD & EUROVOC
– Define tasks: compare all skos:prefLabels
and deliver all matching links
– Initiate process and create output file
© Semantic Web Company – http://www.semantic-web.at/
14
SILK Workbench
Alignment SWD vs EUROVOC
© Semantic Web Company – http://www.semantic-web.at/
15
SILK Workbench
Alignment SWD vs EUROVOC
© Semantic Web Company – http://www.semantic-web.at/
16
Process & Results:
Alignment
SWD
166414 cs.
3440
matching links
2169
STW
EUROVOC
6520 cs.
6797 cs.
1318
DPPedia
Wirtschaft
13294 cs.
© Semantic Web Company – http://www.semantic-web.at/
17
Process & Results:
Enrichment
Upload of
exactmatches to the
SWD graph in
Virtuoso
© Semantic Web Company – http://www.semantic-web.at/
18
Process & Results:
Enrichment
Subject
Predicate
Object
<http://dnb.info/gnd/4000
107-6>
<skos:exactMatch> <http://de.dbpedia.org/resource/Abfallwirtschaft>
<http://dnb.info/gnd/4000
107-6>
<skos:exactMatch> <http://eurovoc.europa.eu/1158>
<http://dnb.info/gnd/4000
107-6>
<skos:exactMatch> <http://zbw.eu/stw/descriptor/13325-0>
© Semantic Web Company – http://www.semantic-web.at/
19
SWD
DBPedia
EUROVOC
STW
© Semantic Web Company – http://www.semantic-web.at/
20
Process & Results:
Enrichment
<skos:Concept rdf:about="http://d-nb.info/gnd/4000107-6">
<skos:definition xml:lang="de">Weiter als im Gabler definiert, auch für öffentliche
Abfallwirtschaft</skos:definition>
<dnb:hasCoordinatedConcept-of>
<dnb:CoordinatedConcept>
<dnb:coordination-of rdf:resource="http://d-nb.info/ddc-sg/360"/>
<dnb:coordination-of rdf:resource="http://d-nb.info/gnd/4000107-6"/>
<dnb:det2 rdf:resource="http://d-nb.info/ddc/class/363.728"/>
</dnb:CoordinatedConcept>
</dnb:hasCoordinatedConcept-of>
<skos:related rdf:resource="http://d-nb.info/gnd/4000100-3"/>
<skos:related rdf:resource="http://d-nb.info/gnd/4076573-8"/>
<dcterms:identifier>(DE-588)040001075</dcterms:identifier>
<dcterms:identifier>(DE-588c)4000107-6</dcterms:identifier>
<skos:broader rdf:resource="http://d-nb.info/gnd/4220414-8"/>
<skos:prefLabel xml:lang="de">Abfallwirtschaft</skos:prefLabel>
<skos:exactMatch rdf:resource="http://de.dbpedia.org/resource/Abfallwirtschaft">
<skos:exactMatch rdf:resource="http://eurovoc.europa.eu/1158">
<skos:exactMatch rdf:resource="http://zbw.eu/stw/descriptor/13325-0">
</skos:Concept>
© Semantic Web Company – http://www.semantic-web.at/
21
Summary & Outlook
• Playground for future scenarios
– Linked Open Library Data
– LOD2 technology stack components
• Further applications
– Executing tasks for regular updates
– Link exchange with LOD providers
– Integration of data and cross-media (e.g.
geo-references, images, AV files)
– Expansion of authority files for
cataloguing (e.g. multilingual searches)
© Semantic Web Company – http://www.semantic-web.at/
22
Get in contact!
Gerd Zechmeister
Research & Development Manager
g.zechmeister@semantic-web.at
Semantic Web Company GmbH http://www.semantic-web.at/
http://poolparty.biz/
Mariahilfer Strasse 70/8
http://twitter.com/semwebcompany
1070 Vienna - Austria
© Semantic Web Company – http://www.semantic-web.at/
23
Download