Lopez - The Center for Geographic Analysis, Harvard University

advertisement
<Insert Picture Here>
Linking Open Data with Location:
Gazetteers and the Semantic Web
Xavier Lopez, Director, Product Management
Overview
•
•
•
•
Linked Open Data (LOD) Concepts
Role of Gazetteers in LOD
Interconnected Web of Content
Towards Geospatial Knowledge Management
Linked Data
“A method of publishing structured data,
so that it can be interlinked and become
more useful”
Wikipedia
Linked Open Data – The Opportunity
• Manage relationships for massive
collections of structured and
unstructured data
• Flexible and extensible data model
supports powerful search and enduser discovery of related content
• Enable users to define their social
networks or communities based on
common interests, subjects, image
scenes, locations, etc.
• Rich platform for data integration,
data repurposing, and better quality
control and classification
Semantic Aggregation & Navigation of Data
• Tactical, non-invasive, iterative
solution for strategic modernization
Simple Linked Data Architecture
User
Query & results
Domain &
Task
Ontologies
Data
Ontologies
(Reasoning/Inferencing)
Engine
Data
Sources
Linked Open Data Cloud (2008)
Linked Open Data Cloud (2010)
Resource Description Framework (RDF)
RDF is a general framework for
describing a Web site's metadata, or the
information about the information on
the site. It provides interoperability
between applications that exchange
machine-understandable information
on the Web.
W3C
Modeling: A FOAF Example
Gazetteers and Linked Open Data Services
• Provide common terms (place names) to link across
existing spatial data resources
• Enable consolidated view across the map layers
• Reconcile differences in data semantics so that they
can all “talk”and interoperate
• Resolving semantic discrepancies across databases
gazetteers and applications
• Integrate full breath of enterprise content continuum
(structured, spatial, email, documents, web services)
Modeling: Enterprise Integration
• Ordnance Survey maintains
definitive mapping data of Great
Britain, the world’s largest and
most detailed Geo DB
• Semantic Web is used to
integrate different, semantically
diverse sources of data
• General ontologies already developed to bridge differences in terminology
• The data is queried efficiently via the ontology or RDF
• Advantages include efficient data integration, data repurposing, and better
quality control and classification
Source: http://www.w3.org/2001/sw/sweo/public/UseCases/
From Linked Data
to Knowledge Management
Knowledge Management Conditions
• Filtering search queries with “context”
• Discovery of data relationships across…
• Structured data (database, apps, web services, RSS schemas)
• Unstructured data (email, office documents)
•
•
•
•
Queries are not defined in advance
Schemas are continuously evolving
Support Machine2Machine interaction
Location can be common link, along with names, concepts,
synonyms
What Can Linked Data Enable?
Mapping & Geotagging
Analysis of Complex Relations
Social Network Relations
Rule-based Reasoning
Ontology-driven Map Apps
National Map
Core Datasets
RDF & OWL Data
Geographic
Names
Spatial
Data
•
•
•
•
•
•
Application
Ontologies
Situational
Awareness
Theater
Raster
Data
Simple Features
GeoRaster
Topology
Networks
Gazateers
…
•
•
•
•
•
•
Data Integration
National Map schemas
Geographic names
Temporal
Naïve Geography
…
Targeting
A “Simple” Knowledge Query
Which hospitals within 30 mins of Alpine, CA provide
burn treatment?”
• We need to associate a number of factors, including
hospital type and facilities – its accessibility after a
disaster – and the staff available
• The query needs to be structured based on
Concepts & Relationships that can be retrieved and
then customized for the specific query.
• Using this approach, a listing of the hospitals
capable of dealing with large number of burn cases
is returned to the user and information associated
with the query retrieved.
“Typical” Analyst Query:
“Which hospitals within 30 mins of Alpine, CA provide burn treatment?”
Feature Reference
Type?
What does
this mean?
Buffer or
proximity?
Driving or Flying?
Road Closures?
Definition?
Where is this?
Centroid or outline?
Ontologies for Problem Solving
Specialists
Burns
Unit
Weather
Treatment
Emergency
Team
Duty Rota
Burns
Illness
Flood
Hazards
Skin Graft
Location
Vehicles
Hospital
Route
Type
A&E
Roads
Helipad
Beds
Obstructions
Type
Size
Oracle 11g RDF/OWL Graph Data Management
• Storage & Loading
• Native W3C RDF graph data store
• Fast Bulk, batch & Incremental load
• Query
• SQL: SEM_MATCH graph pattern query
• SPARQL: supported via Jena plug-in
• Reasoning
• RDF, OWL Prime, RDF++ semantic rules
• Forward chaining inference model
• User defined rule base
• Scalability
• Scales to billions of triples
• Partitioning, RAC, Adv. Compression
• Standards & Interoperability
• Aligned with W3C specifications
• Supported by leading semantic tools
Structured DBMS, Unstructured, Spatial, RSS, email, Documents
Conclusions
• Key semantic technologies are mature
• Semantic technologies are key enablers for
enterprise and Web
• Reuse existing of authoritative gazetteers are
needed
• Model the real world rather than data artifacts
Find out more...
oracle.com/database/spatial.html
Q&A
oracle.com/technology/products/spatial
oracle.com/technology/products/spatial/htdocs/pro_oracle_spatial.html
Information Explosion
• Structured data stores are growing in size
• Amount of semi-structured data is
expanding (XML, RDF, Semantics, Spatial)
• Metric data, beacons, sensors supplying
mega volumes
• Unstructured data is gathered at a
staggering pace (email, documents,
messages, streams, feeds)
Modeling Domain Information
Resource Description Framework (RDF)
RDF is a general framework for
describing a Web site's metadata, or the
information about the information on
the site. It provides interoperability
between applications that exchange
machine-understandable information
on the Web.
W3C
Simple Transitive Reasoning
:partOf
:partOf
:California
:partOf
Asserted Facts
:partOf
:California
:USA
Derived Facts
:California
:USA
rdf:type
:partOf
:NorthAmerica
owl:TransitiveProperty
rdf:type
:partOf
:partOf
:partOf
Query: SELECT ?x ?y
FROM …
WHERE { ?x :partOf ?y }
owl:TransitiveProperty
:USA
:NorthAmerica
:NorthAmerica
Result: ?x______
:California
:California
:USA
?y__________
:USA
:NorthAmerica
:NorthAmerica
Integrated Bioinformatics Networks
Source: Siderean Software
Text/Spatial Mining Workflow
Ontology Engineering
Modeling Process
Information
Extraction
Web Resources
Categorization,
Feature/term Extraction
RDF/OWL
Processed
Document
Collection
OWL
Ontologies
Domain
Specific
Knowledge
Base
News,
Email, RSS
Content Mgmt. Systems
Explore
Spatial Data
Browsing, Presentation, Reporting, Visualization, Query
Analyst
Download