Next Generation Semantic Data Environments Deborah L. McGuinness

advertisement
Next Generation Semantic Data
Environments
(or Linked Data, Semantics, and Standards in Scientific Applications)
Deborah L. McGuinness
Tetherless World Senior Constellation Chair
Professor of Computer and Cognitive Science
Web Science Research Center Director
Rensselaer Polytechnic Institute, Troy, NY
With thanks to the extended RPI Tetherless World Team
OMG Semantics : From Research to Reality: Implementing the Semantic Web
March 20, 2013 Reston, VA
Trends:
More Data & More Diversity
• More data
–
–
–
–
–
More open data
More authoritative data
More interest in and generation of metadata
More enthusiast generated / maintained data
More vocabularies, taxonomies, ontologies
• More diversity
– Broader human participation
• Trained scientists, citizens, enthusiast, indigenous, …
–
–
–
–
More locations – mobile as well as global
More sensors – human, robots, implants, …
Real time feeds
Social sources – Twitter, Facebook, …
2
Increasing Requirements
• Data and data environments should:
– Support usability – not just by original authors
– Include (usable) documentation - meta data concerning collection
methods, sources, recency, assumptions, …
– Provide accessibility with transparent access policies
– Include schema / ontology information – including mapping information
used in integration along with rationales….
– Support queries (with usable and understandable interfaces)
– Document verification and curation methods, including access to tools
– Support AND encourage interactions; users should be able to comment,
question, contribute, discuss, ….
Path moves from
Portal -> Virtual Observatory -> Online Community
Next: examples, foundations, and discussion
3
Semantic Environmental
and Ecological Monitoring
•
Enable/Empower citizens &
scientists to explore pollution
sites, facilities, regulations, and
health impacts along with
5
4
provenance
• Demonstrates semantic
3
2
monitoring possibilities
• Extend to endangered species
and resource mgr issues
1
• Explanations and Provenance
http://was.tw.rpi.edu/swqp/map.html and
available
http://aquarius.tw.rpi.edu/projects/semantaqua
1. Map view of analyzed results
2. Explanation of pollution
3. Possible health effect of contaminant (from EPA)
4. Filtering by facet to select type of data
5. Link for reporting problems
6. Extended with input from USGS, with population counts for birds & fish
Example Workflow (SemantAqua)
Publish
CSV2RDF4LOD
Direct
visualize
derive
derive
archive
CSV2RDF4LOD
Enhance
Archive
5
Reusable Ontologies
• Pollution ontology
describes the relationship
between a regulation
violation (a measurement),
a polluted thing, and a
polluted site
• Combined with other
ontologies (e.g. W3C Geo)
users can ask “Tell me all
of the polluted things
within 1 mile of my
location”
6
Ontologies
• Water quality ontology
extends pollution to
describe water-related
pollution
• Further extended by
regulation ontologies
to provide “regulation
violation” inference
• Allows the reasoner to
match specific
regulations to
measurements that
violate them
7
Interface
8
Semantic Methodology and
Semantic Application Evolution
SemantAqua -> SemantEco -> DataOne
modularizing, broadening,
provenance, interaction
VSTO -> SESDI -> SPCDIS
- modularizing, provenance,
Originally developed for Virtual Observatories (in solar
terrestrial) , now in water quality, Sea ice, volcanology,
broadening, interaction
mycology, …. …
McGuinness, Fox, West, Garcia, Cinquini, Benedict,
Middleton The Virtual Solar-Terrestrial Observatory: A
Deployed Semantic Web Application Case Study for
Scientific Research. Proc. 19 Conf. on Innovative
Applications of Artificial Intelligence (IAAI-07),
http://www.vsto.org
9
Population Sciences Grid: Interventions,
Behaviors, and Policy
Extensible Mashups via Linked Data
 Diverse datasets from NIH
 Exploring Interventions along with correlations with
behavior changes - in this case tobacco interventions
and smoking prevalance
 Accountable Mashups via Provenance
Award winning paper on multi-dimensional
analysis
10
An Example: Hawaii
Changes in cigarette use
viewed against policy changes
We link states from year to year to that
state across time, adding data for each
year.
11
Ontology as API:
Adding Dimensions
This RDF:
graph
Creates this visual:
dataset
x axis
y axis
12
Social Observatory – First Responder
effort (NIST funded)
Social Media use is on
the rise. Every day, we
write:
294 billion emails
2 million blog posts
Over 40 Million
Tweets*
Finding Users
First Responders,
including Emergency
Medical Personnel,
Firefighters, and Police
Officers, have active online
communities on Social
Media websites.
How can we leverage
Social Media sites
… to gather requirements
for active First Responders?
… to identify stakeholders
within those First
Responder communities?
Finding Topics
13
Web Data “Challenge
Response” Enablers
- HHS
Award
winning
platform
- Target
questions:
“good
hospital for
my context”
- Prizm,
DataCube
Explorer, …
14
Open Government Data
TWC –Intl Open Government Data Sets
Mobile, Distributed, and ContextAware Computing
Rensselaer Tetherless World Constellation
Web Observatory Foundations & Directions
THEMES
Multi-Dimensional Data Portals
Observatories:
Science, Open Government, Health and Life
Science, Social
Web Science Research Foundations
Making Data Transparent and Actionable
Provenance
Semantic Methodology
Social Network Analysis
Semantically-Enabled Visualization
Web Data "Challenge Response" Enablers
•
•
•
•
•
•
Open Data Workflow
International Open
Government Data Sets
Health and Human Services Data
Challenge
Semantic eScience Data Portals
Social Media: Reasoning on
Graph Database
First Responder Network
Foundations: Web Layer Cake
Visualization APIs
S2S
Govt Data
Inference Web, Proof
Markup Language, W3C
Provenance Working
group formal model,
W3C incubator group,
…
OWL 1 & 2 WG Edited main OWL
Docs, quick reference,
OWL profiles (OWL RL),
Earlier languages: DAML,
DAML+OIL, Classic
Inference Web IW Trust,
Air + Trust
DL, KIF, CL, N3Logic
Ontology repositories
(ontolinguag),
Ontology Evolution env:
Chimaera,
Semantic eScience
Ontologies, MANY other ontologie
RIF WG
AIR accountability tool
SPARQL WG, earlier QL –
OWL-QL, Classic’ QL, …
Govt metadata search
Linked Open Govt Data
SPARQL to Xquery translator
RDFS materialization
(Billion triple winner)
Transparent Accountable
Datamining Initiative (TAM
Inference Web: Making Data Transparent and
Actionable Using Semantic Technologies
•
How and when does it make sense to use smart system results & how do we
interact with them?
Cognitive
Asst ->
CPOF &
SIRI
Knowledge
Provenance in Virtual
Observatories
(Mobile)
Intelligent
Agents
Intelligence Analyst
Tools -> Watson
Hypothesis
Investigation /
Policy Advisors
NSF Interops:
SONET
SSIII – Sea Ice
19
Moving to the Next Generation
Some focus areas to move to the next generation:
• Provenance – e.g., not just the sources, and dates but
enough to know when to depend on something.
• Policy – balance between sharing data, getting credit ,
making data accessible to all (or all willing to follow the
rules
• Social aspects – incentives, rewards, evolution,
customization
• Distributed, Mobile, and Context-aware
• Education – scientific method - promote creating testable
hypotheses, how to verify/ replication, etc.
• Broadly usable semantic methodology
• Moving to truly integrated communities
20
Discussion
•
•
•
•
Semantic foundations are being used in a wide range of areas.
They are not just for semantic practioners any more
Open as well as commercial software available
Come join us!
• And if you are already there…
– What do you want from evolving observatory / collaboratory
infrastructure ?
– What do you need from provenance and explanation infrastructures?
– Do you have tools, tool templates, and/or tool requirements?
– Do you have use cases?
– Are you using our (or another) semantic methodology?
More info – Deborah McGuinness dlm@cs.rpi.edu
Extra
22
Semantic Web (RPI)
2013
Research
Innovatio
n
RDFa
What is an Ontology?
Thesauri
“narrower
Catalog/
term”
ID
relation
Terms/
glossary
Informal
is-a
Formal Frames General
is-a (properties) Logical
constraints
Formal Value Disjointness
instance Restrs. , Inverse,
part-of…
Ontologies Come of Age McGuinness, 2001, and From AAAI Panel 99 – McGuinness, Welty, Uschold,
Gruninger, Lehmann
Plus basis of Ontologies Come of Age – McGuinness, 2003
Interface
25
Core and Framework Semantics Multi-tiered interoperability
used by
Download