The role of geoscience ontologies in a knowledge infrastructure for e-science

advertisement
1
The role of geoscience ontologies in a
knowledge infrastructure for e-science
Boyan Brodaric
Geological Survey of Canada
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
E-geoscience
 What are the types of geoscience concepts?
 Do they differ from other sciences?
 How do they relate to models and theories in CI?
Acknowledgements
eSci Institute Visitors Program
Theme 4: Spatial Semantics for Automating
Geographic Information Processes (F.Reitsma)
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
2
Outline
3
 Cyberinfrastructure (CI)
 Knowledge infrastructure (KI)
 Geoscience ontologies in KI
types of concepts in geoscience ontologies
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
Cyberinfrastructure
 Cyberinfrastructure (CI)
 cyber-networked resources:
data, software, instruments, people,…
 E-science
 scientific activity in CI:
new paradigm for dramatic discoveries
 NSF Vision for CI
 systems: HPC, connectivity
 data: capture (sensors), manipulate
(structure, analyse, integrate, visualize)
 people: virtual collaboration, education
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
4
CI typical components
Systems
HPC, connectivity
5
Knowledge representation
Theories, concepts
Information
stores, analyses, integrations, visualizations
People
collaboration, education
Instruments (data)
observations, measurements, experiments
Models
models, simulations
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
geo-CI example: real-time severe
storm modeling (LEAD)
On-Demand Grid Computing
Data Mining
Weather ontology and
glossary
resource discovery
Is there a
severe
storm
forming?
Streaming observations
Forecast Model
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
6
geo-CI example: seismic hazard
modeling (SCEC)
Information
On-Demand Grid Computing
workflows
ontologies
info discovery
info integration
InSAR Image of the
Hector Mine Earthquake
A satellite
generated
Interferometric
Synthetic Radar
(InSAR) image of
the 1999 Hector
Mine earthquake.
Shows the
displacement field
in the direction of
radar imaging
Each fringe (e.g.,
from red to red)
corresponds to a
few centimeters of
displacement.
What is
the
hazard
threat ?
Monitoring, mapping
Seismic Hazard Model
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
7
Two perspectives on CI
The ENGINEERING perspective: efficiency
 replicating existing geoscience practice in CI more efficiently
 doing more, faster: more computation, more resources, existing methods
e.g. resource discovery, integration, use (finding, linking and using info., tools, …)
 is evolutionary: an incremental shift
The KNOWLEDGE perspective: creativity
 leveraging CI for new approaches to doing science
 new doings: new questions and results with new CI-driven methods
e.g. knowledge discovery, integration, use (explicitly derive and test ideas in CI)
 is revolutionary: a paradigmatic shift
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
8
Efficiency perspective: ontologydriven map integration (GEON)
e.g. find all regions with
sedimentary rocks
metadata
geospatial projection
9
common concepts in metadata
geol. concept
schema
common concepts in schema
rock type
geol.
unit
common concepts in content
classifications
sedimentary
workflows
english
sandston
e
slate
sandstone
arenite
slate
queries
vocabulary
french
sandstone
grès
slate
arénite
grès
ardoise
common vocabulary
ontology
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
"textbook knowledge"
- defining concepts in:
• theories, laws
• model types
• classification systems
• taxonomies
• etc.
objects
processes, events
theories
10
ontology
ontology
Creativity perspective: theorydriven geoscience discovery
hypothesis testing
hypothesis formation
models
(after A.K. Sinha, 2004)
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
Aspects of a knowledge
infrastructure (KI) for CI
Gap

11
KI should support full discovery
Theory
discovery
Model
discovery
Theory
ontology
model
type
Abduction
Deduction
Information
Model
regularities:
e.g. schemas
objects
processes
Induction
Observation
Data
Information
discovery
sensor
s
Data
discovery
(adapted from Sowa, 2000)
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
Current
geo-CI
focus
Gaps
12
 focus on efficiency as path to creativity
 missing explicit theory-discovery support
 implicitly assumed to be done in heads of scientists
 little infrastructure for scientific theories
e.g. implications of alternate theories on models
 limited support for pragmatics (agent actions)
 limited role of ontologies
 background tools for info. manipulation, not scientific discovery
e.g. mainly used in resource discovery and integration
 must understand ontologies as geoscience artefacts, to aid discovery
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
Ontologies as geoscience artefacts 13
Universal
Theory
 Utilizes universal concepts and their relations:
e.g. theories (plate tectonics,…),
normative guides (stratigraphic codes, textbooks,...)
classification systems (rock types,…)
conceptual data models (e.g. NADM)
Conceptual
Model
 Utilizes generalized situations and relations:
e.g. legends, stratigraphic lexicons, regional models
Toward
formalizing
concepts in
geoscience
models and
theories
Situational
Geospatial
Model
Particular
• Utilizes states and their relations:
e.g. maps, 3D models, simulations,…
(from Dekemo, 2004)
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
Ontologies as geoscience artefacts 14
 Do concept types imply theory and model types?
 what concept types are inherent to geosciences?
 what gains are accrued from higher representation precision?
 Likely problem in KI re: concept types
 using a concept too general or too specific for a task:
E.g. using regional concepts to calculate the local earthquake or
landslide risk for your home (too general)
E.g. using the current local conditions for regional groundwater
vulnerability estimates (too specific)
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
Need for abstraction distinctions
 risk models use physical qualities of rocks
 e.g. density, porosity, permeability,…
 different rock quality values affect risk estimates




measured values at a site (point)
prototypical values for one polygon (on a map)
prototypical values for a class of polygons (in a map legend)
normative values (in a classification scheme)
Seismic Hazard
Model
Geologic Map
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
Groundwater
Vulnerability Model
15
Geoscience Ontology

Abstraction levels for geoscience concepts
Upper-level
• universal
 spans geospace-time
 definitional
 identified by ‘logic'
endurant
Domain-level
• domain specific
 spans geospace-time
 definitional
 defined by 'essences'
geologic
formation
(Millikan, 2000)
 situated in geospace-time
 situational
 defined by 'histories'
formation X
Individual
 single entity
rock body
#1
State
 single description
Rock body
#1 @ time1
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
16
Situated concepts
17
 historical, geo-temporal, situational
Process S
Defined by:
t0
t1
tn
Keli Mutu volcanism in
Flores, Indonesia
(Pasternack & Varekamp, 1994; 1997)
 common process history
Concept
‘Keli Mutu Volcanic lake’
 process history → situation
process=S
Quality1=x-y
 Process = ‘Keli Mutu volcanism’
 pH = 1.8 - 3.1
Instantiated by:
 individuals sharing a process
 extension (class)
individual1
individual2
‘Tiwu Ata Polo’
‘Tiwu Ata Mbupu’
Described by:
 none/some common qualities
State
State
 potential prototype effects
process=S
quality1=x
quality2=a
process=S
quality1=y
quality2=b
 potential quality change in time
 Process = ‘Keli …’  Process = ‘Keli …’
 pH = 1.8
 pH = 3.1
 Location = Flores  Location = Flores
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
Essential concepts
18
 ahistorical, spatial, definitional
‘Granite’
Process Type
Defined by:
Concept
 space of essential qualites
quality1=x-z
quality2=y-z
…
(after Gardenfors, 2000)
 Process = ‘Igneous’
 Quartz = 20-60%
 Alkali Feldspar = …
 Plagioclase = …
(Jersey & Tarman)
Instantiated by:
 individuals with point in
quality space
individual1
individual2
 extension (class)
‘Rock Sample 1234’
 Process = ‘Igneous’
State
State
quality1=x
quality2=y
…
quality1=z
quality3=z
…
 Quartz = 25%
 Alkali Feldspar = 40%
 Plagioclase = 35%
 Location = …
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
Geoscience Ontology

Abstraction levels for geoscience concepts
Upper-level
• universal
Domain-level
• domain specific
 spans geospace-time
 definitional
 identified by ‘logic'
endurant
 spans geospace-time
 schematic
 template, frame, dimensions
geologic
unit
 spans geospace-time
 definitional
 defined by 'essences'
geologic
formation
(Millikan, 2000)
(Gardenfors, 2000)
(Millikan, 2000)
 situated in geospace-time
 situational
 defined by 'histories'
formation X
Individual
 single entity
rock body
#1
State
 single description
Rock body
#1 @ time1
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
19
Schematic concepts
Defined by:
 schematic space
 ahistorical,
aspatial,
definitional
20
‘Rock material’
Concept
 Process
 Minerals
quality1
quality2
…
‘Granite’
 Process = ‘Igneous’
 Quartz = 20-60%
 Alkali Feldspar = …
 Plagioclase = …
Concept
quality1=x-z
quality2=y-z
…
(Jersey & Tarman)
individual1
individual2
‘Rock Sample 1234’
 Process = ‘Igneous’
 Quartz = 25%
State
State
 Alkali Feldspar = 40%
quality1=x
quality2=y
…
quality1=z
quality3=z
…
 Plagioclase = 35%
 Location = …
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
Example [LEKXIS]
upper
schematic
essential
situational
individual
state
Physical
Object
origin
lifetime
Lithostrat
Unit
origin
lifetime
lithology
mappability
Geologic
History
Geologic
Time
Earth Material
{ mappable,…}
Formation
Geologic
History
Geologic
Time
Earth Material
mappable
Formation X
history of
X
Devonian
sandstone
mappable
Rock body Y
history of
Y
Late
Devonian
sandstone
mappable
Rock body Y
@ t1
history of
Y@ t1
Frasnian
shale
mappable
Property
Process
21
Material
[L] Logic-driven
[E] schEma-driven
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
[K] Content-driven
[X] conteXt-driven
[I] Individual
[S] State
Example [LEKXIS]
Material
origin
lifetime
Property
22
Process
Physical
upper
schematic
Object
origin
lifetime
minerals
Geologic
History
Geologic
Time
{Q,A,P,…}
Geologic
History
Geologic
Time
Granites
of Fm X
geologic
history of X
Devonian
Granite of
X1
geologic
history of
granite of X1
Granite of
X1 @ t1
history of gr.
of X1 @ t1
Earth
Material
Granite
essential
situational
individual
state
composition
[L] Logic-driven
texture
[E] schEma-driven
{ felsic,
mafic, …}
{coarse,
…}
Q=20-60
A=…
P=…
M<=90%
felsic
medium
crystal
Q=25-35
A=31-40
P=25-43
M=15-25%
felsic
LateDevonian
Q=30
A=40
P=30
M=20%
felsic
Frasnian
…
…
medium
crystal
medium
crystal
….
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
[K] Content-driven
[X] conteXt-driven
[I] Individual
[S] State
Example
23
upper
Geographical
Region
Geographical
Feature
Geographical
Region
Biological
Object
schematic
Ecologic Rank
Mountain
Political Unit
Biological Rank
essential
Domain
Fault-origin Mtn.
Country
Species
situational
Polar
Canadian Rocky Mtn.
Western country
Human
individual
this Polar region
Mt. Whistler
Canada
Boyan
state
this Polar @ now
Mt. Whistler @ now
Canada @ now
Boyan @ now
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
Ontologies as geoscience artefacts 24
Theory
 Utilizes universal concepts and their relations:
upper
e.g. theories (plate tectonics,…),
normative guides (stratigraphic codes, textbooks,...)
schematic
classification systems (rock types,…)
conceptual data models (e.g. NADM)
Conceptual
Model
Geospatial
Model
essential
 Utilizes generalized situations and relations:
situational
e.g. legends, stratigraphic lexicons, regional models
individual
• Utilizes states and their relations:
state
e.g. maps, 3D models, simulations,…
(from Dekemo, 2004)
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
Application: semantic cube
All individuals
Many individuals
[S] Situational
One individual
One place
increasing
semantic
granularity
[S] State
Many places
[I] Individual
All places
[K] Ahistorical
[S] Schematic
[L] Upper
One time
Many times
All times
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
25
Application: semantic confusion
matrix
26
Semantic confusion =def
inappropriate substitution of a concept
from one level of granularity with a concept from another level, for some task.
 Over-granular = too specific
 Under-granular = too general
State
[S]
State
[S]
e.g. using local
conditions for
regional risk
Individual
[I]
Situational
[X]
Essential
[K]
under-granular:
in time and
place
under-granular:
in time, place,
individuals
undergranular:
in time, place,
individuals
under-granular:
in place,
individuals
undergranular:
in time, place,
individuals
Individual
[I]
over-granular:
in time
Situational
[X]
over-granular:
in time and
place
over-granular:
in place,
individuals
Essential
[K]
over-granular:
in time, place
and individuals
over-granular:
in time, place,
and individuals
undergranular:
in time, place,
individuals
over-granular:
in time, place,
and individuals
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
e.g. using regional
conditions for local
risk
Final thoughts
 CI needs KI for full scientific discovery
 proposal for increased semantic granularity in
geoscience ontologies to aid discovery
 increased prominence of historical-geographical
 but does it work?... needs:
 historical case studies
 knowledge infrastructure implementation and testing
 formal knowledge representation
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
27
28
 Questions, Comments?
Geoscience Ontologies in Knowledge Infrastructure
Boyan Brodaric, Edinburgh, July 17
Download