Quality model PPT

advertisement
GeoViQua:
the quality challenges for
GEOSS
YANG Xiaoyu, BLOWER Jon, CORNFORD Dan, LUSH Victoria,
MASO Joan, ZABALA Alaitz, Nüst Daniel
Center of Research in Ecology and Forestry Applications (CREAF)
contact@geoviqua.org
QUAlity
aware
VIsualisation
for the
Global Earth
Observation
system of
systems
www.geoviqua.org
The problem
• Is there quality information in the GCI?
– There is some in the form of ISO19115 DQ elements and lineage
– Not enough
• The GEOSS Common Infrastructure does not follow a
global model for quality
• The GEOPortal search and results
– are not ranged by quality
– quality indicators are not shown
• Common data viewers do not generally include quality
information in parallel with the data
www.geoviqua.org
The aim
GeoViQua will provide a set
of scientifically developed
software components
and services that
facilitate the creation,
search and
visualization of
quality information
on EO data integrated
and validated in the GEOSS
Common Infrastructure.
GEO S&T
Label
Community
building
Pilot case studies
www.geoviqua.org
Time table
Requirements and Data
Model phase finished,
1
2
3
4
5
6
7
8
Search &
Visualization
Workshops
GeoLabel
10
11
Metadata
extraction
Quality elicitation
Pilot cases
9
Start
12
13
14
15
16
17
1
8
Best practices
quality encoding
19
20
21
22
23
25
Direct extraction from
continuous variables
26
27
28
29
30
31
32
33
34
Extraction from
categorical
variables
User feedback
Validation
Prototypes
Mobile Solutions
Data
ready
Quality
recommendations
Testing
solutions
User & technical
requirements to CoP
Proposals evaluation
24
User & technical
solutions to CoP
Final document
www.geoviqua.org
35
36
Community Views on Data Quality
• Many researchers refer to the ‘famous five’ as the common
criteria for evaluating spatial data quality
– lineage; completeness; consistency; positional accuracy; and
attribute accuracy.
• Broad scientific acceptance of the common spatial quality
elements does not apply to all cases for “fitness-for-use”
evaluation
– user requirements can go far beyond the widely accepted ‘famous
five’.
• We used semi-structured telephone and face-to-face
interviews with a variety of geospatial data users and
experts from a number of countries and application
domains.
www.geoviqua.org
What users want?
• Users are exceedingly interested in good quality metadata records
– And information that can help to assess fitness-for-use of the data
• Users find metadata records typically incomplete with essential data omitted
– The process of dataset discovery and selection is more difficult
• Users are also interested in ‘soft’ knowledge about data quality
– Data providers’ comments on the overall quality of a dataset, known data errors, potential
data usage
– Peers’ reviews and recommendations (they contact their peers to obtain suggestions)
– Dataset provenance, citation and licensing information
• Citation is incomplete (lack of valid producer contact details), and licensing often missing
• Citation: users rely on data from good reputation producers
• Currently, some of these cannot be recorded in standard metadata
• Need for easily and systematically compare metadata records
– Side-by-side visualisation of all metadata elements would allow geospatial datasets to be
compared more effectively,
• especially when datasets are very similar and differences are hard to distinguish
www.geoviqua.org
Producer’s-consumer’s quality
• Producer’s quality metadata
–
–
–
–
In the producers metadata records
Encoded in the classical ISO 19115/19139
Some extensions required
Stored in the current catalogues (GEOSS Clearinghouse, etc)
• Consumer’s quality metadata
–
–
–
–
In independent metadata repositories
Linked to producer’s metadata by id
Future component of the GCI?
Contains comments, “like it”, star rates, etc
www.geoviqua.org
The ISO classical view
Quality indicators
Provenance/Lineage
Usage
www.geoviqua.org
Add ‘soft’ knowledge to
producer’s metadata
Metadata
Packages
Dataset series
0..*
Metadata
User
Feedback
Discovered
Issues
Publication
Lineage
Quality
Scope
Dataset
Subset of data
Feature Type
Data Quality
Universe of
Discourse
••
Non-quantitative
Quality Information
Metaquality
Quality Element
Quality Parameter (ISO 19157)
Completeness
Positional
Accuracy
Temporal
Accuracy
Thematic
Accuracy
Quality Indicator (ISO 19157)
Omission
Missing
Items
Commission
Number of
Missing Items
Quantitative
attribute accuracy
••
Logical
Consistency
Classification
correctness
Misclassification
rate
Usability
Non-quantitative
attribute correctness
Misclassification
matrix
••
Quality Measure (ISO19157, UncertML)
www.geoviqua.org
Quality model is much more that
positional accuracy
• There are many quantifiable aspects that can be
recorded
– Consistency, completeness, positional,
thematic and temporal accuracy…
• There are many qualitative aspects that are
needed
– Lineage (traceability), scientific papers, user
feedback, data usage…
www.geoviqua.org
GeoViQua Data model:
statistical uncertainties
<gmd:DQ_QuantitativeAttributeAccuracy>
<gmd:DQ_QuantitativeAttributeAccuracy>
<gmd:result>
<gmd:result>
<gmd:DQ_QuantitativeResult>
<gmd:DQ_QuantitativeResult>
<gmd:valueType>
<gmd:valueUnit>m</gmd:valueUnit>
<gco:RecordType
xlink:href=“http://www.uncertml.org/distributions/normal”>
<gmd:value>
Value of the vertical DEM accuracy
<gco:Record>3.6</gco:Record>
</gco:RecordType>
</gmd:value>
</gmd:valueType>
</gmd:DQ_QuantitativeResult>
<gmd:valueUnit>m</gmd:valueUnit>
</gmd:result>
<gmd:value>
Explicit recognition that errors
</gmd:DQ_QuantitativeAttributeAccuracy>
<gco:Record>
acceptably fit a Normal distribution
<un:NormalDistribution>
with mean 1.2
<un:mean>1.2</un:mean>
• An overall positive bias was
<un:variance>3.6</un:variance>
observed
</un:NormalDistribution>
</gco:Record>
• A difficult feature to convey by
</gmd:value>
traditional means)
</gmd:DQ_QuantitativeResult>
</gmd:result>
</gmd:DQ_QuantitativeAttributeAccuracy>
www.geoviqua.org
The need for a measure
dictionary
• Current quality
measure names in
the GCI
– Nothing to do with
ISO19138 list of
possible measures
– Not well defined
Absolute external positional accuracy
Anweisung Straßeninformationsbank (Bundes…
Codelist omission
completeness
Feature represented as a single object
horizontal
Horizontal Positional Accuracy
Lagegenauigkeit
Latitude Resolution
Longitude Resolution
Mean value of positional uncertainties (2D)
Overlapping polygon
Quantitative Attribute Accuracy Assessment
Rate of missing items
Sach- und Geodatenüberprüfung
Temporal Resolution
Überprüfung der Toplogie
Valid code Test
Vertical Positional Accuracy
Vertical Resolution
vertikal
Vollständigkeit
www.geoviqua.org
2
1
2
198
2
3146
3265
3
3437
3350
3
2
255
87
7
2870
2
2
1826
812
348
4
Data Quality Measure
Dictionary
• Some quality indicators are used, but the name and
description of the measure used to derive the indicator
are rarely well described.
• Problems can occur due to the lack of semantic
definitions of quality measures.
– “uncertainty at 90% significance level” ??.
• A Quality Measure Dictionary is proposed that
includes:
– vocabularies for quality measures
– associated semantic annotations
– integrate UncertML concepts and vocabularies.
• Composed on quality measures provided by
– ISO138  ISO19157
– UncertML.
Quality Measure ID
(ID=“” Name=“”, Alias=“”)
Description
Definition
Quality element
Basic measure
Value type
UncertML
Dictionary
Value structure
Parameter
Example use
Source reference
UncertML
representation
(URI=“”)
URI
• Measure has a unique ID
– quality element, value type, quality basic measure,
description, example use, etc.
• “uncertainty at 90% significance level” can be
annotated using UncertML vocabulary
“ConfidenceInterval”(URI:
http://www.uncertml.org/statistics/confidence-interval)
<un:ConfidenceInterval
xmlns:un="http://www.uncertml.org/2.0">
<un:lower level="0.05">
<un:values>3.14</un:values>
</un:lower>
<un:upper level="0.95">
<un:values>6.28</un:values>
</un:upper>
</un:ConfidenceInterval>
www.geoviqua.org
Quality Metadata Levels
Level: Multiseries
Positional accuracy: 2.5 m
Content date: 2009-2010
Level: theme=contour line
Overwrite positional accuracy:
1.5 m
Multiseries
Level: sheet=73-30
Overwrite content date:
October 2009
Series
Sheet or Scene
777333--3-33000
Dataset
(raster or feature instance)
Level: dataset (theme=contour line,
sheet=73-30)
Positional accuracy: 1.5 m
Content date: October 2009
www.geoviqua.org
GEOSS common infrastructure
GEOSS Common
Infrastructure
Main GEO
Web Site
Registered Community
Resources
Client Tier
Registries
GEO
Web Portals
Community
Portals
Client
Applications
Components
& Services
Standards and
Interoperability
Best Practices
Wiki
Business Process Tier
GEOSS
Clearinghouse
User
Requirements
Community
Catalogues
Workflow
Management
Alert
Servers
Processing
Servers
Access Tier
GEONETCast
Product Access
Servers
Sensor Web
Servers
Model Access
Servers
www.geoviqua.org
Before GEOSS
Capacity
Resource
User
SBA
Business Process Tier
Capacity
Catalogues
Disasters
Health
Energy
Access Tier
Climate
Product Access
Servers
Water
Weather
Model Access
Servers
Ecosystems
Agriculture
Sensor Web
Servers
Biodiversity
GEONETCast
www.geoviqua.org
How GEOSS worked yesterday
Capacity
Resource
Business Process Tier
User
Components
& Services
Registry
SBA
Capacity
Catalogues
Access Tier
Product Access
Servers
Model Access
Servers
Disasters
Health
GEOSS
Clearinghouse
Catalogue
Energy
Climate
Water
DB
GEO Web
Portal
Ecosystems
Agriculture
Sensor Web
Servers
GEONETCast
Weather
Biodiversity
GEOSS Common Infrastructure
www.geoviqua.org
How GEOSS is going to work
Capacity
Resource
Business Community
Process Tier
Community
Catalogue
Community
Catalogue
Community
Capacity
Catalogue
Catalogues
Catalogue
Access Tier
Product Access
Servers
Model Access
Servers
User
Components
& Services
Registry
SBA
Disasters
GEOSS
Clearinghouse
Catalogue
EuroGEOSS
Broker
Health
Energy
Climate
Water
DB
GEO Web
Portal
Weather
Ecosystems
Agriculture
Sensor Web
Servers
Biodiversity
GEONETCast
GEOSS Common Infrastructure
www.geoviqua.org
How GEOSS is going to work
Capacity
Resource
Business Process Tier
Community
Catalogues
Access Tier
Product Access
Servers
Model Access
Servers
Components
& Services
Registry
Community
Community
Catalogue
Community
Catalogue
Capacity
Catalogue
Catalogue
User
SBA
Disasters
EuroGEOSS
EuroGEOSS
Broker
Broker
GEOSS
Clearinghouse
Catalogue
Health
Energy
Climate
Water
DB
GEO Web
Portal
Weather
Ecosystems
Agriculture
Sensor Web
Servers
Biodiversity
GEONETCast
GEOSS Common Infrastructure
www.geoviqua.org
GeoViQua quality model
EuroGEOSS
Broker model
GeoViQua Model
Metadata
Packages
Dataset series
0..*
Metadata
Comments/
Peer Review
Discovered
Issues
Publication
Lineage
Quality
Scope
Dataset
Subset of data
Feature types
Data Quality
Product
Specification
Rules
Quality requirements
Universe of Discourse
(i.e. Reality)
••
Non-quantitative
Quality Information
Metaquality
Quality Element
Quality Parameter (ISO 19113)
Completeness
Positional
Accuracy
Temporal
Accuracy
Thematic
Accuracy
Quality Indicator (ISO 19113)
Omission
Missing
Items
Commission
Number of
Missing Items
Quantitative
attribute accuracy
••
Logical
Consistency
Classification
correctness
Misclassification
rate
Quality measure (ISO19114/ISO19138, UncertML)
www.geoviqua.org
Usability
Non-quantitative
attribute correctness
Misclassification
matrix
••
Quality in GEOSS
Capacity
Resource
Business Process Tier
Components
& Services
Registry
Community
Community
Catalogue
Community
Catalogue
Capacity
Catalogue
Catalogue
Product Access
Servers
Model Access
Servers
User
SBA
Capacity
Catalogues
Access Tier
Enhanced
geo-search
tools
Disasters
GEOSS
Clearinghouse
Catalogue
EuroGEOSS
Broker
Health
Energy
Climate
Water
DB
GEO Web
Portal
Weather
Ecosystems
Agriculture
Sensor Web
Servers
Biodiversity
GEONETCast
GEOSS Common Infrastructure
www.geoviqua.org
Including data quality in search
• SELECT  WHERE
positional_accuracy < 20 and classification_correctness > 90%
FROM GEOSS_GCI
Devillers R, Bédard Y, R Jeansoulin (2005) Multidimensional Management
of Geospatial Data Quality Information for its Dynamic Use Within GIS
www.geoviqua.org
Enhanced
geo-search
tools
Consumer’s data quality
• More informal
• Based on social network patterns
–
–
–
–
Comments
Linked data
Like it
Star ratings
• More dinàmic
• Need for an encoding
• Need for an independent repository
www.geoviqua.org
GEOSSBack
http://www.ogc.uab.cat/GEOSSBack
• Just a prototype
to play with and
demonstrate a
concept.
www.geoviqua.org
Producer’s+consumer’s
GeoViQua Broker
cmp GeoViQua Components Agreed So Far
CSW Clearinghoure
WMS
SOS-Q + SensorML
Capacity Catalogues
Sensor Registry Q
SOS-Q + SensorML
EuroGEOSS
Discov er broker Q
CSW
CSW-Q
GeoViQua Broker
CSW
WAF
Metadata Import
tool
+
+
+
HDF
netCDF
others...
unknown
FeedBack Serv er
www.geoviqua.org
Quality Metadata comparison
www.geoviqua.org
Conclusions
• After user interviews
• Producer’s quality model
–
–
–
–
GeoViQua quality model is based in ISO
With extensions for ‘soft’ knowledge
Inclusions of uncertML
Quality measure dictionary
• Consumer’s quality model
– Based on social network patterns
– Encoded independently (from producers)
• Linked by the GeoViQua broker (extension/complement of
the EuroGEOSS broker)
www.geoviqua.org
GEOLabel
• What is it?
– The GEO Label is intended to “assist the user to assess the scientific relevance,
quality, acceptance and societal needs of the components” (ST-09-02 Task Team,
2010).
Task performed in
be a quality indicatorcollaboration
for GEOSS geospatialwith
data and
datasets
EGIDA
• Problem: Usability depends on data application; there is no defined threshold.
FP7
anddatasets.
the GEO
improve user recognition
andproject
trust in validated
• Problem: who is going
to certify
this?
task
ST-09-02
• Purposes?
–
–
– assist in searching by providing users with visual clues of dataset quality and
relevance.
– provide accreditation, provenance, monitoring
– increase visibility of EO data
– Emphasize in open access and easy availability
• Possible shape?
– Certification label
– A formal way to present
• quality indicators
• provenance
• attribution
www.geoviqua.org
GEOLabel
• Until the end of this week
• Publicly available in the web
• We encourage you to participate!
www.geoviqua.org
Please participate in the
questionnaire:
http://geolabel.questionpro.com
just a couple of days left!!
Thanks
Joan.Maso@uab.cat
(CREAF)
Please participate in the
questionnaire:
http://geolabel.questionpro.com
just a couple of days left!!
Thanks
Joan.Maso@uab.cat
(CREAF)
How GEOSS is going to work
Capacity
Resource
Business Process Tier
Components
& Services
Registry
Community
Community
Catalogue
Community
Catalogue
Copacity
Catalogue
Catalogue
Product Access
Servers
Disasters
GEOSS
Clearinghouse
Catalogue
EuroGEOSS
Broker
Health
Energy
Climate
Water
DB
GEO Web
Portal
Model Access
Servers
Sensor Web
Servers
User
SBA
Capacity
Catalogues
Access Tier
Quality aware
visualisation
tools
Quality Access
Broker
GEONETCast
GEOSS Common Infrastructure
www.geoviqua.org
Weather
Ecosystems
Agriculture
Biodiversity
Quality map visualization
Quality aware
visualisation
tools
Express data quality using maps
Blackmond Laskey K, EJ. Wright PCG da Costa (2009) Envisioning uncertainty in geospatial information
Devillers R, Bédard Y, R Jeansoulin (2005) Multidimensional
Management of Geospatial Data Quality Information for its
Dynamic Use Within GIS
• Dark color represents poor
quality and light color good
quality
www.geoviqua.org
Quality map visualization
Quality aware
visualisation
tools
• 3D representations
– representation of
estimated water balance
surplus/deficit and their
uncertainty (using bars
above and below the
surface).
• Map representations have some problems
– Makes visualization more complicated
and difficult to understand
– Attracting the attention to the more
uncertain objects!!
Pang A (2001) Visualizing Uncertainty in Geo-spatial Data
MacEachren AM, A Robinson, S Hopper, S Gardner, R Murray, M Gahegan, E Hetzler (2005) Visualizing
Geospatial Information Uncertainty; What We Know and What We Need to Know
www.geoviqua.org
Pilot Case scenarios
Agriculture
Global Carbon
Air Quality
Based on many user stories
among GEOSS SBA
www.geoviqua.org
Please participate in the
questionnaire:
http://geolabel.questionpro.com
just a couple of days left!!
Thanks
Joan.Maso@uab.cat
(CREAF)
Download