Conflation scenarios, provenance queries and quality indicators from conflation processes
Scenario comments: (based on twiki CciScenario)
1.
2.
3.
4.
We may need to tailor based on how successful we are in incorporating conflation use cases.
Presently should be able to do linear features use case. While present emphasis is to get 1.1 to work with RoadMatcher, we also need to look at ability to expand to 2.1, 1.3, 1.2, and 1.9 (and perhaps variant of 1.3 using general use case 2 approach).
If other tools that can work on conflation WPS use cases for points and polygons are available, can be other proposed use cases.
Datasets for linear features in USA (TNM, TDS, OSM) and Haiti (OSM, HEDP) should be included. Datasets may be expanded if points and polygons are added.
The scenario for invoking conflation WPS should focus on updates to roads and airfields in both
USA and Haiti to either get to key personnel, route sensitive cargo, etc.
Provenance-related queries:
1. Automatically generated: a. When conflated dataset is provided back to user, provenance query should provide summary of conflation process (when dataset changed; data sources involved (conflation source and target) b. When user accesses a feature, a conflation summary should be provided if information has changed from initial dataset. (when changed, data source used for change
(conflation source) c. Provenance flags: TBD (may identify based on discussion of quality indicators)
2. User-generated: a. When reviewing dataset: i. Contrast features from different sources (what came from which source: show origin from conflation source v. conflation target) (using table or map) ii. Show new features (not in original target dataset but added from source during conflation) iii. Show modified features (in original target dataset but modified from source during conflation) b. When reviewing features i. Show feature with source and lineage and change summary (table or map) ii. Show attributes with source and lineage and change summary iii. Contrast prior feature and attribute with revised iv. Show qualitative and geopositional changes
Quality indicators
Derived from
1. Data Quality Analysis in the GEOSS Clearinghouse; Paula Díaz, Joan Masó, Eva Sevillano,
Miquel Ninyerola, Alaitz Zabala, Ivette Serral, and Xavier Pons2; Article under Review for the
International Journal of Spatial Data Infrastructures Research, submitted 2012-03-07
2. Handbook on Data Quality Assessment Methods and Tools; Mats Bergdahl, Manfred Ehling, Eva
Elvers, Erika Földesi, Thomas Körner, Andrea Kron, Peter Lohauß, Kornelia Mag, Vera Morais, Anja
Nimmergut, Hans Viggo Sæbø, Ulrike Timm, Maria João Zilhão, Manfred Ehling and Thomas Körner
(eds); Wiesbaden, 2007 http://epp.eurostat.ec.europa.eu/portal/page/portal/quality/documents/HANDBOOK%20ON%20DATA%20
QUALITY%20ASSESSMENT%20METHODS%20AND%20TOOLS%20%20I.pdf
Quality component
Relevance
(may not be available for
OWS-(9))
Positional
Accuracy
Completeness
Logical consistency
Thematic accuracy
Quality Indicator
User satisfaction index
Definition of measurement
Scaled assessment of input quality (e.g. 5-star scale, # likes); may be applicable more to VGI than standard datasets
Contextual or semantic fitness Calculated similarity measure of match of two input features or attributes
Absolute or external positional accuracy
Closeness of coordinate values to ground truth (can be applied to features in source, target, and output
Change in accuracy datasets)
Absolute (e.g. meters) or percent change in accuracy of output compared to target/reference
Conflation tool specific indicators
Feature Density
Provide values for accuracy of source to target (e.g. include RoadMatcher values for Maximum Distance
(Hausdorf), Trim Distance, and Nearness)
Number of features in conflation feature class (e.g. roads) in source, target, and output database
Feature Density Improvement Increase in feature density from target/reference to output (may be measured in number or percent improvement; can apply to features added (in output but not original source) or to features modified (in source but updated in conflation)
Attribute population density
Attribute population change
Average percentage of attributes populated for features in conflation feature class (e.g. roads) in source, target, and output database
Change in attribute population from target/reference to output (measured percent change; can also apply
Conflation tool specific indicators
Conceptual consistency to attributes added (in output but not original source) or to attributes modified (in source but updated in conflation)
Raw count of segments/features and segments added or adjusted from tool summary reports (e.g.
RoadMatcher AutoConflation Summary or Results
State data for Integrated, Inconsistent, or Adjusted segments (features))
Adherence to rules of the conceptual schema
Domain consistency
Topological consistency
Adherence of values to the value domains
Correctness of the explicitly encoded topological characteristics of a dataset
Format consistency Degree to which data is stored in accordance with the physical structure of the dataset
Quantitative attribute accuracy Accuracy of quantitative attributes
Non-quantitative attribute correctness
Thematic classification correctness
Correctness of non-quantitative attributes
Comparison of the classes assigned to features or their attributes to a universe of discourse ( ground truth or reference dataset) e.g.,