Scenario_Provenance_indicators_conflation

advertisement

Conflation scenarios, provenance queries and quality indicators from conflation processes

Scenario comments: (based on twiki CciScenario)

1.

2.

3.

4.

We may need to tailor based on how successful we are in incorporating conflation use cases.

Presently should be able to do linear features use case. While present emphasis is to get 1.1 to work with RoadMatcher, we also need to look at ability to expand to 2.1, 1.3, 1.2, and 1.9 (and perhaps variant of 1.3 using general use case 2 approach).

If other tools that can work on conflation WPS use cases for points and polygons are available, can be other proposed use cases.

Datasets for linear features in USA (TNM, TDS, OSM) and Haiti (OSM, HEDP) should be included. Datasets may be expanded if points and polygons are added.

The scenario for invoking conflation WPS should focus on updates to roads and airfields in both

USA and Haiti to either get to key personnel, route sensitive cargo, etc.

Provenance-related queries:

1. Automatically generated: a. When conflated dataset is provided back to user, provenance query should provide summary of conflation process (when dataset changed; data sources involved (conflation source and target) b. When user accesses a feature, a conflation summary should be provided if information has changed from initial dataset. (when changed, data source used for change

(conflation source) c. Provenance flags: TBD (may identify based on discussion of quality indicators)

2. User-generated: a. When reviewing dataset: i. Contrast features from different sources (what came from which source: show origin from conflation source v. conflation target) (using table or map) ii. Show new features (not in original target dataset but added from source during conflation) iii. Show modified features (in original target dataset but modified from source during conflation) b. When reviewing features i. Show feature with source and lineage and change summary (table or map) ii. Show attributes with source and lineage and change summary iii. Contrast prior feature and attribute with revised iv. Show qualitative and geopositional changes

Quality indicators

Derived from

1. Data Quality Analysis in the GEOSS Clearinghouse; Paula Díaz, Joan Masó, Eva Sevillano,

Miquel Ninyerola, Alaitz Zabala, Ivette Serral, and Xavier Pons2; Article under Review for the

International Journal of Spatial Data Infrastructures Research, submitted 2012-03-07

2. Handbook on Data Quality Assessment Methods and Tools; Mats Bergdahl, Manfred Ehling, Eva

Elvers, Erika Földesi, Thomas Körner, Andrea Kron, Peter Lohauß, Kornelia Mag, Vera Morais, Anja

Nimmergut, Hans Viggo Sæbø, Ulrike Timm, Maria João Zilhão, Manfred Ehling and Thomas Körner

(eds); Wiesbaden, 2007 http://epp.eurostat.ec.europa.eu/portal/page/portal/quality/documents/HANDBOOK%20ON%20DATA%20

QUALITY%20ASSESSMENT%20METHODS%20AND%20TOOLS%20%20I.pdf

Quality component

Relevance

(may not be available for

OWS-(9))

Positional

Accuracy

Completeness

Logical consistency

Thematic accuracy

Quality Indicator

User satisfaction index

Definition of measurement

Scaled assessment of input quality (e.g. 5-star scale, # likes); may be applicable more to VGI than standard datasets

Contextual or semantic fitness Calculated similarity measure of match of two input features or attributes

Absolute or external positional accuracy

Closeness of coordinate values to ground truth (can be applied to features in source, target, and output

Change in accuracy datasets)

Absolute (e.g. meters) or percent change in accuracy of output compared to target/reference

Conflation tool specific indicators

Feature Density

Provide values for accuracy of source to target (e.g. include RoadMatcher values for Maximum Distance

(Hausdorf), Trim Distance, and Nearness)

Number of features in conflation feature class (e.g. roads) in source, target, and output database

Feature Density Improvement Increase in feature density from target/reference to output (may be measured in number or percent improvement; can apply to features added (in output but not original source) or to features modified (in source but updated in conflation)

Attribute population density

Attribute population change

Average percentage of attributes populated for features in conflation feature class (e.g. roads) in source, target, and output database

Change in attribute population from target/reference to output (measured percent change; can also apply

Conflation tool specific indicators

Conceptual consistency to attributes added (in output but not original source) or to attributes modified (in source but updated in conflation)

Raw count of segments/features and segments added or adjusted from tool summary reports (e.g.

RoadMatcher AutoConflation Summary or Results

State data for Integrated, Inconsistent, or Adjusted segments (features))

Adherence to rules of the conceptual schema

Domain consistency

Topological consistency

Adherence of values to the value domains

Correctness of the explicitly encoded topological characteristics of a dataset

Format consistency Degree to which data is stored in accordance with the physical structure of the dataset

Quantitative attribute accuracy Accuracy of quantitative attributes

Non-quantitative attribute correctness

Thematic classification correctness

Correctness of non-quantitative attributes

Comparison of the classes assigned to features or their attributes to a universe of discourse ( ground truth or reference dataset) e.g.,

Download