COMP Today and TomorrowSWG

advertisement
Common Observations and Measurements Profile
COMP
Today and Tomorrow
Data Management
IMD, CSB
September 2014
COMP Background
The Need for a Common Data Standard
• Incompatible data formats and metadata conventions among
•
programs
Assessment, comparison and integration of data sources requires
excessive efforts – often impractical (Quoting the DST)
Primary Audience
• Scientists and data scientists
• Can make a major contribution to:
• Open Science (OGAP 2.0) where EC plays a leading role
• Open Data Initiatives
Based on WaterML 2.0 Part 1
2
“The core aspect of the model is in the correct, precise description of time
series. Interpretation of time series relies on understanding the nature of the
process that generated them. This standard provides the framework under
which time series can be exchanged with appropriate metadata to
allow correct machine interpretation and thus correct use for further
analysis.”
COMP XML Standards Architecture
WaterML 2.0
Part 1- Timeseries
North American
Profile of ISO 19115
(ISO/TS 19139)
Observations and
Measurements
2.0 (ISO 19156)
SWE
Common Data Model
2.0
Geography
Markup
Language 3.2
(ISO 19136)
TimeSeriesML
• WMO/NOAA and EC want WaterML 2.0 Part 1 rebranded
• IMD is participating in the OGC TimeSeriesML SWG
COMP Features Available Today
XML Data Exchange
A common XML exchange profile for time series data that is
100% compliant with OGC international standards
– no local extensions
COMP Viewer
When you open an online COMP XML file in
your browser, the Viewer tracks down all the
external references and presents you with a
complete picture of the metadata and data
as an HTML report in the official language of
your choice – with outlining for easy
navigation
COMP Data Point Downloads
The Viewer also offers download file options
for the extraction of data values into tabular
formats for consumption by your analytical
software
Future Value Added Tools
 GIS Mapping
 Data Visualization
The Anatomy of COMP
Common Observations and Measurements Profile
A common XML exchange profile for time series data that is
100% compliant with the OGC international standards:




wml2: WaterML 2.0 Part 1 – Timeseries
om: Observations & Measurements
swe: Sensor Web Enablement Common Data Model
gml: Geography Markup Language
COMP Logical Data Model
Provides a simple, stable, logical layer used for:
 User interfaces
 Data resource modularization
Creating COMP XML
Two methods:
 Data Input Templates
 Legacy system ETL
Excel Input Templates
FME - Feature Manipulation Engine
 with XSLT transformations







Collection
Monitoring Site
Observation
Data Points
Contact
Procedure
SKOS Taxonomies
Example of COMP Use of SKOS Taxonomies
COMP XML
This XML fragment references a name and
a unit of measure in SKOS taxonomies
SKOS Taxonomies
Define these terms in English and French
Simple Knowledge Organization System














EC
ISO-NAP
NAtChem (Air Quality)
Substance
Unit of Measure
WaterML2
Species
Bio-organism
Water Quality
Water Quantity
Meteorology
Ice Service
Wild Life Service
?
COMP Viewer
Looks up these SKOS references and resolves them in English and French
en-CA
fr-CA
COMP in Action
COMP XML File
http://www.ec.gc.ca/data_donnees/comp
When the user clicks on the file, it asks the
browser to render the XML using the COMP
Viewer instead of its own default XSLT script
(See 2nd line of syntax)
COMP Viewer
COMP Files
XSLT
SKOS Taxonomies
Download
Service
Data Point
Extraction Scripts
1
Browser uses COMP XSLT
4
The Download Service pipes
back the output to the browser
invoking its standard file
download facilities
2
Selecting a download option
invokes the Download Service
3
7 On-the-fly Download Options
Domain Variables Where, When?
Range Variables What?
Future Options?
fix a data point within the space and time
extents of the time series, e.g. date, time,
latitude, longitude, altitude, flight number,
pass number. They make up the key of a
data point.
are the observed properties recorded in a
time series, such as the values for wind
speed, temperature, pollution levels,
specimen size, etc.
Other Vocabularies:
This fits within the concept of dimensions in
data warehousing.
This approximates to the concept of facts in
data warehousing.




JSON
KML
OGC XML
WMO/NOAA XML
COMP Demo
 COMP Viewer
 Data Point Downloads
COMP Next Steps
Data Warehouse
XML-Relational Hybrid
Query Dimensions
API





WFS Service
Query
Response
COMP XML Payload
Audience



EC
GOC
International
Built-in Functionality


Temporal extent
Spatial extent
Sites
Variables
Techniques
Indexed
SQL
Tables
Pointing to
COMP Viewer
Data Point Downloads
XML
CLOBs
Query-specific Collections
of COMP components
are assembled on-the-fly
for the API
What users want - What COMP offers
Easy access to high quality metadata
• that allows them to evaluate the quality of the data and the
•
techniques used to acquire and process it
The COMP Viewer gives them a comprehensive view of all the
relevant metadata in an easily navigated format in their preferred
official language
Easy access to data points for analysis
• In formats compatible with their analytical software
• The COMP Viewer provides instant access to on-demand generation
•
of data point text files in a variety of formats
Additional formats can be supported on request, both text and XML
Easy access to functionality wherever the dataset is found
• On a Web portal, through a Web service API, in Web accessible
•
11
folders, or on FTP servers
COMP XML files come with all this functionality built-in no matter
where they are stored online
Backup Slides
Just in case!
COMP Logical Data Model
Contact
0:n 0:n 0:n
Collection
Procedure
0:1 0:1 0:1 0:1
1:n
Monitoring Site
or
or
0:n
hierarchy
0:n
related
1:n
1:n
Observation
1:1
1:n
Time Series
Variable
1:n
Domain
Range
1:n
0: 4
Data Point
0:n
Qualifier
COMP Input Templates
Taxonomy
Template
generates
SKOS
Taxonomies
references
.RDF
.XML
Contact
Template
generates
Contacts
Collection Template
Collection
references
Contact
Monitoring Site
.XML
.XML
Observation
Monitoring Site
Template
generates
Monitoring
Sites
references
Procedure
.XML
.XML
Procedure
Template
generates
Time Series
Data Point
Procedures
references
.XML
.XML
Data Point
Template
generates
Data Points
references
.XML
.XML
Define once and then reference wherever needed
Data Points in Composite CSV Format
Data Points in Composite TSV Format
Data Points in Simple TSV Format
Data Points in Coverage TLV Format
Download