CSML

advertisement
NESC workshop
Grid and Geospatial Standards
7-Sep-2005
Data integration with the
Climate Science Modelling Language
Andrew Woolf1, Bryan Lawrence2, Roy Lowry3,
Kerstin Kleese van Dam1, Ray Cramer3, Marta
Gutierrez2, Siva Kondapalli3, Susan Latham2, Dominic
Lowe2, Kevin O’Neill1, Ag Stephens2
1CCLRC
e-Science Centre
2British Atmospheric Data Centre
3British Oceanographic Data Centre
1
Outline
NESC workshop
Grid and Geospatial Standards
7-Sep-2005
Background
Standards – a framework for interoperability
Climate Science Modelling Language (CSML)
Using CSML
2
Background
NESC workshop
Grid and Geospatial Standards
7-Sep-2005
Data integration requirements:
• scalability across providers
• warehousing not an option
• enhance access and use, ‘outwards-facing’ (e.g. impacts
community, policymakers)
• storage heterogeneity
 Semantics as integration ‘key’
• common language across providers (and users)
• supports wrapper/mediator architecture
3
Standards
NESC workshop
Grid and Geospatial Standards
7-Sep-2005
Emerging ISO standards
• TC211 – around 40 standards for geographic information
• Cover activity spectrum: discovery  access  use
…in a defined logical
structure…
…and described by
metadata.
…delivered through
services…
ISO 19101
Domain
Reference
Model
A geospatial dataset…
…consists of
features and
related objects…
4
Standards
NESC workshop
Grid and Geospatial Standards
7-Sep-2005
Geographic ‘features’
•
•
•
“abstraction of real world
phenomena” [ISO 19101]
Type or instance
Encapsulate important
semantics in universe of
discourse
Application schema
•
•
•
Defines semantic content and
logical structure of datasets
ISO standards provide toolkit:
• spatial/temporal
referencing
• geometry (1-, 2-, 3-D)
• topology
• dictionaries (phenomena,
units, etc.)
GML – canonical encoding
[from ISO 19109 “Geographic information –
Rules for Application Schema”]
5
Standards
NESC workshop
Grid and Geospatial Standards
7-Sep-2005
The importance of governance
• Information community defined by shared semantics
• Need community process to manage those semantics (definitions,
models, vocabularies, taxonomies, etc.)
• e.g. CF conventions for netCDF files
• Role of Feature Type Catalogues [ISO 19110] and registers [ISO 19135]
Governance as driver for granularity
• Remit / interest determines appropriate granularity
• e.g. IOC, IHO, WMO
<measurement type=“Radiosonde”
measurand=“temperature”/>
abstract
<Sonde parameter=“temperature”/>
generic
<temperatureProfile/>
highly specialised
feature types spectrum
6
Aims:
Climate Science
Modelling Language
NESC workshop
Grid and Geospatial Standards
7-Sep-2005
• provide semantic integration mechanism for NDG data
• explore new standards-based interoperability framework
• emphasise content, not container
Design principles:
• offload semantics onto parameter type (‘phenomenon’, observable,
measurand)
• e.g. wind-profiler, balloon temperature sounding
• offload semantics onto CRS
• e.g. scanning radar, sounding radar
• ‘sensible plotting’ as discriminant
• ‘in-principle’ unsupervised portrayal
• explicitly aim for small number of weakly-typed features (in accordance
with governance principle and NDG remit)
7
Climate Science
Modelling Language
NESC workshop
Grid and Geospatial Standards
7-Sep-2005
CSML feature types
• defined on basis of geometric and topologic structure
CSML feature type
Description
Examples
TrajectoryFeature
Discrete path in time and space of a platform or instrument.
PointFeature
Single point measurement.
ProfileFeature
Single ‘profile’ of some parameter along a directed line in space.
ship’s cruise track, aircraft’s
flight path
raingauge measurement
wind sounding, XBT, CTD,
radiosonde
GridFeature
Single time-snapshot of a gridded field.
gridded analysis field
PointSeriesFeature
Series of single datum measurements.
tidegauge, rainfall timeseries
ProfileSeriesFeature
Series of profile-type measurements.
vertical or scanning radar,
shipborne ADCP, thermistor
chain timeseries
GridSeriesFeature
Timeseries of gridded parameter fields.
numerical weather prediction
model, ocean general
circulation model
8
Climate Science
Modelling Language
NESC workshop
Grid and Geospatial Standards
7-Sep-2005
ProfileSeriesFeature
CSML feature types
• examples...
ProfileFeature
GridFeature
9
Climate Science
Modelling Language
NESC workshop
Grid and Geospatial Standards
7-Sep-2005
Application schema
•
•
logical structure and semantic content of NDG ‘Dataset’
Based on GML 3.1
«Type»
GML::AbstractGMLType
«Type»
GML::FeatureCollection
*
«Type»
AbstractArrayDescriptor
*
*
*
*
«Type»
Dataset
*
*
*
«Type»
UnitDefinitions
*
«Type»
ReferenceSystemDefinitions
*
«Type»
PhenomenonDefinitions
10
Climate Science
Modelling Language
NESC workshop
Grid and Geospatial Standards
7-Sep-2005
Integration approaches: wrapper/mediator
dataInterface
dataInterface
mediatorService
mediatorService
wrapper
wrapper
BODC
BADC
11
Climate Science
Modelling Language
Numerical array descriptors
•
•
«Type»
GML::AbstractGMLType
+id
+metaDataProperty
+description
+name
provides ‘wrapper’ architecture
for legacy data files
‘Connected’ to data model
numerical content through
‘xlink:href’
«Type»
AbstractArrayDescriptor
+arraySize[1]
+uom[0..1]
+numericType[0..1]
+numericTransform[0..1]
+regExpTransform[0..1]
Three subtypes:
•
•
•
InlineArray
ArrayGenerator
FileExtract (NASAAmes,
NetCDF, GRIB)
Composite design pattern for
aggregation
NESC workshop
Grid and Geospatial Standards
7-Sep-2005
*
+component
1
«Type»
AggregatedArray
+aggType[1]
+aggIndex[1]
«Type»
InlineArray
+values[*]
«Type»
ArrayGenerator
+expression[1]
«Type»
NASAAmesExtract
+variableName[1]
+index[0..1]
«Type»
AbstractFileExtract
+fileName[1]
«Type»
NetCDFExtract
+variableName[1]
«Type»
GRIBExtract
+parameterCode[1]
+recordNumber[0..1]
+fileOffset[0..1]
12
Climate Science
Modelling Language
NESC workshop
Grid and Geospatial Standards
7-Sep-2005
Inline array
<NDGInlineArray>
<arraySize>5 2</arraySize>
<uom>udunits.xml#degreeC</uom>
<numericType>float</numericType>
<regExpTransform>s/10/9/ge</regExpTransform>
<numericTransform>+5</numericTransform>
<values>1 2 3 4 5 6 7 8 9 10</values>
</NDGInlineArray>
Array generator
<NDGArrayGenerator>
<arraySize>10001</arraySize>
<uom>udunits.xml#minute</uom>
<numericType>float</numericType>
<expression>0:5:50000</expression>
</NDGArrayGenerator>
13
Climate Science
Modelling Language
NESC workshop
Grid and Geospatial Standards
7-Sep-2005
File extract
<NDGNASAAmesExtract>
<arraySize>526</arraySize>
<numericType>double</numericType>
<fileName>/data/BADC/macehead/mh960606.cf1</fileName>
<variableName>CFC-12</variableName>
</NDGNASAAmesExtract>
<NDGNetCDFExtract gml:id="feat04azimuth">
<arraySize>10000</arraySize>
<fileName>radar_data.nc</fileName>
<variableName>az</variableName>
</NDGNetCDFExtract>
<NDGGRIBExtract>
<arraySize>320 160</arraySize>
<numericType>double</numericType>
<fileName>/e40/ggas1992010100rsn.grb</fileName>
<parameterCode>203</parameterCode>
<recordNumber>5</ recordNumber>
<fileOffset>289412</fileOffset>
</NDGGRIBExtract>
14
Climate Science
Modelling Language
NESC workshop
Grid and Geospatial Standards
7-Sep-2005
Aggregated array
• arrays may be aggregated along an ‘existing’ or ‘new’ dimension
<NDGAggregatedArray gml:id="feat05cruisetrack">
<arraySize>2 50</arraySize>
<aggType>new</aggType>
<aggIndex>1</aggIndex>
<component>
<NDGNetCDFExtract>
<arraySize>50</arraySize>
<fileName>cruisetrack.nc</fileName>
<variableName>alat</variableName>
</NDGNetCDFExtract>
</component>
<component>
<NDGNetCDFExtract>
<arraySize>50</arraySize>
<fileName>cruisetrack.nc</fileName>
<variableName>alon</variableName>
</NDGNetCDFExtract>
</component>
</NDGAggregatedArray>
15
NESC workshop
Grid and Geospatial Standards
7-Sep-2005
Climate Science
Modelling Language
NetCDF
WCS
WFS
OPeNDAP
....
instantiateNetCDF(
DatasetID,
FeatureID)
<CSML>
<CSML>
<CSML>
Provides semantic
abstraction layer
<CSML>
(SAX) demarshalling
CSMLAbstractFeature
+writeNetCDF()
AbstractFileExtract
+read()
filestore
16
Status:
•
•
•
•
Climate Science
Modelling Language
NESC workshop
Grid and Geospatial Standards
7-Sep-2005
Initial feature types defined
First draft application schema complete
Trial software tooling being coded (parser, netCDF instantiation)
Initial deployment trial across BODC, BADC datasets
Future:
•
•
•
•
•
Separate out wrapper implementation (array descriptors)
Disallow ‘internal’ dictionaries
More strongly-typed features?
Follow (and pursue!) GML evolution, enhance compliance
Expand tooling
Related work
• WMO, IOC, IHO
• MarineXML
• MOTIIVE (INSPIRE)
17
Using CSML
NESC workshop
Grid and Geospatial Standards
7-Sep-2005
EU project – MarineXML
<gml:definitionMember>
<om:Phenomenon gml:id="taxon">
<gml:description>The taxon name</gml:description>
<gml:name codeSpace="http://www.vliz.be">taxon</gml:name>
</om:Phenomenon>
</gml:definitionMember>
</NDGPhenomenonDefinitions>
<!--===================================================================-->
<gml:FeatureCollection>
<!-- ============================================================== -->
<gml:featureMember>
<NDGPointFeature gml:id="ICES_100">
<NDGPointDomain>
<domainReference>
<NDGPosition srsName="urn:EPSG:geographicCRS:4979" axisLabels="Lat Long" uomLabels="degree degree">
<location>55.25 6.5</location>
</NDGPosition>
</domainReference>
</NDGPointDomain>
<gml:rangeSet>
<gml:DataBlock>
<gml:rangeParameters>
<gml:CompositeValue>
<gml:valueComponents>
<gml:measure uom="#tn"/>
<gml:measure uom="#amount"/>
<gml:measure uom="#gsm"/>
</gml:valueComponents>
</gml:CompositeValue>
</gml:rangeParameters>
<gml:tupleList>
'ANTHOZOA',63.1,missing
'Scoloplos armiger',66.1,missing
'Spio filicornis',10,missing
'Spiophanes bombyx',60.3,missing
'Capitellidae',131.8,missing
'Pholoe',10,missing
'Owenia fusiformis',23.4,missing
'Hypereteone lactea',6.8,missing
'Anaitides groenlandica',13.2,missing
'Anaitides mucosa',6.8,missing
“MarineXML is an initiative of the IOC/IODE of
“... there
is a momentum
such
UNESCO
to improve
marinefrom
dataorganisations
exchange within
as“The
IHOcommunity.
and WMO
adopt
consistent
the marine
The
European
Commission
NDG
formatto
proved
a robust
recipient for
approaches
for the
vocabulary
oftotheir
data along
has provided
funding
contribution
initiative
the dataa from
each
community.
Itthis
produced
the
implementation
of ISOtoStandards
as part
ofreference
its 5th Framework
Programme
economical
files with few
redundant
elements,
prescribed
by thethe
[Open
undertake
a ‘pre-standardisation’
task
of
striking
about
right Geospatial
balance
between
weak
Consortium]...”
identifying
the approaches
and strong
typing.” the marine community
should adopt regarding XML technology to achieve
improved data exchange.”
18
NESC workshop
Grid and Geospatial Standards
7-Sep-2005
Using CSML
Managing semantics
conceptual model
Class1
-End1
-End2
«datatype»
DataType1
1
*
Class2
UGAS
GML dataset
parser
<gml:featureMember>
<NDGPointFeature gml:id="ICES_100">
<NDGPointDomain>
<domainReference>
<NDGPosition srsName="urn:EPSG:geographicCRS:4979"
axisLabels="Lat Long" uomLabels="degree degree">
<location>55.25 6.5</location>
</NDGPosition>
</domainReference>
</NDGPointDomain>
<gml:rangeSet>
<gml:DataBlock>
<gml:rangeParameters>
<gml:CompositeValue>
<gml:valueComponents>
<gml:measure uom="#tn"/>
<gml:measure uom="#amount"/>
<gml:measure uom="#gsm"/>
</gml:valueComponents>
</gml:CompositeValue>
</gml:rangeParameters>
<gml:tupleList>
'ANTHOZOA',63.1,missing
'Scoloplos armiger',66.1,missing
'Spio filicornis',10,missing
'Spiophanes bombyx',60.3,missing
'Capitellidae',131.8,missing
GML app schema
19
NESC workshop
Grid and Geospatial Standards
7-Sep-2005
Using CSML
«interface»
SAX2::ContentHandler
+setDocumentLocator()
+startDocument()
+endDocument()
+startElement()
+endElement()
+characters()
+processingInstruction()
+ignorableWhitespace()
+startPrefixMapping()
+endPrefixMapping()
+skippedEntity()
«interface»
SAX2::DTDHandler
+notationDecl()
+unparsedEntity()
«interface»
SAX2::ErrorHandler
+error()
+fatalError()
+warning()
SAX2::DefaultHandler
«interface»
SAX2::EntityResolver
+resolveEntity()
Stack of Builders (for UML
meta-model)
•
•
current class, object, attribute
specialised for particular
UMLXML mapping
Builder receives:
•
•
filtered SAX events
built object
Builder returns:
•
•
•
ObjectParser
-process()
builderStack
*
ObjectBuilder
-currentClass
-currentObject
-currentAttribute
+xmlElement() : BuilderUnion
+xmlCharacters() : BuilderUnion
+xmlAttribute() : BuilderUnion
+processObject() : BuilderUnion
ObjectPropertyBuilder
«union»
BuilderUnion
AbstractClass
AbstractObject
AbstractBuilder
ISO19118Builder
built object
new object class
new Builder (for inheritance
through substitutionGroups)
20
Download