Datatypes for OGSA Dr Martin Westhead Principal Consultant, EPCC Telephone: Fax: Email: +44 131 650 5958 +44 131 650 6555 M.Westhead@epcc.ed.ac.uk Overview Uses of XML Basic types Complex types Binary File Description Discussion – feedback. Uses of XML Two distinct areas: 1. Meta data – input values, algorithm choices etc. 2. Real datasets – large scale datasets More interest in (1) – natural use of XML Some interest in (2) but: – – – – Some issues with basic types (also a problem for 1) Modifying apps to read/write XML hassle XML layout may be inconvenient (e.g. arrays) Why bother? Proposed work – Fixes for basic types – XML schema for describing large binary files Basic Types I “XML Schema Part 2: Datatypes” – http://www.w3.org/TR/xmlsch ema-2/ Floating point: – Decimal – arbitrary precision decimal – Float – IEEE 754 (32 bit) – Double – IEEE 754 (64 bit) – Not defined: Quadruple ?? (128 bit) Integers long int short byte nonPositiveInteger negativeInteger nonNegativeInteger positiveInteger unsignedLong unsignedInt unsignedShort unsignedByte Basic Types II Complex numbers – Proposed W3C standard: • http://www.w3.org/2001/03/XMLSchema/TypeLibrary-nn-math.xsd • Uses decimals only • Need floats/doubles – Alternative standard: • http://www.posc.org/ebiz/pefxml/bdoc/dt_complex.html Proposal – Aim to contribute to W3C complex standard Complex Types Arrays – W3C proposal • http://www.w3.org/2001/03/XMLSchema/TypeLibrary-nn-array.xsd • Multidimensional array as a tree of lists (vectors) • Polymorphic – Sparse array supported in SOAP • http://www.w3.org/TR/SOAP/ (section 5.4.2.2) • Many alternative representations possible What else? – Sets, bags, graphs, meshes? Binary File Description Part of metadata – Canonical description of output – Enables web services for: • Conversion • Returning part of data (e.g. slice of array) Different levels of description – Accessing numbers • • • • Big/little endian Precision Offset Record separators…etc – Higher level structure • Array dimensionality/size • Variable names Discussion Feedback? What do you/your users want to use XML for? Value of standard binary description? NeSC registry XML schema references (Yahoo!) XML schema mirrors UDDI registry of eScience Grid Services