VOTable an International Virtual Observatory data exchange format Roy Williams François Ochsenbein Clive Davenhall Daniel Durand Pierre Fernique David Giaretta Robert Hanisch Tom McGlynn Alex Szalay Andreas Wicenec XML: Structured Information <From>Antonio Stadivarius</From> <To>Domenico Scarlatti</To> <Date> <Day>13</Day> <Month>4</Month> <Year>1723</Year> </Date> <Body> Io bisogno una appartamento acoglienti a Cremona … </Body> Separation of structure from presentation 4/13/23 April 13, 1723 17.iv.1723 The computer can read the document: “Find all memos from April 1723” VOTable • • • • Full metadata representation Hierarchy of RESOURCEs containing PARAMs and TABLEs UCD (unified content descriptor) – a has unit meter – a has UCD ORBIT_SIZE_SMAJ (Semi-major axis of the orbit ) • Can reference remote and/or binary streams • Table can be – Pure XML – "Simple Binary" – FITS Binary Table VOTable Parentage • Astrores • • • • XML format for tables Developed at CDS Strasbourg Presented at ADASS 1999 Vizier implementation • XSIL • • • • • XML format for Tables and Arrays Developed at LIGO Caltech 2000 Extensible through Type-Class dynamic loading Java parsing, browsing, editing Matlab interface Sample VOTable <?xml version="1.0"?> <!DOCTYPE VOTABLE SYSTEM "http://us-vo.org/xml/VOTable.dtd"> <VOTABLE version="1.0"> <DEFINITIONS> <COOSYS ID="myJ2000" equinox="2000." epoch="2000." system="eq_FK5"/> </DEFINITIONS> <RESOURCE> <PARAM name="Observer" datatype="char" arraysize="*" value="William Herschel"> <DESCRIPTION>This parameter is designed to store the observer's name </DESCRIPTION> </PARAM> <TABLE name="Stars"> <DESCRIPTION>Some bright stars</DESCRIPTION> <FIELD name="Star-Name" ucd="ID_MAIN" datatype="char" arraysize="10"/> <FIELD name="RA" ucd="POS_EQ_RA" ref="myJ2000" unit="deg" datatype="float" precision="F3" width="7"/> <FIELD name="Dec" ucd="POS_EQ_DEC" ref="myJ2000" unit="deg" datatype="float" precision="F3" width="7"/> <FIELD name="Counts" ucd="NUMBER" datatype="int" arraysize="2x3x*"/> <DATA> <TABLEDATA> <DATA> <TR> <FITS> <TD>Procyon</TD><TD>114.827</TD><TD>5.227</TD> <STREAM <TD>4 5 3 4 3 2 1 2 3 3 5 6</TD> href="ftp://server.com/mydata.fits" </TR> expires="2002-02-22" <TR> <TD>Vega</TD><TD>279.234</TD> actuate="onRequest"/> <TD>38.782</TD><TD>8 7 8 6 8 6</TD> </FITS> </TR> </DATA> </TABLEDATA> </DATA> </TABLE> </RESOURCE> </VOTABLE> Table Cell boolean bit unsignedByte scalar short Primitives arrays int long char unicodeChar float etc variable length arrays double floatComplex etc doubleComplex VOTable is Flexy • eg Table of images • UCD="JPEG_IMAGE" datatype="unsignedByte" arraysize="*" • eg Table of URL links • UCD=“DATA_LINK" datatype="char" arraysize="*" VOTable Schema (xsd) Table Data Model • Metadata • Class definition for Row • FIELD – data type – semantic type • Data • Each Row is a list of Cells • Each Cell is an array of Primitives – may be variable length Table Data Layout • All metadata first – small, complex, XML • Class definition for table record • + params, description, etc etc • Then data – (may be) large, remote – XML | binary | FITS • Instantiations of table record • All records MUST have same format Param Data Model • Param is “Table with one cell” • Like a FIELD value • But with a “value” attribute Primitives • All have fixed binary length Meaning "boolean" Logical "L" 1 Bit "X" * Byte (0 to 255) "B" 1 Short Integer "I" 2 Integer "J" 4 "bit" "unsignedByte" • Same as FITS primitives • Except Unicode FIT datatype "short" "int" Bytes S "long" Long integer "K" 8 "char" ASCII Character "A" 1 "unicodeChar" "float" "double" "floatComplex" "doubleComplex" Unicode Character 2 Floating point "E" 4 Double "D" 8 Float Complex "C" 8 Double Complex "M" 16 Multidimensional Array Cell • A table cell can have lots of Primitives • Example: WCS parameters are arrays – <FIELD name=“CRVAL” datatype=“double” arraysize=“2”/> • Example: up to 10 images, each 64x64 – <FIELD name="thumbs" datatype="unsignedByte" arraysize="64x64x10*"/> Hierarchy • A VOTable contains RESOURCES – RESOURCE can contain: • TABLE • RESOURCE • etc etc • Usage example • Many observations in the file, – each is a RESOURCE • Each observation is – Parameters – Calibration table – Raw data table Hierarchy • New feature: GROUP <TABLE name=“Nutation and Aberration”> <GROUP name=“Nutation”> <FIELD name=“Longitude”/> <FIELD name=“Obliquity”/> </GROUP> <GROUP name=“Aberration”> <GROUP name=“Equinox 1950.0”> <FIELD name=“C”/> <FIELD name=“D”/> </GROUP> <GROUP name=“Equinox 1955.0”> <FIELD name=“C”/> <FIELD name=“D”/> </GROUP> </GROUP> </TABLE> Unified Content Descriptors • UCD is a “semantic type” PHOT.INT-MAG.B Integrated total blue magnitude ORBIT.ECCENTRICITY Orbital eccentricity STAT.MEDIAN Statistics Median Value INST.QE Detector's Quantum Efficiency • Can be resolved by web service – to description, examples, etc • Base + Specifiers • eg error in default right ascension • POS.EQ.RA, MAIN, ERROR VOTable Friends Some “self-describing” file formats XML VOTable BinX MS Dataset Table √ √ √ √ √ √ √ HDF XDF Binary Streaming Semantics √ √ √ √ √ Datacube √ √ XML Parsing SAX: Event-Based Handlers for StartElement, Text, EndElement, etc. Found Found Found Found Found …. element BookCatalogue element Book Element Title Text The Cambridge Star Atlas End Element Title Parsing DOM: Document Object Model Returns a tree-like Document object with data attached BookCatalogue Book Book Title Cambridge Star Atlas Title Author ISBN Wil Tirion Parallel Computing Works! Binding to make a Parser From the Schema an API and library is generated JAXB Breeze Castor This is JAVOT (Caltech) for(int i=0; i<table.getFieldCount(); i++){ Field field = (Field)table.getFieldAt(i); String u = field.getUcd(); if(u != null && u.equals("POS_EQ_RA_MAIN")) System.out.println("Field " + i + " is for RA"); } VOTable Software Treeview from UK-VO VOTable Software VOPlot from India-VO VOTable Software VOTool from US-VO VOTable software Mirage from Bell Labs