VOTable an International Virtual Observatory data exchange format Williams

advertisement
VOTable
an International Virtual Observatory
data exchange format
Roy Williams
François Ochsenbein
Clive Davenhall
Daniel Durand
Pierre Fernique
David Giaretta
Robert Hanisch
Tom McGlynn
Alex Szalay
Andreas Wicenec
XML: Structured Information
<From>Antonio Stadivarius</From>
<To>Domenico Scarlatti</To>
<Date>
<Day>13</Day>
<Month>4</Month>
<Year>1723</Year>
</Date>
<Body>
Io bisogno una appartamento
acoglienti a Cremona …
</Body>
Separation of structure from presentation
4/13/23
April 13, 1723
17.iv.1723
The computer can read the document:
“Find all memos from April 1723”
VOTable
•
•
•
•
Full metadata representation
Hierarchy of RESOURCEs
containing PARAMs and TABLEs
UCD (unified content descriptor)
– a has unit meter
– a has UCD ORBIT_SIZE_SMAJ (Semi-major axis of the orbit )
• Can reference remote and/or binary streams
• Table can be
– Pure XML
– "Simple Binary"
– FITS Binary Table
VOTable Parentage
• Astrores
•
•
•
•
XML format for tables
Developed at CDS Strasbourg
Presented at ADASS 1999
Vizier implementation
• XSIL
•
•
•
•
•
XML format for Tables and Arrays
Developed at LIGO Caltech 2000
Extensible through Type-Class dynamic loading
Java parsing, browsing, editing
Matlab interface
Sample VOTable
<?xml version="1.0"?>
<!DOCTYPE VOTABLE SYSTEM "http://us-vo.org/xml/VOTable.dtd">
<VOTABLE version="1.0">
<DEFINITIONS>
<COOSYS ID="myJ2000" equinox="2000." epoch="2000." system="eq_FK5"/>
</DEFINITIONS>
<RESOURCE>
<PARAM name="Observer" datatype="char" arraysize="*" value="William Herschel">
<DESCRIPTION>This parameter is designed to store the observer's name
</DESCRIPTION>
</PARAM>
<TABLE name="Stars">
<DESCRIPTION>Some bright stars</DESCRIPTION>
<FIELD name="Star-Name" ucd="ID_MAIN" datatype="char" arraysize="10"/>
<FIELD name="RA" ucd="POS_EQ_RA" ref="myJ2000" unit="deg"
datatype="float" precision="F3" width="7"/>
<FIELD name="Dec" ucd="POS_EQ_DEC" ref="myJ2000" unit="deg"
datatype="float" precision="F3" width="7"/>
<FIELD name="Counts" ucd="NUMBER" datatype="int" arraysize="2x3x*"/>
<DATA>
<TABLEDATA>
<DATA>
<TR>
<FITS>
<TD>Procyon</TD><TD>114.827</TD><TD>5.227</TD>
<STREAM
<TD>4 5 3 4 3 2 1 2 3 3 5 6</TD>
href="ftp://server.com/mydata.fits"
</TR>
expires="2002-02-22"
<TR>
<TD>Vega</TD><TD>279.234</TD>
actuate="onRequest"/>
<TD>38.782</TD><TD>8 7 8 6 8 6</TD>
</FITS>
</TR>
</DATA>
</TABLEDATA>
</DATA>
</TABLE>
</RESOURCE>
</VOTABLE>
Table Cell
boolean
bit
unsignedByte
scalar
short
Primitives
arrays
int
long
char
unicodeChar
float
etc
variable length arrays
double
floatComplex
etc
doubleComplex
VOTable is Flexy
• eg Table of images
• UCD="JPEG_IMAGE"
datatype="unsignedByte" arraysize="*"
• eg Table of URL links
• UCD=“DATA_LINK"
datatype="char" arraysize="*"
VOTable Schema (xsd)
Table Data Model
• Metadata
• Class definition for Row
• FIELD
– data type
– semantic type
• Data
• Each Row is a list of Cells
• Each Cell is an array of Primitives
– may be variable length
Table Data Layout
• All metadata first
– small, complex, XML
• Class definition for table record
• + params, description, etc etc
• Then data
– (may be) large, remote
– XML | binary | FITS
• Instantiations of table record
• All records MUST have same format
Param Data Model
• Param is “Table with one cell”
• Like a FIELD value
• But with a “value” attribute
Primitives
• All have fixed
binary length
Meaning
"boolean"
Logical
"L"
1
Bit
"X"
*
Byte (0 to 255)
"B"
1
Short Integer
"I"
2
Integer
"J"
4
"bit"
"unsignedByte"
• Same as FITS
primitives
• Except Unicode
FIT
datatype
"short"
"int"
Bytes
S
"long"
Long integer
"K"
8
"char"
ASCII Character
"A"
1
"unicodeChar"
"float"
"double"
"floatComplex"
"doubleComplex"
Unicode Character
2
Floating point
"E"
4
Double
"D"
8
Float Complex
"C"
8
Double Complex
"M"
16
Multidimensional Array Cell
• A table cell can have lots of Primitives
• Example: WCS parameters are arrays
– <FIELD name=“CRVAL”
datatype=“double”
arraysize=“2”/>
• Example: up to 10 images, each 64x64
– <FIELD name="thumbs"
datatype="unsignedByte"
arraysize="64x64x10*"/>
Hierarchy
• A VOTable contains RESOURCES
– RESOURCE can contain:
• TABLE
• RESOURCE
• etc etc
• Usage example
• Many observations in the file,
– each is a RESOURCE
• Each observation is
– Parameters
– Calibration table
– Raw data table
Hierarchy
• New feature: GROUP
<TABLE name=“Nutation and Aberration”>
<GROUP name=“Nutation”>
<FIELD name=“Longitude”/>
<FIELD name=“Obliquity”/>
</GROUP>
<GROUP name=“Aberration”>
<GROUP name=“Equinox 1950.0”>
<FIELD name=“C”/>
<FIELD name=“D”/>
</GROUP>
<GROUP name=“Equinox 1955.0”>
<FIELD name=“C”/>
<FIELD name=“D”/>
</GROUP>
</GROUP>
</TABLE>
Unified Content Descriptors
• UCD is a “semantic type”
PHOT.INT-MAG.B Integrated total blue magnitude
ORBIT.ECCENTRICITY
Orbital eccentricity
STAT.MEDIAN
Statistics Median Value
INST.QE
Detector's Quantum Efficiency
• Can be resolved by web service
– to description, examples, etc
• Base + Specifiers
• eg error in default right ascension
• POS.EQ.RA, MAIN, ERROR
VOTable Friends
Some “self-describing” file formats
XML
VOTable
BinX
MS
Dataset
Table
√
√
√
√
√
√
√
HDF
XDF
Binary
Streaming
Semantics
√
√
√
√
√
Datacube
√
√
XML Parsing
SAX: Event-Based
Handlers for StartElement, Text, EndElement, etc.
Found
Found
Found
Found
Found
….
element BookCatalogue
element Book
Element Title
Text The Cambridge Star Atlas
End Element Title
Parsing
DOM: Document Object Model
Returns a tree-like Document object with data attached
BookCatalogue
Book
Book
Title
Cambridge Star Atlas
Title
Author
ISBN
Wil Tirion
Parallel Computing Works!
Binding to make a Parser
From the Schema an API and library is generated
JAXB
Breeze
Castor
This is JAVOT (Caltech)
for(int i=0; i<table.getFieldCount(); i++){
Field field = (Field)table.getFieldAt(i);
String u = field.getUcd();
if(u != null && u.equals("POS_EQ_RA_MAIN"))
System.out.println("Field " + i + " is for RA");
}
VOTable Software
Treeview from UK-VO
VOTable Software
VOPlot from India-VO
VOTable Software
VOTool from US-VO
VOTable software
Mirage from Bell Labs
Download