Hello! Virtual Observatories Ajit Kembhavi IUCAA Pune, India Data Storage and Retrieval The Astronomer Vermeer 1632-1675 The Library of Alexandria 3rd Century BC The Data Avalanche Immense amounts of data are being produced by large telescopes using large area detectors. Terabytes of data are now available, and Petabytes will soon be available from frequent all sky imaging. Vast databases are also being produced through simulations. Astronomical Data Explosion ~ 100 Gb/night P. Quinn Data Explosion Peter Quinn Wavelength Coverage The data spans the electromagnetic spectrum from the radio to the gamma-ray region. Obtaining, analysing and interpreting the data in different wavebands involves highly specialised instruments and techniques. The astronomer needs new tools for using this wealth of data in multiwavelength studies. Stars in the Milky Way The Hertzsprung-Russell Diagram The Alliance Members of the IVOA Interactions Virtual Observatories • Provide tools for data analysis, visualization and mining. • Develop interoperability concepts to make different databases seamless. • Manage vast data resources and provide these online to astronomers and other users. Empower astronomers by providing sophisticated query and computational tools, and computing grids for producing new science. IVOA Technology Initiatives The IVOA has identified six major technical initiatives to fulfill the scientific goal of the VO concept. IVOA-LISTS REGISTRIES: These collect metadata about data resources and information services into a queryable database. The registry is distributed. A variety of industry standards are being investigated. DATA MODELS: This initiative aims to define the common elements of astronomical data structures and to provide a framework to describe their relationships. UNIFORM CONTENT DESCRIPTORS: These will provide the common language for for metadata definitions for the VO. DATA ACCESS LAYER: This provides a standardized access mechanisms to distributed data objects. Initial prototypes are a Cone Search Protocol and a simple Image Access Protocol. VO QUERY LANGUAGE: This will provide a standard query language which will go beyond the limitations of SQL. VOTable: This is an XML mark-up standard for astronomical tables. Science Initiatives • Many IVOA projects have active Science Working Groups consisting of astronomers from a broad cross-section of the community representing all wavelengths. • The focus here is to develop a clear perception of the scientific requirements of a VO. • Projects within the working groups will develop new capabilities for VO based analysis. • This will enable the community to create new research programs and to publish their data and research in a more pervasive and scientifically useful manner. Virtual Observatory India A collaboration between IUCAA and PSPL, with a grant from the Ministry of Communications and Information Technology IUCAA Persistent Systems Pvt. Ltd., Pune Virtual Observatory - India Data Archives and Mirrors at VO-I SDSS 2Mass Chandra 2DFGRS 2QZ FIRST NVSS Vizier, Aladin, ADS Fast Computing Four alpha server ES-45 nodes, each with 4 processors, each node with 8 GB RAM Fast, Low latency interconnect Memory Channel Architecture Trucluster clustering environment (Tru64 Unix, DecMPI, openMP) VO-India Software Projects VOPlot Visualizer for catalogue data VOTable C++ Parser VOTable Streaming writer Data Converters Fits Browser User interfaces and query tools Applications beyond astronomy All tools have web-based and stand alone versions The VOPlot Collaboration Visualization and simple statistics of catalogue data. Integration with sky atlases. The VOPlot Tool A VO-I + CDS collaboration First conceived as a web-based tool for Vizier Then integrated with Aladin VOPlot is now also a stand alone system It has been integrated with many data bases Sonali Kale, K.D. Balaji et. Al. VOPlot Colour-magnitude diagram parallax Catalog Data Interface Tool A tool to query catalog data. Simple, customizable, graphic interface. Not specific to type of data or catalogue. SQL queries for expert users. VO tools available for analysis: VOPlot, Aladin, VOStat, SIMBAD, NED... Data Organization and Architecture Browse Server Database Back Create Views Back On-the-fly GUI Back Query using a Form Back Query using SQL Directly Back Results in VOPlot Back Results in Aladin Back Himalaya Chandra Telescope Data Archives SDSS J125637-022452 High proper motion L-subdwarf Optical spectra of mixed late M and mid L type Only the third L subdwarf known Positions 1986-2000 Proper motion 0.617 arcsec / yr Thank You AVO Prototype Demo Astrogrid: Astronomy Catalogue Extractor AVO: Aladin+SED VO-India:VOPlot FITS Manager View, create and add to FITS files Convert to other formats Pallavi Kulkarni Fits-manager VOTable Java Streaming Writer Acts on a data array in memory to convert it to the VOTable form, which is streamed row by row to an output file. Very large VOTables can be written without excessive memory. Pallavi Kulkarni VOTable-Java VOTable • This is a new data exchange standard produced through efforts led by Francois Ochsenbien of CDS, Strasbourg and Roy Williams of Caltech. • VOTable is in XML format. Physical quantities come with sophisticated semantic information. VOTable • The format enables computers to easily parse the information and communicate it to other computers. • Federation and joining of information become possible and Grid computing is easier. • VOTable parsers have been developed in Perl, Java and C++. • Enhancements and extensions are being considered. Streaming Parser Non-streaming Parser VOTable Data The data part in a VOTable may be represented using one of three different formats: – FITS : VOTable can be used either to encapsulate FITS files, or to re-encode the metadata. – BINARY : Supported for efficiency and ease of programming, no FITS library is required, and the streaming paradigm is supported. – TABLEDATA : Pure XML format for small tables. C++ VOTable Parser Motivation: – – – Provide a library for API based access to VOTable files. APIs can be directly used to develop VOTable applications without having to do raw VOTable processing. Streaming and Non-streaming versions are available. Sonali Kale, Sudip Khanna C++ VOTable Parser Salient Features: • • • Implemented as a wrapper over XALANC++. XALAN-C++ is a robust implementation of the W3C recommendations for XSL Transformations (XSLT) and the XML Path language (XPath). XPath queries can be used to access the VOTable data. Project Design VTable Metadata Link Collection Field Collection Field Table Data Link Link Collection Link Values minimum Row Collection maximum Row Option Collection Column Collection Options IUCAA HPC Facility Hercules Co-proposed by : Ajit Kembhavi T. Padmnabhan Tarun Souradeep • Four Alpha Server ES-45 machines HPC Team : Sarah Ponthratnam Sunu Engineer Rajesh Nayak Anand Sengupta • Each with 4 processors Alpha (21264C) •1.25 GHz clock speed • Cache on chip: 64 Kb –I, 64 Kb-D • Cache : 16 Mb ECC DDR • RAM 3 x 8 Gb + 12 Gb • Fast, Low latency interconnect > 30 G flops Preliminary HPL benchmark • Memory channel Architecture (MCA) • High volume Storage ES-45 • 1 Tera-byte SCSCI Specfp2000: 1327 Linpack(Tru64 1000x1000: •Trucluster clustering environment Unix, 6847 DecMPI, openMP) Virtual Observatory - India Persistent Systems IUCAA Caltech, Fermilab, JHU, NASA/HEARC, Microsoft, NCSA/UIUC, NOAO, NRAO, Raytheon ITS, SDSC/UCSD, SAO/CXC, STScI, UPenn, UPitts/CMU, UWis, USC, USNO, USRA, CVO NVO-People Virtual Observatory - India Ajit Kembhavi Inter-University Centre for Astronomy and Astrophysics Pune, India Virtual Observatories • Provide tools for data analysis, visualization and mining. • Develop interoperability concepts to make different databases seamless. • Manage vast data resources and provide these on-line to astronomers and other users. • Empower astronomers by providing sophisticated query and computational tools, and computing grids for producing new science. Terapix Jodrell Bank Registry and DIS High Volume Storage Raid 5, 4 Terabyte CVO Collaborations • There are three major projects at the CVO involving collaborations with other VO. • CVO is collaborating with the German Astrophysical VO to incorporate ROSAT X-ray data and catalogues into the CVO system. • CVO is collaborating with the Australian VO.to incorporate 2Qz and 2DF galaxy spectra into the CVO database. • CVO is an associate member of NVO and is have put in place some components of the NVO galaxy morphology demo. Science Initiatives • Many IVOA projects have active Science Working Groups consisting of astronomers from a broad cross-section of the community representing all wavelengths. • The focus here is to develop a clear perception of the scientific requirements of a VO. • Projects within the working groups will develop new capabilities for VO based analysis. • This will enable the community to create new research programs and to publish their data and research in a more pervasive and scientifically useful manner. Australian –VO Collaborations • The distributed volume renderer (dvr) software, is a tool for rendering large volumetric data sets using the combined memory and processing resources of Beowulf like clusters. • A collaboration between the Melbourne site of Aus-VO and AstroGrid aims to develop the existing dvr software into a grid-based volume rendering service. • Users will be able to select FITS-format cubes from a number of "Data Centres",have the data transferred to a chosen rendering cluster, and then proceed to visualise the volume of data remotely (See Demo). C++ VOTable Parser • Initial version - Released on May 31st , 2002. - Support only for reading of tables. - Support only for pure-XML TABLEDATA and not for BINARY or FITS data streams. - Runs on Windows NT 4.0, Windows 2000 and RedHat Linux 7.1. • Future enhancements - Can be incorporated quickly and efficiently. Parser Design Class Details • VTable: In memory representation of a single <TABLE> from the <RESOURCE> element in VOTable • • • • • • TableMetaData: Contains MetaData (Fields, Links and Description) Resource: Represents the <RESOURCE> element in the VOTable. TableData: Contains Rows Field: Representation of <FIELD> from VOTable Row: Representation of <TR> from VOTable Column: Representation of <TD> from VOTable Parser Design API – Typical Operations • File Level I/O Routines – Open VOTable file – Close VOTable file • Table I/O Operations – Get number of rows – Get number of columns – Get column(field) information (column name, column number, etc.) – Accessing table data Parser Implementation • • • • • • Development on Windows NT 4.0 platform using VC++. Ported to RedHat Linux 7.1/gcc-2.96 with zero effort. 18 C++ classes representing various elements of the VOTable format. 8500 lines of C++ code written for V1.1 release Project start date: April 7th 2002 V1.1 Release: May 31st 2002 Current status: V1.2 design in progress What is in Release V1.1 Parser to serve as a building block for developing VOTable based applications. Can be easily used by users of CFITSIO library. Supports powerful XPath queries against VOTable files. The first version of the VO Table parser can now be downloaded: http://vo.iucaa.ernet.in/~voi/html/infopage.html VOTable Parser Demo Serves as a tutorial to help understand the basic APIs provided by the VOTable parser. Demonstrates how to access the data and metadata elements of a VOTable file. Future Work • • • • • Develop APIs for writing data in VOTable format. Develop APIs for supporting IMAGE data and FITS files in VOTable. Enhance existing API set to allow more elaborate and flexible operations on VOTable files. Support future VOTable versions. Develop applications for conversion between FITS and VOTable formats. References • • • • The first version of the C++ parser can now be downloaded from the VO-India website http://vo.iucaa.ernet.in/~voi VOTable Details: http://vizier.u-strasbg.fr/doc/VOTable/ XALAN http://xml.apache.org/xalan-c/index.html XPATH http://www.w3.org/TR/xpath Virtual Observatory - India SDSS Data Features Size : 900 Gb DBMS : Microsoft SQL (MS-SQL) Data Contains : 1) Spectroscopic data 2) Tilling data SDSS Query Architecture User Interface Client Search MS-SQL Database MS-SQL Server Submit Query/Request User Process Query Output Output : 1) HTML 2) XML 3) CSV Data Catalogs & Web services at IUCAA Catalogs Catalog Description 2dfQSO Size : 4 MB 2dfGRS Size : 4 GB Organized as mSQL 2MASS Size : 42 GB Sky Survey Size : 13 GB FIRST Size : 192 GB Web Services 1) VizieR Services The most complete library of astronomical catalogues (e.g. Guide Star catalogues, USNO-BI. Tools to select, extract, format records matching a certain criteria. 2) Anglo-Australian 2DF System Query Tool to select records from the 2DF catalogue. Display Skymap & Spectrum (FITS) of objects in 2DF catalogue. Star Positions • REGISTRIES: These collect metadata about data resources and information services into a queryable database. The registry is distributed. A variety of industry standards are being investigated. • DATA MODELS: This initiative aims to define the common elements of astronomical data structures and to provide a framework to describe their relationships. • UNIFORM CONTENT DESCRIPTORS: These will provide the common language for for metadata definitions for the VO. Data Catalogs & Web services at IUCAA Catalogs Catalog Description 2dfQSO Size : 4 MB 2dfGRS Size : 4 GB Organized as mSQL 2MASS Size : 42 GB Sky Survey Size : 13 GB FIRST Size : 192 GB Web Services 1) VizieR Services The most complete library of astronomical catalogues (e.g. Guide Star catalogues, USNO-B1) Tools to select, extract, format records matching certain criteria. 2) Anglo-Australian 2DF System Query Tool to select records from the 2DF catalogue. Display Skymap & Spectrum (FITS) of objects in 2DF catalogue. SDSS Data Features Size : 900 GB DBMS : Microsoft SQL (MS-SQL) Contains : Spectroscopic data Tiling data SDSS Query Architecture User Interface Client Search MS-SQL Database MS-SQL Server Submit Query/Request User Process Query Output Output : 1) HTML 2) XML 3) CSV VO Schema