THE EUROPEAN GRID OF SOLAR OBSERVATIONS

advertisement
THE EUROPEAN GRID OF SOLAR OBSERVATIONS
Robert D. Bentley1, Dept. Space and Climate Physics, University College London
Anthony Finkelstein, Dept. Computer Science, University College London
C. David Pike, Dept. Space & Technology, Rutherford Appleton Laboratory
Valentina Z. Zharkova, Dept. Cybernetics, University of Bradford
ABSTRACT
The European Grid of Solar Observations (EGSO) is a
Grid testbed funded by the European Commission under
the Information Society Technologies (IST) thematic
priority of the Fifth Framework Programme (FP5).
EGSO will provide the tools and infrastructure needed
to create a data grid that will form the fabric of a virtual
solar observatory.
this will encourage participation and that this will
ensure its long-term viability.
The problems that EGSO addresses are not unique to
solar physics. Other disciplines also have distributed
data sets that are becoming too large to copy around and
a principle objective of the EGSO is to develop tools
that can also be used on other projects.
2.
EGSO started in March 2002 and will last for 36
months. The project involves eleven groups from ten
institutions located in five countries in Europe and the
US and is led by University College London – a total of
four groups are from the UK. The EGSO Consortium is
in discussion with other groups interested in creating a
virtual observatory with the aim of finding a solution
that is universally acceptable.
1.
INTRODUCTION
The task of identifying solar data sets of interest, then
locating and retrieving them, remains a continuing
difficulty. The data are heterogeneous and widely
distributed, without any means to tie them together, and
there is no systematic way to identify observations
associated with a particular feature or type of event.
Also, the rapidly increasing volume and complexity of
solar data necessitate a sea change in the way the data
are handled.
EGSO, the European Grid of Solar Observations, is
designed to confront these issues. It will allow a user to
identify solar observations covering a given time
interval and pointing, or a type of feature; it will locate
the selected observation and then return them after any
necessary processing. To achieve its objectives, EGSO
is developing new forms of catalogues: unified
observing catalogues derived from existing catalogues,
and feature and event catalogues. It will provide the
tools to search these, and will federate data archives to
simplify the recovery of the data.
We want EGSO to succeed and every effort will be
made to ensure that it is attractive for scientists to use
and not too complex or onerous for data providers to
support. If EGSO is generally acceptable, we hope that
1
For more information, e-mail: bentley@egso.org
OVERVIEW
2.1 The Generic Problem
In solar physics, observations are used to construct a
picture of the plasma in multi-dimensional parameter
space, including space, time, temperature and density.
The observations are made at different wavelengths
originating from different levels in the solar atmosphere
and the combined information allows the user to build
up an understanding of the changes in structures, motion
of material, sites of energy release, etc.
Observations from both ground and space are both
important. Satellite-based observations are made at
wavelengths that do not penetrate the Earth’s
atmosphere, including UV, EUV and X-rays. Groundbased observations, mainly in optical and radio
wavelengths, compliment those from space.
Satellites are usually operated under the umbrella of
large organizations and the instruments they carry are
often built by international collaborations – as a
consequence, the data are handled in a more systematic
and open manner. Data are stored in archives, often at
mission level, with copies at one or more sites. The files
have various formats, including FITS, and range from
single images to extended intervals (an hour or orbit).
The ground-based observatories involved are both large
and small, and are located throughout the world,
scattered over many time zones. Since observatories
only observes for a fraction of the day and are often
affected by weather, good coverage often means dealing
with a number of observatories. The data are usually
available as FITS files of single images. Often there is a
single copy of the data, managed by observatory.
2.2 Data Analysis
When analyzing solar observations, the user undertakes
the following three steps – they are beset with problems:
• Identify suitable observations
Many studies relate to the state or evolution of features.
They involve time intervals from a few minutes to many
hours, and areas from a fraction to the whole solar disk.
Frequently, they make use of serendipitous rather than
planned observations – an instrument observes the sun,
and post-facto events or feature of interest are
identified. To gain any understanding, it is necessary to
use as many different wavelengths as possible.
When undertaking a study, the researcher will have
identified a particular event, or the occurrence of a type
of feature, and will then have the task of identifying the
observations they need to investigate the phenomenon.
Catalogues are key to this. Solar observing catalogues
differ in quality, contents and format and in their
availability and accessibility.
For several space-based instruments, their observing
catalogues are distributed within the SolarSoftWare
DataBase tree, making it possible to use search then
with SolarSoft [1]. However, the size of the catalogues
deters most sites from holding a complete set. Also,
some catalogues have dependencies on ancillary data or
consist of multiple interrelated files making them
difficult to access except with specialized software.
• Retrieve the data
At some sites users can identify suitable observations
from observing catalogues distributed with SolarSoft,
but these are the exceptions. Also, the tools to easily
conduct such searches for multiple instruments are
almost none existent. As a consequence, the user often
has the problem of both identifying and retrieving the
data at the same time.
The data are heterogeneous, widely scattered, with
differing means of access. Being unable to identify
suitable observations a priori means the user has to
access many sites in their effort to gather data. Although
they often need only a subset of each data file, it is
necessary to retrieve several quite large files containing
extended intervals through sometimes quite crude
interfaces. This can be quite a painful experience.
• Process the data
Once the data has been retrieved, they have to be
processed. This usually involves the extraction and
calibration of a subset of the data. Here, solar physics
has a great advantage over many disciplines. SolarSoft
provides a common set of analysis tools that are
distributed globally – it also establishes the environment
in which to use them. Calibration data are often usually
distributed with the software in the SolarSoft tree.
3.
PROJECT DETAILS
3.1 Project Objectives
In the EGSO contract (IST-2001-32409), the project
declared several objectives including:
• Develop the middleware to federate solar data
archives across Europe, and beyond
• Create the tools to select, process and retrieve
distributed and heterogeneous solar data
• Provide the mechanism to produce standardized
observing catalogues for solar observations
• Provide the tools to create a solar feature catalogue
• Make all tools and middleware created by the
project open source
In essence, EGSO will create the fabric of a virtual solar
observatory and will provide an entry point into solar
data for other disciplines, including space weather,
climate physics, and astrophysics.
3.2 Project Phasing & Status
The work in EGSO is divided into four phases:
I.
Project definition; consult the community explore
and experiment with technologies
II. Architectural design; prepare system integration
and validation plan
III. Implementation of the design; development of
middleware and catalogues
IV. Product commissioning and delivery
EGSO is currently (in August 2003) in the early stages
of Phase III. A detailed set of system requirements was
drawn up during 2002. These were prioritised and used
as guidance for the EGSO Architecture – this was
delivered to the Commission in early 2003. The
architecture was refined over the following months and
implementation started in the summer of 2003.
3.3 Who is involved in EGSO
EGSO consortium is comprised of groups that have
considerable experience in handling solar observations
and provide access to a representative subset of
currently available data. It also includes groups with the
expertise in information technologies (IT) that will be
needed to develop the project. Details of the consortium
are given in Table 1 – the Swiss and US partners depend
on their own funding.
The two US partners in EGSO are also members of the
US Virtual Solar Observatory (US-VSO) which is
funded by NASA. Other US groups have recently
become associated with the EGSO. These include other
members of the US-VSO, Stanford University and
MSU, and Lockheed-Martin, lead of the Collaborative
Sun-Earth Connector (CoSEC), funded by NASA under
Living with a Star. EGSO, US-VSO and CoSEC held a
joint meeting in October 2002 and are now trying to
collaborate as closely as possible.
Other groups involved in EGSO include the European
Space Agency, and a branch of Astrium (which is acting
as an observer on the project).
Table 1. EGSO Project Consortium Members and
Associate Members
Consortium Members
Country
University College London
(PI Group)
Dept. of Space and Climate Physics
Dept. of Computer Science **
Rutherford Appleton Laboratory
Dept. of Space and Technology (**)
University of Bradford
Dept. of Cybernetics **
Institut d’Astrophysique Spatiale **
Observatoire de Paris-Meudon
Isituto Nazionale di Astrofisico
(Obs. of Turin, Naples & Trieste )
Politechnico di Torino
Dept. Automation & Informatics **
University of Applied Science
Dept. of Computer Science **
Solar Data Analysis Center, NASA-GSFC
National Solar Observatory
Note: Groups marked “**” have IT expertise
UK
Associate Members
Astrium plc.
(UK/French/German company)
Stanford University
Montana State University
Lockheed-Martin
European Space Agency (SOHO Project)
4.
UK
UK
France
France
Italy
Italy
Switzerland
USA
USA
Country
•
•
•
Use Cases solicited from the solar community
Consultation with the user community
Brainstorming, etc. resulting in the EGSO Concepts
Document
The EGSO architecture is designed to meet these
requirements and be a flexible and extensible as
possible. It is divided into three roles: Provider, Broker
and Consumer.
The Provider role involves interactions with data and
other providers – it includes software that might reside
on data centre systems. The Consumer role includes the
interaction with the users of EGSO, provides access
through the User Interface and any necessary workflow
capabilities. The Broker acts as a switching centre for
requests. It maintains registries that allow it to keep
track of available resources, including data and
metadata, and can make decisions on how best to satisfy
a request.
Below, we highlight a few features of the EGSO
system.
4.1 Feature Recognition Tools
Experience gained by the Observatory of Paris-Meudon
(using Hα data) is being combined with the expertise in
feature recognition of the Cybernetics group at Bradford
University to extend existing techniques to develop a set
of tools to detect solar features such as filaments,
sunspots, active regions, etc. Once developed and fully
evaluated, these techniques will be applied to a selected
set of synoptic data to build a valuable new catalogue,
the Solar Feature Catalogue. The tools will also be
available for the user to apply to individual images.
UK
4.2 Catalogue Preparation & Access
USA
USA
USA
Netherlands
DESIGN AND IMPLIMENTATION
Four workpackages will create the key components of
EGSO. Together these cover the three steps described in
Section 2.2. In the EGSO implementation, as far as
possible data will be extracted before being returned to
the user – this is departs from the current process of
extracting after the data are retrieved and greatly
simplifies the user’s software installation.
The EGSO system requirements prepared in 2002 drew
on a number of sources:
• A user survey conducted in collaboration with
SpaceGrid , an ESA sponsored Grid study project
Catalogues are key to locating the data, and their
importance will grow as the rapidly increasing volumes
of data prohibit the unnecessary copying of data sets.
Existing observing catalogues are heterogeneous and
this makes them difficult to search. EGSO will produce
standardized solar catalogues to simplify the searches of
existing data, and ensure that future data will be more
accessible. The idea of standardized cataloguing was
first proposed as the Whole Sun Catalogue [2]. The
proposed Unified Observing Catalogues (UOCs) will be
self-describing, quantized into fragments by instrument
and time interval, have dependencies on ancillary data
removed and errors corrected. The catalogues will be
designed so that they do not have to be held in a
centralized location – the data are distributed, it is only
rational that the catalogues should be treated in the same
manner. If necessary, the UOC can be created as it is
needed. A side effect of this design will be that it will be
very easy to add new data sets, and to update catalogues
of existing data sets as new observations are made.
Two additional catalogues, the Solar Feature Catalogue
(SFC) and the Solar Event Catalogue (SEC), are
intended to provide a new entry point into solar data.
They will allow the user to search for events, features
and phenomena, rather than just date, time, location and
wavelength. Existing lists of events or features will
form the Event Catalogue. To extend these, and produce
a more systematic approach to solar features, the
Feature Catalogue will be produced using the feature
recognition software. A search of the Feature and Event
Catalogues will yield a list of dates, times and locations
that then link into a search of the observing catalogues.
The Solar Event Catalogue is being implemented as a
stand-alone server. This will hide the complex nature of
the event data that must be gathered from a large
number of sources and is very heterogeneous. The
Server will permit complex searches over a number of
different types of list and will return the answer in a
standardized format. A similar server is planned for the
Solar Feature Catalogue. This will have much in
common with the SEC Server, but will have additional
capabilities to rotate image data to the same epoch to
allow the comparison of features on demand.
It will be possible to access both the SFC and SEC
Servers from outside of EGSO. They will be usable by
VSO and CoSEC, as well as by projects like AstroGrid.
4.3 Tools to Select the Data
The entry point into EGSO for many users will be
through a graphical user interface (GUI). This will
allow the user to define criteria for selecting data that is
based either on date & time, pointing, and wavelength,
or on features, events and phenomena. To assist the
user, synoptic images and other data will be used to
provide the context for high-resolution observations.
Once an initial search of the UOC provides a list of
available observations, the user will be able to refine the
selection with the aid of quick-look images and movies.
An alternate entry point into EGSO will be provided for
other communities that are interested in solar data.
These include climate physics, astrophysics, solarterrestrial physics and space weather. This entry point
will also provide access to EGSO from applications
such as IDL (e.g. from within SolarSoft).
4.4 Data Provider Federation
The metadata catalogues are used to relate the
heterogeneous solar data. They allow the user to
identify what observations are available – it then
necessary to retrieve them.
The observations could be anywhere around the world,
in large or small data centres, and might even be held in
multiple locations – also, the data could be in the public
domain or proprietary. We recognise that the resources
available varies from centre to centre. Some smaller
data centres do not have resources for full federation,
but would still like to be involve. Mechanisms will be
provided to affiliate very small data sources to larger
centres – requests would be serviced by the larger
centres that would then interact with smaller sites
through some form of trusted host arrangement
The user does not need to know this – the system should
take care of it all, selecting data sources and granting
user access as appropriate.
5.
THE EGSO DEMONSTARTOR
The first demonstrator of EGSO will be available
shortly. The initial implementation will include only a
subset of available solar data – enough to prove the
concept, and allow us to test standalone components.
Data from consortium members will be used in the first
instance. These provide what is needed to test EGSO:
heterogeneous data (both space- and ground-based)
scattered over a number of sites, with some duplication,
and with a variety of data formats and catalogue
capabilities. Emphasis will be place on the user
interface so that we can establish the optimal way to
design this, and provide maximum search capabilities.
6.
SUMMARY
The European Grid of Solar Observations will provide
the tools necessary for a virtual observatory, but is
essentially a Grid testbed. EGSO will form a sea change
in the way solar data are accessed. In collaboration with
other groups, we are striving to ensure that the project
will find global acceptance and lead to the creation of a
worldwide virtual solar observatory.
Further details about the EGSO project can be found
under the URL http://www.egso.org
7.
REFERENCES
1.
Freeland, S.L. and Handy B.N., SolarSoft, Solar
Physics, Vol. 182, 497, 1998.
Sanchez Duarte, L., Fleck, B. and Bentley, R., The
Whole Sun Catalogue, Proceedings of 1st Advances
in Solar Physics Euroconference, ASP Conf. Ser
Vol 119, 382, 1997.
2.
Download