Stellar spectra parametrization by neural the Gaia RVS spectral region

networks: extraction of Teff, logg and [Fe/H] in
the Gaia RVS spectral region
Minia Manteiga Outeiro1 , Diego Ordóñez Blanco2 , Carlos Dafonte Vázquez2
Bernardino Arcay Varela2 and Iciar Carricajo Marı́n1
Dep. of Navigation and Earth Sciences. University of A Coruña
Dep. of Information Technologies and Communications. University of A Coruña
Summary. The Gaia satellite, which is foreseen to be launched near the end of
2011, is one of the key scientific missions of the European Space Agency. Gaia will
carry out what is called a census of the Galaxy, by compiling exact information on
the nature and motion of its main components. It will perform precise astrometry,
multiepoch spectrophotometry and medium resolution spectroscopy (R=11500) for
the brighter sources.
Gaia will be equiped with one spectrograph, the RVS spectrograph that will contribute to the study of the nature of the sources and will allow to determine radial
motions with precisions between 1-10 km/s for V=16-17 mag. The RVS domain is
the Ca II infrared region, 847-874 nm, a region which is rich in diagnostic lines for
the determination of stellar atmosphere parameters, in particular, effective temperatures, surface gravities and overall metallicities. A Comprehensive information about
the Gaia project can be found in the Gaia web area at
Data handling, analysis and classification of information regarding the complete
sky down to magnitude 17-18 is, with no doubt, a challenge for both Astrophysics
and Computational Sciences. We present here our first results on the automatic
derivation of stellar parameters in the RVS spectral region, by the use of artificial
neural networks (ANN) trained with synthetic model spectra. It is shown that the
results achieved are comparable to those obtained by the use of spectrophotometry,
beeing their accuracy highly dependant on the spectral signal to noise ratio.
1 Gaia RVS instrument
The Radial Velocity Spectrometer (RVS) is an integral-field spectrograph dispersing the light of the field of view with a nominal dispersion R 11500. The
RVS instrument operates in time-delayed integration mode, observing each
source about 40 times during the 5 years of the mission. The RVS wavelength
range, 847-875 nm, has been selected to coincide with the energy-distribution
peaks of G and K-type stars which are the most abundant RVS targets. For
these late type stars the wavelength interval displays three strong ioniced Calcium lines and numerous weak lines, mainly due to Fe, Si and Mg. In early
type stars, RVS spectra will be dominated by Hydrogen Paschen lines and
may contain weak lines such as CaII, He I, He II and NI.
Over the 5 years mission, RVS will observe around 5 billion transit spectra of the brightest 100-150 million stars on the sky. The on-ground analysis
of these spectroscopic data set will be a complex and challenging task, not
only because of the volume but because the interdependance of different instruments and modes of observation. As a consequence, data extraction and
parametrization should be performed completely in an automatic fashion. We
think that the use of Artificial Intelligence techniques and, in particular, artificial neural networks (ANN), is a good approach to be tested for the case
of Gaia-RVS dataset.
2 Spectralib: a library of synthetic spectra for Gaia RVS.
Our initial approach consist in performing simulations on stellar parameter
extraction by means of synthetic spectra. We have used the Gaia RVS Spectralib, a library of 9285 stellar spectra compiled by A. Recio-Blanco and P. de
Laverny fron Niza Observatory, and B. Plez from Montpellier University. The
spectra are based on the new generation of MARC models from The Uppsala
Observatory in the spectral region 847.5-874.5 nm. A technical note is available describing the models used for the atmospheres from which the synthetic
spectra were calculated and what parameters were used (Recio Blanco et al.,
2006). The grid consist on spectra corresponding to effective temperatures
between 4000 and 8000 K (step 250K), logg between -1.0 to 5.0 (step 0.5dex),
and overall metallicities between -5.0 and 1.0 (with variable step from 1.0 to
0.25 dex). For each model atmosphere, alpha-elements abundance variations
of +0.4, +0.2, +0.0, -0.2 and -0.4 dex, were considered with respect to the
original abundances in the models. Access to RVS-Spectralib is open via the
ESA Gaia web pages (
3 The use of ANN for spectral parametrization
Among the different techniques of Artificial Intelligence, ANN have already
proved their success in classification problems: they are generally capable of
learning the intrinsic relations that reside in the patterns with which they were
trained. Some well-known previous works have applied this artificial intelligence technique to the problem of stellar spectral parametrization, obtaining
different grades of resolution in the extraction of parameters Teff, logg, [Fe/H]
and [alpha/H]. A summary of the current status of automated stellar classification techniques and achievable accuracies can be found in the reviews by
Bailer-Jones (2001)and Allende Prieto (2004).
Fig. 1. Schematic figure illustrating the location of the RVS optical module and
CCDs. Figure courtesy of EADS Astrium.
In order to probe the ability of extraction of stellar atmospheric parameters
from Gaia-RVS spectra, we choose to train ANN with the ad-hoc calculated
synthetic spectra already introduced in the previous section. A simple network architecture was chosen to perform the initial tests: a feedforward ANN
with 333 input nodes (the number of pixels in the spectra), 1 hidden layer
with 150 nodes, and 3 output layers providing the three atmospheric parameters, Teff, logg and [Fe/H]. The original synthetic spectra were degraded in
spectral dispersion, averaging the flux each three pixels in order to reduce the
input nodes from 1004 to 333, avoiding very long, and probably unnecessary,
computational time. An empirical rule restricts the number of the nodes in
the hidden layer to 0.5-0.3 times the number of input nodes.
A total of well distributed 1764 spectra were considered in the training sets,
while tests were performed on subsets of 465 spectra. Typically, a good performance in the network convergence and low parameter errors were achieved
after about 3000 training cycles, which translates to about 3 hours of computational time on an AMD 64 computer.
Gaia-RVS spectra will be of very different quality depending mostly on
the stellar brightness. It has been proposed that the end-of-mission SN ratio
for a typical star in the Galaxy, a G5V with V=15.5 or a F2II with V=14.5
will be about 10. In order to have into account the effect of the SN values in
the ANN performance, we have delivered tests taking into account four values
of SN ratio: 10, 20, 50 and 100. The model of noise considered was a simple
gaussian white noise, introduced using IRAF mknoise routine.
Fig. 2. Results on Teff parametrization. Columns show the mean errors in K for
the different values of the SN ratio of the training set, while lines refer to errors in
the validation sample.
4 Preliminary results
The mean errors in the extraction of effective temperatures, gravities and
metallicities is shown in figures 1, 2 and 3. In each of the figures, columns
show the mean errors in each of the stellar parameter for the different values
of the SN ratio of the training set, while lines refer to that of the test sample.
The diagonal line in each of the threee figures shows the performance of
the network when the training and test sets have the same SN ratio. Mean
errors as low as 21.5 K, 0.09 dex and 0.08 dex for effective temperatures, logg
and metal abundance, respectively, were reached for ANN trained and tested
on synthetic spectra with no noise added. The errors grow to values of 67 K,
0.16 dex and 0.11 dex in the case of SN 100; 95 K, 0.22 dex and 0.16 dex for
SN 50; 204 K, 0.44 dex, 0.27 dex for SN 20 and finally to 382 K, 0.75 dex and
0.46 dex for SN 10.
From the data in the figures, it is obvious that the SN heavenly influence
the learning process, and that essentially poor results are encounter when
training and testing spectral samples with different SN values.
5 Conclusions and future work
We presented our first results on the automatic derivation of stellar parameters in the RVS spectral region, by the use of artificial neural networks trained
Fig. 3. Results on logg parametrization. Columns and lines as in Figure 1.
Fig. 4. Results on metallicity parametrization. Columns and lines as in previous
with synthetic model spectra. The results achieved are comparable to those
obtained by the use of spectrophotometry, beeing the accuracy highly dependant on the signal to noise ratio of the spectra.
Our results show that ANN can be a good approach to extract atmospheric
parameters from Gaia-RVS spectra, providing that the SN ratio of the training
and testing spectral set be well characterized. Mean errors as low as 95 K,
0.22 dex and 0.16 dex for effective temperatures, logg and metal abundance,
respectively, were reached for ANN trained and tested on synthetic spectra
with SN 50.
Future work includes the performance of tests with different ANN architectures; the consideration of spectra with the original Gaia-RVS dispersion,
1004 flux points; an improvement in the statistics of the test set, and an
statistical consideration of the effect of the noise in the ANN performance.
