An Analysis of Noise in ... Aaron Sampson

An Analysis of Noise in the CoRoT Data
by
Aaron Sampson
Submitted to the Department of Physics
in partial fulfillment of the requirements for the degree of
Bachelor of Science in Physics
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
June 2010
© Massachusetts Institute of Technology 2010. All rights reserved.
'/
Author...............
-------
.-
..---...v-----f
-----
Department of Physics
May 17, 2010
Certified by............
. . . . . . . . . . . . . ... .< .
. . . . .
Sara Seager
Associate Professor of Physics
Thesis Supervisor
................
...............
Professor David E. Pritchard
Senior Thesis Coordinator, Department of Physics
Accepted by ...................
OF TECHNOLOGY
ARCHIVES
AUG 13 2010
LIBRARIES
2
An Analysis of Noise in the CoRoT Data
by
Aaron Sampson
Submitted to the Department of Physics
on May 17, 2010, in partial fulfillment of the
requirements for the degree of
Bachelor of Science in Physics
Abstract
In this thesis, publically available data from the French/ESA satellite mission CoRoT,
designed to seek out extrasolar planets, was analyzed using MATLAB. CoRoT attempts to observe the transits of these planets accross their parent stars. CoRoT
occupies an orbit which periodically carries it through the Van Allen Belts, resulting
in a a very high level of high outliers in the flux data. Known systematics and outliers were removed from the data and the remaining scatter was evaluated using the
median of abolute deviations from the median (MAD), a measure of scatter which is
robust to outliers. The level of scatter (evaluated with MAD) present in this data is
indicative of the lower limits on the size of planets detectable by CoRoT or a similar
satellite. The MAD for CoRoT stars is correlated with the magnitude. The brightest
stars observed by CoRoT display scatter of approximately 0.02 percent, while the
median value for all stars is 0.16 percent.
Thesis Supervisor: Sara Seager
Title: Associate Professor of Physics
4
Acknowledgments
I would like to thank everyone who helped me with this work, in particular Sukrit
Ranjan and Lisa Messeri, with whom I worked on the early stages of the project.
Thank you also to Dr. Suzanne Aigrain, who was extremely helpful in explaining the
CoRoT mission, how the data is reported, and her techniques for analyzing the data.
Thank you also to Elizabeth Adams for her help with understanding and calculating
transit properties. Above all, I would like to thank Professor Sara Seager for her
help and advice throughout the project. From its early stages of the analysis to the
writing of this thesis, her insight has be critical to its completion.
6
Contents
1 Introduction
11
2 Noise Reduction Methods
15
..
..
. ..
. . . ..
..
...
...
. ...
..
. . ..
15
CoRoT D ata
2.2
Corrections ........
2.3
Identifying and removing Systematics . . . . . . . . . . . . . . . . . .
17
2.4
Evaluating Scatter
18
................................
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
21
3 Results
4
.
2.1
3.1
CoRoT Light Curves and Corrections . . . . . . . . . . . . . . . . . .
21
3.2
Scatter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
3.3
Calculating Transit Properties . . . . . . . . . . . . . . . . . . . . . .
24
3.4
Limits on Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
Relation to Other Missions
27
A Figures
31
B MATLAB Code
37
B.1 Read Light Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
37
B.2 Apply Corrections . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
B.4 Save MAD, Magnitude . . . . . . . . . . . . . . . . . . . . . . . . . .
51
B.3 Save Corrections
8
List of Figures
A-1 This light curve is typical of CoRoT data before the application of
any corrections. There are a very large number of very high outliers,
a pronounced linear trend (upward, in this case, indicating pointing
drift allowed more light into the area designated for the star) and low
outliers over one portion of the observing run. . . . . . . . . . . . . .
32
A-2 After the application of the correction techniques described in Section2,
the CoRoT lightcurves exhibit significantly reduced outliers and have
had any linear trend removed. The lower - outliers associated with
entry or exit from the earths shadow (and loss of accuracy) are still
apparent toward the end of the lightcurve. . . . . . . . . . . . . . . .
32
A-3 Scatter is plotted against R magnitude here for stars observed during the fist short run designated as Chromatic, meaning that they
are bright enough to have their flux reported in three separate color
channels, red, green, and blue. The R magnitude is reported to only
one decimal place for stars in teh short run. Scatter for this data set
ranges from approximately 0.0003 to 0.002, with a positive correlation
between magnitude and scatter. Scatter here is the median of absolute
deviations from the median. . . . . . . . . . . . . . . . . . . . . . . .
33
A-4 Plotted here is the scatter (MAD) versus R magnitude for the Monochromatic stars from the short run, those stars too dim to have their flux
reported in separate color channels. The MAD values in the same
range as the chromatic set, and higher for the dimmest stars. . . . . .
33
A-5 Equivalent scatter-magnitude plots were made for the first long (150
day) run data sets. The magnitude of these stars is reported with
greater precision, with four decimal places in the FITS header, and
the correlation between magnitude and scatter is apparent. . . . . . .
34
A-6 Scatter (MAD) is plotted against R magnitude for Chromatic stars
observed during the first long observing run. The plot is a clear correlation between magnitude and scatter for these stars. . . . . . . . . .
34
A-7 The blue curve above represents the relationship between the planet/star
radius ratio and transit depth. Also plotted here and below are the
transit depths of various planet/star pairs and the limits on transit
depth imposed by CoRoT-level noise. . . . . . . . . . . . . . . . . . .
35
A-8 More transit depths are shown here, for the Earth and sun and CoRoT7 b, both of which are on the same order of magnitude as the lowest
threshhold of detectability imposed by the scatter in CoRoT light curves. 35
A-9 Here, a quadratic fit to the scatter/magnitude plot from the long run
monochromatic stars is shown along with transit depths for various
planet/star pairs, showing which types of planets could be detectable
at which m agnitudes. . . . . . . . . . . . . . . . . . . . . . . . . . . .
36
Chapter 1
Introduction
In recent years, there have been an abundance of newly discovered extrasolar planetsplanets which orbit stars other than our own. Detecting these planets beyond the
boundaries of our solar system requires highly precise and careful measurements of the
stars around which these previously unseen objects orbit. One of the best methods
for detecting these planets is observing their transit across their parent stars. As
the planets orbit their stars, they will sometimes pass directly between the star and
observers on earth, thereby blocking part of the light coming from that star. Although
most known extrasolar planets known have been discovered by other methods, transit
observation it is a method which holds great potential, especially for discovering small,
earthlike planets. Measuring transits for these planets of greatest interest is especially
challenging, however, because the change in light from the star is very small. It is
therefore of great interest to determine the noise limits on detecting such planets
through this method.
Exoplanet transits can actually provide a great deal of information. Precise measurements can reveal both primary (planet passing in front of star) and secondary
eclipses (planet passing behind star) as well as the smaller differences in incident flux
when the illuminated side of the planet is visible (before or after a secondary transit)
versus when the dark side is visible (near the primary transit) [10]. The difficulty in
discovering planets via observation of transits arises in large part from the fact that
the star-planet system needs to be oriented quite precisely in order for the two distant
bodies to align as viewed from Earth. In fact, for those most interesting extrasolar
planets, Earth-sized bodies occupying a habitable zone around their parent stars, the
probability of observing the system in proper alignment for a transit is around 0.5
percent [6].
CoRoT (COnvection ROtation and planetary Transits) is a satellite mission administered by the French and European Space Agencies (CNES and ESA). Launched
via Soyuz rocket on December 27, 2006 from Baikonur Cosmodrome in Kazakhstan,
the mission has two purposes. In addition to searching for extrasolar planets, CoRoT
performs asteroseismology measurements, studying the pulsation of stars. Seeking
out exoplanet transits and studying asteroseismology both require measuring (very
carefully and precisely) the intensity of light received from stars over periods of time.
The needs of these dual missions are, however, somewhat different. The added difficulty comes from the fact that in order to be sure a planet is observed while it
is transiting, observations must be carried out throughout the planets entire orbital
period. For a planet like Earth, the orbital period would be on the order of one year,
necessitating a similarly long session of observing. So, while the detection of transits
requires observing stars for as long as possible to maximize the probability of finding the star-planet system in the proper alignment, asteroseismology requires only
relatively short periods of observation. For this reason, CoRoT alternates between
long and short runs of data collection-a compromise between its dual missions. The
short runs are 20 days in duration, and the long runs are 150 days. The publicly
available data from CoRoT is therefore released by run. The satellite also alternates
pointing in two opposite directions, reversing directions every six months to avoid the
sun, which would otherwise come too close to its field of view. These two ten-degree
radius patches of sky are known as the two "eyes of CoRoT" and all observations
will be made within these areas, both of which lie in the galactic plane, one of which
is pointed toward the center of the galaxy, one of which is pointed toward the outer
edge [7]. The instruments on board CoRoT reflect the nature of its mission. CoRoT
consists of an afocal telescope, housed in a baffle designed to block reflected sunlight
from the Earth. The telescope is comprised of two off-axis mirrors. Light reflected
from the mirrors reaches the craft's wide-field camera, four CCDs (charge coupled
devices) in 2048 by 2048 pixel array shielded from radiation by 10 mm of aluminum.
Each pixel is a 13.5 by 13.5 micrometer square, which corresponds to an angular
view of 2.32 arcseconds. In order to avoid saturation of the CCDs, the starlight is
defocused, allowing for more precise photometry. In addition, the starlight is passed
through a prism before reaching the CCD, spreading the light out into a spectrum
[8]. All of this means that although CoRoT is really just a telescope in space, there
will be no images from CoRoT of the type that the Hubble Space Telescope has been
dazzling the public with for years. As incredible as it would be to have an image of a
terrestrial extrasolar planet, there would simply be no way to spatially resolve such
a planet, and that is not CoRoT's mission.
To date, data from a total of seven data collection runs have been released. The
initial run was in the direction of Monoceros, away from the galactic center, and was
followed by the first short run in the direction of Sepens Cauda and the first long run
in the direction of Aquila, both toward the galactic center. The data analyzed in this
study come from these first long and short runs to be publicly released [7].
CoRoT occupies a circular polar orbit, carrying it over both of the Earth's poles.
It observes in directions less than ten degrees from perpendicular to its orbital plane,
which means that there are no occultations by the Earth of its targeted objects.
This allows it to observe continuously throughout its months-long data collection
runs. While in this respect the orbit is well-suited to the mission, unfortunately the
satellite's orbital characteristics do introduce several problems which must be dealt
with in processing the data. First, the satellite frequently crosses of the region known
as the South Atlantic Anomaly. This is the region where the Van Allen radiation
belt comes closest to the surface of the Earth. The Van Allen belt is a toroidal
region around the earth where plasma is trapped by the Earth's magnetic field. The
result of crossing through the South Atlantic Anomaly then, is that the satellite is
bombarded with charged particles. Additionally, the satellite must travel in and out
of the shadow of the Earth. Data points affected by both of these sources of noise
should be identified and flagged, making them easy to remove, but the flagging process
is often imperfect, necessitating the corrections described below
[9].
These problems have not, however, prevented CoRoT from making groundbreaking discoveries. So far, the satellite has discovered a confirmed nine extrasolar planets.
Most of these planets are quite like the vast majority of known extrasolar planetsso called "hot Jupiters," giant planets orbiting very close to their stars. Often these
worlds are as large as or larger than Jupiter but orbit within the distance of Mercury's
orbit. Planets of this type discovered by CoRoT range from 0.467 Jupiter masses to
3.31 Jupiter masses. The body designated CoRoT-3 b has a mass of 21.66 Jupiter
masses, so large that it is considered a brown dwarf, and not a planet at all. The most
significant planet to be discovered so far, however, is by far the smallest. Designated
CoRoT-7 b, this planet has a mass only 0.151 times that of Jupiter, or 4.8 Earth
masses. This, along with its measured radius of 1.7 Earth radii gives it a density similar to Earth, meaning that it is in fact a small, terrestrial planet. While its position
very close it its star, with a semimajor axis of only 0.0172 astronomical units, makes
it a very hot, inhospitable place not much like Earth, it is still an extremely exciting
discovery that brings us one step closer to finding a close Earth analog [2].
Chapter 2
Noise Reduction Methods
2.1
CoRoT Data
CoRoT data is released in the standard astronomical Flexible Image Transport System (FITS) format. The science-quality data from CoRoT is designated N2, and is
released to the public one year after it becomes available to CoRoT scientists. This
analysis was performed using MATrix LABoratory (MATLAB) [1], which includes
a built in fitsread function, which was used to read the CoRoT data files. Each file
made available by CoRoT corresponds to photometry data on one star over the course
of an entire observing run.
The relevant data is contained in the binary table of the FITS files. From here,
the date, measured flux, and error are read out using fitsread. The flux reported
by CoRoT is in some cases broken down into red, green, and blue color channels.
(Only the brighter stars observed by CoRoT were treated in this way.) The nomenclature used for these two ways of treating data is somewhat nonstandard. Where
the starlight was bright enough to report flux in multiple color channels, the stars are
referred to as "chromatic." "Monochromatic" stars do not have just one color band
reported, as one might expect, but the full white flux. These color bands reported by
CoRoT for "chromatic" stars originated not from multiple filters, but from a prism
on the satellite which spread the starlight out on the CCD. The red, green, and blue
channels were defined by taking the long, short, and middle wavelength parts of this
spectrum, but divided based on percentage of light, not a fixed wavelength cutoff.
For F, G, and K type stars, most of the flux is in the red part of the spectrum, and
CoRoT's red channel is defined as the longest wavelength 40 percent of the light
Therefore, after the date, red flux, green flux, blue flux, white flux, and error are
read in from the binary table, we choose the white flux (or equivalently, combine the
three color channels) and output a matrix with the date, flux, and flux error. This
matrix can be plotted to obtain a light curve like Figure A-1, or it can be saved as a
matrix for further processing.
Additional information is included in the header of each FITS file, which can be
accessed using another built in MATLAB function, fitsinfo. This was the source for
the -magnitude information for all stars. R magnitude was used for comparison with
scatter.
2.2
Corrections
After obtaining date, flux, and error data from the CoRoT FITS files, a number of
corrections were applied to the data to remove noise from known sources. First, any
light curves with clear irregularities (zero error or non-numerical flux values) were
rejected entirely. The principal correction then applied to each light curve was the
identification and removal of outliers. Light curves plotted from the raw data released
by CoRoT exhibited a very high number of high outliers. Aigrain et al. attribute this
abundance of high outliers principally to charged particles hitting the detector during
the satellite's frequent crossing of the South Atlantic Anomaly
[9].
(See section 3.1)
The high outliers were removed after the application of a median filtering and
smoothing process. The data was binned according to timescales equivalent to the
orbital period of the satellite, 103 minutes. The smoothing of the data consisted of
applying two five point boxcar filters, first a mean filter and then a median filter.
First, the standard mean filter is applied. This means that for every point in the
light curve, the mean of the surrounding five point block (two points at earlier times,
the point itself, and two points at later times) is calculated. A new list of data points
is then created corresponding to the true data read in from the fits file, but using
these median values for each block of data instead of the real data (that is with the
true dates and the five-point block means replacing the original flux values). This
produces a smoothed light curve with fewer outliers. This standard mean filter is
then followed with a median filter. The median is more resistant to the influence of
outliers than the mean, and the smoothing therefore benefits from this step.
The outliers are identified and removed based on this smoothed data. The set
of smoothed data is broken up into samples of one thousand points for which the
median is calculated. For each point, the residual (the distance from the median
of the thousand point block) is calculated, and then the standard deviation for the
sample is found in the standard way by summing the squares of the residuals, dividing
by the number of points in the sample, and taking the square root.
=
E(fi -- ,)
2
/N
(2.1)
Now, returning to the original data, for each point more than three standard
deviations away from the median of the corresponding smoothed data, its index is
noted, and the corresponding point is removed from the original data. In this way, a
light curve is obtained, which is free of the very high or low outliers. The cutoff of
three standard deviations allowed very high confidence that all rejected points were
the result of unusual events such as hot pixels resulting from cosmic ray hits on the
detector.
2.3
Identifying and removing Systematics
Another problem with the data involves a long term linear decay, which Aigrain notes
and assumes to be of instrumental origin. In addition to this, other light curves obtained seem to exhibit an opposite effect-an increase over the course of observation.
The instrumental origin of this phenomenon would seem to be a slow drift of the satellite's field of view. This results in a linear trend in the flux data because there are
particular areas of the CCD designated as corresponding to particular stars. When
the satellite drifts, it can become misaligned and allow either more (if there is another
bright object nearby) or (more often) less light into the area designated for the star
of interest.
In order to deal with the problem of instrumentally introduced linear trends,
MATLAB's fitlin function was used to fit a linear polynomial to the light curve. This
line is then simply subtracted from the data, or more precisely, at each point in
the light curve, the y-value (flux) of the line at the corresponding x-value (time) is
subtracted from the data point.
Many of the systematics one would expect to find in the CoRoT data are periodic
in nature, due to the orbital period of the satellite and its periodic entries and exits
from the Earth's shadow. One approach to characterizing and ultimately removing
theses systematics would employ Fourier analysis. While this approach was initially
attempted, it was never fully developed, and instead long-timescale variation was
removed via median subtraction.
This method of removing low-frequency variation (chosen as variation on timescales
longer than two days), the median for the two-days of points surrounding each point
was calculated and subtracted off. This was effectively a high-pass filter, two days
being chosen because it was much longer than any expected transit durations. After
this subtraction, the overall median for the entire light curve was added back to each
point in order to restore the flux values to their original magnitude.
2.4
Evaluating Scatter
The scatter that remains in the data after the application of the above corrections was
evaluated using the MAD (median of absolute deviations from the median) statistic.
This metric was calculated for each light curve by taking the median magnitude
value for the light curve, calculating the deviation of every magnitude value from this
median, and then taking the median of these deviations.
The MAD statistic is useful as a measure of dispersion which is less strongly
influenced by outliers than the variance or standard deviation. This is simply because
it involves taking the median of residuals rather than the mean, and as such remains
around the middle of residual values even when the residuals on one end are quite
extreme. Because of this, for this data, which displays a number of high outliers,
although a number greatly reduced by the outlier rejection procedures described in
section 2.2, the MAD is the best way to measure meaningful dispersion in the data.
MAD = median(|m - median(m)|)
(2.2)
Before calculating the scatter of light curves, they were in some cases resampled
to a uniform 512 second sampling rate. See section 3.2 for plots of MAD with and
without this resampling step. There seems to be very little effect of this procedure
on the overall scatter.
The nonuniform sampling of the light curves comes from
two sources. The correction techniques described in section 2.1 of course introduce
gaps in the data through outlier removal. Even the raw light curves exhibit some
nonuniform sampling, however. This is a result of the fact that actual exposure times
are 32 seconds, and although the data is typically rebinned to 512 seconds before
transmission, this is not always the case, however. When there is a possible detection
of a transit, the raw data is reported with 32 second sampling [5].
The applied resampling process rebins oversampled regions to 512 seconds, accounting for both gaps and oversampled periods. The oversampled regions are rebinned by using the median flux value for these time blocks, this being resistant to
the influence of outliers. Any gaps in the data, those introduced by our corrections
or preexisting, are filled via linear interpolation.
The last step before calculating the MAD statistic is the conversion of flux values
from electron counts to magnitudes. This is done using the standard calculation of
apparent magnitude, using as a reference the accepted magnitude of the star given in
the FITS header and the median flux over the entire observation run.
*f ) +
m = -2.5 log10 (d
median (f )
mreported
(2.3)
c
The scatter among these magnitude values was then calculated using MAD as
described above.
Chapter 3
Results
3.1
CoRoT Light Curves and Corrections
The CoRoT data has a number of problems which can make exoplanet detection more
difficult, largely from instrumental sources. The most prominent of these effects,
immediately apparent in Figure A-i or any plot of an unprocessed light curve is the
very high number of high outliers. Other significant effects include an overall linear
trend upward or (more frequently) downward and a grouping of low outliers present
for only one segment of the light curve.
The first of these effects are principally attributable to cosmic ray hits on the
detector, the high number being due to the satellite's frequent crossing of the South
Atlantic Anomaly. As discussed in Section 1, the South Atlantic Anomaly is the
region where the Van Allen radiation belt comes closest to the surface of the Earth.
This effectively bombards the satellite with a large amount radiation every time the
satellite passes through the region. When the on-board CCD is struck with this
radiation, the sensitivity of the affected pixels is temporarily altered, causing the
very high readings [9].
The frequent crossings of the South Atlantic Anomaly do
present a significant problem and make CoRoT's orbital path decidedly less than
ideal.
The linear trends observed in many light curves are most likely due to a drift in
the satellite's pointing accuracy. A slight drift off target can cause more or less light
to fall in the region of the CCD associated with the star. The light reported is only
that which falls into a designated region corresponding to the spread out spectrum
of the star from the prism in the detector. If the satellite's targeting drifts over
the course of its observation run, some of the starlight will fall outside this region,
causing the downward trend. Alternatively, light from a nearby star can contaminate
the region associated with the star of interest, causing a possible upward trend [5].
The low outliers may represent the results of the satellite's entry into or exit from the
shadow of the Earth, when the change in incident light can cause a loss of accuracy
in targeting [9].
The light curves can be significantly improved by applying the correction techniques described in Section 2. The corrections remove the very high or low outliers and
any linear trend in a light curve, all of these being effects that can be accounted for.
This allows an evaluation of the noise in the data after accounting for known sources
of unwanted signal. Cleaning up the data to this level allows a better estimation of
the limits that the data places on exoplanet detection.
The remaining scatter is likely due to unknown systematics or intrinsic stellar
variability. The low orbit of the satellite could be a source of noise beyond the
periodic South Atlantic Anomaly crossings. Understanding the impact of intrinsic
stellar variability would be particularly interesting, since it would present an obstacle
for any planet-finding mission, regardless of orbital characteristics or instrumentation.
Figures A-1 and A-2 show a typical CoRoT light curve both before and after the
application of the correction routines. The number and magnitude of outliers have
been greatly reduced, and the data no longer exhibit any linear trend.
3.2
Scatter
In order to determine the detectability of exoplanet transits with CoRoT, or from
data of a similar quality, it is essential to understand the level of scatter that remains
in the data after known systematics are removed. The scatter was evaluated via the
MAD statistic (median of absolute deviations from the median, see Section 2) for
each star. In addition to determining the overall level of scatter present, the scatters
of all stars in each data set (corresponding to an observing run) were plotted against
their R-magnitudes to show any correlation with magnitude, and the degree to which
the noise in individual light curves varies can be estimated from these plots.
The logarithm MAD metric, plotted in Figure A-3 to A-6 for most light curves
from the first long and short runs of observing, broken up into "chromatic" (bright
enough to separate the light into a spectrum) and "monochromatic" (dimmer) stars,
fall from 10-4 to 10~3 counts. Again, the MAD statistic is calculated by finding the
median of the electron counts in the light curve, then calculating the absolute value
of the deviation from this median for each point, and finally taking the median of
these deviations.
The two observing runs for which scatter-magnitude plots were constructed were
the first long and short runs in the constellations Monoceros and Serpens Cauda
respectively. The long run plot exhibits an apparently stronger trend, as well as
somewhat closer grouping of the scatter plot, but both display approximately the
same general level of scatter for both point-to-point and two hour binned MAD.
Figure A-3 to A-6 show the logarithm of the MAD statistic for each star plotted
against its R magnitude. It should be noted that for the short run data sets, the
magnitudes were reported only to one decimal place for the magnitude, whereas for
the long run sets, the magnitudes were given to four decimal places. This creates a
significant visual difference in the plots. More significant is the previously mentioned
fact that while the short run scatters seem to exhibit a weak dependence on the
magnitude, (with higher magnitude stars showing more scatter, as expected) the
correlation seems stronger for the long run.
The MAD statistics for this short run data set vary by a factor of about ten,
with a few stars exhibiting significantly more scatter. This suggests that the noise
in CoRoT light curves is somewhat inconsistent, with some light curves plagued by
much more noise. Some light curves were also rejected in the process of assembling
the scatter-magnitude plots if the data correction techniques removed a large majority
of the points in the curve, suggesting highly noisy data with a very weak signal. For
the short sun, chromatic set, 2.2 percent of stars (28 of 1271) were rejected in this
fashion. For the monochromatic short run stars, only 0.07 percent (4 of 5706) were
rejected. For the long run 0.04 percent (3 of 7689) of the monochromatic stars and
2.5 percent (93 of 3719) of the chromatic stars were rejected.
The MAD could be evaluated in a few different ways for each data set either by
measuring simple point-to-point scatter or by binning over longer timescales before
taking the MAD. In particular, the timescales of interest are those corresponding to
the duration of transits. It is the noise on these timescales that presents the greatest
obstacle to planet detection. Appendix A shows a several plots of MAD after twohour binning versus R magnitude for the "chromatic" stars in the first short run of
observation.
Two hours was chosen based on the probable transit duration of the longest period
exoplanets CoRoT would be likely to detect, those with orbits of a few weeks. This
limit on the orbital period is imposed by the length of time for which CoRoT observes
the star, which is only a few months. (It would be highly unlikely to catch a planet
with a much longer orbital period in a transit during the relatively short observing
run.) There was little discernible difference in the level of scatter point to point and
after two-hour binning, however all plots in Figures A-3 to A-6 are binned on two
hours.
3.3
Calculating Transit Properties
The detection of an exoplanet transit involves measuring small changes in the incident
flux from a star, and as such is limited by the noise in the flux data. The scatter
in light curves such as the ones obtained by CoRoT has the effect of limiting the
detectability of planets. Ultimately, it is the size of planets which can be detected
that is limited by the noise, but what is directly observed is the depth of the transit,
which is directly related to the planet radius. It is the transit depth, therefore, that
should be considered when determining what the limits on CoRoT's abilities to detect
planets are.
At the most basic level, determining the depth of a transit is straightforward, the
depth of the transit as a fraction of the incident flux from the star is simply equal to
the ratio of the projected areas of the planet and star. This is therefore equal to the
ratio of the squares of the radii.
Transit depth =
pla
(3.1)
For this analysis, subtler effects such as limb darkening (the fact that the centers
of the observed disks of stars are more luminous than their outer edges-leading to
a rounded transit) can be ignored. What is critical is simply the total depth of the
transit (E. Adams, private communication, 2009).
First, say that we wish to detect a planet of radius equal to that of the Earth,
with a radius of around 0.01 solar radii, orbiting a star similar to our own, with a
radius equal to that of the sun. The transit depth would be correspondingly just 0.01
percent.
More generally, we can say that planets will likely be detected around stars with
a radius between 0.5 and 2 solar radii, and we can look for planets of 5 earth radii.
These numbers would give transit depths in the range of 0.06 to 1 percent
3.4
Limits on Detection
Using the MAD statistic as a measure of the noise in the CoRoT light curves, we
can put limits on the depth of a transit which would be detectable using CoRoT
or another instrument providing data of similar quality. The median of all MAD
statistics calculated for all stars is 0.0016, or 0.16 percent.
Figure A-7 shows a plot of the depth of a transit for the ratio of radii of the
planet and its parent star. The depth of an exoplanet transit is simply given by the
square of this ratio. If we then require a signal to noise ratio of 10, then based on the
noise found to be present in the CoRoT data, only exoplanets with a transit depth
greater than 1.6 percent could be definitively detected for most stars. However, for
the brightest stars observed by CoRoT, the noise is significantly reduced, yielding a
MAD statistic closer to 0.03 percent, which would allow the detection of significantly
smaller signals-in the range of 0.3 percent.
This means that while truly Earth-like planets are not quite within CoRoT's reach,
planets that are slightly larger, but still relatively small, and likely rocky, are certainly
detectable, as evidenced by the much lauded discovery of CoRoT-7b.
The case of CoRoT-7 b in fact pushes the limits of detectability. CoRoT-7 is
a relatively bright star, with an apparent magnitude of 11.7. This means that the
scatter in the data is at the 0.02 percent level. CoRoT-7 b has a radius of 0.15 Jupiter
radii and orbits a star of 0.87 solar radii, so the transit depth is 0.03 percent, just
above the level of the noise.
Chapter 4
Relation to Other Missions
Significantly, a great deal of the noise present in the CoRoT data has a known origin,
and a satellite with different orbital characteristics could avoid many of the problems
that face CoRoT. NASA's Kepler mission, launched in 2009, provides a strong contrast with CoRoT as a model for an extrasolar planet-finding space mission in this
and other respects. The first advantage Kepler has over CoRoT is its much longer
period of observation. CoRoT has discovered a number of short-period exoplanets,
but even its long observing runs are only 150 days. To have a high probably of detecting a planet like Earth however, the period would be on the order of one year,
necessitating a much longer session of observing. Kepler's mission involves observing
its selected field of view for a full three and a half years.
There is something of a trade-off inherent in this longer mission, however. CoRoT,
with its multiple shorter runs moves to different parts of the sky (albeit parts that
are very nearby) over the life of its mission. Kepler, at least as its mission is currently
defined, will observe only one patch of the sky for its entire lifetime as a functional
spacecraft. This is made up for somewhat by the fact that Kepler has a much larger
field of view than CoRoT, but while this field of view will encompass around 100,000
stars bright enough to study [4], CoRoT will be able to observe around twice that
number [7].
The full field of view covered by Kepler's CCD array is 105 square degrees, a
field of view situated in the constellations Cygnus and Lyra, a mission "specifically
designed to survey our region of the Milky Way galaxy" [3]. This field of view also
meets the requirement of being out of the ecliptic plane, meaning the stars under
observation will not be periodically blocked by the sun. There are around 100,000
stars bright enough to study in Kepler's field of view. For the Kepler mission, bright
enough means stars of magnitude 14 in the visual wavelength band.
Kepler's instrumentation also differs from that of CoRoT. The spacecraft consists
of a telescope with a spherical primary mirror 1.4 meters in diameter and a Schmidt
correcting plate to correct for spherical abberation. This spherical mirror allows for
Kepler's wider field of view, observed by its photometer, an array of 42 CCDs (charge
coupled devices). Each CCD on Kepler is 50 by 25 millimeters, or 2200 by 1024 pixels.
Onboard exposure times are three seconds, a period short enough to avoid saturation
of the CCDs with the starlight slightly defocused-spread out to 10 arcseconds [4].
Kepler's instruments are housed within a spacecraft weighing just over 1,000 kilograms, providing an aperture of 0.95 meters. Kepler is equipped with a solar array,
a high-gain antenna for data transmission, thrusters, and a radiator for cooling the
CCDS (thereby reducing shot noise from the random motion of electrons). As it
watches these stars for years, Kepler will also rotate 90 degrees about its line of sight
every three months to keep its solar arrays in the sunlight and its CCD radiator
pointed into deep space [4].
Perhaps the most significant advantage Kepler has over CoRoT is its orbit. Unlike
CoRoT, in its circumpolar orbit, Kepler is not forced to contend with the radiation
of the Van Allen belts and periodic entry and exit from the Earth's shadow. This is
because Kepler is far more distant. Kepler does not orbit the Earth, but occupies an
Earth-trailing heliocentric orbit. This means that the spacecraft is following the Earth
at a distance as it orbits the sun, but with a slightly larger semimajor axis, giving the
craft a slightly longer orbital period of 372.5 days. Due to this longer period, Kepler
is slowly drifting farther and farther away from Earth. Another advantage of this
orbit is the fact that at a greater distance, the craft is not subject to the same fewer
torques due to gravitational gradients that act on CoRoT or other satellites deeper in
the Earth's gravitational field, allowing for a better maintenance of pointing accuracy.
The satellite will still have to contend with the radiation associated with solar flares.
All of this means that Kepler is positioned to have far more success at discovering
terrestrial, and perhaps habitable worlds than CoRoT has had. Although all planets
discovered by Kepler so far are of the easy-to-detect hot Jupiter type, this is expected
since it has been in operation for just a matter of months. As the mission continues,
we can expect to see more long-period planets transiting.
Understanding the capabilities and limitations of a satellite like CoRoT is still very
useful however, as in order to truly push our knowledge of extrasolar planets forward,
there will be a need for many more missions to search for these distant worlds, and
not all of them can be on the scale of Kepler. Understanding the sources of noise in
data from CoRoT and the limitations this puts on the detection of planets can inform
the design of future missions capable of new parts of the sky as yet unsearched by
CoRoT or Kepler.
Ultimately, these missions can make progress on answering some of the most fundamental questions that have inspired us to study the cosmos for years. Could there
be life elsewhere in the universe? And could we humans find a hospitable home other
than Earth elsewhere in the universe? The first step in answering these questions
is determining whether there are other planets with Earth's unique combination of
attributes making it so perfectly suited to the development of life. Three years from
now, Kepler may have discovered dozens of planets similar in size and orbital characteristics to our own. Farther down the road, many more missions could be pushing
our knowledge of such planets ever farther. And despite the wealth of questions to
be answered in our own solar system, knowing that these distant earthlike planets
are out there, awaiting further study will surely be there strongest inspiration and
incentive for the further study and exploration of space that could be hoped for.
30
Appendix A
Figures
Figure A-1: This light curve is typical of CoRoT data before the application of any
corrections. There are a very large number of very high outliers, a pronounced linear
trend (upward, in this case, indicating pointing drift allowed more light into the area
designated for the star) and low outliers over one portion of the observing run.
Figure A-2: After the application of the correction techniques described in Section2,
the CoRoT lightcurves exhibit significantly reduced outliers and have had any linear
trend removed. The lower o outliers associated with entry or exit from the earths
shadow (and loss of accuracy) are still apparent toward the end of the lightcurve.
MAD vs. R Magnitude for Short Run, Chromatic Stars (Resampled to 512 s)
10
F
I
I
I
I
I
I
___q
10'2-
1:
a
10'3-
~
*
.:
:
I~i I1111"
IIII
I
*
.
10~4-
11
11.5
12
13
13.5
14
R Magnitude
12.5
14.5
15
15.5
16
Figure A-3: Scatter is plotted against R magnitude here for stars observed during
the fist short run designated as Chromatic, meaning that they are bright enough to
have their flux reported in three separate color channels, red, green, and blue. The R
magnitude is reported to only one decimal place for stars in teh short run. Scatter for
this data set ranges from approximately 0.0003 to 0.002, with a positive correlation
between magnitude and scatter. Scatter here is the median of absolute deviations
from the median.
IMAD
vs. RMagnitude for Short Run,
Monochromatic Stars (Resampled to 512 s)
10ini
10
a
*
*
*
*
*
10,2
*
*
*
*
*
*
*
*
*. **:~
*
*
*
*
**
3
* *a
* -.10
*
rln
I
II
11
11.5
*
* :.
~*
*$
***
12.5
13.5
14
R Magnitude
*?~*
~
*
I
12
13
14.5
15
15.5
16
Figure A-4: Plotted here is the scatter (MAD) versus R magnitude for the Monochromatic stars from the short run, those stars too dim to have their flux reported in
separate color channels. The MAD values in the same range as the chromatic set,
and higher for the dimmest stars.
33
Stars
1 MAD vs. R Magnitude for Long Run, Monochromatic
10
104
11
11.5
12
12.5
13
13.5
14
R Magnitude
14.5
15
15.5
16
Figure A-5: Equivalent scatter-magnitude plots were made for the first long (150 day)
run data sets. The magnitude of these stars is reported with greater precision, with
four decimal places in the FITS header, and the correlation between magnitude and
scatter is apparent.
MAD vs R Magnitude for Long Run, Chromatic Stars (No Resampling)
10
10-2
*k
103
*AM
40
104
11
11.5
12
12.5
13
13.5
14
RMagnitude
14.5
15
15.5
16
Figure A-6: Scatter (MAD) is plotted against R magnitude for Chromatic stars observed during the first long observing run. The plot is a clear correlation between
magnitude and scatter for these stars.
Transit Depth vs. Planet/Star Radius Ratio
/
/
Jupiter/Sun Transit
Average CoRoT Star Limit
1 -
L
01
06
0.4
02
Planet Radius / Star Radius
Figure A-7: The blue curve above represents the relationship between the planet/star
radius ratio and transit depth. Also plotted here and below are the transit depths
of various planet/star pairs and the limits on transit depth imposed by CoRoT-level
noise.
Transit Depth vs. Planet/Star Radius Ratio
'I
0.0006
0.000 5
//
0.000 4
/
/
/
oRoT-7 bTransit
0.000
CoRoT Bright Star Limit
0000
Earth/Sun Transit
0.000
0.01
0.02
0.03
0.04
Planet Radius / Star Radius
Figure A-8: More transit depths are shown here, for the Earth and sun and CoRoT-7
b, both of which are on the same order of magnitude as the lowest threshhold of
detectability imposed by the scatter in CoRoT light curves.
Detection Limits
ac01
Jupiter/Sun Transit
D.004-
CoRoT-7 b Transit
13
14
151-
Earth/Sun Transit
R Magnitude
Figure A-9: Here, a quadratic fit to the scatter/magnitude plot from the long run
monochromatic stars is shown along with transit depths for various planet/star pairs,
showing which types of planets could be detectable at which magnitudes.
Appendix B
MATLAB Code
B.1
Read Light Curve
function output=lightcurve(file,option,NStEDcompatibility,color)
XLAST UPDATED 6/22/2009 SR
%This modifies lightcurve.m to work with chromatic and
X.monochromatic
data from CoRoT's N2 Public archive.
Xoutput=lightcurve(file,
option, NStEDcompatibility,color)
%file is a filename (string).
%if option='plot',then it makes a plot. If option='errorplot'
%it makes a plot with errorbars. If
%option='matrix', then it returns a 3-column matrix with
%helio date, white flux, and white-flux error.
%if NStED-compatibility=1 then the flux will
%be scaled by 1/10000 and the date shifted by
%-2000, to make it compatible with the plots on
%the NStED website. If NStEDcompatilibity !=1,
Xthen
it won't be.
%If color = 'red' plots red flux.
If color equals 'green'
%it plots green flux.
%flux.
Xfor
If color equals 'blue' it plots blue
If using monochromatic data, enter 'white'
color. If want to convert from
%polychromatic, enter 'combine'
%The purpose of this function is to generate a
%lightcurve from the CoRoT exoplanet data.
%It has only been tested for this dataset.
data=fitsread(file, 'bintable'); %read in binary table of the fits file
datehel=data{1,3}; %heliocentric julian date
%this routine selects out the desired color
%channel.
if strcmp(color, 'white')
flux=data{1,5}; %flux is measured flux in the given channel in electrons
fluxerr=data{1,6}; %flux-err is error in the measured flux
elseif strcmp(color,'red')
flux=data{1,5};
fluxerr=data{1,6};
elseif strcmp(color, 'green')
flux=data{1,7};
fluxerr=data{1,8};
elseif strcmp(color, 'blue')
flux=data{1,9};
fluxerr=data{1,10};
elseif strcmp(color, 'combine')
flux=data{1,5}+data{1,7}+data{1,9};
flux.err=sqrt(data{1,6}.^2+data{1,8}.^2+data{1,10}.^2);
else
error('Invalid value for "color"')
end
%The lightcurves generated by the NStED database
%are unusual in that they displace the date by
%-2000 and scale the e- counts by 10^-4. We don't
%work with these since they're not physical, but
Xfor
purposes of testing our initial code to make
%sure it could recover the NStED lightcurves, we
%included it.
if 1==NStED-compatibility
flux=flux/10000; %scale flux by 10^-4
fluxerr=flux-err/10000; %ditto for error in flux
datehel=datehel-2000; %shift date
end
if strcmp(option, 'plot') %do you want to create a plot?
results=[datehel,fluxfluxerr];
[a,aerr,chisq,yfit]=fitlin(results(:,1),results(:,2),ones(size(results(:,1))));
%do a chi-squared minimization fit
%to obtain a linear trend to subtract off the data (incorporates errors)
slope=a(2);
yint=a(1);
%x=datehel;
%y=slope*datehel+yint;
figure(gcf+1)
plot(datehel, flux, 'k.')%,x,y)
xlim([min(datehel) max(datehel)]) %sets limits on the x-axis
Xsets
ylim([min(flux) max(flux)])
limits on the y-axis
xlabel('Heliocentric Julian Date')
ylabel('Flux')
if strcmp(color, 'red')
title('Red Flux')
elseif strcmp(color, 'blue')
title('Blue Flux')
elseif strcmp(color, 'green')
title('Green Flux')
elseif strcmp(color, 'white')
title('White Flux')
elseif strcmp(color, 'combine')
title('Combined White Flux')
end
elseif strcmp(option, 'errorplot') %creates a plot with errorbars.
figure (gcf+1)
errorbar(datehel, flux, flux-err,'k.')
xlim([min(datehel) max(datehel)])
ylim([min(flux) max(flux)])
xlabel('Heliocentric Julian Date')
ylabel('Flux')
if strcmp(color, 'red')
title('Red Flux')
elseif strcmp(color, 'blue')
title('Blue Flux')
elseif strcmp(color, 'green')
title('Green Flux')
elseif strcmp(color, 'white')
title('White Flux')
elseif strcmp(color, 'combine')
title('Combined White Flux')
end
elseif strcmp(option, 'matrix') %return the read-in lightcurve.
matrix(:,1)=datehel;
matrix(:,2)=flux;
matrix(:,3)=flux-err;
output=matrix;
else
error('Invalid value for "option"')
end
B.2
Apply Corrections
function output=data correction(file,option,color)
%data-correction(file,option,color,numiter)
%file is the filename of the target
%option='plot' yields a plot. option='matrix'
%yields a 2-column matrix, col 1 is xdata, col 2
%is ydata
%color is 'white', 'red', 'green', 'blue', 'combine'
%Data corrections so far are boxcar filter and
%3-sigma cut, iterated.
Xnumiter
is number of iterations
%USER-DEFINED CONSTANTS
outliersamplewidth=1000;
Xwidth
of sample across which to compute cutoff
boxcarwidth=5; %width of boxcar smoothing filter
%numiter=3; %number of boxcar/outlier rejection iterations
XREAD IN DATA
data=lightcurveN2(file, 'matrix', O,color); %date, flux, flux error;
XREJECT POINTS WITH NaN, 0 ERROR
count=O;
for ind=1:size(data,1)
if or(data(ind,3)==0, sum(isnan(data(ind,:)))>0)
count=count+1;
rejectlist(count)=ind;
end
end
if exist('rejectlist',
'var')
data(rejectlist,:)=[;
end
XMAIN CODE BEGINS -- SMOOTHING AND OUTLIER REJECTION
iter=O;
initnumbpoints=size(data,1);
while 1==1
iter=iter+1;
Xdisp(iter)
count=O; %this variable tracks how far we are in the keeplist
if iter>20
break
end
XSMOOTHING
%boxcar mean smoothing of data.
clear thunkdata %temporary variable to store the smoothed data.
distboxcar=floor(boxcarwidth/2); %width searched by code
thunkdata=zeros(size(data));%initialize thunkdata
numpoints=size(data,1); %number of data points
sigma=zeros(numpoints,1);
for ind=1:(distboxcar)
thunkdata(ind,:)=mean(data(1:(ind+dist-boxcar),:),1);
end
for ind=(dist boxcar+1):(numpoints-dist boxcar)
thunkdata(ind,:)=mean(data((ind-distboxcar):(ind+distboxcar),:),1);
end
for ind=(numpoints-dist-boxcar+1):numpoints
thunkdata(ind,:)=mean(data((ind-distboxcar):end,:),1);
end
%boxcar median smoothing of data.
distboxcar=floor(boxcarwidth/2); %width searched by code
thunkdata2=zeros(size(thunk-data)); %initialize thunk_data2
for ind=1:(distboxcar)
thunkdata2(ind,:)=median(thunk-data(1:(ind+distboxcar),:),1);
end
for ind=(dist-boxcar+1):(numpoints-dist boxcar)
thunk-data2(ind,:)=median(thunkdata((ind-distboxcar):(ind+distboxcar),:),1);
end
for ind=(numpoints-dist-boxcar+1):numpoints
thunk_data2(ind,:)=median(thunk-data((ind-distboxcar):end,:),1);
end
thunkdata=thunkdata2;
------------------------------------------------XOUTLIER REJECTION
flux=thunkdata(:,2); %flux (e-)
%fluxerrs=thunkdata(:,3); %error in flux
distoutlier=floor(outliersamplewidth/2); %width searched by code
for ind=1:numpoints
if numpoints<=distoutlier
sample=flux;
elseif numpoints<=2*dist-outlier
if ind+distoutlier>numpoints
sample=flux((end-distoutlier):end);
else
sample=flux(ind:(ind+distoutlier));
end
elseif ind<=distoutlier
sample=flux(1:(ind+distoutlier));
%sample-errs=flux-errs(1:(ind+distoutlier));
elseif (ind+dist-outlier)>numpoints
sample=flux((ind-dist-outlier):end);
%sampleerrs=flux-errs((ind-distoutlier):end);
else
sample=flux((ind-dist outlier):(ind+distoutlier));
%sample errs=flux-errs((ind-distoutlier):(ind+distoutlier));
end
medianval=median(sample); %find the median of the sample
residuals=sample-median-val; %use the median to get the residuals
sigma(ind)=sqrt(1/outliersamplewidth*sum(residuals.^2));
if abs(flux(ind)-median-val)<3*sigma(ind) %only keep data w/in 3-sigma
count=count+1;
X
results(count,:)=data(ind,:);%pruning step.
keeplist(count)=ind;
end
end
if exist('keeplist', 'var')
data=data(keeplist, :);
end
sigmalist(iter)=mean(sigma);
clear keeplist
if iter>1
Xunfortunately
cannot initialize this.
if (abs(sigmalist(iter)-sigmalist(iter-1))/sigmalist(iter)<.01)
break
end
end
end
disp(iter)
flag-val=O; %originally,all is well
results=data;
if size(data,1)<=30 %data rejected if flag column triggers either conditions
disp('Fewer than 30 points')
flagval=1;
elseif size(data,1)<=0.5*initnumbpoints
disp('Fewer than half of original points remain')
flagval=1;
elseif iter > 20
disp('Over 20 iterations --
convergence likely weak')
flag-val=1;
else
X - - - - - - - --------
XLINEAR
--------- -
- - - - - - - - - - - - - - - - -
DETRENDING
[a,aerr,chisq,yfit]=fitlin(results(:,1),results(:,2),ones(size(results(:,1))));
%to obtain a linear trend to subtract off the data (incorporates errors)
slope=a(2);
%yint=a(1); %this quantity is not used.
results(:,2)=results(:,2)-slope*results(:,1); %subtract off linear trend
X-----------------------------------------
XMEDIAN SUBTRACTION
binwidth=338;
X38
points corresponds to two days with 512 second sampling.
subtraction=zeros(size(results,1),1);
%This initializes a matrix
numpoints=size(results,1);
clear subtraction
distwindow=floor(binwidth/2); %width searched by code
if numpoints>=340
for ind=1:numpoints
if ind<=distwindow
Xis
point too far to left?
subtraction(ind, :)=median(results(1: (ind+distwindow) ,2),1);
elseif (ind+dist window)>=numpoints %is point too far to the right?
subtraction(ind, :)=median(results((ind-distwindow):end,2),1);
else
subtraction(ind,:)=median(results((ind-distwindow):(ind+distwindow),2),1);
end
end
results(:,2)=results(:,2)-subtraction; %Remove median from surrounding span
results(: ,2)=results(: ,2)+median(data(: ,2));
else
flagval=1;
disp('Insufficient number of points for median subtraction.')
end
end
flags=flag-val*ones(size(results,1),1);
results(:,size(results,2)+1)=squeeze(flags);
XRETURNING RESULTS
if strcmp(option, 'plot') %plot lightcurve
figure (gcf+1)
plot(results(:,1), results(:,2), 'k.')
xlim([min(results(:,1)) max(results(:,1))])
xlabel ('Phase')
ylabel('Power (e{-})')
elseif strcmp(option, 'matrix') %return lightcurve
output=results;
else
error('Incorrect value for option.')
end
Save Corrections
B.3
function save-dispersion(directory, name)
XThis
function applies applies datacorrection and
Xcalculates the median of absolute deviations from the
Xfor all files in the directory input as 'directory'.
%directory=directory containing files of interest
%numiter=number of iterations of datacorrection
%name=name of file in which to save data for plotting
filestructure=dir(directory);
numfilestructure=size(filestructure);
ticker=O;
for ind=1:numfilestructure
element=filestructure(ind,1);
if (element.isdir==O)
ticker=ticker+1;
filename=element.name;
filenames{ticker,1}=filename;
end
end
numfile=size(filenames,1);
%Print number of files
disp(strcat('Files to Process: ',
cd(directory)
addpath('../')
num2str(numfile)))
%Change to directory containing files
median
Sca=zeros(numfile, 1);
Mag=zeros(numfile, 1);
tic; %timer initialized
count=O;
for ind=1:numfile;
disp(ind)
file=filenames{ind,1};
if strfind(file, 'CHRI')
data=datacorrection(file, 'matrix', 'combine');
elseif strfind(file, 'MON')
data=datacorrection(file, 'matrix', 'white');
else
disp('Warning: file found with neither MON nor CHR')
end
flag=data(:,4);
if median(flag)==1
continue
else
file((end-4):end)=[];
filenametosave=strcat('./processed data/', file);
save(filenametosave, 'data')
time=fix(clock);
save('./processed-data/timecomplete/TIMELASTFILECOMPLETED', 'time')
end
end
cd('..')
toc %print total time
B.4
Save MAD, Magnitude
function saveMedAbsDev(directory, name, option)
%This function applies applies datacorrection and
%calculates the median of absolute deviations from the median
Xfor
all files in the directory input as 'directory'.
%directory=directory containing files of interest
%name=name of file in which to save data for plotting
Xoption=entering
'resample' for option will resample the data to 512
%seconds before calculating the MAD
drctry=strcat(directory, '/processed data');
filestructure=dir(drctry);
numfilestructure=size(filestructure);
ticker=O;
for ind=1:numfilestructure
element=filestructure(ind,1);
if (element.isdir==O)
ticker=ticker+1;
filename=element.name;
filenames{ticker,1}=filename;
end
end
numfile=size(filenames,1);
%Print number of files
disp(strcat('Files to Process:
cd(directory)
',
num2str(numfile)))
%Change to directory containing files
addpath('../')
Sca=zeros(numfile, 1);
Mag=zeros(numfile, 1);
tic;
Xtimer
initialized
count=O;
for ind=1:numfile;
disp(ind)
file=filenames{ind,1};
count=count+1;
file((end-3):end)=[J;
file=strcat(file, '.fits');
info=fitsinfo(file); %Obtain R-magnitude from FITS header.
magnitude=info.PrimaryData.Keywords;
Mag(count,1)=magnitude{34,2};
%matrix of magnitudes
file((end-4):end)=[1;
filenametoload=strcat('./processed data/', file);
load(filenametoload)
data(:,4)=[];
medianflux=median(data(:,2)); %Median flux value for the lightcurve
%0/
-
-
-
-
-
-
- -
-
-
-
-
-
-
-
-
- -
-
-
-
-
-
-
-
XMAKE SURE UNIFORM TIME SAMPLE - ALL SAMPLED AT 512 SEC.
%take median to resample to 512.
-
- -
-
-
-
-
If time between points is greater than
timeinterval=2; %interval of sampling in hours
if strcmp(option, 'resample')
datehel=data(:,1);
datehel=(fix(data(:,1)*1e5))/1e5;
%roundoff date
%initialize loop specific variables
flag=0;
skipcount=O;
time=0;
time1=0;
xi=0;
yi=0;
zi=0;
x=0;
Y=0;
Z=0;
uniform-sample=zeros(size(data,1)-1,size(data,2));%preallocate
for time=1:size(data,1)-1
-
IF 32 SEC,
%512, return linear interpolation (LRM)
s_count=0;
-
-
-
if and(flag==1,skip-count<15)
skip-count=skip-count+1;
continue
else
flag=0;
if (fix((datehel(time+1)-datehel(time))*1e5)/le5)<.0059
if size(data,1)-time>15
s_count = scount+1;
uniform-sample(s-count,:)=[data(time,1)
median(data(time,2):data(time+15,2))
median(data(time,3):data(time+15,3))];
flag=1;
skip-count=0;
else
continue
Xthrows
out the last few points
end
elseif (fix((datehel(time+1)-datehel(time))*1e5)/1e5)>.006
timesample = .00592;
x = [datehel(time,1) datehel(time+1,1)];
Y = [data(time,2) data(time+1,2)];
Z = [data(time,3) data(time+1,3)];
xi = datehel(time,1):time-sample:datehel(time+1,1);
yi = interpl(x,Y,xi,'linear');
zi = interpl(x,Z,xi,'linear');
for timel=1:size(xi,2)-1
s_count = scount+1;
uniform-sample(s-count,:)=[xi(timel) yi(timel) zi(timel)]
end
else
%sampling time is 512 seconds
s_count = scount+1;
uniformsample(scount,:)=data(time,:);
%keeps original data
end
end
end
data=uniform-sample; XREPLACES ORIGINAL DATA WITH REBINNED DATA SET
end
data(:,2)=Mag(count,1)-2.5*loglO(((data(:,2))/2)/(0.5*medianflux));
%Convert flux to magnitude
times=data(:,1);
times=(fix(times*1e5))/le5;
initialtime=times(1);
finaltime=times(end);
timevector=initialtime:(timeinterval/24):finaltime;
for dex=1:(length(timevector)-1)
starttime=timevector(dex);
endtime=timevector(dex+1);
indices=find(and(times>starttime, times<endtime));
datablock=data(indices,:);
madvals(dex,1)=mad(datablock(:,2));
end
Sca(ind, :)=1.48*median(madvals);
end
cd('./processed-data/scatter')
toc %print total time
save(name, 'Mag', 'Sca')
Bibliography
[1] Matlab. version 7.8.0, 2009.
[2] Transiting
planets.
http://exoplanet.eu/catalogtransit.php?munit=runit=punit=mode=1more=, May 2010.
[3] National Aeronautics and Space Administration.
http://kepler.nasa.gov/Mission/discoveries/, 2010.
Kepler:
Discoveries.
[4] National
Aeronautics
and
Space
Administration.
Kepler:
Photometer
and
spacecraft.
http://kepler.nasa.gov/Mission/MissionDesign/PhotometerAndSpacecraft/,
2010.
[5] Suzanne Aigrain. personal communication, 2009.
[6] Thomas Beatty. Design considerations for a space-based transit search for earth
analogs. Master's thesis, Massachusetts Institute of Technology, Department of
Earth, Atmospheric and Planetary Sciences, 2009.
[7] Centre
National
d'tudes
Spatiales.
http://smsc.cnes.fr/COROT/GPmission.htm, 2006.
Observation
strategy.
[8] Auvergne M. et al. The corot satellite in flight: description and performance.
Astronomy and Astrophysics, January 2009.
[9] Suzanne Aigrain et al. Noise properties of the corot data. Astronomy and Astrophysics, March 2009.
[10] Josh Winn. Transits and occultations. http://arxiv.org/abs/1001.2010v1, January 2010.