An Analysis of Noise in the CoRoT Data by Aaron Sampson Submitted to the Department of Physics in partial fulfillment of the requirements for the degree of Bachelor of Science in Physics at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2010 © Massachusetts Institute of Technology 2010. All rights reserved. '/ Author............... ------- .- ..---...v-----f ----- Department of Physics May 17, 2010 Certified by............ . . . . . . . . . . . . . ... .< . . . . . . Sara Seager Associate Professor of Physics Thesis Supervisor ................ ............... Professor David E. Pritchard Senior Thesis Coordinator, Department of Physics Accepted by ................... OF TECHNOLOGY ARCHIVES AUG 13 2010 LIBRARIES 2 An Analysis of Noise in the CoRoT Data by Aaron Sampson Submitted to the Department of Physics on May 17, 2010, in partial fulfillment of the requirements for the degree of Bachelor of Science in Physics Abstract In this thesis, publically available data from the French/ESA satellite mission CoRoT, designed to seek out extrasolar planets, was analyzed using MATLAB. CoRoT attempts to observe the transits of these planets accross their parent stars. CoRoT occupies an orbit which periodically carries it through the Van Allen Belts, resulting in a a very high level of high outliers in the flux data. Known systematics and outliers were removed from the data and the remaining scatter was evaluated using the median of abolute deviations from the median (MAD), a measure of scatter which is robust to outliers. The level of scatter (evaluated with MAD) present in this data is indicative of the lower limits on the size of planets detectable by CoRoT or a similar satellite. The MAD for CoRoT stars is correlated with the magnitude. The brightest stars observed by CoRoT display scatter of approximately 0.02 percent, while the median value for all stars is 0.16 percent. Thesis Supervisor: Sara Seager Title: Associate Professor of Physics 4 Acknowledgments I would like to thank everyone who helped me with this work, in particular Sukrit Ranjan and Lisa Messeri, with whom I worked on the early stages of the project. Thank you also to Dr. Suzanne Aigrain, who was extremely helpful in explaining the CoRoT mission, how the data is reported, and her techniques for analyzing the data. Thank you also to Elizabeth Adams for her help with understanding and calculating transit properties. Above all, I would like to thank Professor Sara Seager for her help and advice throughout the project. From its early stages of the analysis to the writing of this thesis, her insight has be critical to its completion. 6 Contents 1 Introduction 11 2 Noise Reduction Methods 15 .. .. . .. . . . .. .. ... ... . ... .. . . .. 15 CoRoT D ata 2.2 Corrections ........ 2.3 Identifying and removing Systematics . . . . . . . . . . . . . . . . . . 17 2.4 Evaluating Scatter 18 ................................ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 21 3 Results 4 . 2.1 3.1 CoRoT Light Curves and Corrections . . . . . . . . . . . . . . . . . . 21 3.2 Scatter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.3 Calculating Transit Properties . . . . . . . . . . . . . . . . . . . . . . 24 3.4 Limits on Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Relation to Other Missions 27 A Figures 31 B MATLAB Code 37 B.1 Read Light Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 B.2 Apply Corrections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 B.4 Save MAD, Magnitude . . . . . . . . . . . . . . . . . . . . . . . . . . 51 B.3 Save Corrections 8 List of Figures A-1 This light curve is typical of CoRoT data before the application of any corrections. There are a very large number of very high outliers, a pronounced linear trend (upward, in this case, indicating pointing drift allowed more light into the area designated for the star) and low outliers over one portion of the observing run. . . . . . . . . . . . . . 32 A-2 After the application of the correction techniques described in Section2, the CoRoT lightcurves exhibit significantly reduced outliers and have had any linear trend removed. The lower - outliers associated with entry or exit from the earths shadow (and loss of accuracy) are still apparent toward the end of the lightcurve. . . . . . . . . . . . . . . . 32 A-3 Scatter is plotted against R magnitude here for stars observed during the fist short run designated as Chromatic, meaning that they are bright enough to have their flux reported in three separate color channels, red, green, and blue. The R magnitude is reported to only one decimal place for stars in teh short run. Scatter for this data set ranges from approximately 0.0003 to 0.002, with a positive correlation between magnitude and scatter. Scatter here is the median of absolute deviations from the median. . . . . . . . . . . . . . . . . . . . . . . . 33 A-4 Plotted here is the scatter (MAD) versus R magnitude for the Monochromatic stars from the short run, those stars too dim to have their flux reported in separate color channels. The MAD values in the same range as the chromatic set, and higher for the dimmest stars. . . . . . 33 A-5 Equivalent scatter-magnitude plots were made for the first long (150 day) run data sets. The magnitude of these stars is reported with greater precision, with four decimal places in the FITS header, and the correlation between magnitude and scatter is apparent. . . . . . . 34 A-6 Scatter (MAD) is plotted against R magnitude for Chromatic stars observed during the first long observing run. The plot is a clear correlation between magnitude and scatter for these stars. . . . . . . . . . 34 A-7 The blue curve above represents the relationship between the planet/star radius ratio and transit depth. Also plotted here and below are the transit depths of various planet/star pairs and the limits on transit depth imposed by CoRoT-level noise. . . . . . . . . . . . . . . . . . . 35 A-8 More transit depths are shown here, for the Earth and sun and CoRoT7 b, both of which are on the same order of magnitude as the lowest threshhold of detectability imposed by the scatter in CoRoT light curves. 35 A-9 Here, a quadratic fit to the scatter/magnitude plot from the long run monochromatic stars is shown along with transit depths for various planet/star pairs, showing which types of planets could be detectable at which m agnitudes. . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Chapter 1 Introduction In recent years, there have been an abundance of newly discovered extrasolar planetsplanets which orbit stars other than our own. Detecting these planets beyond the boundaries of our solar system requires highly precise and careful measurements of the stars around which these previously unseen objects orbit. One of the best methods for detecting these planets is observing their transit across their parent stars. As the planets orbit their stars, they will sometimes pass directly between the star and observers on earth, thereby blocking part of the light coming from that star. Although most known extrasolar planets known have been discovered by other methods, transit observation it is a method which holds great potential, especially for discovering small, earthlike planets. Measuring transits for these planets of greatest interest is especially challenging, however, because the change in light from the star is very small. It is therefore of great interest to determine the noise limits on detecting such planets through this method. Exoplanet transits can actually provide a great deal of information. Precise measurements can reveal both primary (planet passing in front of star) and secondary eclipses (planet passing behind star) as well as the smaller differences in incident flux when the illuminated side of the planet is visible (before or after a secondary transit) versus when the dark side is visible (near the primary transit) [10]. The difficulty in discovering planets via observation of transits arises in large part from the fact that the star-planet system needs to be oriented quite precisely in order for the two distant bodies to align as viewed from Earth. In fact, for those most interesting extrasolar planets, Earth-sized bodies occupying a habitable zone around their parent stars, the probability of observing the system in proper alignment for a transit is around 0.5 percent [6]. CoRoT (COnvection ROtation and planetary Transits) is a satellite mission administered by the French and European Space Agencies (CNES and ESA). Launched via Soyuz rocket on December 27, 2006 from Baikonur Cosmodrome in Kazakhstan, the mission has two purposes. In addition to searching for extrasolar planets, CoRoT performs asteroseismology measurements, studying the pulsation of stars. Seeking out exoplanet transits and studying asteroseismology both require measuring (very carefully and precisely) the intensity of light received from stars over periods of time. The needs of these dual missions are, however, somewhat different. The added difficulty comes from the fact that in order to be sure a planet is observed while it is transiting, observations must be carried out throughout the planets entire orbital period. For a planet like Earth, the orbital period would be on the order of one year, necessitating a similarly long session of observing. So, while the detection of transits requires observing stars for as long as possible to maximize the probability of finding the star-planet system in the proper alignment, asteroseismology requires only relatively short periods of observation. For this reason, CoRoT alternates between long and short runs of data collection-a compromise between its dual missions. The short runs are 20 days in duration, and the long runs are 150 days. The publicly available data from CoRoT is therefore released by run. The satellite also alternates pointing in two opposite directions, reversing directions every six months to avoid the sun, which would otherwise come too close to its field of view. These two ten-degree radius patches of sky are known as the two "eyes of CoRoT" and all observations will be made within these areas, both of which lie in the galactic plane, one of which is pointed toward the center of the galaxy, one of which is pointed toward the outer edge [7]. The instruments on board CoRoT reflect the nature of its mission. CoRoT consists of an afocal telescope, housed in a baffle designed to block reflected sunlight from the Earth. The telescope is comprised of two off-axis mirrors. Light reflected from the mirrors reaches the craft's wide-field camera, four CCDs (charge coupled devices) in 2048 by 2048 pixel array shielded from radiation by 10 mm of aluminum. Each pixel is a 13.5 by 13.5 micrometer square, which corresponds to an angular view of 2.32 arcseconds. In order to avoid saturation of the CCDs, the starlight is defocused, allowing for more precise photometry. In addition, the starlight is passed through a prism before reaching the CCD, spreading the light out into a spectrum [8]. All of this means that although CoRoT is really just a telescope in space, there will be no images from CoRoT of the type that the Hubble Space Telescope has been dazzling the public with for years. As incredible as it would be to have an image of a terrestrial extrasolar planet, there would simply be no way to spatially resolve such a planet, and that is not CoRoT's mission. To date, data from a total of seven data collection runs have been released. The initial run was in the direction of Monoceros, away from the galactic center, and was followed by the first short run in the direction of Sepens Cauda and the first long run in the direction of Aquila, both toward the galactic center. The data analyzed in this study come from these first long and short runs to be publicly released [7]. CoRoT occupies a circular polar orbit, carrying it over both of the Earth's poles. It observes in directions less than ten degrees from perpendicular to its orbital plane, which means that there are no occultations by the Earth of its targeted objects. This allows it to observe continuously throughout its months-long data collection runs. While in this respect the orbit is well-suited to the mission, unfortunately the satellite's orbital characteristics do introduce several problems which must be dealt with in processing the data. First, the satellite frequently crosses of the region known as the South Atlantic Anomaly. This is the region where the Van Allen radiation belt comes closest to the surface of the Earth. The Van Allen belt is a toroidal region around the earth where plasma is trapped by the Earth's magnetic field. The result of crossing through the South Atlantic Anomaly then, is that the satellite is bombarded with charged particles. Additionally, the satellite must travel in and out of the shadow of the Earth. Data points affected by both of these sources of noise should be identified and flagged, making them easy to remove, but the flagging process is often imperfect, necessitating the corrections described below [9]. These problems have not, however, prevented CoRoT from making groundbreaking discoveries. So far, the satellite has discovered a confirmed nine extrasolar planets. Most of these planets are quite like the vast majority of known extrasolar planetsso called "hot Jupiters," giant planets orbiting very close to their stars. Often these worlds are as large as or larger than Jupiter but orbit within the distance of Mercury's orbit. Planets of this type discovered by CoRoT range from 0.467 Jupiter masses to 3.31 Jupiter masses. The body designated CoRoT-3 b has a mass of 21.66 Jupiter masses, so large that it is considered a brown dwarf, and not a planet at all. The most significant planet to be discovered so far, however, is by far the smallest. Designated CoRoT-7 b, this planet has a mass only 0.151 times that of Jupiter, or 4.8 Earth masses. This, along with its measured radius of 1.7 Earth radii gives it a density similar to Earth, meaning that it is in fact a small, terrestrial planet. While its position very close it its star, with a semimajor axis of only 0.0172 astronomical units, makes it a very hot, inhospitable place not much like Earth, it is still an extremely exciting discovery that brings us one step closer to finding a close Earth analog [2]. Chapter 2 Noise Reduction Methods 2.1 CoRoT Data CoRoT data is released in the standard astronomical Flexible Image Transport System (FITS) format. The science-quality data from CoRoT is designated N2, and is released to the public one year after it becomes available to CoRoT scientists. This analysis was performed using MATrix LABoratory (MATLAB) [1], which includes a built in fitsread function, which was used to read the CoRoT data files. Each file made available by CoRoT corresponds to photometry data on one star over the course of an entire observing run. The relevant data is contained in the binary table of the FITS files. From here, the date, measured flux, and error are read out using fitsread. The flux reported by CoRoT is in some cases broken down into red, green, and blue color channels. (Only the brighter stars observed by CoRoT were treated in this way.) The nomenclature used for these two ways of treating data is somewhat nonstandard. Where the starlight was bright enough to report flux in multiple color channels, the stars are referred to as "chromatic." "Monochromatic" stars do not have just one color band reported, as one might expect, but the full white flux. These color bands reported by CoRoT for "chromatic" stars originated not from multiple filters, but from a prism on the satellite which spread the starlight out on the CCD. The red, green, and blue channels were defined by taking the long, short, and middle wavelength parts of this spectrum, but divided based on percentage of light, not a fixed wavelength cutoff. For F, G, and K type stars, most of the flux is in the red part of the spectrum, and CoRoT's red channel is defined as the longest wavelength 40 percent of the light Therefore, after the date, red flux, green flux, blue flux, white flux, and error are read in from the binary table, we choose the white flux (or equivalently, combine the three color channels) and output a matrix with the date, flux, and flux error. This matrix can be plotted to obtain a light curve like Figure A-1, or it can be saved as a matrix for further processing. Additional information is included in the header of each FITS file, which can be accessed using another built in MATLAB function, fitsinfo. This was the source for the -magnitude information for all stars. R magnitude was used for comparison with scatter. 2.2 Corrections After obtaining date, flux, and error data from the CoRoT FITS files, a number of corrections were applied to the data to remove noise from known sources. First, any light curves with clear irregularities (zero error or non-numerical flux values) were rejected entirely. The principal correction then applied to each light curve was the identification and removal of outliers. Light curves plotted from the raw data released by CoRoT exhibited a very high number of high outliers. Aigrain et al. attribute this abundance of high outliers principally to charged particles hitting the detector during the satellite's frequent crossing of the South Atlantic Anomaly [9]. (See section 3.1) The high outliers were removed after the application of a median filtering and smoothing process. The data was binned according to timescales equivalent to the orbital period of the satellite, 103 minutes. The smoothing of the data consisted of applying two five point boxcar filters, first a mean filter and then a median filter. First, the standard mean filter is applied. This means that for every point in the light curve, the mean of the surrounding five point block (two points at earlier times, the point itself, and two points at later times) is calculated. A new list of data points is then created corresponding to the true data read in from the fits file, but using these median values for each block of data instead of the real data (that is with the true dates and the five-point block means replacing the original flux values). This produces a smoothed light curve with fewer outliers. This standard mean filter is then followed with a median filter. The median is more resistant to the influence of outliers than the mean, and the smoothing therefore benefits from this step. The outliers are identified and removed based on this smoothed data. The set of smoothed data is broken up into samples of one thousand points for which the median is calculated. For each point, the residual (the distance from the median of the thousand point block) is calculated, and then the standard deviation for the sample is found in the standard way by summing the squares of the residuals, dividing by the number of points in the sample, and taking the square root. = E(fi -- ,) 2 /N (2.1) Now, returning to the original data, for each point more than three standard deviations away from the median of the corresponding smoothed data, its index is noted, and the corresponding point is removed from the original data. In this way, a light curve is obtained, which is free of the very high or low outliers. The cutoff of three standard deviations allowed very high confidence that all rejected points were the result of unusual events such as hot pixels resulting from cosmic ray hits on the detector. 2.3 Identifying and removing Systematics Another problem with the data involves a long term linear decay, which Aigrain notes and assumes to be of instrumental origin. In addition to this, other light curves obtained seem to exhibit an opposite effect-an increase over the course of observation. The instrumental origin of this phenomenon would seem to be a slow drift of the satellite's field of view. This results in a linear trend in the flux data because there are particular areas of the CCD designated as corresponding to particular stars. When the satellite drifts, it can become misaligned and allow either more (if there is another bright object nearby) or (more often) less light into the area designated for the star of interest. In order to deal with the problem of instrumentally introduced linear trends, MATLAB's fitlin function was used to fit a linear polynomial to the light curve. This line is then simply subtracted from the data, or more precisely, at each point in the light curve, the y-value (flux) of the line at the corresponding x-value (time) is subtracted from the data point. Many of the systematics one would expect to find in the CoRoT data are periodic in nature, due to the orbital period of the satellite and its periodic entries and exits from the Earth's shadow. One approach to characterizing and ultimately removing theses systematics would employ Fourier analysis. While this approach was initially attempted, it was never fully developed, and instead long-timescale variation was removed via median subtraction. This method of removing low-frequency variation (chosen as variation on timescales longer than two days), the median for the two-days of points surrounding each point was calculated and subtracted off. This was effectively a high-pass filter, two days being chosen because it was much longer than any expected transit durations. After this subtraction, the overall median for the entire light curve was added back to each point in order to restore the flux values to their original magnitude. 2.4 Evaluating Scatter The scatter that remains in the data after the application of the above corrections was evaluated using the MAD (median of absolute deviations from the median) statistic. This metric was calculated for each light curve by taking the median magnitude value for the light curve, calculating the deviation of every magnitude value from this median, and then taking the median of these deviations. The MAD statistic is useful as a measure of dispersion which is less strongly influenced by outliers than the variance or standard deviation. This is simply because it involves taking the median of residuals rather than the mean, and as such remains around the middle of residual values even when the residuals on one end are quite extreme. Because of this, for this data, which displays a number of high outliers, although a number greatly reduced by the outlier rejection procedures described in section 2.2, the MAD is the best way to measure meaningful dispersion in the data. MAD = median(|m - median(m)|) (2.2) Before calculating the scatter of light curves, they were in some cases resampled to a uniform 512 second sampling rate. See section 3.2 for plots of MAD with and without this resampling step. There seems to be very little effect of this procedure on the overall scatter. The nonuniform sampling of the light curves comes from two sources. The correction techniques described in section 2.1 of course introduce gaps in the data through outlier removal. Even the raw light curves exhibit some nonuniform sampling, however. This is a result of the fact that actual exposure times are 32 seconds, and although the data is typically rebinned to 512 seconds before transmission, this is not always the case, however. When there is a possible detection of a transit, the raw data is reported with 32 second sampling [5]. The applied resampling process rebins oversampled regions to 512 seconds, accounting for both gaps and oversampled periods. The oversampled regions are rebinned by using the median flux value for these time blocks, this being resistant to the influence of outliers. Any gaps in the data, those introduced by our corrections or preexisting, are filled via linear interpolation. The last step before calculating the MAD statistic is the conversion of flux values from electron counts to magnitudes. This is done using the standard calculation of apparent magnitude, using as a reference the accepted magnitude of the star given in the FITS header and the median flux over the entire observation run. *f ) + m = -2.5 log10 (d median (f ) mreported (2.3) c The scatter among these magnitude values was then calculated using MAD as described above. Chapter 3 Results 3.1 CoRoT Light Curves and Corrections The CoRoT data has a number of problems which can make exoplanet detection more difficult, largely from instrumental sources. The most prominent of these effects, immediately apparent in Figure A-i or any plot of an unprocessed light curve is the very high number of high outliers. Other significant effects include an overall linear trend upward or (more frequently) downward and a grouping of low outliers present for only one segment of the light curve. The first of these effects are principally attributable to cosmic ray hits on the detector, the high number being due to the satellite's frequent crossing of the South Atlantic Anomaly. As discussed in Section 1, the South Atlantic Anomaly is the region where the Van Allen radiation belt comes closest to the surface of the Earth. This effectively bombards the satellite with a large amount radiation every time the satellite passes through the region. When the on-board CCD is struck with this radiation, the sensitivity of the affected pixels is temporarily altered, causing the very high readings [9]. The frequent crossings of the South Atlantic Anomaly do present a significant problem and make CoRoT's orbital path decidedly less than ideal. The linear trends observed in many light curves are most likely due to a drift in the satellite's pointing accuracy. A slight drift off target can cause more or less light to fall in the region of the CCD associated with the star. The light reported is only that which falls into a designated region corresponding to the spread out spectrum of the star from the prism in the detector. If the satellite's targeting drifts over the course of its observation run, some of the starlight will fall outside this region, causing the downward trend. Alternatively, light from a nearby star can contaminate the region associated with the star of interest, causing a possible upward trend [5]. The low outliers may represent the results of the satellite's entry into or exit from the shadow of the Earth, when the change in incident light can cause a loss of accuracy in targeting [9]. The light curves can be significantly improved by applying the correction techniques described in Section 2. The corrections remove the very high or low outliers and any linear trend in a light curve, all of these being effects that can be accounted for. This allows an evaluation of the noise in the data after accounting for known sources of unwanted signal. Cleaning up the data to this level allows a better estimation of the limits that the data places on exoplanet detection. The remaining scatter is likely due to unknown systematics or intrinsic stellar variability. The low orbit of the satellite could be a source of noise beyond the periodic South Atlantic Anomaly crossings. Understanding the impact of intrinsic stellar variability would be particularly interesting, since it would present an obstacle for any planet-finding mission, regardless of orbital characteristics or instrumentation. Figures A-1 and A-2 show a typical CoRoT light curve both before and after the application of the correction routines. The number and magnitude of outliers have been greatly reduced, and the data no longer exhibit any linear trend. 3.2 Scatter In order to determine the detectability of exoplanet transits with CoRoT, or from data of a similar quality, it is essential to understand the level of scatter that remains in the data after known systematics are removed. The scatter was evaluated via the MAD statistic (median of absolute deviations from the median, see Section 2) for each star. In addition to determining the overall level of scatter present, the scatters of all stars in each data set (corresponding to an observing run) were plotted against their R-magnitudes to show any correlation with magnitude, and the degree to which the noise in individual light curves varies can be estimated from these plots. The logarithm MAD metric, plotted in Figure A-3 to A-6 for most light curves from the first long and short runs of observing, broken up into "chromatic" (bright enough to separate the light into a spectrum) and "monochromatic" (dimmer) stars, fall from 10-4 to 10~3 counts. Again, the MAD statistic is calculated by finding the median of the electron counts in the light curve, then calculating the absolute value of the deviation from this median for each point, and finally taking the median of these deviations. The two observing runs for which scatter-magnitude plots were constructed were the first long and short runs in the constellations Monoceros and Serpens Cauda respectively. The long run plot exhibits an apparently stronger trend, as well as somewhat closer grouping of the scatter plot, but both display approximately the same general level of scatter for both point-to-point and two hour binned MAD. Figure A-3 to A-6 show the logarithm of the MAD statistic for each star plotted against its R magnitude. It should be noted that for the short run data sets, the magnitudes were reported only to one decimal place for the magnitude, whereas for the long run sets, the magnitudes were given to four decimal places. This creates a significant visual difference in the plots. More significant is the previously mentioned fact that while the short run scatters seem to exhibit a weak dependence on the magnitude, (with higher magnitude stars showing more scatter, as expected) the correlation seems stronger for the long run. The MAD statistics for this short run data set vary by a factor of about ten, with a few stars exhibiting significantly more scatter. This suggests that the noise in CoRoT light curves is somewhat inconsistent, with some light curves plagued by much more noise. Some light curves were also rejected in the process of assembling the scatter-magnitude plots if the data correction techniques removed a large majority of the points in the curve, suggesting highly noisy data with a very weak signal. For the short sun, chromatic set, 2.2 percent of stars (28 of 1271) were rejected in this fashion. For the monochromatic short run stars, only 0.07 percent (4 of 5706) were rejected. For the long run 0.04 percent (3 of 7689) of the monochromatic stars and 2.5 percent (93 of 3719) of the chromatic stars were rejected. The MAD could be evaluated in a few different ways for each data set either by measuring simple point-to-point scatter or by binning over longer timescales before taking the MAD. In particular, the timescales of interest are those corresponding to the duration of transits. It is the noise on these timescales that presents the greatest obstacle to planet detection. Appendix A shows a several plots of MAD after twohour binning versus R magnitude for the "chromatic" stars in the first short run of observation. Two hours was chosen based on the probable transit duration of the longest period exoplanets CoRoT would be likely to detect, those with orbits of a few weeks. This limit on the orbital period is imposed by the length of time for which CoRoT observes the star, which is only a few months. (It would be highly unlikely to catch a planet with a much longer orbital period in a transit during the relatively short observing run.) There was little discernible difference in the level of scatter point to point and after two-hour binning, however all plots in Figures A-3 to A-6 are binned on two hours. 3.3 Calculating Transit Properties The detection of an exoplanet transit involves measuring small changes in the incident flux from a star, and as such is limited by the noise in the flux data. The scatter in light curves such as the ones obtained by CoRoT has the effect of limiting the detectability of planets. Ultimately, it is the size of planets which can be detected that is limited by the noise, but what is directly observed is the depth of the transit, which is directly related to the planet radius. It is the transit depth, therefore, that should be considered when determining what the limits on CoRoT's abilities to detect planets are. At the most basic level, determining the depth of a transit is straightforward, the depth of the transit as a fraction of the incident flux from the star is simply equal to the ratio of the projected areas of the planet and star. This is therefore equal to the ratio of the squares of the radii. Transit depth = pla (3.1) For this analysis, subtler effects such as limb darkening (the fact that the centers of the observed disks of stars are more luminous than their outer edges-leading to a rounded transit) can be ignored. What is critical is simply the total depth of the transit (E. Adams, private communication, 2009). First, say that we wish to detect a planet of radius equal to that of the Earth, with a radius of around 0.01 solar radii, orbiting a star similar to our own, with a radius equal to that of the sun. The transit depth would be correspondingly just 0.01 percent. More generally, we can say that planets will likely be detected around stars with a radius between 0.5 and 2 solar radii, and we can look for planets of 5 earth radii. These numbers would give transit depths in the range of 0.06 to 1 percent 3.4 Limits on Detection Using the MAD statistic as a measure of the noise in the CoRoT light curves, we can put limits on the depth of a transit which would be detectable using CoRoT or another instrument providing data of similar quality. The median of all MAD statistics calculated for all stars is 0.0016, or 0.16 percent. Figure A-7 shows a plot of the depth of a transit for the ratio of radii of the planet and its parent star. The depth of an exoplanet transit is simply given by the square of this ratio. If we then require a signal to noise ratio of 10, then based on the noise found to be present in the CoRoT data, only exoplanets with a transit depth greater than 1.6 percent could be definitively detected for most stars. However, for the brightest stars observed by CoRoT, the noise is significantly reduced, yielding a MAD statistic closer to 0.03 percent, which would allow the detection of significantly smaller signals-in the range of 0.3 percent. This means that while truly Earth-like planets are not quite within CoRoT's reach, planets that are slightly larger, but still relatively small, and likely rocky, are certainly detectable, as evidenced by the much lauded discovery of CoRoT-7b. The case of CoRoT-7 b in fact pushes the limits of detectability. CoRoT-7 is a relatively bright star, with an apparent magnitude of 11.7. This means that the scatter in the data is at the 0.02 percent level. CoRoT-7 b has a radius of 0.15 Jupiter radii and orbits a star of 0.87 solar radii, so the transit depth is 0.03 percent, just above the level of the noise. Chapter 4 Relation to Other Missions Significantly, a great deal of the noise present in the CoRoT data has a known origin, and a satellite with different orbital characteristics could avoid many of the problems that face CoRoT. NASA's Kepler mission, launched in 2009, provides a strong contrast with CoRoT as a model for an extrasolar planet-finding space mission in this and other respects. The first advantage Kepler has over CoRoT is its much longer period of observation. CoRoT has discovered a number of short-period exoplanets, but even its long observing runs are only 150 days. To have a high probably of detecting a planet like Earth however, the period would be on the order of one year, necessitating a much longer session of observing. Kepler's mission involves observing its selected field of view for a full three and a half years. There is something of a trade-off inherent in this longer mission, however. CoRoT, with its multiple shorter runs moves to different parts of the sky (albeit parts that are very nearby) over the life of its mission. Kepler, at least as its mission is currently defined, will observe only one patch of the sky for its entire lifetime as a functional spacecraft. This is made up for somewhat by the fact that Kepler has a much larger field of view than CoRoT, but while this field of view will encompass around 100,000 stars bright enough to study [4], CoRoT will be able to observe around twice that number [7]. The full field of view covered by Kepler's CCD array is 105 square degrees, a field of view situated in the constellations Cygnus and Lyra, a mission "specifically designed to survey our region of the Milky Way galaxy" [3]. This field of view also meets the requirement of being out of the ecliptic plane, meaning the stars under observation will not be periodically blocked by the sun. There are around 100,000 stars bright enough to study in Kepler's field of view. For the Kepler mission, bright enough means stars of magnitude 14 in the visual wavelength band. Kepler's instrumentation also differs from that of CoRoT. The spacecraft consists of a telescope with a spherical primary mirror 1.4 meters in diameter and a Schmidt correcting plate to correct for spherical abberation. This spherical mirror allows for Kepler's wider field of view, observed by its photometer, an array of 42 CCDs (charge coupled devices). Each CCD on Kepler is 50 by 25 millimeters, or 2200 by 1024 pixels. Onboard exposure times are three seconds, a period short enough to avoid saturation of the CCDs with the starlight slightly defocused-spread out to 10 arcseconds [4]. Kepler's instruments are housed within a spacecraft weighing just over 1,000 kilograms, providing an aperture of 0.95 meters. Kepler is equipped with a solar array, a high-gain antenna for data transmission, thrusters, and a radiator for cooling the CCDS (thereby reducing shot noise from the random motion of electrons). As it watches these stars for years, Kepler will also rotate 90 degrees about its line of sight every three months to keep its solar arrays in the sunlight and its CCD radiator pointed into deep space [4]. Perhaps the most significant advantage Kepler has over CoRoT is its orbit. Unlike CoRoT, in its circumpolar orbit, Kepler is not forced to contend with the radiation of the Van Allen belts and periodic entry and exit from the Earth's shadow. This is because Kepler is far more distant. Kepler does not orbit the Earth, but occupies an Earth-trailing heliocentric orbit. This means that the spacecraft is following the Earth at a distance as it orbits the sun, but with a slightly larger semimajor axis, giving the craft a slightly longer orbital period of 372.5 days. Due to this longer period, Kepler is slowly drifting farther and farther away from Earth. Another advantage of this orbit is the fact that at a greater distance, the craft is not subject to the same fewer torques due to gravitational gradients that act on CoRoT or other satellites deeper in the Earth's gravitational field, allowing for a better maintenance of pointing accuracy. The satellite will still have to contend with the radiation associated with solar flares. All of this means that Kepler is positioned to have far more success at discovering terrestrial, and perhaps habitable worlds than CoRoT has had. Although all planets discovered by Kepler so far are of the easy-to-detect hot Jupiter type, this is expected since it has been in operation for just a matter of months. As the mission continues, we can expect to see more long-period planets transiting. Understanding the capabilities and limitations of a satellite like CoRoT is still very useful however, as in order to truly push our knowledge of extrasolar planets forward, there will be a need for many more missions to search for these distant worlds, and not all of them can be on the scale of Kepler. Understanding the sources of noise in data from CoRoT and the limitations this puts on the detection of planets can inform the design of future missions capable of new parts of the sky as yet unsearched by CoRoT or Kepler. Ultimately, these missions can make progress on answering some of the most fundamental questions that have inspired us to study the cosmos for years. Could there be life elsewhere in the universe? And could we humans find a hospitable home other than Earth elsewhere in the universe? The first step in answering these questions is determining whether there are other planets with Earth's unique combination of attributes making it so perfectly suited to the development of life. Three years from now, Kepler may have discovered dozens of planets similar in size and orbital characteristics to our own. Farther down the road, many more missions could be pushing our knowledge of such planets ever farther. And despite the wealth of questions to be answered in our own solar system, knowing that these distant earthlike planets are out there, awaiting further study will surely be there strongest inspiration and incentive for the further study and exploration of space that could be hoped for. 30 Appendix A Figures Figure A-1: This light curve is typical of CoRoT data before the application of any corrections. There are a very large number of very high outliers, a pronounced linear trend (upward, in this case, indicating pointing drift allowed more light into the area designated for the star) and low outliers over one portion of the observing run. Figure A-2: After the application of the correction techniques described in Section2, the CoRoT lightcurves exhibit significantly reduced outliers and have had any linear trend removed. The lower o outliers associated with entry or exit from the earths shadow (and loss of accuracy) are still apparent toward the end of the lightcurve. MAD vs. R Magnitude for Short Run, Chromatic Stars (Resampled to 512 s) 10 F I I I I I I ___q 10'2- 1: a 10'3- ~ * .: : I~i I1111" IIII I * . 10~4- 11 11.5 12 13 13.5 14 R Magnitude 12.5 14.5 15 15.5 16 Figure A-3: Scatter is plotted against R magnitude here for stars observed during the fist short run designated as Chromatic, meaning that they are bright enough to have their flux reported in three separate color channels, red, green, and blue. The R magnitude is reported to only one decimal place for stars in teh short run. Scatter for this data set ranges from approximately 0.0003 to 0.002, with a positive correlation between magnitude and scatter. Scatter here is the median of absolute deviations from the median. IMAD vs. RMagnitude for Short Run, Monochromatic Stars (Resampled to 512 s) 10ini 10 a * * * * * 10,2 * * * * * * * * *. **:~ * * * * ** 3 * *a * -.10 * rln I II 11 11.5 * * :. ~* *$ *** 12.5 13.5 14 R Magnitude *?~* ~ * I 12 13 14.5 15 15.5 16 Figure A-4: Plotted here is the scatter (MAD) versus R magnitude for the Monochromatic stars from the short run, those stars too dim to have their flux reported in separate color channels. The MAD values in the same range as the chromatic set, and higher for the dimmest stars. 33 Stars 1 MAD vs. R Magnitude for Long Run, Monochromatic 10 104 11 11.5 12 12.5 13 13.5 14 R Magnitude 14.5 15 15.5 16 Figure A-5: Equivalent scatter-magnitude plots were made for the first long (150 day) run data sets. The magnitude of these stars is reported with greater precision, with four decimal places in the FITS header, and the correlation between magnitude and scatter is apparent. MAD vs R Magnitude for Long Run, Chromatic Stars (No Resampling) 10 10-2 *k 103 *AM 40 104 11 11.5 12 12.5 13 13.5 14 RMagnitude 14.5 15 15.5 16 Figure A-6: Scatter (MAD) is plotted against R magnitude for Chromatic stars observed during the first long observing run. The plot is a clear correlation between magnitude and scatter for these stars. Transit Depth vs. Planet/Star Radius Ratio / / Jupiter/Sun Transit Average CoRoT Star Limit 1 - L 01 06 0.4 02 Planet Radius / Star Radius Figure A-7: The blue curve above represents the relationship between the planet/star radius ratio and transit depth. Also plotted here and below are the transit depths of various planet/star pairs and the limits on transit depth imposed by CoRoT-level noise. Transit Depth vs. Planet/Star Radius Ratio 'I 0.0006 0.000 5 // 0.000 4 / / / oRoT-7 bTransit 0.000 CoRoT Bright Star Limit 0000 Earth/Sun Transit 0.000 0.01 0.02 0.03 0.04 Planet Radius / Star Radius Figure A-8: More transit depths are shown here, for the Earth and sun and CoRoT-7 b, both of which are on the same order of magnitude as the lowest threshhold of detectability imposed by the scatter in CoRoT light curves. Detection Limits ac01 Jupiter/Sun Transit D.004- CoRoT-7 b Transit 13 14 151- Earth/Sun Transit R Magnitude Figure A-9: Here, a quadratic fit to the scatter/magnitude plot from the long run monochromatic stars is shown along with transit depths for various planet/star pairs, showing which types of planets could be detectable at which magnitudes. Appendix B MATLAB Code B.1 Read Light Curve function output=lightcurve(file,option,NStEDcompatibility,color) XLAST UPDATED 6/22/2009 SR %This modifies lightcurve.m to work with chromatic and X.monochromatic data from CoRoT's N2 Public archive. Xoutput=lightcurve(file, option, NStEDcompatibility,color) %file is a filename (string). %if option='plot',then it makes a plot. If option='errorplot' %it makes a plot with errorbars. If %option='matrix', then it returns a 3-column matrix with %helio date, white flux, and white-flux error. %if NStED-compatibility=1 then the flux will %be scaled by 1/10000 and the date shifted by %-2000, to make it compatible with the plots on %the NStED website. If NStEDcompatilibity !=1, Xthen it won't be. %If color = 'red' plots red flux. If color equals 'green' %it plots green flux. %flux. Xfor If color equals 'blue' it plots blue If using monochromatic data, enter 'white' color. If want to convert from %polychromatic, enter 'combine' %The purpose of this function is to generate a %lightcurve from the CoRoT exoplanet data. %It has only been tested for this dataset. data=fitsread(file, 'bintable'); %read in binary table of the fits file datehel=data{1,3}; %heliocentric julian date %this routine selects out the desired color %channel. if strcmp(color, 'white') flux=data{1,5}; %flux is measured flux in the given channel in electrons fluxerr=data{1,6}; %flux-err is error in the measured flux elseif strcmp(color,'red') flux=data{1,5}; fluxerr=data{1,6}; elseif strcmp(color, 'green') flux=data{1,7}; fluxerr=data{1,8}; elseif strcmp(color, 'blue') flux=data{1,9}; fluxerr=data{1,10}; elseif strcmp(color, 'combine') flux=data{1,5}+data{1,7}+data{1,9}; flux.err=sqrt(data{1,6}.^2+data{1,8}.^2+data{1,10}.^2); else error('Invalid value for "color"') end %The lightcurves generated by the NStED database %are unusual in that they displace the date by %-2000 and scale the e- counts by 10^-4. We don't %work with these since they're not physical, but Xfor purposes of testing our initial code to make %sure it could recover the NStED lightcurves, we %included it. if 1==NStED-compatibility flux=flux/10000; %scale flux by 10^-4 fluxerr=flux-err/10000; %ditto for error in flux datehel=datehel-2000; %shift date end if strcmp(option, 'plot') %do you want to create a plot? results=[datehel,fluxfluxerr]; [a,aerr,chisq,yfit]=fitlin(results(:,1),results(:,2),ones(size(results(:,1)))); %do a chi-squared minimization fit %to obtain a linear trend to subtract off the data (incorporates errors) slope=a(2); yint=a(1); %x=datehel; %y=slope*datehel+yint; figure(gcf+1) plot(datehel, flux, 'k.')%,x,y) xlim([min(datehel) max(datehel)]) %sets limits on the x-axis Xsets ylim([min(flux) max(flux)]) limits on the y-axis xlabel('Heliocentric Julian Date') ylabel('Flux') if strcmp(color, 'red') title('Red Flux') elseif strcmp(color, 'blue') title('Blue Flux') elseif strcmp(color, 'green') title('Green Flux') elseif strcmp(color, 'white') title('White Flux') elseif strcmp(color, 'combine') title('Combined White Flux') end elseif strcmp(option, 'errorplot') %creates a plot with errorbars. figure (gcf+1) errorbar(datehel, flux, flux-err,'k.') xlim([min(datehel) max(datehel)]) ylim([min(flux) max(flux)]) xlabel('Heliocentric Julian Date') ylabel('Flux') if strcmp(color, 'red') title('Red Flux') elseif strcmp(color, 'blue') title('Blue Flux') elseif strcmp(color, 'green') title('Green Flux') elseif strcmp(color, 'white') title('White Flux') elseif strcmp(color, 'combine') title('Combined White Flux') end elseif strcmp(option, 'matrix') %return the read-in lightcurve. matrix(:,1)=datehel; matrix(:,2)=flux; matrix(:,3)=flux-err; output=matrix; else error('Invalid value for "option"') end B.2 Apply Corrections function output=data correction(file,option,color) %data-correction(file,option,color,numiter) %file is the filename of the target %option='plot' yields a plot. option='matrix' %yields a 2-column matrix, col 1 is xdata, col 2 %is ydata %color is 'white', 'red', 'green', 'blue', 'combine' %Data corrections so far are boxcar filter and %3-sigma cut, iterated. Xnumiter is number of iterations %USER-DEFINED CONSTANTS outliersamplewidth=1000; Xwidth of sample across which to compute cutoff boxcarwidth=5; %width of boxcar smoothing filter %numiter=3; %number of boxcar/outlier rejection iterations XREAD IN DATA data=lightcurveN2(file, 'matrix', O,color); %date, flux, flux error; XREJECT POINTS WITH NaN, 0 ERROR count=O; for ind=1:size(data,1) if or(data(ind,3)==0, sum(isnan(data(ind,:)))>0) count=count+1; rejectlist(count)=ind; end end if exist('rejectlist', 'var') data(rejectlist,:)=[; end XMAIN CODE BEGINS -- SMOOTHING AND OUTLIER REJECTION iter=O; initnumbpoints=size(data,1); while 1==1 iter=iter+1; Xdisp(iter) count=O; %this variable tracks how far we are in the keeplist if iter>20 break end XSMOOTHING %boxcar mean smoothing of data. clear thunkdata %temporary variable to store the smoothed data. distboxcar=floor(boxcarwidth/2); %width searched by code thunkdata=zeros(size(data));%initialize thunkdata numpoints=size(data,1); %number of data points sigma=zeros(numpoints,1); for ind=1:(distboxcar) thunkdata(ind,:)=mean(data(1:(ind+dist-boxcar),:),1); end for ind=(dist boxcar+1):(numpoints-dist boxcar) thunkdata(ind,:)=mean(data((ind-distboxcar):(ind+distboxcar),:),1); end for ind=(numpoints-dist-boxcar+1):numpoints thunkdata(ind,:)=mean(data((ind-distboxcar):end,:),1); end %boxcar median smoothing of data. distboxcar=floor(boxcarwidth/2); %width searched by code thunkdata2=zeros(size(thunk-data)); %initialize thunk_data2 for ind=1:(distboxcar) thunkdata2(ind,:)=median(thunk-data(1:(ind+distboxcar),:),1); end for ind=(dist-boxcar+1):(numpoints-dist boxcar) thunk-data2(ind,:)=median(thunkdata((ind-distboxcar):(ind+distboxcar),:),1); end for ind=(numpoints-dist-boxcar+1):numpoints thunk_data2(ind,:)=median(thunk-data((ind-distboxcar):end,:),1); end thunkdata=thunkdata2; ------------------------------------------------XOUTLIER REJECTION flux=thunkdata(:,2); %flux (e-) %fluxerrs=thunkdata(:,3); %error in flux distoutlier=floor(outliersamplewidth/2); %width searched by code for ind=1:numpoints if numpoints<=distoutlier sample=flux; elseif numpoints<=2*dist-outlier if ind+distoutlier>numpoints sample=flux((end-distoutlier):end); else sample=flux(ind:(ind+distoutlier)); end elseif ind<=distoutlier sample=flux(1:(ind+distoutlier)); %sample-errs=flux-errs(1:(ind+distoutlier)); elseif (ind+dist-outlier)>numpoints sample=flux((ind-dist-outlier):end); %sampleerrs=flux-errs((ind-distoutlier):end); else sample=flux((ind-dist outlier):(ind+distoutlier)); %sample errs=flux-errs((ind-distoutlier):(ind+distoutlier)); end medianval=median(sample); %find the median of the sample residuals=sample-median-val; %use the median to get the residuals sigma(ind)=sqrt(1/outliersamplewidth*sum(residuals.^2)); if abs(flux(ind)-median-val)<3*sigma(ind) %only keep data w/in 3-sigma count=count+1; X results(count,:)=data(ind,:);%pruning step. keeplist(count)=ind; end end if exist('keeplist', 'var') data=data(keeplist, :); end sigmalist(iter)=mean(sigma); clear keeplist if iter>1 Xunfortunately cannot initialize this. if (abs(sigmalist(iter)-sigmalist(iter-1))/sigmalist(iter)<.01) break end end end disp(iter) flag-val=O; %originally,all is well results=data; if size(data,1)<=30 %data rejected if flag column triggers either conditions disp('Fewer than 30 points') flagval=1; elseif size(data,1)<=0.5*initnumbpoints disp('Fewer than half of original points remain') flagval=1; elseif iter > 20 disp('Over 20 iterations -- convergence likely weak') flag-val=1; else X - - - - - - - -------- XLINEAR --------- - - - - - - - - - - - - - - - - - - DETRENDING [a,aerr,chisq,yfit]=fitlin(results(:,1),results(:,2),ones(size(results(:,1)))); %to obtain a linear trend to subtract off the data (incorporates errors) slope=a(2); %yint=a(1); %this quantity is not used. results(:,2)=results(:,2)-slope*results(:,1); %subtract off linear trend X----------------------------------------- XMEDIAN SUBTRACTION binwidth=338; X38 points corresponds to two days with 512 second sampling. subtraction=zeros(size(results,1),1); %This initializes a matrix numpoints=size(results,1); clear subtraction distwindow=floor(binwidth/2); %width searched by code if numpoints>=340 for ind=1:numpoints if ind<=distwindow Xis point too far to left? subtraction(ind, :)=median(results(1: (ind+distwindow) ,2),1); elseif (ind+dist window)>=numpoints %is point too far to the right? subtraction(ind, :)=median(results((ind-distwindow):end,2),1); else subtraction(ind,:)=median(results((ind-distwindow):(ind+distwindow),2),1); end end results(:,2)=results(:,2)-subtraction; %Remove median from surrounding span results(: ,2)=results(: ,2)+median(data(: ,2)); else flagval=1; disp('Insufficient number of points for median subtraction.') end end flags=flag-val*ones(size(results,1),1); results(:,size(results,2)+1)=squeeze(flags); XRETURNING RESULTS if strcmp(option, 'plot') %plot lightcurve figure (gcf+1) plot(results(:,1), results(:,2), 'k.') xlim([min(results(:,1)) max(results(:,1))]) xlabel ('Phase') ylabel('Power (e{-})') elseif strcmp(option, 'matrix') %return lightcurve output=results; else error('Incorrect value for option.') end Save Corrections B.3 function save-dispersion(directory, name) XThis function applies applies datacorrection and Xcalculates the median of absolute deviations from the Xfor all files in the directory input as 'directory'. %directory=directory containing files of interest %numiter=number of iterations of datacorrection %name=name of file in which to save data for plotting filestructure=dir(directory); numfilestructure=size(filestructure); ticker=O; for ind=1:numfilestructure element=filestructure(ind,1); if (element.isdir==O) ticker=ticker+1; filename=element.name; filenames{ticker,1}=filename; end end numfile=size(filenames,1); %Print number of files disp(strcat('Files to Process: ', cd(directory) addpath('../') num2str(numfile))) %Change to directory containing files median Sca=zeros(numfile, 1); Mag=zeros(numfile, 1); tic; %timer initialized count=O; for ind=1:numfile; disp(ind) file=filenames{ind,1}; if strfind(file, 'CHRI') data=datacorrection(file, 'matrix', 'combine'); elseif strfind(file, 'MON') data=datacorrection(file, 'matrix', 'white'); else disp('Warning: file found with neither MON nor CHR') end flag=data(:,4); if median(flag)==1 continue else file((end-4):end)=[]; filenametosave=strcat('./processed data/', file); save(filenametosave, 'data') time=fix(clock); save('./processed-data/timecomplete/TIMELASTFILECOMPLETED', 'time') end end cd('..') toc %print total time B.4 Save MAD, Magnitude function saveMedAbsDev(directory, name, option) %This function applies applies datacorrection and %calculates the median of absolute deviations from the median Xfor all files in the directory input as 'directory'. %directory=directory containing files of interest %name=name of file in which to save data for plotting Xoption=entering 'resample' for option will resample the data to 512 %seconds before calculating the MAD drctry=strcat(directory, '/processed data'); filestructure=dir(drctry); numfilestructure=size(filestructure); ticker=O; for ind=1:numfilestructure element=filestructure(ind,1); if (element.isdir==O) ticker=ticker+1; filename=element.name; filenames{ticker,1}=filename; end end numfile=size(filenames,1); %Print number of files disp(strcat('Files to Process: cd(directory) ', num2str(numfile))) %Change to directory containing files addpath('../') Sca=zeros(numfile, 1); Mag=zeros(numfile, 1); tic; Xtimer initialized count=O; for ind=1:numfile; disp(ind) file=filenames{ind,1}; count=count+1; file((end-3):end)=[J; file=strcat(file, '.fits'); info=fitsinfo(file); %Obtain R-magnitude from FITS header. magnitude=info.PrimaryData.Keywords; Mag(count,1)=magnitude{34,2}; %matrix of magnitudes file((end-4):end)=[1; filenametoload=strcat('./processed data/', file); load(filenametoload) data(:,4)=[]; medianflux=median(data(:,2)); %Median flux value for the lightcurve %0/ - - - - - - - - - - - - - - - - - - - - - - - - - XMAKE SURE UNIFORM TIME SAMPLE - ALL SAMPLED AT 512 SEC. %take median to resample to 512. - - - - - - - If time between points is greater than timeinterval=2; %interval of sampling in hours if strcmp(option, 'resample') datehel=data(:,1); datehel=(fix(data(:,1)*1e5))/1e5; %roundoff date %initialize loop specific variables flag=0; skipcount=O; time=0; time1=0; xi=0; yi=0; zi=0; x=0; Y=0; Z=0; uniform-sample=zeros(size(data,1)-1,size(data,2));%preallocate for time=1:size(data,1)-1 - IF 32 SEC, %512, return linear interpolation (LRM) s_count=0; - - - if and(flag==1,skip-count<15) skip-count=skip-count+1; continue else flag=0; if (fix((datehel(time+1)-datehel(time))*1e5)/le5)<.0059 if size(data,1)-time>15 s_count = scount+1; uniform-sample(s-count,:)=[data(time,1) median(data(time,2):data(time+15,2)) median(data(time,3):data(time+15,3))]; flag=1; skip-count=0; else continue Xthrows out the last few points end elseif (fix((datehel(time+1)-datehel(time))*1e5)/1e5)>.006 timesample = .00592; x = [datehel(time,1) datehel(time+1,1)]; Y = [data(time,2) data(time+1,2)]; Z = [data(time,3) data(time+1,3)]; xi = datehel(time,1):time-sample:datehel(time+1,1); yi = interpl(x,Y,xi,'linear'); zi = interpl(x,Z,xi,'linear'); for timel=1:size(xi,2)-1 s_count = scount+1; uniform-sample(s-count,:)=[xi(timel) yi(timel) zi(timel)] end else %sampling time is 512 seconds s_count = scount+1; uniformsample(scount,:)=data(time,:); %keeps original data end end end data=uniform-sample; XREPLACES ORIGINAL DATA WITH REBINNED DATA SET end data(:,2)=Mag(count,1)-2.5*loglO(((data(:,2))/2)/(0.5*medianflux)); %Convert flux to magnitude times=data(:,1); times=(fix(times*1e5))/le5; initialtime=times(1); finaltime=times(end); timevector=initialtime:(timeinterval/24):finaltime; for dex=1:(length(timevector)-1) starttime=timevector(dex); endtime=timevector(dex+1); indices=find(and(times>starttime, times<endtime)); datablock=data(indices,:); madvals(dex,1)=mad(datablock(:,2)); end Sca(ind, :)=1.48*median(madvals); end cd('./processed-data/scatter') toc %print total time save(name, 'Mag', 'Sca') Bibliography [1] Matlab. version 7.8.0, 2009. [2] Transiting planets. http://exoplanet.eu/catalogtransit.php?munit=runit=punit=mode=1more=, May 2010. [3] National Aeronautics and Space Administration. http://kepler.nasa.gov/Mission/discoveries/, 2010. Kepler: Discoveries. [4] National Aeronautics and Space Administration. Kepler: Photometer and spacecraft. http://kepler.nasa.gov/Mission/MissionDesign/PhotometerAndSpacecraft/, 2010. [5] Suzanne Aigrain. personal communication, 2009. [6] Thomas Beatty. Design considerations for a space-based transit search for earth analogs. Master's thesis, Massachusetts Institute of Technology, Department of Earth, Atmospheric and Planetary Sciences, 2009. [7] Centre National d'tudes Spatiales. http://smsc.cnes.fr/COROT/GPmission.htm, 2006. Observation strategy. [8] Auvergne M. et al. The corot satellite in flight: description and performance. Astronomy and Astrophysics, January 2009. [9] Suzanne Aigrain et al. Noise properties of the corot data. Astronomy and Astrophysics, March 2009. [10] Josh Winn. Transits and occultations. http://arxiv.org/abs/1001.2010v1, January 2010.