Instrument Science Report NICMOS 2001-02 NICMOS OTFR: A Feasibility Study For Improved Scientific Return H. Bushouse, M. Dickinson, and E. Bergeron September 4, 2001 ABSTRACT We examine the feasibility and utility of performing bias drift corrections to NICMOS data in a pipeline environment, with a specific eye towards the use of these routines in an “On The Fly Reprocessing” (OTFR) system. We have investigated the feasibility of the required calibration software implementation, as well as the scientific quality of the data that would be produced by such a system. Our results indicate that the implementation would be quite straight-forward, and that the calibrated data products would offer a vast improvement over the currently-available products and would be immediately usable for scientific analysis with little or no further post-pipeline processing. 1. Motivation The existence of certain data anomalies in the calibrated output products of the current NICMOS pipeline processing programs calls into question the utility of implementing an “On The Fly Reprocessing” (OTFR) capability for NICMOS data. This is due to the fact that it is often necessary for an astronomer to reprocess the data using special standalone techniques not currently available in the standard pipeline in order to produce calibrated products that are suitable for scientific analysis, thus rendering the products of an OTFR system unusable. We have performed a study to assess the feasibility of incorporating the standalone correction techniques into the pipeline system. Because some of the techniques occasionally require user intervention or decision making to produce the highest quality products, we wanted to investigate the level of benefits that would be realized if these techniques were applied in a “blind” or automatic fashion to all NICMOS datasets. In the following sections we describe the data anomalies that are present in the current calibrated products, the special techniques that have been developed to remove their 1 Instrument Science Report NICMOS 2001-02 effects from NICMOS images, and the procedures and results of our study of the application of these techniques in a pipeline environment. 2. NICMOS Bias-Induced Data Anomalies The noiseless bias levels of the NICMOS detectors have several components with different spatial and temporal properties. There is an initial bias level established when the detector is reset at the start of an exposure. This level is, in general, different for each quadrant of the detector, and is removed by the first step - the “zeroth read subtraction” step - of the standard pipeline processing program calnica. There is an additional, spatially varying, bias component known as “shading.” The amplitude and shape of this component depends on the time interval since the previous readout. For a given detector temperature the shading pattern is very stable. It has been accurately characterized for all readouts of each predefined MultiAccum sequence and is included in the standard CDBS “dark” calibration reference files. Thus the shading pattern is removed when the dark images are subtracted from the science data during pipeline processing. Unfortunately there are small variations in both the overall bias level and the shading pattern, which appear to be dependent on detector temperature. Dark reference data are available for a range of temperatures and therefore we have been able to measure and characterize the changes to the shading pattern as a function of temperature and have constructed temperature-dependent dark files for use in pipeline processing. The overall bias level of each detector quadrant is not completely stable during the time of a particular MultiAccum exposure sequence, and therefore the subtraction of the zeroth read by the pipeline does not completely remove the net bias level from each readout. This uncorrected bias effect goes by a variety of names, but is commonly called the “pedestal” in STScI NICMOS-related literature. We will use this term, as well as “floating bias” interchangeably. The drifting bias levels are generally constant within a given detector quadrant, but can vary independently from quadrant to quadrant. Because they are unpredictable they are not currently removed using the standard pipeline software. This has two consequences: • Changing residual bias levels from readout to readout result in an apparently non-linear accumulation of counts vs. time. These can be gradual drifts (positive or negative) in the bias level, or occasional sharp jumps in the global bias for one or more image quadrants. • The net (unremoved) bias, after division by the flat field during pipeline processing, introduces a residual flat field pattern on the processed images. The first effect is usually small, but can sometimes be important. Not only is it impossible to correctly determine the intrinsic countrate of sources when the signal is accumulating non-linearly, it also makes the task of cosmic-ray identification and rejection 2 Instrument Science Report NICMOS 2001-02 much more difficult. The CR rejection routine in the calnica pipeline software works by performing a linear fit to the accumulating signal vs. time in each pixel and identifying outliers as CR hits. If the observed signal is non-linear, pixels in one or more readouts can be falsely identified and rejected as CR hits, leading to the assignment of incorrect countrates for those pixels. The second effect (flat field imprint) causes the greatest problems for data analysis, giving images the appearance of poor flatfielding and unremoved spatial structure, and limiting the ability of the user to detect and measure photometry for faint objects. 3. Correction Techniques Removing the Effects of Temperature-Dependent Shading A large library of dark images was obtained during the NICMOS operational lifetime. These darks cover both the small range in detector temperatures that were experienced as the cryogen supply gradually diminished and the instrument slowly warmed up, as well as the rapid rise through very high temperatures that immediately followed the complete exhaustion of cryogen. These data have been used to characterize the behavior and dependence of the bias shading signal with detector temperature. An IDL program has been written that uses this information to generate a dark reference image for any standard MultiAccum readout sequence and chosen detector temperature. The use of these temperature-specific darks in the calibration process can lead to improved shading subtraction, compared to what is obtained using the single-temperature darks currently available in CDBS. If the shading is not completely subtracted from a science image, a bias signal gradient will be left in each image quadrant. While this particular aspect of the bias-related image anomalies is usually not as severe as the general problem of a spatially-constant residual bias, it can at times lead to incorrect bias estimation and removal by the currently available pedestal correction techniques (described below). This is due to the fact that these techniques make the implicit assumption that the residual bias is spatially constant within each image quadrant. It is therefore worth considering the use of temperature-dependent darks in the pipeline that would be used in an OTFR system. The use of these dark reference files could be implemented in one of two ways. First, a program such as the IDL task that has already been written could be employed to generate an appropriate dark image on-the-fly for each science dataset as it is processed. Alternatively, a library of temperature-dependent darks could be generated and archived, although this would require quite a large number of reference files. A unique reference file would be required for each combination of NICMOS camera, MultiAccum sequence, and temperature. With 3 cameras, 16 sequences, and perhaps 10-20 temperature points, this would require the existence of 500-1000 reference 3 Instrument Science Report NICMOS 2001-02 files. In either case, the information necessary to select a dark file of the appropriate temperature is contained in instrument status keyword values in the SPT files, and is therefore readily available to the pipeline process. Removing the Bias Drifts Within a MultiAccum Exposure If the sky and source components of a NICMOS MultiAccum exposure can be assumed to be intrinsically constant in time, then the existing STSDAS NICMOS task biaseq can be used to remove the changes in the bias level which occur between readouts. In essence, biaseq models the countrate at each pixel as the sum of a constant astronomical signal (sky plus source) and a potentially variable component (the bias drift). The drifting bias level is assumed to be constant throughout each quadrant of the detector, and an average bias offset is determined for each quadrant of each readout and removed from the data. A model image of the target scene is produced by simply averaging or medianing together the observed countrates at each pixel over some range of readouts from the MultiAccum sequence. The model is formed separately for each image quadrant. This model image then contains the constant astronomical signal plus the average or median change in bias level from one readout to another. In the interactive standalone version of the biaseq task, the user is allowed to specify which readouts will be combined to produce this median image. The model image is then scaled to the exposure time of each readout in the MultiAccum sequence and subtracted from it. This effectively removes all constant signal from sources and sky background, leaving only the difference between the quadrant bias level and the average bias. The median residual bias signal is measured within each quadrant of each readout and subtracted from it, effectively removing any signal due to 2nd-order and higher changes in the bias level from readout to readout. The resulting dataset is a MultiAccum exposure stack which should have a signal level that accumulates linearly with time, i.e. without temporal drifts or jumps in the bias level. This dataset is now suitable for further pipeline (calnica) processing where the data from the individual readouts can now be reliably combined into a single, final image, with cosmic-ray rejection applied in the process. Note, however, that the net bias or “pedestal” level has not been removed by this procedure. The mean bias difference per readout will still be present in the final image along with the constant astronomical signal. The presence of this spatially constant net bias signal results in an imprinting of the flatfield pattern into the image when the flatfield calibration step is applied. 4 Instrument Science Report NICMOS 2001-02 Removing the Net Bias or “Pedestal” From Final Images In order to remove the net bias or “pedestal” from processed NICMOS images it is necessary to model and separate the signals due to astronomical sources plus sky background from that of the net bias in each image quadrant. In general, however, it is difficult to measure the bias level independent of the sky plus source flux. In order to do so, we take advantage of the fact that, before flatfielding has been applied, the recorded counts from the sky and sources are modulated by the detector flatfield pattern, while the bias is not. Conversely, if the image has already been flatfielded, the source plus sky signal will now be unmodulated, while the bias signal will have attained a flatfield imprint. One method, which has been implemented in the STSDAS NICMOS task pedsky, works well for measuring and removing the net bias from images that are only sparsely covered by astronomical targets. The reason for this restriction on field content is due to the fact that the pedsky task relies on the ability to effectively filter out signal from small and widely-spaced targets so that it can accurately measure the sky background signal in each image quadrant. It iteratively solves for the best combination of sky and net bias signal levels that minimizes the residual flatfield imprint in the image. The amplitude of the flatfield imprint is estimated by simply computing the standard deviation of the pixel values within each quadrant, after rejection and filtering of signal from sources. It is assumed that the point at which the standard deviation of the pixel values is minimized corresponds to a minimization of the flatfield imprint. Images containing large astronomical targets, especially nearby galaxies or very bright stars (whose PSF wings may cover much of the image), galactic nebulae, or crowded stellar fields require a different approach due to the inability to measure the sky level in these images. Applying the pedsky technique of searching for the minimum in the standard deviation of pixel values within an image quadrant fails due to the inclusion of (spatiallyvarying) source signal in a majority of image pixels. An alternative technique that can be used for these types of images has been developed by Roeland van der Marel (STScI) and is implemented in his standalone Fortran program unpedestal. An enhanced version of this program, containing additional processing options, has been ported to the C language and implemented in the STSDAS NICMOS package as the pedsub task. This program uses the same general approach as pedsky, in that it also solves for the bias signal level in each image quadrant by iteratively subtracting different bias levels and searching for the value that minimizes the remaining pixel-to-pixel variations. It contains the additional feature, however, of allowing the user to select the spatial scale over which the pixel-to-pixel variations are measured. For sparse fields the variations can be measured over an entire image quadrant, as is done in pedsky. For images containing many or large sources, where the signal from the sources is the major contributor to the pixel-to-pixel variations on medium and large spatial scales, pedsub offers the option to compute the pixel variations in an unsharp masked version of the image (original image minus a 5 Instrument Science Report NICMOS 2001-02 median-filtered image). The unsharp masked version of the image has the low spatial frequency source signal removed, effectively confining the statistics computation to high spatial frequencies only. Note that, in theory, the technique of minimizing the pixel variations at high spatial frequencies only should also prove effective when applied to images of sparse fields and therefore could be universally applicable to all image types. 4. Pipeline Software Tests A study has been conducted to evaluate the effectiveness of including the biaseq and pedsub algorithms in the standard NICMOS pipeline calibration program calnica. The goal of this study was to determine whether the “blind” application of these correction algorithms in the non-interactive pipeline environment could produce a useful improvement (relative to the current pipeline) in the scientific quality of the resulting calibrated images, what fraction of images would see an improvement, and what fraction of images might actually be degraded by such a scheme. For the purpose of this study a test version of calnica was implemented that contains the biaseq and pedsub corrections as additional calibration steps. This not only simplified the test procedures, but also proved the feasibility - from a software engineering point of view - of integrating the algorithms into calnica. In order to compare the relative effectiveness of the pedsky and pedsub algorithms, all calnica output images were also processed through pedsky without the pedsub correction applied in calnica. The NICMOS images used for this test consisted of 95 datasets selected from the STScI archive, covering observation dates from July 1997 through October 1998, a range of exposure times and MultiAccum exposure sequences, short (F110W) to long wavelength (F205W, F237M) filters, as well as various source types and field types, including regions of blank sky, sparse fields, bright stars, and large extended sources. All of the raw dataset image headers were updated to make use of the best reference files currently available in CDBS. The raw datasets were processed through the current standard calnica program (version 3.3), as well as the test version of calnica which includes the biaseq and pedsub corrections. The test version of calnica was applied twice to each image: once with the pedsub correction applied and once with the pedsky correction applied. The output images from standard version of calnica were also further processed through both the pedsub and pedsky programs in order to assess the value of applying these corrections without having applied biaseq. The test procedures therefore produced five complete sets of calibrated images consisting of: 1. Normal calnica processing only 2. Normal calnica processing with the addition of pedsub 3. Normal calnica processing with the addition of pedsky 6 Instrument Science Report NICMOS 2001-02 4. Calnica processing with biaseq and pedsub 5. Calnica processing with biaseq and pedsky In order to apply the biaseq, pedsub, and pedsky corrections in an automatic fashion it is necessary to have the software make some decisions that would otherwise be made by a user when running the standalone versions of these tasks. In the case of biaseq the only decision to be made is which readouts of the MultiAccum sequence to combine to form the model image of the scene, as well as the fraction of low and high outliers to reject at each pixel to produce a clean model image. In general it is advisable to use as many readouts as possible in this process, but it is also desirable to exclude any readouts with very short integration times (<1 sec) which occur at the start of every MultiAccum sequence and at the ends of “Multiple Initial and Final” (MIF) readout sequences. In the test version of calnica, logic was implemented in the biaseq step to choose all readouts other than these short ones. There are, however, two defined sequences in which all of the readout integration times are less than 1 second (“SCAMRR” and “MCAMRR”). For datasets using either of these sequences the biaseq step in the test version of calnica uses all of the readouts to form the model image. Logic was also implemented to automatically set rejection limits for the readout combination process so that approximately the upper and lower quartiles of the data samples are rejected. For pedsky the user is allowed to interact with the sky-fitting procedure, if desired, giving them the option of refining or overriding the procedure that automatically converges on the best combination of sky and bias values for an image. For the purpose of this test, these interactive capabilities were disabled and the routine’s automatic sky-fitting procedures were used. The pedsub task gives an interactive user the ability to chose what type of image filtering scheme to use, which in turn sets the spatial scale over which the pixel-to-pixel variations will be measured and minimized. The default mode is to use high spatial frequency information only, which is what was used in these tests. One feature that is common to both the pedsky and pedsub tasks is the choice of the flatfield image to use when subtracting trial sky or bias values (since either the sky or bias signals will be modulated by the flatfield pattern). The default option, which is what was used for this test, is to simply use the flatfield image already named in the header of the science dataset and used by calnica to flatfield the calibrated images. There are certain circumstances (see below) when it may be desirable to use an alternate image for this purpose. 7 Instrument Science Report NICMOS 2001-02 5. Results To date, we have performed only a qualitative assessment of the study results. A detailed quantitative study is necessary to evaluate any potential impacts on, for example, the photometric accuracy of the pedestal correction algorithms. This quantitative study is in progress. Qualitatively, every one of the test datasets showed a significant reduction in, if not complete removal of, the effects of “pedestal” in the final calibrated images when the biaseq and pedsky or pedsub routines were applied with pipeline processing. In the relatively small number of datasets that were not severely affected by the pedestal problem to begin with, the correction routines properly and automatically detected this and applied little or no corrections. For most images, the greatest improvement was seen by the application of either the pedsky or pedsub corrections. A smaller number of datasets saw dramatic improvements through the application of biaseq. As noted earlier, this is due to the fact that the non-linear changes in bias level that biaseq removes are less prevalent than the linear component of bias drift. In those datasets where biaseq did show a clear improvement, it was usually in the form of better cosmic-ray rejection. It should be noted, however, that no harm was ever observed due to the unnecessary application of the biaseq correction. In images of blank or sparse fields, the results of pedsky and pedsub were comparable (see Figure 1 and Figure 2). This verified the assumption that in the case of sparse-field images the use of high spatial frequency information only (in pedsub) is able to produce results comparable to the pedsky approach of using information from all spatial scales. In images containing bright stars, large extended sources, or crowded fields, however, the pedsub technique produced clearly superior results (see Figure 3). In these cases the pedsky routine was incorrectly biased by the presence of large amounts of source signal. The only cases in which the correction routines consistently failed to remove all of the pedestal signal was for images taken through long-wavelength filters (e.g. F205W, F237M). In these images the background signal is dominated by telescope thermal emission, which due to the difference in color as compared to the internal calibration lamps, does not follow the same pattern as the standard flatfield images used by the correction routines. Even in these images, however, the pedestal signal was always significantly reduced; it just wasn’t completely eliminated (see Figure 4). 6. Conclusions and Recommendations Given the fact that we have seen a significant improvement in all of the images used in our test, with no indication of any image ever coming out “damaged” by the process, we conclude that the data produced by an automatic OTFR system would be scientifically useful and a great improvement compared to the calibrated data produced by the current 8 Instrument Science Report NICMOS 2001-02 standard pipeline and contained in the DADS archive. The simple “steering” logic that was employed in our tests to guide the biaseq and pedsky/pedsub corrections is adequate to produce useful results in the vast majority of images. The test results also validate the assumption that the pedsub routines can be universally and safely applied to all types of data. While the biaseq corrections do not appear to be absolutely necessary to apply to every dataset, we have seen no cases in which the unnecessary application of this correction caused any problems. Furthermore, the fact that our tests indicate that both the pedsky and pedsub routines fail to remove all of the pedestal signature in some types of images when applied in a non-interactive, pipeline environment, does not render the output product useless or force the user to interactively reprocess the dataset starting from the raw images. This is due to the fact that the standalone versions of the pedsky and pedsub routines are designed to accept as input the final, fully-processed images produced by the calnica pipeline. Therefore even if additional interactive processing is required to finish removing residual pedestal from an image, the calibrated output product from the pipeline can be used directly as input to either the pedsky or pedsub routines, without the need for complete reprocessing. There are at least two options available for better handling of images taken through long-wavelength filters, where the use of the standard flatfield reference files in the pedestal correction routines does not produce completely satisfactory results. First, the simplest approach would be to turn off the pedestal correction step in the pipeline for images that use these particular filters, and then leave it to the user to perform the pedestal correction on the fully calibrated images using an off-line, standalone version of the pedestal correction task and an appropriate sky background image. Second, suitable sky background images could be constructed and archived for the long-wavelength filters, and these images could be used by the pipeline pedestal correction routines, instead of the standard flatfield images. Appropriate reference file keywords for such a sky background image already exist in NICMOS datasets and therefore no additional header, keyword data base (KWBD), CDBS, or calibration software changes would be necessary to implement such a scheme. The use of temperature-dependent dark reference files in the pipeline system would also provide a benefit to the quality of calibrated datasets. Given the logistics involved in creating, storing, and managing a database of 500-1000 static reference files, we recommend that the alternative approach of using a task to build appropriate darks on the fly be implemented. Finally, it is important to note that even if pedestal correction routines are deemed to be too risky to implement in a pipeline system, it would still be worth the effort of building an OTFR pipeline for NICMOS simply to allow users to take advantage of the latest versions of the calibration software and reference data. The NICMOS reference files and 9 Instrument Science Report NICMOS 2001-02 calibration software have both evolved and improved significantly over the operational lifetime of the instrument, and further improvements have come since the NICMOS end of life. For example, calnica now incorporates two new steps (zero-read signal correction and electronic bars removal) that were unavailable for part or all of the instrument’s operational lifetime, and significant improvements were made to both the cosmic-ray rejection/ readout combination step and the noise computation step. The calibration reference files, especially the darks, flats, and non-linearity corrections, have continued to improve with time, and substantial gains in data quality can be achieved simply by using the latest reference files. Therefore, even if no bias drift correction algorithms were added to the software or no temperature-dependent darks were implemented in the pipeline, it is nevertheless likely that a significant fraction of NICMOS data would benefit from the recalibration that would be an automatic product of OTFR. We therefore recommend the development of an OTFR capability for NICMOS data. 10 Instrument Science Report NICMOS 2001-02 Figure 1: Processing results for a mostly blank field (camera 2, F110W). From top to bottom: normal processing, with biaseq and pedsub, with biaseq and pedsky. Pedsub and pedsky produce effectively identical results. 11 Instrument Science Report NICMOS 2001-02 Figure 2: Processing results for a sparse field (camera 3, F160W). From top to bottom: normal processing, with biaseq and pedsub, with biaseq and pedsky. Pedsub and pedsky produce effectively identical results. 12 Instrument Science Report NICMOS 2001-02 Figure 3: Processing results for an extended source (camera 2, F110W). From top to bottom: normal processing, with biaseq and pedsub, with biaseq and pedsky. Pedsky fails due to the presence of the large source. 13 Instrument Science Report NICMOS 2001-02 Figure 4: Processing results for an image with thermal background emission (camera 2, F205W). The top image shows normal processing results, while the bottom shows the results with biaseq and pedsub applied. Pedsub has failed to accurately remove the “pedestal” signal due to the mismatch between spatial variations in QE for the internal lamp flat and the thermal emission. 14