TECHNICAL REPORT Title: : STScI NIRSpec Calibration Pipeline Doc #: Processing Description Date: Rev: Authors: T. Beck Phone: 410338-5038 JWST-STScI-001859, SM-12 3 September 2009 - Release Date: 24 November 2009 1.0 Abstract NIRSpec observations will be carried out in three main observing modes: the multi-object spectroscopy (MOS) mode using the micro-shutter arrays (MSAs; referred to as MOS or MSA mode), the fixed slit (FS) observing mode for single object spectroscopy, and threedimensional imaging spectroscopic observations made using the integral field unit (IFU). NIRSpec data reduction and calibration will be complex, particularly for the MOS observing mode where spectra are simultaneously acquired on 100+ targets at a time. This document wholly supports and is consistent with the existing ESA NIRSpec data calibration documents, but provides a bit more detail on how the NIRSpec data reduction pipeline for all science modes shall be carried out. At the end of this document, we have included a section on ‘open issues’, which highlights some issues related to the NIRSpec data reduction and calibration that require further investigation. We plan to update this document several times during the implementation of the data calibration pipeline, as the detailed algorithms are defined for specific reduction steps and analysis methods are determined more conclusively. 2.0 Introduction In this document, we outline the processing steps for the reduction of NIRSpec data in all modes. Two key companion documents that are referenced throughout the text are: NIRSpec Calibration Plan, by De Marchi et al. Document ID = ESA-JWST-PL-2959 NIRSpec Science Data Pipeline Inputs and User Processing Requirements by De Marchi et al. Document ID = ESA-JWST-RQ-2961 (Referred to as “Data Processing Requirements” document in all subsequent text). When taken in combination, these existing documents provide a very detailed description of why the NIRSpec data reduction methods have been defined and adopted. This present document supports and expands upon the Calibration Plan and Science Requirements Operated by the Association of Universities for Research in Astronomy, Inc., for the National Aeronautics and Space Administration under Contract NAS5-03127 Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. JWST-STScI-001859 SM-12 documents, but goes into more detail and emphasizes how the data might be reduced in the calibration pipeline. We do not outline why the proposed reduction methods were adopted, and we also do not discuss general data reduction philosophies that might be applicable to all JWST instruments. NIRSpec data reduction and calibration will be complicated. This document discusses the data reduction for the initial processing from data ramps to count rate images (section 3), MSA mode observations (section 4), fixed slit (FS) mode (section 5) and IFU mode observations (section 6). We also discuss potential format and information included in reference files (although final formats are TBD and should be discussed), we propose some initial structures for the pipeline output data products, and outline what may be needed for post-pipeline processing and analysis tools – e.g. “Next Level” reduction and analysis. All steps in the data reduction process are laid out in detail in the MOS data reduction section. For the FS and IFU modes, many of the processing steps are very similar (or effectively identical) to the MOS data reduction. As a result, the steps for the FS and IFU modes that are different than for MOS reduction are presented in detail, and steps that are identical refer to the MOS section for the description included there. Also included in this document is a description of data reduction steps for processing NIRSpec “images” – these are frames acquired with the grating mirror in place, used for target acquisition and source placement verification (section 7). Reduction of these data will consist of general imaging mode processing, and will likely be similar to the reduction of NIRCam or TFI images. Also described in section 8 are proposed “Next Level” processing steps for NIRCam data, for images that have been acquired specifically for NIRSpec pre-imaging purposes. In this document, we identify the outputs from the NIRSpec data reduction and calibration process. As discussed at the JWST Calibration summit meeting in March 2009, we call these outputs “Browse Quality” at this point in the pipeline description. This phrase was adopted to describe the processing outputs from the early incarnation of the reduction pipeline. Of course, the data pipeline and calibration process should output optimal Science Quality data whenever possible. But we do note that in the early stages when problems and bugs in the pipeline processing are being worked out, the outputs are all likely to be of “Browse Quality”, not science and publication quality by most standards. As the pipeline processing improves, we hope that the Browse outputs will be replaced by properly calibrated science grade outputs. Unless otherwise stated, it is assumed throughout this document that we are describing the reduction process for sets of NIRSpec spectra that were acquired at the same time, within a single visit. As the details of specific data processing algorithms are determined, and as studies are done to clarify issues that are listed as “TBD”, this document will be updated to a new version (e.g., see Section 9 on ‘Open Issues’). 3.0 NIRSpec Initial Processing Steps (CALWebb) The JWST HAWAII-2 RG near-infrared detectors will all use similar processing steps to reduce the raw detector MULTIACCUM datacubes (ramps) into count-rate images. See the NIRSpec Operations Concept document (OCD) for further description of the raw Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. -2- JWST-STScI-001859 SM-12 NIRSpec data. In this section, we describe the details of the reduction steps that will be adopted to execute this initial data processing. This portion of the JWST data reduction pipeline has (temporarily) been dubbed the “CALWebb” processing. Figure 1 presents the data reduction flow chart diagram for the CALWebb pipeline, and the following sections describe each step. At the time of the JWST multi-instrument data calibration summit in early April 2009, the order of the processing steps in this flow chart was consistent for all instruments (including MIRI). As we learn more about the characteristics of the flight detectors, the order of the processing steps may shift and the algorithms may vary – but the key goals of each task should not change appreciably. It should be noted that the CALWebb pipeline will operate on data from each of the two NIRSpec SCAs separately, and each SCA should have their own calibration reference files and supporting pipeline information. Figure 1: The data reduction flow-chart for the initial processing steps, from raw datacubes to countrate images ("CALWebb"). The first sub-section in the description of the CALWebb processing includes a note on the raw data file structures that will be inputs into the data reduction and calibration pipeline processing steps. The CALWebb flow chart and discussion for NIRSpec presents two data processing steps which we hope will be adopted for the final reduction Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. -3- JWST-STScI-001859 SM-12 process, but these tasks are not as well defined as others and they may not be included in the flow charts for all of the near-infrared instruments. These two tasks are the Interpixel Capacitance (IPC) deconvolution, and the Latent Image Correction. 3.1 NIRSpec Raw Data File Structures The .fits file format is the known file structure for all JWST data, and all data in a single exposure from a single SCA should be stored in a single file (Kriss 2004). The separate frames or groups that comprise the final exposure will be stored as image planes in a three-dimensional datacube. In this case, the file size would be NROWS x NCOLUMNS x NGROUPS, where NROWS and NCOLUMNS designate the readout pixel array size or subarray size (in full-frame readout for all H2RG detectors, NROWS = NCOLUMNS = 2048 pixels). However, the further details of the JWST data format have not been finalized yet. For example, for exposures that have multiple integrations in order to avoid saturation on a bright target, the file size would be NROWS x NCOLUMNS x NGROUPS x NINTS. So, the raw .fits file that is input into the pipeline may have a 4 dimensional structure with multiple datacubes for each integration in the exposure. It is yet to be determined what the final exposure data structure will look like, and whether pixel data in the image planes may be reordered as the recorded science data is transformed into raw exposure data. In the following processing steps, we note where information on subarray processing will be important (e.g., in the CALWebb and FS processing). We have deferred the discussion of pipeline reduction of multiple integrations in a single exposure (NINTS) to a future version of this document, once the final JWST file structures are adopted for this observing strategy. This is included in Section 9 on “open issues”. 3.2 Inter-pixel Capacitance Deconvolution Description: The NIRSpec detectors have capacitive coupling of the pixels. The most obvious effect of this is the cross-like or “+-shaped” pattern of charge around the “hot” pixels. Effectively, the observed data array (A’) is the convolution of the true image array (A) with the IPC kernel, (k) (McCullough 2008). Deconvolution of the observed data array by the measured IPC kernel will allow for the extraction of the true image, unaffected by this characteristic of the detector. McCullough (2008) describes a method to execute this deconvolution for the WFC3 H1RG detector, and proposes that the optimum correction is achieved if the devonvolution is performed on data in the first step of the pipeline reduction. Input Data: the raw, unprocessed datacube with dimensions of 2048x2048 x ngroups, where ngroups is the number of up-the-ramp groups in the acquired data. (or nrows x ncolumns x ngroups, when subarrays are used for FS science). Input Reference Files: The IPC kernel, k. k is a 3 by 3 matrix described by the values α and β, which quantify the capacitive coupling of the center pixel’s charge to each of its adjacent neighbors along columns (α) or rows (β) (McCullough 2008). The four corner Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. -4- JWST-STScI-001859 SM-12 elements of the IPC kernel are typically zero (or very close to zero), and the sum of all elements is 1: Example IPC deconvolution kernel, 0 β 0 k = α 1− 2α − 2β α 0 β 0 For most HAWAII 2RG detectors, α = β, and work on multiple detectors has shown that α is very close to 0.015 or thereabouts. The IPC kernels will need to be determined for both of the NIRSpec flight detectors, and we must verify the nature of the IPC kernel – € that α=β and that the corner kernel components do equal zero. Output Data: The IPC corrected datacube with dimensions 2048x2048 x ngroups. The cross or “+” shapes around hot pixels are corrected and all subsequent data will accurately reflect the true detector characteristics (such as for gain calculations). NOTE: many aspects of the IPC deconvolution step need to be finalized before the decision to implement this processing step should be made. These include: 1) Verifying that the deconvolution of the IPC kernel improves the noise statistics and accuracy of the data reduction, 2) verification of the IPC kernel shape, and whether it can be well described by α and β parameters or if the full 3 x 3 matrix should be used for the deconvolution, 3) How stable in time the IPC kernel is, 4) if uncertainties in the IPC kernel are ever significant in the deconvolution process and 5) if deconvolution of the IPC kernel should be the first processing step in the detector reduction, or if it would be better placed elsewhere in the reduction flow. 3.3 Flag Bad Pixels Description: This task will create the uncertainty and data quality (UNC and DQ) image cubes and flag values from the static bad pixel mask file to the DQ image. This uses the bad pixel mask reference file, which contains an image array for known bad (hot or cold) pixels. The flag value may vary depending on the type of bad pixel (consistency in flag values between JWST instruments is desired). There will be one bad pixel mask file for each of the two NIRSpec detectors. Besides the truly “defective” bad pixels included in the reference images, other bad pixels may be flagged. For example, pixels that are saturated or have high flux levels and might show latency in the next accumulated detector image should also be identified. Reference pixels can also be flagged as bad. Input Data: The IPC corrected datacube of 2048x2048x ngroups dimensions (or nrows x ncolumns x ngroups, when subarrays are used for FS science). Input Reference Files: The bad pixel mask image for each NIRSpec detector with dimensions 2048 x 2048, and a saturation reference image for each detector with dimensions 2040 x 2040. Output Data: The datacube of 2048x2048 x ngroups, with x 3 data extensions. The file consists of the bad pixel corrected data image cube in the first extension, the Uncertainty image cube (UNC) and the Data Quality datacube (DQ) (or nrows x ncolumns x ngroups, when subarrays are used for FS science). Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. -5- JWST-STScI-001859 SM-12 Output Reference File: When applicable, a second output of this procedure is a 2040x2040 image (or nrows x ncolumns x ngroups, when subarrays are used for FS science), which flags the pixels that had high flux or were saturated (and perhaps records the count value?). This image output would keep the record of latency for the next detector image (see section 3.6). For this, the generic term of output Latency Map is adopted. 3.4 Reference Pixel Correction Description: A 4 pixel wide border of reference pixels surrounds the 2040x2040 light sensitive pixels in the NIRSpec detectors. These pixels are used to correct slow bias drift. The optimal method for correcting the bias offset sampled by the reference pixels is TBD, most detector groups seem to do this correction in a different manner. The proper method to be adopted for correcting reference pixels will be the topic of further study. Input Data: The bad pixel masked datacube image of 2048x2048x ngroups dimension, with corresponding UNC and DQ extensions. (or nrows x ncolumns x ngroups, when subarrays are used for FS science). Input Reference Files: None Output Data: Reference pixel corrected datacube image of dimension 2040x2040x ngroups, with corresponding and updated UNC and DQ extensions. (or nrows x ncolumns x ngroups, when subarrays are used for FS science). 3.5 Linearity Correction Description: Correct the up-the-ramp image datacube for the effects of detector nonlinear flux response. Input Data: The reference pixel corrected datacube image of 2040x2040x ngroups dimension, with corresponding UNC and DQ extensions. (or nrows x ncolumns x ngroups, when subarrays are used for FS science). Input Reference Information: This task requires coefficients for the linearity correction equation for each of the two NIRSpec detectors (assumed to be a ~3rd order polynomial – but flexibility should exist in case a different order polynomial is adopted). The input might be a single set of reference coefficients for each of the two NIRSpec detectors. The reference file should have a header parameter that specifies the function that should be used to interpret the coefficients. At this time, only a polynomial transformation needs to be implemented, but flexibility should exist in the format of the reference file in case the means for correcting the linearity evolves. Though it is presently thought unnecessary, it is TBD if multiple coefficients may be needed - such as one set of coefficients for each of the 2040x2040 detector pixels. If linearity coefficients are required for every pixel, then the input reference file structure might be two input image array datacubes of dimension 2040x2040x~4 – where the first two dimensions represent each pixel coordinate on the detector, and the third dimension is the polynomial coefficients for the linearity correction. Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. -6- JWST-STScI-001859 SM-12 Output Data: linearized flux datacube image of dimension 2040x2040x ngroups, with corresponding and updated UNC and DQ extensions (or nrows x ncolumns x ngroups, when subarrays are used for FS science). The pipeline must apply the linearity correction to both the measured pixel values and the corresponding uncertainties, recording the result in an extension in the output file. 3.6 Dark Subtraction Description: Correct the science image datacube for dark current by subtracting a corresponding dark image cube. Input Data: The linearized science image datacube of dimension 2040x2040 x ngroups, with corresponding UNC and DQ image cubes (or nrows x ncolumns x ngroups, when subarrays are used for FS science). Input Reference Files: A dark image cube that has been processed through all previous steps (IPC corrected, reference pixel subtracted, linearized). The current plan is for acquisition of multiple NIRSpec detector dark images during long, ~10,600s parallel observations using the NIRSpec NRSRAPID readmode (ngroups = 1000; e.g., see the NIRSpec OCD). Thus the input dark image cube will likely be of dimension 2040x2040x1000. The region of the 3D dark datacube image used to correct the science data can be extracted from this large dark cube. If the science was acquired in the NRS or NRSSLOW read modes, which involve averaging or dropping frames in the readout pattern, then the input dark datacube may need to be processed to result in the same data structure and signal/noise characteristics. Output Data: dark-subtracted science datacube image of dimension 2040x2040x ngroups, with corresponding and updated UNC and DQ extensions (or nrows x ncolumns x ngroups, when subarrays are used for FS science). Each dark rate in the dark image cube has an associated uncertainty that will be propagated by the pipeline during processing. 3.7 Image Latency Correction Description: If the SCA was exposed to bright illumination in preceding science frames, provide a first-order correction to the present science image datacube for latency effects. To do this, an accurate knowledge of the NIRSpec detector pixel charge trap structure and the exponential decay characteristics of the latency are needed. Additionally, the existence of this task implies that there is a method in place to track images and access the saturation and high flux maps for the data image acquired immediately prior to this science data. The implementation of this correction could be tricky and complicated (e.g., see Regan et al. 2009), and the details are TBD. Input Data: The dark subtracted science datacube of dimension 2040x2040x ngroups, with corresponding UNC and DQ extensions (or nrows x ncolumns x ngroups, when subarrays are used for FS science). Input Reference Files: A latency map of the saturated and high flux pixels flagged from the previously acquired image or images is needed (i.e., output image from the bad pixel Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. -7- JWST-STScI-001859 SM-12 correction step and/or linearity correction, sections 3.2 and 3.4). Additional information on the exponential decay of the latency and a 2040x2040 image of the pixel charge trap characteristics will also be needed for each detector for this correction to be applied. (Details are TBD). Output Data: A latency corrected science datacube image of dimension 2040x2040x ngroups, with corresponding and updated UNC and DQ extensions (or nrows x ncolumns x ngroups, when subarrays are used for FS science). NOTE: A number of issues need to be figured out before this latency correction method can be implemented. These include: 1) How easy or feasible is it going to be to track latency from one image to the next? 2) Will implementation of this require a whole new system of updatable calibration reference files for every exposure that is taken? 3) How accurate is it to flag previously saturated pixels and attempt a latency correction without knowing details of the charge trap maps for a detector? (e.g., derivation of charge trap maps is not something that is being tested for the NIRSpec flight SCAs during DS testing at Goddard). Many questions need to be ironed out before this can be adopted. In the early operations of JWST, mitigation of the effects of saturation will likely be best done with dithered observations. 3.8 Cosmic Ray Cleaning and Collapse to Rate Image Description: Identify and remove hits by cosmic rays in the linearized, up-the-ramp datacubes, and collapse the cube into a cleaned, 2-dimensional count rate image. Cosmic rays can be flagged and removed as spurious outliers in the datacube by analyzing the slopes of each of the pixel ramps. The slope of the data ramps before and after the cosmic ray hit should be the same. The best estimate for the true ramp slope is found by analyzing the data to determine the slope and y intercept in the intervals before and after the cosmic ray hit and taking a weighted mean of the two fits. Fixen et al. (2000) and Regan (2007) showed the benefits of using optimum weighting analysis based on the signal-to-noise to determine the best slopes of the data ramps. The count rate images for each detector are constructed using the best slope fit for each pixel, with the appropriate factors for the detector gain and the group time used to scale the image to units of electrons/second. Input Data: The dark subtracted and (where applicable) latent corrected science datacube image of 2040x2040x ngroups dimensions, with corresponding UNC and DQ extensions (or nrows x ncolumns x ngroups, when subarrays are used for FS science). Input Reference Information: Header keyword information on detector gain and group time. Output Data: The 2040x2040 2-dimensional count rate image in units of electrons/second, with corresponding and updated UNC and DQ extensions (or nrows x ncolumns, when subarrays are used for FS science). Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. -8- JWST-STScI-001859 SM-12 4.0 NIRSpec MOS Data Reduction (MSA Mode) As outlined in the NIRSpec Calibration Plan and Data Processing Requirements documents, the data reduction for MSA mode observations is more complicated then IFU or FS data reduction because the pipeline will need to rely on an instrument model to correct for throughput including part of the flat field correction. The pixel-to-pixel flat (P-flat) fielding is decoupled from the instrument model and the low frequency flat (Lflat) correction. At the present time, it is assumed that P-Flat correction is wavelength insensitive, or only mildly dependent on wavelength. The MOS flat field data correction is thus broken into multiple steps, the P-flat is applied to the full detector image and the L-flat and throughput correction will be best done once the data have been extracted into individual 2-D spectra for each open MSA shutter. Note, as described in the Data Processing Requirements document, the character of the P-flat correction may be wavelength dependent. The optimal placement of the P-flat correction step in the data reduction pipeline is awaiting further information on the wavelength dependent character of the P-flats from the NIRSpec flight detectors. If the P-flat is wavelength dependent, then this correction will likely need to be incorporated in the throughput calibration in some manner (this is TBD). NIRSpec MSA data will always be acquired in full-frame readout, no subarrays will ever be used. 4.1 Data Combine Description: Combine data images of targets that were acquired at the same nominal position within the same MSA shutter. This would probably include combining data that was acquired within the same visit only, though perhaps taken after multiple guide star acquisitions (under the assumption that the NIRSpec target acquisition will need to be repeated every ~10000 seconds because of the need to re-point the high gain antenna). For data combination at this point in the reduction process, the number of input images may be linked to specific and fixed dithering strategies that are defined by the GOs in the APT. Input Data: Collapsed count rate images of dimension 2040x2040, with corresponding UNC and DQ extensions. Several images will be taken as inputs, depending on the number to be combined. Input Reference Files: Header keywords that verify the offset pattern may be needed. Output Data: A single, combined output count rate image of dimension 2040x2040, with corresponding and updated UNC and DQ extensions. NOTE: It will be important to have the option to combine spectra prior to the flat fielding steps. However, not all data will be acquired in a manner which allows for data combination at this early stage in the pipeline processing. It is still TBD if data combining at this point should be somehow merged into an automated pipeline, or if this should only be an option available to users who want to re-process their data in this manner. Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. -9- JWST-STScI-001859 SM-12 4.2 Background Subtract Description: Subtract the background flux from a MOS science target spectrum using a background spectrum that was taken nearby in time and through the exact same MSA slit. Subtracting background in this early stage of the reduction is a ‘ground-based’ observing bias to remove high sky flux background prior to the flat fielding step. While we do not have the worry of high and variable sky flux, background subtraction may be appropriate here for bright NIRSpec targets that are photon noise dominated. The optimal implementation of background subtraction for the faintest targets may depend on detector characteristics such as correlated noise on spatial scales that span several MSA slits. It is possible that the signal-to-noise on very faint sources can be improved if background subtraction is done at a later point in the reduction, using background spectra that were acquired nearby but not in the same MSA slit as the science target. Further work is TBD for optimal MOS background subtraction strategies. Input Data: A count rate file of the science target image (or list of target image files) that has dimension 2040x2040 with UNC and DQ extensions, with its corresponding background file (or list of background files). For background subtraction at this point in the data reduction process, the number of input images and the number and ordering of the corresponding background images will likely be linked to specific and fixed dithering strategies defined by the GO in the APT. (Note that an image file that has a target in one shutter may have background for a different shutter. Hence, a given .fits file may appear at different positions in both the target image list and the background image list, depending on the ordering of the dither pattern). Input Reference Information: Header keyword information on dither pattern sequence and possibly keyword information propagated from the APT on whether shutters contain a target or are background. Output Data: Background-subtracted count rate image of dimension 2040x2040, with corresponding and updated UNC and DQ extensions. After this processing step, the MSA slits which have positive target spectra will be processed further by the pipeline. The slits that were for background should not be processed, because the data are likely affected by the one-to-one image subtraction procedure. NOTE: It will be important to have the option to subtract background from spectra prior to the flat fielding steps. However, not all data will be acquired in a manner which allows for background subtraction at this early stage in the pipeline processing. It is still TBD if background subtraction at this point should be somehow merged into an automated pipeline, or if this should only be an option available to users who want to reprocess their data in this manner. Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 10 - JWST-STScI-001859 SM-12 Figure 2: The flow-chart diagrams for MSA Mode Data Reduction 4.3 P-Flat Correction Description: Correct the NIRSpec full-frame detector data for pixel-to-pixel flat field variations. It is presently assumed that the pixel-to-pixel flat field variations of the Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 11 - JWST-STScI-001859 SM-12 NIRSpec flight detectors will generally be wavelength insensitive, or only a weak function of wavelength. As a result of this assumption, the P-flat correction should be applied directly to the full-frame images to remove sensitivity variations of the detector over small scales. Input Data: The NIRSpec count rate science images of size 2040x2040 pixels, with corresponding UNC and DQ extensions. Input Reference Files: The NIRSpec P-Flat images of size 2040x2040 pixels, with corresponding UNC and DQ extensions. Output Data: P-flat corrected NIRSpec science images of size 2040x2040 pixels, with corresponding UNC and DQ extensions. 4.4 2-D Slit Extraction Description: Extract full frame images into 2-D data sub-image (windows), with one MSA slit spectrum per extracted 2-D data sub-image (See Figure 3). While it is still TBD how and when the extraction of individual MSA slit spectra into 2-D sub-windows will be handled in the NIRSpec pipeline, we emphasize that extraction of the MSA spectra into smaller slit spectra is very important. This will allow for easier processing on each individual spectrum, and the manner of processing the 2-D extracted data can be identical to long-slit data processing for many of the subsequent steps. The extracted window will encompass all data from the main centered slit, but it may also contain small regions of spectra from slits near to the target slit. If the data already had background subtraction executed in the previous steps, then only the slits with target data should be extracted into smaller 2-D image sub-windows. If background subtraction has not yet been done, both the target and background slit data should be extracted. Each MSA shutter will have associated x & y pixel coordinates which map out the spectral extraction box. For each grating+filter combination, every MSA shutter will map to its own unique 2-D image extraction sub-window. Input Data: P-Flat corrected data image of 2040x2040 size, with corresponding UNC and DQ extensions. Input Reference Files and Reference Information: The reference file and reference information will need to define the extraction box location pixel coordinates. To define the extraction box pixel coordinates, a reference file for each MSA quadrant might be used (as described below). A set of polynomial equations might be used to generate the x and y pixel positions of the extraction box. If the polynomial approach is adopted, it would be assumed that the reference information would include coefficients for the polynomial calculation, and the x and y pixel extraction box locations would be calculated by the pipeline for every MSA shutter to be extracted. (Tracy’s note: For the record, I don’t like the polynomial approach, because it means that there is the possibility that x and y pixel locations might be extracted differently on different computer platforms – e.g., reducing data on one machine w/ a flat field generated on a different machine might not work if pixel values are calculated differently because of Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 12 - JWST-STScI-001859 SM-12 computer or platform settings. This also implies a whole lot of redundant calculations for multiple observations through an identical MSA configuration). The exact format of the pixel extraction reference information is TBD, but below we describe a file format structure that could be used. An example of the reference file structure for MSA shutter spectrum extraction might be: file with four image extensions that are image array datacubes of real numbers, with dimensions of 365x171x4. Each of the datacube arrays map out the spectral extraction boxes for the MSA quadrants. The 365x171 dimensions of these reference datacubes correspond to the dimensions of the MSA quadrants. The third dimension of the datacube is the pixel values for the extraction box size. Because there are for MSA quadrants in NIRSpec, a separate MSA extraction reference file of this nature would be needed for each MSA quadrant. Four pixel values are needed to define the pixel coordinate image extraction locations. For example, Figure 3 presents the extraction box location for an MSA slit spectra, with xα, yα as the pixel coordinates of the lower left extraction location, and xβ, yβ as the pixel coordinates of the upper right of the box. The two pixel locations, xα, yα and xβ yβ define the full extraction sub-image location, and make up the four planes of the z dimension of the data image: Reference Image, x dim = n, ydim = m and z dimension plane 1: xα n,m m,1→171 n,1→365 Reference Image, x dim = n, ydim = m and z dimension plane 2: €€€ € € € yα n,m m,1→171 n,1→365 Reference Image, x dim = n, ydim = m and z dimension plane 3: x β n,m m,1→171 n,1→365 Reference Image, x dim = n, ydim = m and z dimension plane 4: yβ n,m m,1→171 n,1→365 One reference pixel extraction datacube will be needed for each of the four MSA quadrants, and these are included in the reference file as the four datacube extensions. Additionally, reference datacube files are needed for each of the spectral configurations, for a total of 9 reference files, 36 datacube extensions (one extension for each of the four MSA quadrants for the prism, G140M+F070LP, G140H+F070LP, G140M+F100LP, G140H+F100LP, G235M, G235H, G395M and G395H spectral modes). The reference file must be able to distinguish between pixel locations on the two NIRSpec detectors (spectra in the R=2700 mode will extend over both detectors). Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 13 - JWST-STScI-001859 SM-12 Figure 3: A Figure presenting the pixel coordinates necessary to define the boundaries of the 2-D sub-image extraction boxes - xα, yα, xβ and yβ. This task will ultimately need access to either header keyword information or calibrated sky coordinates propagated from the APT that defines which shutters contain a target and which are background. Output Data: Multiple 2-dimensional spectral slices that are x_spec x y_spatial in size, where the x dimension is the spectral dispersion dimension (x_spec) and y is the pixel length in the cross dispersion direction that the MSA spectra extends over (y_spatial). Because of distortion and spectral curvature, the y_spatial dimension of each 2-D extraction window will be larger than the ~4 pixel length of the undispersed images of the MSA slits. The x_spec dimension depends upon the grating used and the wavelength region being sampled. For further discussion, these extracted 2D spectral images are referred to as sub-images. The 2-dimensional spectral sub-images may be organized into a single data file using multiple extensions. The number of science extensions will depend on whether the target data only is extracted (# of science extensions = # of MSA targets), or whether the target and background slits are both extracted (# of science extensions = # of open MSA slits). Each extracted science spectral data sub-image should also have its associated extracted UNC and DQ extensions. 4.5 Complete Throughput and L-Flat Correction Description: Correct the spectra for throughput and low frequency flat field variation, including all effects from the transmission of the optics and a default chromatic slit-loss correction. If the P-flat does turn out to be a slowly varying function of wavelength, then this P-flat wave dependence should also be merged into this correction. Input Data: The extracted 2-d sub-images of dimension x_spec x y_spatial, with corresponding UNC and DQ planes. Input Reference Files: The L-flat and throughput reference data cube, which will likely be an image datacube that includes all correction components for throughput and the lowfrequency flat. This reference image will be constructed from the instrument model initially, but will be verified by on-ground and in-orbit spectral flats acquired with the NIRSpec CAA flat field lamp and may possibly be replaced by these empirical flats, as the catalog of observed MSA slit-flat fields is built up over time. This image datacube Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 14 - JWST-STScI-001859 SM-12 will consist of 2-d throughput and flat correction model spectra for each slit in each of the four MSA quadrants. The default chromatic slit-loss correction will be based on a pointsource centered in each slit, and may be factored into the throughput values for the reference datacube. More work is TBD to determine the precise nature and format of the reference datacube, and supporting input reference values. Output Data: The L-flat corrected, extracted 2-d sub-images of dimension x_spec x y_spatial, with corresponding UNC and DQ planes. 4.6 Initial Wavelength and Spatial Calibration Description: Use the instrument model to provide the initial wavelength and spatial calibration for each of the spectral sub-images. This task, in practice, will likely only consist of the addition of calibration keys that designate the initial spatial and spectral calibration reference values. As a result, this initial calibration will be flexible and could be done prior to (or in conjunction with) the L-flat correction step. Input Data: The L-flat corrected, extracted 2-d sub-images of dimension x_spec x y_spatial, with corresponding UNC and DQ planes. Input Reference Files: Reference model equations which link the open slit in the MSA quadrant to the extracted data from the detector. Based on the spectral configuration, the NIRSpec instrument model will provide the approximate spectral and spatial calibration, including distortion and spectral curvature. The structure of the input model files is TBD. Output Data: The L-flat corrected, extracted 2-d sub-images of dimension x_spec x y_spatial, with spatial and spectral calibration added and corresponding UNC and DQ planes. 4.7 Final Wavelength and Spatial Rectification Description: Determine and apply the final wavelength and spatial rectification to the spectral sub-images. Empirical on-sky spectra and NIRSpec lamp wave calibration images will be acquired to verify the accuracy of the initial wavelength and spatial rectification processes, and can be used to generate correction factors to the initial calibration. The final rectification process will consist of merging the spectra onto a regular pixel grid, and removing the spatial and spectral curvature from the data (See Data Processing Requirements document for further discussion). As described in the Data Processing Requirements document, during this step in the reduction process it may be desirable (or necessary) to combine spectral sub-images that were acquired in a fixed offset pattern. To do this, offset spectral data would be interpolated onto a finer grid and merged in combination with the wavelength and spatial rectification process, in a type of drizzle combination/rectification. This will require further development and may be hard-coded to be applied only if a fixed or associated offset pattern was used to acquire the data. Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 15 - JWST-STScI-001859 SM-12 Input Data: The L-flat corrected, extracted 2-d sub-images of dimension x_spec x y_spatial, with spatial and spectral calibration keywords added and corresponding UNC and DQ extensions. Input Reference Files: Rectification coefficients and transformation information derived from the instrument model and verified or corrected using empirical on-sky and lamp wave calibration data. The structure of these reference inputs is TBD, but it should contain: (a) physical parameters for each grating, (b) coefficients of a two-dimensional polynomial describing MSA-to-detector distortion for all gratings, (c) coordinates describing the detector geometry, and (d) empirical corrections, if necessary . Output Data: The spatially and spectrally rectified 2-d sub-images, with spatial and spectral calibration keywords added and corresponding UNC and DQ extensions. The final dimensions of the 2-D sub-images after the rectification process will be something other (larger) than the input x_spec x y_spatial size (TBD). Note: There are some science applications that will not wish to interpolate the MSA spectra onto a rectified spatial and spectral grid. For this reason, the following reduction steps should be constructed to work with data that have not been rectified onto a regular pixel grid (with the possible exception of the data combine and background subtraction steps, which may not work well on un-rectified data). 4.8 Data Combine Description: Combine 2-D MSA spectral sub-images that have been rectified onto a regular pixel grid. The input spectra for combining need not have been acquired through the same slit. The initial accuracy and ease of pipeline implementation for this data combination step may require that the spectra were obtained in a fixed offset pattern so that merging of the spectra can be done in an automated fashion (Tumlinson 2009a). Ultimately, it is desirable to merge all target spectra together in the pipeline, regardless of the offset pattern. This will likely require a sophisticated means to track the target through any MSA slit position in a user-defined offset sequence. The data combination described here can be executed on either a target or background spectral sub-image. Input Data: Multiple 2-D spectral sub-images that have been rectified onto a regular spatial and spectral pixel grid, with corresponding UNC and DQ extensions. Input Reference Info.: Header keyword information on the spatial and spectral calibration and offset pattern will be needed for automated data combination. Output Data: A single, merged and combined 2-D spectral sub-image with corresponding UNC and DQ extensions. 4.9 Background Subtract Description: Subtract the background flux off of 2-D sub-image spectra acquired through different MSA slits, if background flux has not yet been subtracted (i.e., in step 4.2). The background slit spectra used for this subtraction should be very nearby to the target slit spectra because the optical distortion through the NIRSpec field causes slits in different Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 16 - JWST-STScI-001859 SM-12 regions of the MSA quadrants to sample different areas on the sky. For ease of initial implementation, this background subtraction will likely be linked to specific offset sequence patterns selected by the user. As in the previous step for data combination, user-selected offset sequences will likely require a sophisticated means to track targets and identify corresponding background spectra. This is desirable, but will not be a goal of the initial pipeline implementation. The input spectra can be rectified onto a regular spectral and spatial pixel grid, or possibly un-rectified if the spectra were acquired through adjacent slits (?? TBD??). Depending on the science philosophy, the target and background spectra may have been combined in the previous step. Subtraction of background using un-rectified spectra acquired in a slit adjacent to the science target could be important for removing the effects of correlated detector noise from the data. Input Data: Multiple rectified (or unrectified) 2-D spectral sub-images with corresponding UNC and DQ extensions. If the target spectra were rectified onto a regular pixel grid, then the background spectra must be rectified also. Input Reference Info.: Header keyword information on offset pattern and/or background spectra location will be needed for automated data background subtraction. Output Data: Background subtracted 2-D spectral sub-images with corresponding UNC and DQ extensions. 4.10 Aperture Flux Correction Description: Apply a default flux correction to the spectra for the effects of un-centered targets within the slits. Nearly all targets observed through the MSA will have slit loss effects caused by improper centering, because of the nature of using the fixed MSA grid when observing multiple targets. As described in the Data Processing Requirements document, the spectra must be corrected for these effects of slit-loss. This default correction assumes a point source PSF flux distribution. Input Data: The 2-D spectral sub-image (may be rectified or un-rectified) with corresponding UNC and DQ extensions. Input Reference Info.: Every target observed through the MSA must have associated reference information on the centering position of the target observed through the slit. This information is captured by the APT in the proposal planning process, and will be useful information to have propagated within supporting visit meta-data and information propagated to the DMS regarding the target/background/MSA configurations and positions. The centering info. for the aperture flux correction will likely need to come from meta-data information, associated with the APT MSA reference and slit definition that was captured during the MSA planning process. The magnitude of the aperture flux correction as a function of wavelength will depend upon the target centering. (Ideally, we’d like for the aperture flux correction to also capture information on the source PSF shape and positioning based on the NIRCam pre-imaging. But this is likely a higher level pipeline goal). Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 17 - JWST-STScI-001859 SM-12 Output Data: The 2-D spectral sub-image (may be rectified or un-rectified) that has been corrected for aperture flux losses, with corresponding UNC and DQ extensions. 4.11 Absolute Flux Calibration Description: Apply the calibration which translates the count rate image units of e-/sec to flux calibrated data units. This will be a direct multiplicative factor applied uniformly to each pixel in the spectral images. The final units of the NIRSpec pipeline output data are TBD. Input Data: The 2-D spectral sub-image (may be rectified or un-rectified) that has been corrected for aperture flux losses, with corresponding UNC and DQ extensions. Input Reference Info.: Input flux calibration information, likely in the form of a photometric calibration header keyword determined from prior calibration observations. Output Data: The flux calibrated 2-D spectral sub-image (may be rectified or unrectified), with corresponding UNC and DQ extensions. This is a Browse Quality Data Output. 4.12 1-D Spectral Extraction Description: Extract the 2-D spectral sub-images into 1-D spectra, using either an optimal extraction method, and/or a straight collapse of spectra in the cross dispersion dimension. Ideally, a weighted extraction assuming the source profile (from pre-imaging, or the spectrally collapsed PSF) will likely provide a better result than a collapse of the spectrum. Input Data: The flux calibrated 2-D spectral sub-image (may be rectified or un-rectified), with corresponding UNC and DQ extensions. Input Reference Files: None. Output Data: The flux calibrated 1-D spectra (may be rectified or un-rectified), with corresponding UNC and DQ extensions. This is a Browse Quality Data Output. 4.13 1-D Spectral Data Combine Description: At the end of the full reduction process, it will be possible to do an automated combination of multiple spectra acquired on the same target through many different MSA slits. Data combination at this point will be useful to increase the signalto-noise to verify the spectra, particularly for very faint sources. In practice, it should also be possible to combine 1-D spectra acquired on the same MSA target that was observed with different MSA configurations, different pointings, or different visits. Executing data combination on many MSA target spectra acquired in different pointings or visits will require a method to trace targets through different data associations within an observing program. Input Data: The flux calibrated 1-D spectra (may be rectified or un-rectified), with corresponding UNC and DQ extensions. Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 18 - JWST-STScI-001859 SM-12 Input Reference Files: Reference header information on target name and spectral calibration is needed for 1-D data combination. Further information on target data associations will be necessary to combine data acquired in different pointings or visits. Output Data: The combined, flux calibrated 1-D spectra (may be rectified or unrectified), with corresponding UNC and DQ extensions. This is a Browse Quality Data Output. 4.14 ‘Browse Quality’ Data Products As described in the previous section and presented in Figure 2, the main “browse quality” data outputs for NIRSpec MOS observations are: • The flux calibrated 2-D spectral sub-images for each MSA target, with UNC and DQ extensions. • The collapsed/extracted 1-D spectra for each target, with UNC and DQ extensions. • The combined 1-D spectra for a target acquired over multiple offsets, with UNC and DQ extensions (and perhaps multiple pointings or visits). These “browse quality” data outputs from the pipeline may fulfill the general observers science requirement and thus could be used directly for published results. However, in the early stages of the pipeline implementation, the “Browse Qualtiy” data outputs might not meet the NIRSpec or user requirements, and further processing may be warranted. In the NIRSpec MSA pipeline processing early in the JWST mission lifetime, it is likely that more manual interaction with the data processing steps is necessary, and the GO could re-run the data reduction. The hope is that the Browse Quality outputs will be replaced by better calibrated science quality outputs as we learn more about the data reduction and calibration process. These outputs would serve as the viewable result of calibrated data within the JWST science archive. 4.15 “Next Level” Reduction The primary challenge for “next level” NIRSpec data reduction in MOS mode is applying an aperture flux correction to spectra which will accurately take into account the light lost through the slit as a function of wavelength and source PSF shape (see the Data Processing Requirements document for further discussion and description). Work by the NIRSpec Science Team has shown that the difference between an aperture flux correction assuming a point-source PSF and assuming a r1/4 deVaucouleurs galaxy surface brightness profile can be as large as 1 magnitude at some wavelengths. Since many NIRSpec science targets will not be point-sources, we know that a reduction routine or post-pipeline tool will be necessary in order to apply the proper aperture flux correction to all NIRSpec slit spectroscopy. To help define target profile shape, perhaps this task or tool will include the NIRSpec ‘Confirmation Image’ that is acquired during MSA target acquisition, or even the NIRCam pre-image used to define the target positions. A future goal should be to incorporate a proper aperture slit correction based on target profile shape into the automated MSA data reduction pipeline. Further work is TBD to define Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 19 - JWST-STScI-001859 SM-12 and implement this (and it will most likely not be automated in the early part of the mission). An additional challenge for “next level” MOS mode reduction is tracking and combining spectra of the same target taken through different MSA slits or at different times (visits or pointings). As such, the ‘data combine’ steps for 2-D sub-image and 1-D spectra in the MOS data pipeline flow chart should be able to take inputs from multiple observations of a given target. The complexity of this next level reduction arises from the need to track and locate all target spectra to be combined within all the visits in a science program. This will likely require that a unique target name and/or on-sky coordinates are linked to the open MSA target slits and propagated through the APT and the data headers. So, the name or coordinates must be linked to each target slit at all pointings and visits (e.g., as in the MSA header information which includes target placement within a slit). Further work is TBD to better define this. (In HST-speak, I think this section would be called “CALNIRSPECB”). 5.0 NIRSpec Fixed Slit Data Reduction Figure 4 presents the flow chart diagram for data reduction in the NIRSpec fixed slit mode. In practice, the FS data reduction is identical to the MSA data reduction, with the exception of the throughput and flat fielding correction steps. There are a set number of fixed slits, and because they are always open flat fields can be acquired simultaneously with MSA flats for each of the FSs for every spectral configuration. As a result, the flat field and throughput correction can be done in a single processing step using an empirical flat field image, without decoupling the P-Flat for the full-frame images (as for MSA mode). Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 20 - JWST-STScI-001859 SM-12 Figure 4: The flow-chart diagram for NIRSpec fixed slit data reduction The NIRSpec FS data reduction pipeline, which processes the data using an empirical flat, should be constructed to work with MSA slit spectra. If a flat field spectrum is acquired through an MSA slit, then the MSA science data obtained through the same slit should be reduceable through the FS data pipeline using the corresponding flat field. Sections 5.1 and 5.2 describe briefly the slit sub-image extraction and flat field and throughput correction processing step for FS data reduction, all other steps included in the flow chart diagram in Figure 4 are identical to the MSA reductions (from section 4). Discussion of the details of these steps is not repeated here. Some FS data may be acquired using subarrays defined for their associated slits. So, in these cases extraction of the data into 2-D windows is not necessary. Because FS data is acquired for all observing programs in all modes (even if the slits are on blank sky), every NIRSpec science exposure should have the associated FS data processed through the reduction pipeline. At the present time, it is assumed that the pipeline data processing of spectra acquired through the 1.”6 x 1.”6 square wide aperture is the same as FS data reduction. In practice many datasets from this wide aperture may need to be processed in a different manner, because the data can be acquired in small subarrays with no reference pixels. The pipeline processing for this data is TBD. Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 21 - JWST-STScI-001859 SM-12 5.1 2-D Slit Extraction Description: Extract FS spectra into smaller 2-D sub-images, one sub-image associated with each FS. This processing step is only necessary for FS data that was acquired in full frame or “ALLSLITS” mode, not for spectra taken with a subarray associated with the specific slit. Input Data: The count rate image (combined from multiple frames, where applicable) of 2040 x 2040 size, with corresponding UNC and DQ extensions (or with 256 x 2040 size, for data taken in ALLSLITS mode). Input Reference Files and Reference Information: A table of real numbers (or array/image, for consistency with MOS reduction) with dimensions of 5x4 that maps out the spectral extraction boxes for each of the five FSs. The first dimension of this array corresponds to the five FSs, the second dimension are the pixel values of the extraction box. As described in section 4.4, four pixel values are needed to define the pixel coordinate image extraction locations for each slit (Figure 3). One reference file will be necessary for each spectral configuration, for a total of 9 files (see section 4.4). The reference file must be able to distinguish between pixel locations on the two NIRSpec detectors (spectra in the R=2700 mode will extend over both detectors). Output Data: 2-D spectral slice images that are x_spec x y_spatial in size, where the x dimension is the spectral dispersion dimension (x_spec) and y is the length in the cross dispersion direction that the MSA spectra extends over (y_spatial). This processing step will output 5 spectral sub-images, one for each of the FSs, with associated UNC and DQ extensions. 5.2 Flat Field and Throughput Correction Description: Correct FS (or MSA) spectra for flat field and throughput effects using empirical flat images acquired with the NIRSpec CAA lamps. Input Data: The 2-D extracted sub-image associated with each FS (or MSA slit), with corresponding UNC and DQ extensions. Input Reference Files: The calibrated and throughput corrected 2-D extracted subwindow flat field image associated with each FS (or MSA slit), with corresponding UNC and DQ extensions. Output Data: The flat and throughput corrected 2-D extracted sub-image associated with each FS, with corresponding UNC and DQ extensions. 5.3 ‘Browse Quality’ Data Products The FS data reduction output “browse quality” data products will be the same as the data products for MOS mode: • The flux calibrated 2-D spectral sub-images for each FS target, with UNC and DQ extensions. • The collapsed/extracted 1-D spectra for each target, with UNC and DQ extensions. Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 22 - JWST-STScI-001859 SM-12 The combined 1-D spectra for a target acquired over multiple offsets, with UNC and DQ extensions. Every NIRSpec observation will have associated FS pipeline data products. • 5.4 “Next Level” Reduction As for MOS mode observations, FS spectra may be acquired on targets in multiple pointings or over multiple visits. Hence, for combining multiple spectra in a “next level” reduction, the method in which target names are associated with MSA slit spectra and tracked through a general observing program will also be applicable to FS observations. (In HST-speak, I think this section would be called “CALNIRSPECB”). 6.0 NIRSpec IFU Data Reduction Figure 5 presents the flow chart diagram for data reduction in the NIRSpec integral field unit (IFU) mode. In practice, the initial processing steps for IFU data reduction are identical to the FS data reduction, with the 2-D image extraction process creating subimages for the 30 IFU virtual slits instead of the 5 FSs. The flat field and throughput correction for the IFU data will be the same as the FS mode, using flat images acquired with the NIRSpec CAA lamps. The IFU reduction deviates from the FS process only after the wavelength calibration step. The below sections describe only the extraction of the IFU virtual slits into sub-images, and the reduction steps after the wave and spatial calibration which construct and work with the 3-D IFU datacube files. Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 23 - JWST-STScI-001859 SM-12 Figure 5: The flow-chart diagram for NIRSpec integral field spectroscopy data reduction. 6.1 2-D Slit Extraction Description: Extract IFU spectra into smaller 2-D sub-images, one sub-image associated with each of the IFU virtual slits. Input Data: The IFU data count rate image (combined from multiple frames, where applicable) of 2040x2040 size, with corresponding UNC and DQ extensions. Input Reference Files and Reference Information: A table of real numbers (or array/image, for consistency with MOS reduction) with dimensions of 30x4 that maps out the spectral extraction boxes for each of the thirty IFU virtual slits. The first dimension of this array corresponds to the IFU slits, the second dimension are the pixel values of the extraction box. As described in section 4.4, four pixel values are needed to define the pixel coordinate image extraction locations for each slit (Figure 3). One reference file will be necessary for each spectral configuration, for a total of 9 files (see section 4.4). The reference file must be able to distinguish between pixel locations on the two NIRSpec detectors (spectra in the R=2700 mode will extend over both detectors). Output Data: 2-D spectral slice images that are x_spec x y_spatial in size, where the x dimension is the spectral dispersion dimension (x_spec) and y is the length in the cross dispersion direction that the IFU spectra extends over (y_spatial). This processing step Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 24 - JWST-STScI-001859 SM-12 will output 30 spectral sub-images (extensions), one for each of the IFU slits, with associated UNC and DQ extensions. 6.2 Reformat the Data into a 3-D cube Description: Reformat the wavelength calibrated, spatially rectified IFU virtual slit spectra into a 3-D datacube with dimensions 30 x ~30 x spec. The x dimension corresponds to the 30 virtual slits, and the z dimension is the extent of the spectra. The y dimension is ~30 because the slit extraction and spatial rectification process in the crossdispersion direction may alter the spatial dimension by a small amount. Input Data: Wavelength and spatially calibrated IFU virtual slit spectra, with 30 subimage science extensions and corresponding UNC and DQ extensions. Input Reference Files: None. (For an instrument-specific IFU cubing routine, no reference information is necessary. The way the slits map to the sky to make the 3-D datacube can be hard-coded in the routine. For a flexible IFU cubing routing, such as one that might also work for MIRI data, a reference file that describes the mapping of the virtual slits to the sky would be needed). Output Data: A 3-D IFU datacube, x and y being the spatial axes and z the spectral dimension, with corresponding UNC and DQ extensions. 6.3 Absolute Flux Calibration The absolute flux calibration of IFU data is essentially identical to the MSA and FS absolute calibration step, consisting only of a multiplicative calibration factor applied to the count-rate data. Some users may wish to work with IFU data that has not been processed through the 3-D cube stage. As such, the absolute flux calibration task should be able to process the IFU data outputs from the wavelength and spatial rectification step (prior to constructing the 3-D datacube). Browse Quality Data Output. 6.4 Combine Datacubes Description: Combine datacubes that were acquired at a given pointing position (small offsets). The IFU dither patterns at a given position have been defined so that targets can be moved within the IFU image field to remove detector artifacts (Tumlinson 2009b). As such, many IFU observations will be executed using these small, in-field dithers. IFU data acquired using these “canned” dither patterns could be combined in the automated pipeline to form a datacube that is free of detector artifacts. Input Data: A set of multiple 3-D IFU datacubes acquired using a default dither pattern, with corresponding UNC and DQ extensions. Input Reference Info.: Header keyword information that describes the offset pattern used to acquire the IFU data. Output Data: A single combined IFU datacube that has slightly wider spatial extent than the input cubes (depending on the offset pattern), with corresponding UNC and DQ extensions. Browse Quality Data Output. Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 25 - JWST-STScI-001859 SM-12 6.5 ‘Browse Quality’ Data Products The “browse quality” data products of the IFU data reduction pipeline will be an absolute flux calibrated datacube of size 30 x ~30 x spec, and (where applicable) a dither combined, calibrated datacube of slightly greater spatial extent. 6.6 “Next Level” Reduction The NIRSpec IFU component in APT will provide the option to construct wider-field mosaics consisting of many IFU tile positions (Tumlinson 2009b). Additionally, the option will exist to dither the IFU by integer slice widths, or use sub-slice offsets in the spatial or dispersion dimension. The former offsets will be used to remove instrument and pixel defects, and the latter dithering strategy will improve the spatial PSF and spectral LSF sampling. As a result, “next level” reduction will be needed to combine dithered IFU patterns and mosaic IFU datacubes together to form a larger spatial extent datacube. Most IFU mosaics will probably be contained within a single visit, with spatial extents of <20” (same guide star). Though, it is possible that tiles for very large or very deep IFU mosaics must be acquired using different guide stars, in different visits, or at different times. Associating and tracking IFU data acquired through multiple visits may be important for large IFU mosaics. (In HST-speak, I think this section would be called “CALNIRSPECB”). 7.0 NIRSpec “Imaging” Mode NIRSpec “Imaging Mode” data will be acquired using the grating mirror and one of the filters used for target acquisition. Observations in this mode will be important for measuring field distortion and PSF shape during the commissioning of the instrument, and for monitoring instrument performance. During the TA process, a flat field lamp image is acquired using the NIRSpec CAA to verify the position of the grating mirror, a cosmic-ray rejected image of the science field is acquired through the TA filter with the MSA shutters open. An optional ‘Confirmation Image’ is observed using the science filter and the science MSA configuration, to confirm the science target placings within the MSA shutters. While it is not assumed that NIRSpec “Imaging Mode” observations will be used extensively for science, images acquired during the standard MSA target acquisition process will prove useful for target position verification and user-interactive aperture flux corrections. As a result, it will be important to provide accurately reduced and calibrated NIRSpec images. The following sections describe the main processing steps for NIRSpec imaging mode observations (the steps described in the “CALWebb” processing are assumed to apply for the initial reduction to count rate images). 7.1 Flat Field Correction Description: Correct the NIRSpec images for flat field effects. Lamp flats acquired using the CAA may be used, with the note that the light from the calibration lamps does Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 26 - JWST-STScI-001859 SM-12 not go through the NIRSpec filters. Additionally, the MSA is always in place, which causes a grid pattern modulation of the flat field illumination on the detector. Yet, CAA lamp images will be acquired simultaneously with the acquisition images during the TA process, and these will have the same grating mirror shift and can serve well to correct the imaging flat field effects. Correction to the lamp flat using sky flats and/or a throughput correction (model) image may also be needed. Input files: The reduced NIRSpec count rate images of dimension 2040x2040, with corresponding UNC and DQ extensions. Input Reference Files: The reduced lamp flat images of dimension 2040x2040, with corresponding UNC and DQ extensions. If necessary, sky flat / throughput correction images with corresponding UNC and DQ extensions. Output Files: The flat-field corrected NIRSpec images of dimension 2040x2040, with corresponding UNC and DQ extensions. 7.2 Absolute Flux Calibration Description: Calibrate the NIRSpec images to an absolute flux scale (units are TBD). Input files: The flat-field corrected NIRSpec images of size 2040x2040, with corresponding UNC and DQ extensions. Input Reference Info.: Input flux calibration information, likely in the form of a photometric calibration header keyword determined from prior calibration observations. Output Files: The absolutely calibrated NIRSpec images of size 2040x2040, with corresponding UNC and DQ extensions. 7.3 Field Distortion Calibration It will be necessary to correct the NIRSpec images for field distortion effects for proper comparison with images and catalogs from other instruments (particularly the NIRCam or WFC3 pre-image). Observations of the JWST astrometric calibration field in the large Magellenic Cloud will be made during commissioning to verify and calibrate the NIRSpec field distortion (See NIRSpec Calibration Plan). Description: Correct the calibrated NIRSpec images for field distortion. Input files: The flux calibrated NIRSpec images of size 2040x2040, with corresponding UNC and DQ extensions. Input Reference Info.: The NIRSpec field distortion model, likely in the form of polynomial coefficients that accurately describe the distortion. Output Files: The distortion calibrated NIRSpec images, with corresponding UNC and DQ extensions (dimensions TBD). 8.0 “Next Level” NIRCam Data Reduction for NIRSpec Pre-Imaging There are several processing steps beyond the baseline NIRCam imaging reduction that may be necessary in order for NIRCam images to be used efficiently for NIRSpec preCheck with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 27 - JWST-STScI-001859 SM-12 imaging and MSA spectroscopy definition. At the present time (and to the best of our knowledge), the plan for NIRCam imaging data reduction is to construct flat field corrected images that have been calibrated for absolute flux for each of the 10 detectors. Next level processing steps such as image mosaicing are not planned for initial pipeline implementation. Below we describe three processing steps that may not be included in the baseline initial pipeline development for NIRCam, but will be crucial for NIRSpec pre-imaging. 8.1 NIRCam Field Distortion NIRSpec target acquisition requires astrometric accuracy of better than 5 mas for MSA mode observations. NIRCam imaging will be the primary means to acquire accurate source positions for NIRSpec target acquisition reference targets (WFC3 images will likely be used at the beginning of the mission, but may not exist for many fields). As a result, very accurate field distortion measurement and correction for NIRCam imaging is critical for NIRSpec. This is a ‘next level’ processing step for NIRCam images that will impact NIRSpec if it is not implemented early in the mission. 8.2 Mosaic Dithered NIRCam Frames The NIRSpec team will likely define a limited set of fixed dither patterns for NIRCam imaging mode observations that will be optimized for MSA pre-imaging (e.g., Anderson 2009). Most of the NIRSpec MSA target acquisition and science observations will be defined by general observers after the NIRCam pre-images are acquired. The NIRCam data for pre-imaging obtained using a canned dither pattern should be automatically reduced into a larger mosaic image by the pipeline. If general observers are expected to process pre-imaging data into a larger mosaic to identify their target and reference sources, this will cause delays in the definition of MSA science observations and potential problems for scheduling. 8.3 NIRCam Imaging Source Extraction General observers will need to estimate NIRSpec target acquisition exposure times and identify reference stars based on calibrated NIRCam images. To do this, accurate positions and magnitudes of sources within the images will be required. An automated source extraction procedure should be run on the NIRCam pre-images to provide reference magnitudes and positions for potential NIRSpec target acquisition sources. If general observers are expected to run their own source identification procedures to identify their reference sources for MSA target acquisition definition, this will cause delays in the definition of MSA science observations and potential problems for scheduling. The same software that identifies the positions of MSA science targets could also be made to extract source profile information and save this for use the in the pipeline spectral extraction and calculation of aperture flux corrections. More work is TBD to demonstrate the feasibility of this. Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 28 - JWST-STScI-001859 SM-12 9.0 Open Issues Regarding NIRSpec Data Reduction and Pipeline Calibration (Notes from and discussion w/ J. Valenti) 9.1 General Open Issues 9.1.1 Propagation of MSA Shutter and Target Information The NIRSpec data processing pipeline will need to know a lot of information about the MSA configuration – beyond the general shutter open/closed map that is uploaded to the spacecraft. The pipeline will additionally need to know which open MSA shutters are on science targets, and which are open on background flux – and when this changes with dithers. Additionally, the pipeline should be able to capture information on the target centering within an MSA shutter, which is information that will be captured during the observation planning process with the MSA Planning tool in the APT. In general, a method for automatically associating background shutters with sources for multiple MSA configurations will be difficult. It also might not be possible in all science applications to measure flux background close to the science sources. It should be possible to make multiple associations for background flux for MSA science targets in the pipeline, based on the user inputs to APT. Moreover, the pipeline should correct for known differences in projected shutter sizes, regardless of the separation between source and background shutters. 9.1.2 Exposure-Level data processing vs. Association data processing: The requirement that separate exposures (e.g., at the same position or different dither locations) be combined before extracting spectra complicates data processing. All data have to be received before any spectra can be extracted. Raw and/or intermediate products from data associations have to be queued until processing can be completed. There are other more subtle issues. We need to thoroughly investigate the relationship between exposure-level processing and data association processing (including dithers and background subtraction steps, which do likely require associated data not single exposures). Should there be separate outputs at the exposure level and association level? 9.1.3 Data Reduction File Structures File structure for the process of NIRSpec data reduction has not been established. More information needs to be gathered and determined for the raw input file structure. We also need to decide how we handle multiple integrations within a single exposure. Perhaps the user can click a radio button in the APT that requests integrations are combined into a single exposure image. This might clear up some confusion in the pipeline processing and make the reduction outputs more streamlined for observations with multiple integrations in a single exposure. (TBD). In the data reduction process, if each MSA spectrum is stored in a separate FITS extension or a separate FITS file, then it makes sense to store the predicted target centering information from the MSA planning process within the associated header. If we bundle all of the spectra into a single array, then storing the predicted centering Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 29 - JWST-STScI-001859 SM-12 information in a single FITS header would be awkward. In the latter case, the predicted locations should be stored in a separate extension or more likely a separate file. 9.1.4 R=2700 Mode and Detector Gap-Spanning Observations In the R=2700 (high resolution) grating settings with NIRSpec, the useful scientific spectral data will extend over both detectors in all observing modes (FS, MSA, IFU). This will also happen with some MSA configurations using the R=1000 gratings. At some point in the reduction process, we will want to join the two segments of each spectrum that span the detector gap. We need to determine when and how in the data reduction this is accomplished. Regarding how, we have a few choices: 1) splice the two detector spectral segments together, leaving a discontinuity in the wavelength scale or 2) create a continuous wavelength scale and flag the missing values. In this document section on extraction 2-D into data sub-windows for each open MSA shutter, it is noted that the extraction window in R=2700 mode may extend to cover both detectors. This is the first processing step where the detector gap spanning spectral regions might be extracted and merged together. The details of this are TBD. 9.1.5 Pipeline Output Data Products The goal for final science quality reduced and calibrated pipeline data products for MSA science could consist of: • Requested and/or actual MSA configuration with predicted and/or actual target locations • For each target in each exposure: (a) rectified 2-D spectra, and (b) extracted 1-D spectra. Each spectrum has associated uncertainties and data quality flags. • For each target in a group of associated exposures: (a) combined rectified 2-D spectra, (b) extracted combined rectified 2-D spectra, and (c) combined extracted 1-D spectra. Each spectrum has associated uncertainties and data quality flags. Corresponding output products for FS and IFU science are assumed (with IFU outputs being 3-D datacubes, not extracted spectra). The final outputs are TBD. 9.1.6 Temporal Variability of Instrument Performance Instrument performance will evolve over the course of the mission. [Our knowledge of calibration parameters will improve as well, but this paragraph is about actual changes in instrument performance.] For example, dark rates will increase due to radiation damage. In principle, any value that is stored in a reference file may evolve. Calibration reference files must have one or more methods for describing this evolution. The HST calibration database system (CDBS) contains a “use after” date for each reference file. When processing a particular observation, HST calibration pipelines select the reference file of the appropriate type with the latest “use after” date that precedes the observation date. Reference files that have not evolved over the course of the mission have a “use after” date that precedes the launch date. The “use after” formalism describes Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 30 - JWST-STScI-001859 SM-12 instrument performance as intervals of constant performance, punctuated by moments of discrete change. The “use after” capability is used extensively for darks, which degrade continuously due to radiation damage and suddenly due to anneals. Even relatively stable calibration parameters may change suddenly, for example when switching from primary to backup electronics. Gradual evolution of instrument performance may be described by calibration parameters that have a piecewise linear dependence on time. CDBS does not (currently) provide this capability, but certain HST reference files store temporal evolution data internally. JWST does not have to adopt the existing CDBS and HST reference file design, but JWST should “learn lessons” from HST. JWST reference files should have a general mechanism for describing both gradual and sudden changes in instrument performance. The CDBS “use after” mechanism has worked well for describing sudden changes. A more generic mechanism is needed to describe gradual changes, for example the ability to interpolate linearly between two reference files. 9.1.7 Correlated Errors Random errors are straightforward to propagate and report. Correlated errors are harder to track and describe. For example, an error in background subtraction can bias all pixels by the same amount, leading to correlated errors in the resulting spectrum points. These correlated errors may dominate random errors for high S/N observations. We should think about how to track and report correlated errors. 9.2 Issues from Specific Sections in the Document 9.2.1 Reference Pixel correction Bias and photoelectrons have different noise characteristics, so the reference pixel subtraction step of the pipeline is the first opportunity to calculate uncertainties in pixel values. If the reference pixel correction procedure contributes significantly to overall uncertainty, then the pipeline must record bias uncertainties in the output data for this step. To calculate the uncertainty for each pixel value, the pipeline would need the gain now, rather than waiting until ramps are fitted (Section 3.7)1. 9.2.2 Dark Subtraction We should describe the possibility that dark rates and associated uncertainties for each pixel will be described by a few function parameters, rather than explicit values at 1000 different times in the ramp. First, a fitted function is often more precise than the data that were fitted. The fit will have to take into account the fact that up-the-ramp samples are correlated. Second, a reference file containing function parameters will require only 2% 1 Alternatively, the pipeline could store the uncertainty in the bias in an extension with a different name (e.g., UNCBIAS), but there is no obvious advantage to deferring beyond this point the calculation of uncertainties in measured pixel values. Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 31 - JWST-STScI-001859 SM-12 of the space required to store every measured sample. Assuming 4-byte floating-point values, this reduces the file size from 33 GB to approximately 500 MB. Every user that wants to reprocess NIRSpec data will need to download this file, so smaller is better. 9.2.3 Data Combining and Background Subtraction The philosophy and methods for combining data and subtracting off background flux need a more thorough investigation at all levels. Justification for data combining and backgrounds subtraction early in the reduction process should be made. If this is best done very early in the reduction, then reducing exposure-level data should probably progress before all dithered observations in association-level data are acquired. If this is best done at later stages, then this needs described as well. Data combine and background subtraction steps were merged into the processing at early points in the reduction in part for time and efficiency reasons. It will more efficient for a pipeline to extract and process 150+ shutter spectra from one combined, sky subtracted science image than to have to reduce the same 150+ shutter spectra many, many times from the different raw (uncombined) exposures. It may not be possible to execute data combine/background subtraction steps in this manner for all science programs, depending on the observing philosophy. These issues need investigated more conclusively. Automatically associating background shutters with each source for non-standard MSA multi-shutter slit configurations is difficult, so I understand deferring implementation of that capability. However, allowing observers to suggest such associations is straightforward and should be implemented for Cycle 1. In general, separate code (possible in the same program) will be required to process original pixels versus spatially and spectrally rectified data. We should prioritize which code to develop first. Assuming both processing options are available, we need to develop criteria that the pipeline can use to select a processing strategy. Alternatively, we could have the pipeline process every data set both ways (original pixels and rectified data). 9.2.4 L-Flat and Throughput Correction The calibration approach advocated in the ESA Pipeline Requirements document may not be sufficient to remove the throughput dependence on wavelength and position in the field of view. Internal lamp spectra: (a) include the reflectivity of three calibration mirrors not used when observing external sources, and (b) exclude the reflectivity of eight mirrors and a filter that are used when observing external sources. An optical model could be used to correct measured flats for the three extra and nine missing optical elements. However, external sources observed by the telescope may fill the pupil differently than the calibration assembly, leading to different vignetting. A detailed optical model might be able to correct measured flats for different pupil illumination, but more likely the corrections will be determined by observing a set of external sources at different positions in the field of view. Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 32 - JWST-STScI-001859 SM-12 Regardless of whether calibration data are obtained from internal lamp spectra, external sources, or both, we should consider whether to store throughput end-to-end or for each component separately. In the later case, one or more pseudo-components would be needed to describe empirical corrections to the model. Every MSA shutter will have a different end-to-end throughput because vignetting depends on location in the field of view (and hence location in the MSA). Storing two 2040 x 2040 throughput images (one per detector) for each shutter and each grating/filter combination is not feasible because the reference file would contain 2x1013 (2 x 2040 x 2040 x 365 x 171 x 4 x 10) data values. Using parameterized functions (two-dimensional polynomials, for example) to describe throughput as a function of MSA shutter and/or detector pixel would significantly reduce reference file size, but vignetting may have localized or sharp features that cannot be described by low-order polynomials. It might be useful to factor the throughput correction into two components: 1. Sky-to-MSA throughput is a function of position in the field of view and perhaps filter. Optical models and calibration observations will establish whether: (a) sky-toMSA vignetting depends on filter, and (b) whether the vignetting can be described by a parameterized two-dimensional function of position in the field of view. In the worst case, this throughput component would require one map of the MSA focal plane for each of the 7 filters (2x106 values). In the best case, a single twodimensional function would describe the behavior for all filters (less than 100 parameters). 2. MSA-to-detector throughput is a function of MSA shutter index, location on the FPA, and perhaps grating. Optical models and calibration observations will establish whether: (a) MSA-to-detector vignetting depends on grating, (b) whether the vignetting can be described as a parameterized two-dimensional function of MSA shutter index, (c) whether the vignetting can be described as a one-dimensional function of location along a dispersed spectrum, and (d) whether the vignetting can be described as a parameterized two-dimensional function of location on the FPA. In the worst case, this component of the throughput would require two 2040 x 2040 images for each shutter and each of the 8 grating wheel elements (2x1013 values, infeasible). In an intermediate case, each shutter would have a separate a one-dimensional throughput vector for each shutter, but vignetting would be independent of disperser (109 values, perhaps feasible). In the best case, a parameterized function of MSA shutter index and location in the FPA would describe the vignetting for all dispersers and grating efficiency would be stored separately (103 parameters). The functional form of throughput (including vignetting) will be uncertain until commissioning. Thus, the format of the throughput reference files must be flexible enough to accommodate functional forms. In practice, this means the reference file must include a parameter that specifies the functional form of the throughput description. It might be simpler to apply the correction for chromatic slit loss later, when correcting for location of the target in the aperture. Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 33 - JWST-STScI-001859 SM-12 9.2.5 Aperture Flux Correction Consider applying the chromatic slit loss correction here, rather than earlier in the process. 9.2.6 Absolute Flux Calibration The conversion from measured counts per second to incident flux is a strong function of wavelength, yet the text describes the calibration information as data that can be stored in a header keyword, which suggests a scalar. A scalar conversion factor is only possible if the wavelength dependence has already been removed in a preceding step, for example the “Complete Throughput… Correction”, that is described in Section 4.5. Such an approach would distort the meaning of “counts per second” at the shortest and longest wavelengths, where sensitivity is low and very large correction factors will be applied. Noise at these extreme wavelengths would also be amplified. An alternative would be to let the conversion from measured counts per second to incident flux be a function of wavelength for each disperser. The pipeline would read this input reference information from a reference file, rather than the FITS header. 10.0 References Anderson, J. A. 2009 “Dither Patterns for NIRCam Imaging” JWST-STScI-001738 Fixsen, D.J., Offenberg, J.D., Hanisch, R.J., Mather, J.C., Nieto-Santisteban, M.~A., Sengupta, R., & Stockman, H.S. 2000, PASP, 112, 1350 Kriss, J. 2004 “Recommendations for JWST FITS Formats and Keywords” JWSTSTScI-000380 McCullough, P. 2008 “Inter-pixel Capacitance: prospects for deconvolution” WFC3 2008-26 Regan, M. et al. 2009 “The Effect of Splitting the Exposure Time on the Observed Persistence after a Bright Exposure for a H2RG Detector” JWST-STScI-001743 Regan, M. 2007, “Optimum weighting of up-the-ramp readouts and how to handle cosmic rays” JWST-STScI-001212 Tumlinson, J. 2009a “NIRSpec Dithering Strategy Part 3: The Micro-Shutter Array” Tumlinson, J. 2009b “NIRSpec Dithering Strategy Part 2: The Integral Field Unit” Check with the JWST SOCCER Database at: http://soccer.stsci.edu/DmsProdAgile/PLMServlet To verify that this is the current version. - 34 -