Error Characterization This is a supplement to the paper: Profiling Tropospheric CO2 using the Aura TES and TCCON instruments Le Kuai1, John Worden1, Susan Kulawik1, Kevin Bowman1, Sebastien Biraud4, Christian Frankenberg1, Debra Wunch3, Brian Connor2, Charles Miller1, Run-Lie Shia3, and Yuk Yung3 1. Jet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Drive, Mail stop: 233-200, Pasadena, CA 91109, USA 2. BC Consulting Ltd., 6 Fairway Dr, Alexandra 9320, New Zealand 3. California Institute of Technology, 1200 E. California Blvd., Pasadena, CA, 91125, USA 4. Lawrence Berkeley National Laboratories, Berkeley, CA, 94720, USA (A1) Errors in TCCON column-averaged CO2 In this section, we develop the error characterization for the estimates of TCCON CO2 column averages from the profile retrievals. First of all, the retrieval dimension state vector ðļ is obtained using the profile retrieval from TCCON spectra. ðū1[ðķð2 ] âŪ ðū10[ðķð2] ðū[ðŧ2 ð] ðļ = ðū[ðŧð·ð] ðū[ðķðŧ4 ] ðūðð ðūððĄ ðūðð [ ðūð§ð ] (ðī. 1) Each element of ðļ is a ratio between the state vector (x) and its a priori (xa) except the last four, which are instrument parameters (continuous level: ‘cl’, continuous title: ‘ct’, frequency shift: ‘fs’, and zero level offset: ‘zo’). For target gas CO2, the altitude dependent scaling factors are retrieved. For other interferential gases, a constant scaling factor is retrieved. To obtain concentration profile, the retrieved 1 scaling factors need to be first mapped from retrieval grids (i.e. 10 levels for CO2 and 1 level for other three gases) to the 71 forward model levels. ð· = ððļ (ðī. 2) where ð = ðð· ððļ is a linear mapping matrix relating retrieval level to the forward model altitude grid. Multiplying the scaling factor (ð·) on the forward model level to the concentration a priori (ðð ) gives the estimates of the gas profile. We define ðð = ðð ðð· and ðð is simply a diagonal matrix filled by the concentration a priori (ðð ): Ė Ė = ðð ð· ð (ðī. 3) According to the definition, ðð = ðð ð·ð and ð = ðð ð·. The Jacobian matrix of retrieved parameter with respect to the radiance is ðð = ððģ(ððļ) ððļ (ðī. 4) Using the chain rule, we can obtain the equation relating the retrieval Jacobians to the full-state Jacobian ððģ ððģ ðð ðð· = ððļ ðð ðð· ððļ (ðī. 5) ð ðļ = ð ð ðð ð = ð ð· ð (ðī. 6) or If the estimates by retrieval are close to the actual state, then the estimated state for a single measurement can be expressed as a linear retrieval equation: 2 Ė = ð·ð + ð ð· (ð· − ð·ð ) + ðððļ ð + ∑ ðððļ ð ðð âðð ð· (ðī. 7) ð where n is a zero-mean Gaussian spectral noise vector with covariance ðð and the vector âðð is the error in true state of parameters (l) that also affect the modeled radiance, e.g. temperature, interfering gases, and etc. ð ðð is the Jacobian of parameter (l). ððļ is the gain matrix, which is defined by ððļ = ððļ = (ð ððļ ðð−ð ð ðļ + ðð−ð )−ð ð ððļ ðð−ð ððģ (ðī. 8) Ė), we simply If we want to convert Eq. (A. 7) to the state vector of concentration (ð apply Eq. (A. 3): Ė = ðð + ðð ð ð· ðð−ð (ð − ðð ) + ðð ðððļ ð + ∑ ðð ðððļ ð ðð âðð ð (ðī. 9) ð The averaging kernel for ð· in forward model dimension is ð ð· = ðððļ ð ð· (ðī. 10) We can define ð ðĨ = ðð ð ð· ðð−ð as the averaging kernel for ð. If we have N TCCON observations within a short time window (i.e. 4-hr time window), they could be considered as multiple measurements of the same air mass. Their mean is simply: ð 1 Ėð = ∑ ð Ėð ð ð (ðī. 11) ð=1 3 Ėð with Eq. (A. 9), we can have Replacing ð ð ð ð=1 ð=1 1 1 Ėð = ðð + ∑ ð ðð (ð − ðð ) + ∑ ðð ðððļð ðð + ð ð ð ð 1 ∑ ∑ ðð ðððļi ð ððð ðŦððð ð ð=1 (ðī. 12) ð where the first term represents the a priori profile which stays the same within the same day but is variable day by day. We assume that ð ðð , ððļð and ð ððð have small variability: ð 1 Ė ð , ððļð = ð, ð ððð = ð ðð ∑ ð ðð = ð ð (ðī. 13) ð=1 The estimated bias error for multiple measurements of the same air parcel would be the residual between the mean of retrieval estimates and the truth. Following Eq.s (A.9) and (A.11), the estimated bias error when observing the same air mass is: ð 1 Ė ðĩ )(ðð − ð) + ∑ ðð ðððļð ðð + Ėðĩ = ð Ėð − ðĨ = (ð − ð ð ð ð=1 ð 1 ∑ ∑ ðð ðððļi ð ððð ðŦððð ð ð=1 (ðī. 14) ð We assume ðð and ðŦððð are independent, normal distributed random variables for all N observations. We further assume ðð is zero-mean while ðŦððð is non-zero-mean Ė Ė Ė Ė Ė ð ). The expected mean bias error for the observations of the same air mass (ðŦð becomes: 4 Ė ðĩ )(ðð − ð) + ∑ ðð ðððļi ð ðð Ė Ė Ė Ė Ė Ėðĩ ) = (ð − ð ðŽ(ð ðŦðð (ðī. 15) ð The random measurement noise ðð is a zero-mean variable with a diagonal Ė )(ð − ð Ė )ðŧ = ðŽ(ðððŧ ) = ðð . Systematic errors have noncovariance matrix ðŽ(ð − ð ðŧ zero expected values Ė Ė Ė Ė Ė âðð and covariance matrices ðŽ(âðð − Ė Ė Ė Ė Ė âðð )(âðð − Ė Ė Ė Ė Ė âðð ) = ððð for the same air parcel. The covariance of the estimated bias error can be estimated as: ð 1 Ėðĩ − ðŽ(ð Ėðĩ ))(ð Ėðĩ − ðŽ(ð Ėðĩ )) ] = 2 ∑ ðð ðððļð ðšð (ðð ðððļð )ðŧ + ððĖ = ðŽ [(ð ð ðŧ ð=1 ð 1 ∑ ∑ ðð ðððļi ð ððð ðð (ðð ðððļi ð ððð )ðŧ ð2 ð=1 (ðī. 16) ð The spectral noise and systematic errors are assumed to be uncorrelated. Under the condition of Eq. (A. 13), then covariance of the estimated bias error reduces to ððĖ = 1 1 ðð ðððļ ðð (ðð ðððļ )ðŧ + ∑ ðð ðððļ ð ðð ðð (ðð ðððļ ð ðð )ðŧ ð ð (ðī. 17) ð The sample covariance is ðĖ = ðŧ 1 Ė )(ð Ė ) Ė−ð Ė Ė−ð Ė (ð ð−1 (ðī. 18) Ė is a matrix whose columns are the retrieved TCCON profiles and whose where ð Ė is a matrix whose columns are the Ė rows are N observations of the same air mass. ðŋ profiles of the mean values of retrieval samples. 5 The sample covariance can be rewritten as ðĖ = ð ð [ðð ðððð (ðð ðð)ð + ∑ ðð ððð ðð ððð (ðð ððð ðð ) ] ð−1 (ðī. 19) ð For a large number N, the actual variability of retrievals can be represented by the sum of two covariance matrices: ðĖ ≈ ðĖðĶ + ðĖðŽðēðŽ (ðī. 20) where the measurement error covariance is ðĖðĶ = ðð ðððð (ðð ðð)ð (ðī. 21) and the “systematic” error covariance is ð ðĖðŽðēðŽ = ∑ ðð ððð ðð ððð (ðð ððð ðð ) (ðī. 22) ð To examine how well the estimated systematic and random covariance matrices represent the actual variability in the retrievals, we can compare the sample covariance to the actually variability of the retrieved samples. Applying a columnaveraging operator, such as a pressure weighting function h (Connor et al., 2008) to the sample covariance, its square root can be compared to the root-mean-square of the collection of the multiple retrievals within 4-hr time window: ðĖð ≈ ððŧ ðĖðĶ ð + ððŧ ðĖðŽðēðŽ ð (ðī. 23) The variability of the systematic errors when observing the same air mass are small, so the second term is less important for the variability of the bias error in 4-hr time 6 window. Considering the observations of over multiple days, which mean the air mass has been changed, the systematic errors can be assumed to be zero-mean random variables instead. For example, the mean bias error over J different days is the average of the bias errors for each air parcel: ð― Ėðð― ð 1 Ėðð = ∑ð ð― (ðī. 24) ð=1 and it can be estimated by Ėðð― ð ð― ðą ðĩð ð=1 ð=1 ð=1 1 1 1 Ė ðĩ )(ððð − ðð ) + ∑( ∑ ðð ðððð ððð ) = ∑(ð − ð ð― ð― ðð ð― ðð 1 1 + ∑( ∑ ∑ ðð ðððð ð ðððð âðððð ) ð― ðð ð=1 ð=1 (ðī. 25) ð where ððð is still a zero mean random measurement noise and âðððð become a zero mean variable. So the systematic error contributions to the mean bias from day to day can be averaging out. However, its covariance becomes a significant source to the variability of bias. The expected mean bias reduces to ð― 1 Ė ð )(ððð − ðð ) Ėðð― ] = ∑(ð − ð ðŽ[ð ð― (ðī. 26) ð=1 The covariance of the estimated bias error can be expressed as 7 ð Ėðð― − ðŽ[ð Ėðð― ])(ð Ėðð― − ðŽ[ð Ėðð― ]) ] ððĖ = ðŽ [(ð ð― 1 1 = 2 ∑ ðð ððð ðð (ðð ððð )ð ð― ðð ð=1 ð― 1 1 + 2 ∑ ∑ ðð ððð ð ððð ðððĨ (ðð ððð ð ððð )T ð― ðð (ðī. 27) ð=1 ð where the spectral noise and systematic errors are assumed to be uncorrelated. J is the number of different air parcels being measured and ðð represents the number of multiple retrievals for the same air mass j. The first term describes the variability of bias error driven by measurement noise and the second term represents the source from systematic errors. The sample covariance becomes: ðĖ = 1 Ė ðĩ − ð ðą )(ð Ė ðĩ − ð ðą )ðŧ (ð ð―−1 (ðī. 28) Ė ðĩ is a matrix whose columns are the profiles of the mean values of the where ð multiple retrievals for the same air mass and whose rows are measurements for ð― air parcels. ð ðą is a matrix whose columns are the true profiles or the aircraft observed profiles for ð― air parcels. Assuming that the variability of ððļ and ð ð are small, then: ðĖ = ð― ð [ðð ðððð (ðð ðð)ð + ∑ ðð ððð ðð ððð (ðð ððð ðð ) ] ð―−1 ð ≈ ðĖðĶ + ðĖðŽðēðŽ (ðī. 29) 8 Applying the column-averaging operator, it should be consistent with the actual variability of bias error in retrieved column averages. The second term by the systematic errors becomes dominant. To estimate the actual bias, we need compare TCCON estimates to the aircraft measurements, which are considered as the best estimates of true state (ð). The aircraft profile is first mapped to the forward model pressure level in TCCON retrieval and then apply the smoothing operator described in Rodgers and Connor (2003) to be smoothed by the averaging kernel and a priori constraint from the TCCON profile retrieval: Ėððģðŧ = ðð + ð ð (ðððģðŧ − ðð ) ð (ðī. 30) Ėððģðŧ is the profile that would be retrieved from TCCON measurements for the same ð air sampled by the aircraft without the presence of other errors. Applying the Ėððģðŧ gives estimates of the aircraft column averages. column-averaging operator to ð We can examine how well the estimated random and systematic covariance matrices represent the actual variability of N TCCON retrievals by comparing the calculated errors to the empirical errors from real retrievals. 9 (A2) Errors in PBL column-averaged CO2 To make PBL CO2 useful for carbon cycle science we have to understand the potential source of bias in the PBL CO2 product. Averaging kernel matrix can be decomposed into four sub-matrices to represent the boundary layer, free troposphere and their correlations: ð=[ ðð ð ð ð ð ðð ] ðð (ðī. 31) where subscript ‘B’ represents the boundary layer, ‘F’ represents the free troposphere. The intercomparison equation in the form of (A. 30) could be rewritten as ĖðĐ ð ððĐ ð [ ð] = [ ð] + [ ð ð ð ð Ė ð ð ð ðð ððĐ − ððĐ ð ][ ð ] ð ð ð − ððð (ðī. 32) ðĐ ðĐ ð ð ĖðĐ ð ððĐ ð + ð ð (ð − ðð ) + ð ðð (ð − ðð ) [ ð] = [ ð ] (ðī. 33) ð ð Ė ð ðð + ð ð ð (ððĐ − ððĐ ð ) + ð ð (ð − ðð ) Then the boundary layer aircraft profile and free troposphere TES profile applied with TCCON averaging kernel are expressed as ðĐ ð ðĐ ðĐ ð ĖðĐ ð ððģðŧ = ðð + ð ð (ðððģðŧ − ðð ) + ð ðð (ðððģðŧ − ðð ) (A. 34) ð ðĐ ð ĖððŧðŽðš = ððð + ð ð ð (ððĐ ð ðŧðŽðš − ðð ) + ð ð (ððŧðŽðš − ðð ) (ðī. 35) We simplify the TCCON estimated profile in terms of linear term plus random noise error and systematic error ĖðŧðŠðŠðķðĩ = ðð + ð(ð − ðð ) + ðð + ðð ð âð ð (ðī. 36) 10 Applying column-integral operator (ðŠ) by Eq. (1), it will give the estimates of total column amount. ðððĶ ðŠðð ∞ ðððĶ = ∫ ðð (ð) â ð(ð) â ðð§ (27) ð§ð The superscript F of the column-integral operator means the integration within free troposphere, and B means the integration within the boundary layer. Without superscript suggests the integration from the bottom to the top. The total column amount can be decomposed to the components of the boundary layer, free troposphere, and the cross terms. ððð ðķððķðķðð = ðŠðð + ðŠð(ð − ðð ) + ðŠðð + ðŠðð ð âð ð ð ðĐ ðĐ ðĐ ðĐ ð ð = ðŠðĐ ððĐ ð + ðŠ ðð + ðŠ [ð ð (ð − ðð )] + ðŠ ð ðð (ð − ðð ) ð ð ð +ðŠð ð ð ð (ððĐ − ððĐ ð ) + ðŠ ð ð (ð − ðð ) + ðŠðð +ðŠðð ð âð (ðī. 37) Similarly, for the free tropospheric partial column amount, apply the columnintegral operator within free troposphere to TES profile in Eq. (A.35): ð ðđ ðĐ ð ð ðķððļð = ðŠð ðððŧðŽðš = ðŠð ððð + ðŠð ð ð ð (ððĐ ðŧðŽðš − ðð ) + ðŠ ð ð (ððŧðŽðš − ðð ) (ðī. 38) The PBL column amount is the difference of Eq. (A.37) and (A.38): ðĩ ððð ðđ ðķððķðķðð+ððļð = ðķððķðķðð − ðķððļð ðĐ ðĐ ðĐ ðĐ ðĐ ð ð ð ðĐ = ðŠðĐ ð ðĐ ð + ðŠ [ð ð (ð − ðð )] + ðŠ ð ðð (ð − ðð ) + ðŠ ð ð ð (ð − ððŧðŽðš ) +ðŠð ð ð (ðð − ðððŧðŽðš ) + ðŠðð + ðŠðð ð âð (ðī. 39) 11 Applying column-integral operator within boundary layer to Eq. (A.34) gives the PBL amount by aircraft measurements. ðĐ ð ðĩ ðĐ ðĐ ðĐ ð ðķðđðŋð = ðŠðĐ ð ðĐ ð + ðŠ ð ð (ðððģðŧ − ðð ) + ðŠ ð ðð (ðððģðŧ − ðð ) (ðī. 40) The bias error of PBL column amount can be estimated by the difference of Eq. (A.39) to Eq. (A.40). ðĩ ð ðĩ ðĐ ð ðķĖ ðĩ = ðķððķðķðð+ððļð − ðķðđðŋð = ðŠðĐ ð ð (ððĐ − ððĐ ððģðŧ ) + ðŠ ð ðð (ð − ðððģðŧ ) ð ð ð +ðŠð ð ð ð (ððĐ − ððĐ ðŧðŽðš ) + ðŠ ð ð (ð − ððŧðŽðš ) + ðŠðð + ðŠðð ð âð (ðī. 39) If the cross sub-matrix of averaging kernel are nearly a null matrix and the errors in flight measurements are negligible, then the bias error in PBL column amount can be simplified as: ðķĖ ðĩ = ðŠð ð ð (ðð − ðððŧðŽðš ) + ðŠðð + ðŠðð ð âð (ðī. 40) Then the bias in PBL column averages is ððĩðŋ ðĖðķð = 2 ðŠð ð ð (ðð − ðððŧðŽðš ) + ðŠðð + ðŠðð ð âð ðŠðĐ ðĻð°ðđ (ðī. 41) where ðŠðĐ ðĻð°ðđ is the column amount of air within the boundary layer. The first term on the right represents the source of bias from TES free tropospheric CO2 and last two terms are the bias from TCCON column estimate. 12 (A3) Estimating the free tropospheric CO2 column using TES and GEOS-Chem To estimate the free tropospheric CO2, retrieved TES CO2 fields are assimilated into the GEOS-Chem model. GEOS-Chem is a global 3-D chemical transport model (CTM) for atmospheric composition developed by NASA Goddard, with sources and additional modifications specific to the carbon cycle as described in Nassar et al. (2010) and Kulawik et al. (2011). TES at all pressure levels between 40S and 40N, along with the predicted sensitivity and errors, was assimilated for the year 2009 using 3d-var assimilation. We compare model output with and without assimilation to surface based in situ aircraft measurements from the U.S. DOE Atmospheric Radiation Measurement (ARM) Southern Great Plains site during the ARM-ACME (www.arm.gov/campaigns/aaf2008acme) and HIPPO-2 (hippo.ucar.edu/) mission. We find improvement in the seasonal cycle amplitude in the mid-troposphere at the SGP site, but also discrepancies with HIPPO at remote oceanic sites, particularly outside of the latitude range of assimilation (Kulawik et al., manuscript in preparation). References: Nassar, R., D. B. A. Jones, et al. (2010). "Modeling global atmospheric CO2 with improved emission inventories and CO2 production from the oxidation of other carbon species." Geoscientific Model Development 3(2): 689-716. Kulawik, S.S., K.W. Bowman, M. Lee, R. Nassar, D.B.A. Jones, S.C. Biraud, S. Wofsy, D. Wunch, W. Gregg, J. R. Worden, C. Frankenberg, “Constraints on near surface and 13 free Troposphere CO2 concentrations using TES and ACOS-GOSAT CO2 data and the GEOS-Chem model”, AGU presentation, 2011. Kulawik, S.S., J.R. Worden, S. Wofsy, S.C. Biraud, R. Nassar, D.B.A. Jones, E. T. Olsen, G. B. Osterman, “Comparison of improved Aura Tropospheric Emission Spectrometer (TES) CO2 with HIPPO and SGP aircraft profile measurements”, accepted by Atmos. Chem. Phys., 2012. 14