Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 1/33 CCDAS evaluation : Error estimations on parameters (D430.1) and output fields (D430.2) Authors Company Philippe PEYLIN LSCE Bacour Cédric NOVELTIS Abdou Khane CLIMMOD Approval Company Pascal PRUNET NOVELTIS Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 2/33 CHANGE RECORDS ISSUE DATE § : CHANGE RECORD 1 Document Creation 05/02/2016 AUTHOR Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 3/33 Table of contents 1. INTRODUCTION.............................................................................................................................................. 4 2. CARBONES CCDAS VERSION V0 AND ERROR ESTIMATION ............................................................................. 5 2.1 OVERALL CCDAS APPROACH ................................................................................................................5 2.2 PRINCIPLE OF ERROR ESTIMATION ..........................................................................................................6 2.3 CCDAS COMPONENTS AND ASSOCIATED ERROR ESTIMATION ..................................................................7 2.3.1 Land component (step 1 and 2) ........................................................................................................7 2.3.2 Ocean component (step 3) .............................................................................................................11 2.3.3 Atmospheric component (step 4) ...................................................................................................12 3. PRIOR ERROR STATISTICS ON PARAMETERS AND OBSERVATIONS ............................................................... 13 3.1 3.1.1 Prior error statistics on the parameters .........................................................................................13 3.1.2 Error statistics on the observations and the model ........................................................................13 3.2 OCEAN COMPONENT (STEP 3) ............................................................................................................. 14 3.3 ATMOSPHERIC COMPONENT (STEP 4): ERROR ON CO2 CONCENTRATIONS ............................................. 16 4. ESTIMATED ERROR STATISTICS FROM THE MODEL-DATA FUSION: PARAMETERS & STATE VARIABLES ................................................................................................................................................... 16 4.1 ASSIMILATION OF SATELLITE NDVI (STEP 1) ......................................................................................... 16 4.2 ASSIMILATION OF IN SITU FLUX MEASUREMENTS (STEP 2) ...................................................................... 18 4.3 ASSIMILATION OF OCEAN PCO2 DATA (STEP 3) ..................................................................................... 25 4.4 ASSIMILATION OF ATMOSPHERIC CO2 DATA (STEP 4) ........................................................................... 25 5. 6. LAND COMPONENT (STEP 1 & 2) .......................................................................................................... 13 SUMMARY AND PERSPECTIVES .................................................................................................................... 30 5.1 SUMMARY OF THE ERROR ESTIMATIONS ............................................................................................... 30 5.2 FUTURE ASSIMILATION OF BIOMASS DATA ............................................................................................. 30 REFERENCES ................................................................................................................................................. 33 Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 4/33 1. Introduction The initial aim of the two reports (condensed into one, see justification below) was to present the uncertainties associated to the 20-year CARBONES reanalysis products. However, several constraints led us to revise the objectives of these deliverables as detailed in the following remarks: First, we have chosen to merge the deliverables D430.1 and D430.2 on the error estimations associated to the Carbon Cycle Data Assimilation System (CCDAS). Indeed the separation between the errors on the parameters and the errors on the output fields is not straightforward, at least given the set up of the first V0 version of the system (see below). A more logical split is to discuss the a priori errors used as input to the CCDAS and then the estimated errors both on the parameters and the associated output fields. This report should be considered as a preliminary report describing part of the errors associated to the CCDAS. Indeed, due to some technical problems, not initially anticipated (i.e., the completion of the adjoint of ORCHIDEE, the compatibilities between the different data streams to be assimilated in the CCDAS, …), the global system is only entering in the “production phase”. We thus have not yet characterized the uncertainties associated to the use of all input data streams, simultaneously. Finally, we should recall the CCDAS approach for this first V0 version has been modified compared to the initial proposition in the Document of Work (DoW). Although these changes are detailed in a previous report (D410.1), we will briefly recall the new CCDAS approach and slight modifications compared to the report D410.1. Given these remarks, we are only able to provide a first hint on the uncertainties associated to the different CARBONES products. In this context such report will be revised in 6 months in order to incorporate the results from the “global optimization step” (see below). In the following, we will thus describe: In section 2, the various components of the CCDAS (models and data streams) together with the principles of the error estimation on parameters and the propagation of errors in the space of the state variables; in section 3, the determination of the a priori error statistics on the model parameters and observations; in section 4, the determination of the posterior errors on model parameters and observations; finally, in section 5, we summarize the results and describe the anticipated impact on error estimation from the assimilation of biomass data in the future version (V1) of the CCDAS. Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 5/33 2. CARBONES CCDAS version V0 and error estimation The description of version V0 of the system has been detailed in a previous report (D410.1). However, we briefly recall below the different steps that are performed (section 2.3) in order to define the context of the error estimations associated to each step. 2.1 Overall CCDAS approach Results obtained during the consolidation phase of the system led us to propose a sequential assimilation of the different data streams (satellite products measuring vegetation activity (e.g. NDVI), ecosystem fluxes measured at several sites, ocean pCO2 measurements, and atmospheric CO2 measurements). The reasons of this new approach (compared to what has been proposed in the DoW) are detailed in the report D410.1. The different steps of this sequential approach are described in Figure 1 and are as follows: Step 1: Assimilation of the remotely sensed products of vegetation greenness (NDVI) derived from MODIS into ORCHIDEE; the prior parameters including values and error covariance (X0 and P0) are optimized to produce a first set of optimized parameters X1 with error covariance P1. Step 2: Assimilation of in situ flux measurements ; the parameters X1 and P1 are used as input to the optimization system (adding new parameters) and further optimized to produce the second set of optimized parameters (X2 and their error covariance P2). Step 3: Assimilation of ocean pCO2 measurements into a statistical model (neural network) to produce a priori air-sea fluxes Step 4: Final assimilation step using the atmospheric CO2 measurements as a global constraint; the parameters X2 and their error covariance P2 are used as prior for the ORCHIDEE model and the air-sea fluxes from step 3 are used as prior ocean fluxes. The result of this last optimization step will consist of i) final parameters for ORCHIDEE with the associated optimized land fluxes and ii) the optimized ocean fluxes. In the remaining part of the document we will make reference to the above four sequential assimilation steps. Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 6/33 Figure 1: Sequential assimilation of the different data stream into 4 steps. 2.2 Principle of error estimation For each step described above (except for the computation of the prior ocean-atmosphere CO2 flux, step 3), the model parameters are optimized using a 4D-var assimilation method. The approach relies on the minimization of a misfit function J(x) that measures the mismatch between 1) a set of observations Y and corresponding model outputs M(x), and 2) the values x of the parameters (to optimize) and some prior information on them xp, weighted by the prior error covariance matrices on observations R and parameters Pb (Tarantola, 1987): 1 J ( x) (Y M ( x)) t R 1 (Y M ( x)) ( x x p ) t Pb ( x x p ) Eq. 1 Within this Bayesian inversion framework, we then account for uncertainties regarding the model and the observations (through R), and the prior parameters (through Pb), assuming that the errors on prior parameters and observations follow Gaussian distributions. The optimal set of parameters is the one that minimize the misfit function. The posterior estimation uncertainties associated to these optimized parameter values are characterized by the matrix Pb': Pb ' H t .R 1 .H Pb 1 1 Eq. 2 Pb' is the posterior error covariance matrix on the parameters. Its determination follows Gaussian assumptions on the distribution of the parameter values and errors, and assumes that the model is linear in the vicinity of the solution. The matrix H is the Jacobian matrix of the model M at the Deliverable D430.1 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page Deliverable D430.2 7/33 minimum of J: it quantifies the sensitivity of the model outputs with respect to each parameter (∂M(x)/∂x). Finally, we need to asses the impact of the data assimilation on the uncertainties associated to the state variables, i.e. the carbon fluxes and stocks. Such step involves the propagation of the parameter error variance-covariance matrix (Pb’) to the state variables. This step is usually performed assuming linearity of the model around the optimal parameter set. Following the standard rule of error propagation, the corresponding posterior error covariance matrix on the state variable R'sv can be expressed as: R'sv H sv .Pb '.H sv t Eq. 3 Where, Hsv represents the jacobian matrix of the model relating the parameters to the state variables of interest (fluxes and stocks). Note that this matrix may be different than the H matrix as this later matrix represents the sensitivity of the observations (which are not necessarily the state variables) to the parameters. In the following section we recall briefly for each inversion step, the associated parameters and how the above uncertainty calculation has been implemented. 2.3 CCDAS components and associated error estimation 2.3.1 Land component (step 1 and 2) The assimilation system component is based on the ORCHIS tool (described in D410.1) to estimate the set of ORCHIDEE parameters x that provide the best fit between model outputs and observations. Its functional description is recalled in Figure 2. Figure 2: Functional description of the ORCHIS assimilation system for the ORCHIDEE model. The ORCHIS system can combine several streams of data together. At the end of the iterative process minimizing the cost function J(X), a routine computes the posterior uncertainties on model parameters from Eq. 2 (written in FORTRAN-90). The system has been designed in such a way that any assimilation run requires only simple modification of a configuration file, with various possible options. Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 8/33 The user can tune: the number of parameters to optimize, as well as their respective uncertainty and range of variation; the number of observations to assimilate, their temporal resolution, as well as their respective uncertainty; the main optimization parameters such as maximum number of loops, termination criteria, etc. In the first V0 version of CARBONES CCDAS, we use the multisite version of ORCHIS that allows optimizing a mean set of parameters using simultaneously information from several measurement sites. Consequently, the observations state vector is the concatenation of the observations vectors from all the sites, whereas for the parameters the size of the control variable vector x depends on the assumptions of “genericity” that are made. The following types of parameters can be considered: site-specific parameters. These parameters are not considered as generic and therefore vary for each site considered (ex: the multiplicative factor of soil carbon pools, that is strongly tied to the site history); region-generic parameters. They are involved in PFT-independent processes, such as heterotrophic respiration or energy balance. The control vector x have as many components as there are regions considered in the optimization; region-and-PFT-generic parameters. These parameters allow describing the behavior of the sites located in the same region and sharing the same PFT (ex: the maximum leaf area index, the maximum rate of carboxylation, etc.). We have chosen to use ORCHIDEE default parameter values as the a priori information xp on the parameters (Bayesian term in the misfit function). Furthermore, each parameter has been assigned a range of variation (minimum and maximum values) in order to constrain the searched solution into physically acceptable ones. The range of variation depends on the type of vegetation for PFT specific parameters. The broad ranges of variation (across all PFTs) for each parameter are given in Table 1. parameter name lower bound upper bound Vcmax_opt 17 140 Fstressh 0.8 10 Humcste 0.2 10 Gsslope 0 15 Tphoto_opt_c 5 57 Tphoto_min_c -10 18 Deliverable D430.1 Deliverable D430.2 parameter name Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 9/33 lower bound upper bound Tphoto_max_c 18 75 Kpheno_crit 0.5 2 Senescence_temp_c -5 22 LAI_MAX 1.5 10 SLA 0.004 0.05 Leafagecrit 30 1110 Klaihappy 0.35 0.7 Tau_leafinit 5 30 LAI_init 0.1 10 Z0_over_height 0.02 0.1 Kalbedo_soil 0.8 1.2 Kalbedo_veg 0.8 1.2 So_capa_dry 0.9*1e6 2.7*1e6 So_capa_wet 1.5*1e6 4.5*1e6 Q10 1 3 Moistcont_a -2 0 Moistcont_b 1.8 6 Moistcont_c 0.1 0.6 Moistcont_min 0.1 0.6 KsoilC 0.25 4 Maint_resp_c 0.1 2 Maint_resp_slope_c 0.04 0.48 Frac_growthresp 0.1 0.5 Dpu_cste 0.1 6 Z_decomp 0.05 5 Hcrit_litter 0.01 5 Table 1: Broad range of variations (across all PFTs) of the ORCHIDEE parameters. Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 10/33 Assimilation of satellite NDVI products (step 1): In the first step we assimilate NDVI products derived from MODIS observations over the 2000-2008 period. In V0 version of the CCDAS, ORCHIDEE does not embed any radiative transfer model, so that we use a simple observation operator to compare ORCHIDEE Leaf Area Index outputs (LAI) to satellite observations. The Fraction of Absorbed Photosynthetically Active Radiation (fAPAR) is derived by ORCHIDEE as a function of the LAI, the latter being computed prognostically by the model at a daily time step: fAPAR = 1-exp(-0.5 x LAI) Eq. 4 NDVI and fAPAR are linearly related. As we are more confident in the seasonal behaviour of fAPAR than in its absolute value, we only account in the misfit function for the normalized values of both NDVI and fAPAR. To avoid the influence of spurious outliers, we reject values below / above the 5% / 95% thresholds, respectively. As described in the D410.1 report, the satellite data are used to constrain only few phenological parameters whose respective role on the seasonal cycle is recalled in Figure 3. Compared to the initial description of the system in the D410.1 report, the LAI_MAX (maximum leaf area) parameter has been replaced in the optimization by the parameter Klaihappy, that controls the value of LAI at which plants stop using the carbohydrate reserves. The reason for discarding LAI_MAX from the optimization is because the assimilation is conducted on normalized data thus loosing the information from the maximum LAI (or fAPAR) values. As LAI_MAX impacts the amplitude of the seasonal cycle, we would have changed it value during the optimization process for any marginal improvement in the phase of the model phenology but no real direct constraint. Figure 3: Optimized ORCHIDEE phenological parameters Finally, as explained in the report describing the CCDAS (D410.1), we have chosen to use only a subset of all possible MODIS measurements for each PFT. Instead of using all grid points with significant PFT fractional cover, we selected those that have the maximum PFT coverage and that present enough clear sky measurements to accurately characterize the seasonal evolution of the vegetation activity. Typically we selected around 10-20 pixel for each PFT. Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 11/33 Assimilation of flux measurements (step 2): Compared to the initial description of the CCDAS in the D410.1 report, we slightly increased the complexity of this step in order to increase the efficiency of the overall system and in particular of the last global optimization (step 4). Step 2 now combines two types of observations: The standard net ecosystem carbon exchange (NEE) and latent Heat flux (LE) measured in situ for a collection of flux tower sites. The data are described in the report D300.1. We will assimilate only daily means rather using the high frequency measurements. Additional NEE derived from a previous 4D-var atmospheric assimilation of atmospheric CO2 concentration (such as described in section 4.4, following Chevallier et al. 2010). The estimated fluxes are used as pseudo-observations, at a daily time step, and at the spatial resolution of the atmospheric transport model (LMDz) grid: 3.75°x2.5°. Only a subset of “pseudo observations” for each plant functional types (on the order of 10) has been selected. We selected the grid cells that are dominated by the considered PFT (following ORCHIDEE land cover map). The use of these additional “pseudo-observations” allows pre-optimizing the ORCHIDEE model. The objective is to find a set of parameters that already produces net carbon fluxes that are partly compatible with the seasonal cycle of the atmospheric CO2 concentrations. This approach will help to reduce the number of iterations in the final global optimization (step 4). Note that for both types of observation, the daily data are further smoothed using a moving average window of ±15 days (for the model and the observations) in order to remove high frequency variations in the data that can not yet be properly captured by ORCHIDEE. Finally, there is on the order of 50 parameters to be optimized, depending on the set-up. 2.3.2 Ocean component (step 3) The ocean fluxes are computed as: aq atm Fco2 = Kex * ( PCO2 – PCO2 ) Eq. 5 Where PCO2 is the partial pressure respectively in the sea surface water (aq) and the atmosphere at the interface to the water (atm), Kex (piston velocity times the solubility) the exchange coefficient and F the flux directed to the atmosphere. In this first V0 version we have developed a statistical model based on a neural network technique to reconstruct the relationship between the Pco2 at the surface water and some variables which are supposed to control its variations at the first order (see report D410.1 for more details). These variables are the Seas Surface Temperature (SST), the Sea Surface Salinity (SSS), the Mixed Layer Depth (MLD), and the CHLorophyll content (CHL). The atmosphere Pco2 input to the system is taken from an atmosphere inversion model (Chevallier et al. 2010). Many formulation of the exchange coefficient exists in the literature. They all depend on the wind speed and the maximum dispersion is obtained for high wind speed. Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 12/33 For this particular step, we will not use the classical Bayesian error estimation/propagation. The sea surface water Pco2 calculated from the neural network will be compared to independent Pco2 measurements in order to evaluate the characteristic of the errors: i.e. the shape of the distribution and an associated standard deviation. The neural network will also be used to derive spatial correlation between these errors. 2.3.3 Atmospheric component (step 4) The final step consists in the assimilation of atmospheric CO2 measurements to optimize simultaneously the ocean and land fluxes, using prior air-sea fluxes for each grid cell from step 3 and the pre-optimized ORCHIDEE model for the land fluxes (steps 2 & 3). The approach relies on the iterative minimization of a cost function, following the principles of the four dimensional variational (4D-Var) systems developed for numerical weather prediction [e.g., Courtier et al., 1994] A single inversion is performed that covers the 20 years at once. The operator that links the variables to be optimized (i.e., the surface fluxes) and the observations (i.e., the atmospheric measurements) in the inversion scheme is version 4 of the LMDZ transport model [Hourdin et al., 2006], nudged to European Centre for Medium‐Range Weather Forecasts (ECMWF).The scheme is described in more details in the report D410.1. Following Chevallier et al. (2007) we rely here on a Monte Carlo approach to estimate the error statistics of the inverted ocean fluxes (each pixel fluxes): these are reconstructed from an ensemble of inversions using synthetic data as input. The ensemble is defined in such a way that it rigorously explores the statistics of the prior errors and of the observation errors. In other words, if the ensemble of inversions grows, the corresponding ensemble of observations converges toward the assigned observation error statistics. The same feature applies to the ensemble of the prior fluxes that converge toward the assigned prior error statistics. By construction, the ensemble of the inverted fluxes then follows the theoretical error statistics of the posterior fluxes. This feature will be exploited here with a synthetic 20 year inversion over the same period. For the errors on ORCHIDEE parameters, we will use the standard approach described in 2.2, with the linear approximation for the ORCHIDEE model. The computation of Pb’ will be performed with Eq. 2. The H term will combine i) the adjoint model of the transport model LMDz that provides the sensitivity of the cost function to all surface fluxes with ii) the tangent linear model of ORCHIDEE (first version of CARBONES) that provides the sensitivity of these fluxes to the main parameters. In the future version of the system, we will use the adjoint model of ORCHIDEE (under completion) to derive more efficiently these sensitivities. Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 13/33 3. Prior error statistics on parameters and observations 3.1 Land component (step 1 & 2) 3.1.1 Prior error statistics on the parameters For the determination of the prior error variance-covariance matrix on parameters Pb, only diagonal elements (variances) are accounted for, because the error correlations between these various parameters are rather difficult to estimate. Thus, the a priori errors on the parameters are assumed uncorrelated. Rather large uncertainties have been assigned to each parameter in order to let the observations mainly drive the inversion. The Bayesian term has thus a smaller influence on the retrieved values of the parameters, but it still ensures the stability of the algorithm towards a proper determination of a unique minimum of the misfit function. The a priori errors on ORCHIDEE parameters are determined with a generic approach: for a given parameter, the prior standard deviation is set to 40% of its prescribed definition interval. For the parameters depending on PFT, this implies that the prior errors have distinct values with respect to the PFT considered 3.1.2 Error statistics on the observations and the model Only the diagonal elements of the prior variance-covariance matrix on observations R are considered. Whether the assimilation is performed on satellite NDVI products or on in situ flux measurements, we have chosen a rather general approach to define the corresponding priori uncertainties: the a priori estimate of the observation error is set equal to the Root Mean Square Error (RMSE) between the various measurements (site dependent) and the corresponding ORCHIDEE outputs using the a priori (standard) parameterization. Note that the assimilation system let the possibility to further scale these a priori uncertainties, considering that our ability to make the model and the various observations match should be consistent with the error statistics we use in assimilation (Gaussian hypothesis). Assimilation of flux data The computation of daily means for NEE should, in an ideal case, either put lower weight to data measured during the night, or keep only daytime values. Indeed, the error associated to night-time measurements is usually higher than that of daytime observations for several reasons: atmospheric stratification resulting from low turbulence (calm nights) may result in an accumulation of CO2 within the canopy which lead to an underestimation of the night-time net CO2 flux ; Deliverable D430.1 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page Deliverable D430.2 14/33 the CO2 accumulated can be released suddenly at dawn, turbulence increasing at the nightday transition, which lead to an overestimation of the net CO2 flux ; these error sources for the CO2 flux at night also concern the other fluxes. However, the informational content of night-time data on ecosystem heterotrophic respiration is highly valuable, because this signal then impacts most of the observations in the absence of plant photosynthetic activity. Given that we choose not to use the information content from the full diurnal cycle (in order to focus on seasonal time scale), we therefore kept night-time measurements for the computation of flux daily means. 3.2 Ocean component (step 3) We describe in this section how are characterized the errors affecting the sea-air fluxes estimated by the ocean component of the CCDAS CARBONES. To estimate the error of the flux, we use the formula for flux in Eq. 5 and estimate the error coming from each term: dFco2 aq atm aq atm dKex * ( PCO2 – PCO2 ) + Kex * (d PCO2 – d PCO2 ) = Eq. 6 Eq. 6 can be seen as a formula to propagate the errors coming from the terms of Eq. 5 for each time step. aq atm aq atm Err_Fco2 = Err_Kex * ( PCO2 – PCO2 ) + Kex * (Err_ PCO2 – Err_ PCO2 ) Eq. 7 The error from Err_Kex will be expressed from the spread between some different formulations depending on the wind speed (Figure 4). We notice with this figure that the larger the wind speed, between the larger the difference the different formulation of the exchange coefficient. We are thus currently deriving a formulation to express the error on Kex proportional to the wind speed. Deliverable D430.1 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page Deliverable D430.2 15/33 Figure 4: Spread of the piston velocity depending on the wind speed within different formulations in the literature. atm The errors Err_ PCO2 are provided by the atmosphere inversion system and taken in a first aq approximation to be negligible. The Errors Err_ PCO2 are estimated from the residual errors measured on the statistical neural network model. Figure 5 presents below the histogram of the differences aq aq between the results of the Neural Network PCO2 and the raw PCO2 data from the Takahashi database (see more details on these input data in the report D300.1). The histogram indicates that are statistical model produces non biased estimates with a distribution centred on zero. The spread of the histogram is also relatively small with 95 % of the data below 10 % errors. Figure 5: Performances of the statistical model aq PCO2 estimator on an independent set of climatological observations. The x-axis corresponds to the relative error ; the y-axis to the number of observations. Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 16/33 3.3 Atmospheric component (step 4): error on CO2 concentrations In this section, we describe the error statistics that will be applied to the atmospheric CO2 concentrations used in the CCDAS. The location of the different stations is described in the report on the “Input data stream of the CCDAS” (D300.1). The uncertainty assigned to each observation within the inversion system includes the error of the measurement, the error of the forward model that simulates it from the parameters to be optimized, and the representativeness error (i.e. the mismatch between the scale of the measurement and the scale of the transport model). It is here time-independent: its variance is set to half the variance of the high-frequency variability of the de-seasonalized and de-trended CO2 time series of the measurement at a given station (even for two stations - CSJ and KOT -where the data are provided as daily averages). The high-frequency variability is calculated following Masarie and Tans (1995). The resulting error varies between a few tenths of a ppm for marine stations and several ppm for continental ones, reaching 6 ppm at CBW0200 station, in the Netherlands, which means that they are more than one order of magnitude larger than the measurement errors at all stations. Because of the large temporal error correlations of the transport models that simulate the measurements in the flux inversion systems, the continuous measurements have been further de-weighted by multiplying the observation error by the square root of the number of local data each day. Error correlations are therefore neglected. This reference set-up is well adapted for flux inversion (Chevallier et al. 2010). It will be adapted to account for the errors of the ORCHIDEE model that should further lessen the weight of the observations in the inversion system (increasing their errors). 4. Estimated error statistics from the model-data fusion: parameters & state variables As explain in the introduction, the different steps of our model-data fusion approach have not been all completed yet. Steps 1, 2, and 3 are partially completed while the final step is still in a “test phase”. We thus describe below, for each step, only preliminary error estimates. 4.1 Assimilation of satellite NDVI (step 1) In order to illustrate the estimation of posterior errors on model parameters and model state variables when assimilating NDVI products, we present the results of the multisite inversion focusing on the Boreal Needleleaf Summergreen PFT. Note that because we choose to optimize only ORCHIDEE parameters that are dependant on the PFT, we can conduct separate optimization for each PFT. The results for all PFT is only partially analysed yet and we discuss briefly discuss the typical result we will obtain. Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 17/33 The assimilation has been conducted on daily MODIS NDVI products aggregated at 0.7°x0.7° and available for the 2000-2008 period, for 15 pixels. These pixels have been chosen because of their thematic homogeneity with respect to this type of vegetation (fraction of Boreal Needleleaf Summergreen > 50%). Figure 6 illustrates the error correlation matrix on the optimized phenological parameters (the correlation matrix is directly estimated from the posterior error covariance matrix on parameters P'b presented in §2.2, Eq. 2). The analysis of the posterior correlation matrix on parameters indicates the level of constraint associated to each parameter from the observations that are used. For the combination of satellite observations over several different pixels considered through the multisite optimization, the optimized parameters are only slightly correlated. The maximum correlations between the different parameters are lower than 0.3 in absolute values. The results thus indicate that there is enough information in the remote sensing signal to resolve the different phenological stages these parameters are controlling. Figure 6: Posterior correlations between the estimated phenological parameters for the assimilation conducted on satellite NDVI data for the 15 sites located inthe Northern Hemisphere. The propagation of errors in the space of the state variables (fAPAR is chosen as a first example) is illustrated in Figure 7 for two of the pixels that are used in the optimization. We present the temporal variation over the 2000-2008 period of the posterior standard deviation (square root of the diagonal elements of the matrix R'sv, see §2.2, Eq. 3). We can see that the posterior uncertainty on simulated fAPAR is rather small (usually below 0.015), as compared to the prior observation error that was set to about 0.4 for all pixels considered. This strong error reduction is directly proportional to the high number of observations that are used. The posterior error is close to zero in winter, due to the model insensitivity to variations of the phenological parameters in this season. The variation of the error during the growing season are mainly associated to the parameter Leafagecrit (controlling the age of leaves), and to a smaller extent to the parameter Klaihappy (controlling the value of the leaf area index after which vegetation stops using carbohydrate reserves). The strong error peaks at the beginning and end of the growing season are attributed to the parameters Kpheno_crit and Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 18/33 Senescence_temp_c driving respectively the date of leaf onset and the threshold temperature at which leaves enter in senescence. The implication in terms of error reduction on the estimated carbon fluxes from the error reduction on the phenology parameters after the assimilation of MODIS data is only under investigation. These results will be presented and discussed in a revised version of this report. Figure 7: Posterior uncertainties on simulated fAPAR time series for two pixels located in the Northern Hemisphere. The first tree digits of the pixel name relate to the latitude coordinate, the two last digits refer to longitude. 4.2 Assimilation of in situ flux measurements (step 2) We illustrate in this section the determination of the posterior error on the ORCHIDEE model parameters for one particular case (one PFT), knowing that the optimization for the other PFTs using as many sites as possible is only under completion. As noticed above for the first step of satellite data assimilation the optimization for the different PFTs can be conducted separately given that we choose to optimize PFT specific parameters. The delay in the different optimizations is partly due to difficulties in the completion of the optimization system and in the collection of the FluxNet Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 19/33 observations. The results presented below also do not yet account for the additional “pseudoobservations” from a previous atmospheric inversion, as described in section 2.3.1. They are conducted with daily variations of Net Ecosystem CO2 Exchange (NEE) and Latent Heat (Qle) fluxes measured on the sites. Figure 8: Locations of the measurements sites used for the optimization. The yellow and green latitude bands are the regions used to group the parameters. The results are presented for assimilations performed with twelve sites representative of the Temperate Broadleaved Summergreen ecosystem, and distributed in two predefined regions corresponding to the Northern and Southern hemispheres (Figure 8). All sites have more than 70% of their vegetation represented by temperate deciduous broadleaved forests, the rest of the biomass being C3 grasslands. We illustrate below only the results for the multi-sites assimilation conducted on the sites located in the Northern Hemisphere region (green region of Figure 8). The performances of the multi-site assimilation approach, in terms of optimized parameter values and corresponding posterior uncertainties, is presented in Figure 9. The results are compared to results of the assimilations conducted on each site separately (single site optimization versus multisite optimization). The notion of "parameter genericity" varies with the parameters considered. For some, the estimated values following the multisite optimization is an average (within the estimation uncertainty) of the individual values obtained with the single-site assimilation approach: Vcmax_opt, Fstressh, Tphoto_opt_c, SLA, Tau_leafinit, etc., for instance. Considering that the model fit to the data after the optimization is very similar in the single-site and multisite approaches (though slightly better in the case of the single-site assimilation), this points out that a common set of parameter values may be derived for the various sites considered. On the other hand, other parameters show very strong site dependencies, that the multisite assimilation fails to resolve: Gsslope (slope of the stomatal conductance), Kpheno_crit, Leafagecrit, Q10, etc. For the parameters depending on PFT, this can indicate that the type of vegetation is actually different between the sites considered (i.e. different species with different behaviours), due to a strong influence of the local climatic drivers, soil characteristics, physiological properties, management histories, contrary to our prior assumptions that they all share the same generic PFT concept with the same parameters. Note that this feature can also be related to the numerical assimilation procedure, that can fail to capture the global minimum of the misfit function due to non- Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 20/33 linearities and/or numerical accuracy. For instance, the sensitivity of the model to the phenological parameters Kpheno_crit and Senescence_temp_c, is computed with a finite difference approach that is less accurate than the use of the tangent linear model of ORCHIDEE (used for the other parameters). Figure 9 (see next page): Optimized values of the parameters for the assimilation conducted on the sites of the Northern Hemisphere region. Prior values and uncertainties are in thin black, multisite optimized values and uncertainties in thick black, and each local values and uncertainties in thick color. Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 21/33 Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 22/33 Figure 10 presents the posterior correlation matrix on some of the optimized parameters. It generally exhibits rather low correlations between the various parameters. One can note strong error correlations between the different site dependant KsoilC parameters controlling the amount of soil carbon pools. From a general point of view, the highest correlations are obtained for the parameter involved in the same processes. This is for instance the case for Fstressh and Humcste, both governing plant hydric stress, or for the parameters tightly driving photosynthesis (Vcmax_opt the maximal rate of carboxylation, Gsslope the slope of stomatal conductance, and Tphoto_opt_c determining the temperature at which photosynthesis is optimal). The correlation gauges the level of interaction between the parameters. High correlation values indicate that the corresponding parameters cannot be resolved with the observations that are currently used. Such matrix is important to analyse in order to assess the potential of the overall assimilation process and especially not to over-interpret the results. The correlations also highlight the parameters and thus the processes that are not well constrained by the chosen set of observations. For this particular case, it points to the need of specific measurements to optimize separately the parameters controlling the above ground respiration (growth and maintenance) versus the below ground heterotrophic respiration. These could be soil chamber measurements. Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 23/33 Figure 10: Posterior correlations between the estimated parameters for the assimilation conducted on the sites of the Northern Hemisphere region. The propagation of the estimated error on the parameters onto the space of the state variables is showed in Figure 11. It presents the temporal variation of the posterior uncertainty (standard deviation) on the Net Ecosystem CO2 Exchange (NEE) and Latent Heat (Qle) fluxes for four sites (on the specific period where observations are available at each site). Once again one can observed a strong reduction of the error, considering that the prior error on NEE and Qle is respectively on the order of 2 gC/m2/day and 20 w/m2. Except for the Willow Creek site (US-WCr), the temporal variation of the posterior uncertainty mimics the seasonal cycle for NEE whereas this feature is less pronounced for Qle. For the latent heat flux, the error exhibit stronger synoptic variations. For NEE, the higher errors are obtained during the peak of the growing season, where photosynthesis is maximum. This is link in particular to the error on the parameters that control the maximum photosynthetic capacity and its dependency to temperature and soil moisture. Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 24/33 These results still need further investigations. In particular we need to derive the errors associated to the annual and seasonal carbon fluxes for individual sites like in Figure 11, but also for the total carbon flux of a given region. This is the particular objective of step 4 of the current V0 version of the system and it will only be fully achieved in the coming months. Figure 11: Posterior uncertainties on simulated NEE and latent heat fluxes for four sites located in the Northern Hemisphere region: Hainich (Germany), Soroe (Danemark), Hesse (France), and Willow Creek (USA) Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 25/33 4.3 Assimilation of ocean Pco2 data (step 3) Step 3 of the sequential CCDAS approach consists in the provision of net air-sea carbon fluxes from aq an ensemble of observations including in particular PCO2 data. Section 3.2 described the principle of the approach and the errors associated to the input fields: errors on the exchange coefficient Err_Kex aq and on the partial pressure of CO2 at the surface of the ocean Err_ PCO2 . From the different terms of Eq. 7 (see §3.2), we can derive a covariance matrix on the flux V_Fco2 by doing a temporal mean from the respective error estimate at each time step: V_Fco2 = E ( Err_Fco2 - E(Err_Fco2) )T ( Err_Fco2 - E(Err_Fco2) ) Eq. 8 With this approach, we take the assumption of a temporal constant error covariance matrix. This matrix will characterize only the spatial covariance of the error on the fluxes. Given that the error estimation on the fluxes is only under completion, we will describe them in a more comprehensive way in a next version of this report. 4.4 Assimilation of atmospheric CO2 data (step 4) The final step of the overall model-data fusion, i.e. the assimilation of the atmospheric CO2 observations to adjust simultaneously land and ocean fluxes, is not completed yet. Indeed we faced some technical problems and the system is currently only in the test phase (see report D410.1 for more details on the reason of the delay). Thus, we only provide below some typical uncertainties that would arise from this last step. For that purpose, we used the set up of Chevallier et al. (2010), and performed a classical atmospheric inversion with the same atmospheric CO2 data as constraint, the same transport model (LMDz), but solving for the weekly land and ocean surface fluxes at the resolution of the transport model (3.75 x 2.5 degrees). The first V0 version of CARBONES will only replace the land component by the ORCHIDEE model, solving for model parameters rather than the fluxes themselves. We first describe the uncertainties obtained in this configuration and then discuss the expected modification with our CCDAS set up. Figure 12 illustrates the uncertainties obtained by the system for a 20 year inversion. As explained by Chevallier et al. (2007), the Monte Carlo approach yields an estimate of the degree of freedom for signal (DOFS) of the observation system. The DOFS quantifies the number of independent quantities about the fluxes that the inversion system exploits. In this case it is about 400 per year. The distribution of the fractional uncertainty reduction over the globe (Figure 12b and Figure 12c) shows where in space the 400 information pieces lie. Further, these maps quantify the knowledge brought Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 26/33 by the surface measurements on the CO2 weekly surface fluxes for the first (1988–1997) and second (1998–2008) decades of the study. The fractional uncertainty reduction is defined as 1 minus the ratio of the posterior error standard deviation to the prior error standard deviation. A value of 0 indicates that the observations have not provided any information to the prior. A value of 1 would be reached if the observations gave a perfect knowledge about the fluxes. The impact of the measurements results from the combination of assigned prior errors, assigned observation errors, observation density, and transport characteristics. It is mostly located in the vicinity of the stations, with values larger than 30% at continuous stations like in eastern Canada, in South Africa, and in Finland. The difference between Figure 12b and Figure 12c mainly reflects the evolution of the network between the two decades, with several stations added to the network. Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 27/33 Figure 12: (a) Quadratic mean of the standard deviation of the errors of the prior weekly fluxes (gC m −2) per day) throughout the 20 years. Expected uncertainty reduction in each grid point provided by surface stations for estimation of 8-day-mean CO2 surface fluxes for the periods (b) 1988–1997 and (c) 1998– 2008. The reduction is defined as [1 − (sa /sb)], with sa the quadratic mean of the posterior error standard deviation and sb the prior error standard deviation. The impact appears to be larger when the fluxes solved in each grid point and 8 day period are aggregated in space and time. Figure 13 presents it at the scale of the widely used 22 TransCom3 regions of Gurney et al. (2002) and for weekly, monthly, and yearly averages for the second decade. In the mid‐ and high latitudes of the Northern Hemisphere lands, where most stations are located, all regional flux estimates are improved by more than 20% and by up to 60% (North American Boreal, Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 28/33 North American Temperate, and Eurasian Boreal regions with annual fluxes). As an example, for the TransCom3 “Europe” region the inversion theoretically reduces the flux uncertainty from 1.0 to 0.6 GtC.yr−1. The figures for the lands in the tropics and in the Southern Hemisphere are between 10 and 30%. Over ocean basins the reduction lies between 0 and 25%. Figure 13: Expected uncertainty reduction provided by surface stations for estimation of CO 2 surface fluxes in the 22 TransCom3 regions for the period 1988–2008. As in Figure 13 the error reduction is defined as [1 − (sa /sb)], with sa the posterior error standard deviation and sb the prior error standard deviation. Results for weekly (blue bars), monthly (red bars), and annual (green bars) fluxes are shown. Note that for annual fluxes, sa and sb are computed on an ensemble of 21 realizations of the yearly errors only. The inversion spreads a sink of a few GtC.yr−1 over the lands to yield the CO2 growth rate seen by the measurements. This large negative increment varies in space and in time. For instance, in the North American Boreal region, the mean budget remains around 0 throughout the years before and after the inversion, while the inversion reduces the North American Temperate budget by a few tenths of a GtC.yr-1 without a noticeable trend and increases a positive trend in the Eurasian Temperate region. However, even at the regional/continental scale the carbon fluxes are still uncertain with few significant error correlations between adjacent regions. Figure 14 illustrates the monthly error correlations between the “classical Transcom3 regions”. For example, the location of the European flux increment could well be placed in Eurasia instead of Europe, given the strong negative correlation between these two regions. A larger observation network toward Eastern Europe and Siberia would be needed to resolve this ambiguity. Note that the correlations computed for yearly fluxes were found to be hardly reliable because the size of the ensemble in this case (21 members) is too small, but we expect that they behave similarly to the monthly flux correlations shown here. For more details see Chevallier et al. 2010. Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 29/33 Figure 14: Correlations between the posterior uncertainties of the monthly regional fluxes aggregated at the scale of the 11 Transcom3 land regions. Names are abbreviated. Expected changes of the error statistics with our CCDAS set-up: The optimization of ORCHIDEE parameters instead of the biospheric fluxes themselves is expected to have significant impacts both on the land fluxes and on their uncertainty. We summarize below the expected outcomes: First the use of model will avoid the problem of changing the observation network size. In the case of an atmospheric inversion, the appearance of a new station usually induces abrupt changes in the spatial distribution of the surface fluxes. Such artefact will disappear when solving for parameter that control the fluxes for the entire period, because the optimization of parameters associated to Plant Functional Types (13 different PFTs) induce strong correlations between all fluxes of a given PFT. As a consequence, the spatial distribution of the error reduction, illustrated in Figure 12 for the standard atmospheric inversion, will be much smoother spatially. The magnitude of the error reduction is likely to be also much larger than illustrated above. Indeed with only around 50 parameters, the number of degrees of freedom of the inversion is much smaller than with the optimization of all grid-cell fluxes (even with spatial prior error correlations). Finally, the error correlations between the flux errors of regions sharing similar PFT will be much larger than those depicted in Figure 14 with the classical flux inversion. All these changes will be presented and discussed in a revised version of this report. Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 30/33 5. Summary and perspectives 5.1 Summary of the error estimations This report should be considered as preliminary, given that the different steps are still under completions. More specifically, the final step (system under verification) will only be launched after finalization of the previous steps. We will thus update the current report within the next 6 months. However the above results/analysis already highlights the potential of the CCDAS version 1, in terms of error estimates on the parameters that are optimized and on the targeted state variables (mainly the carbon fluxes). For the land component, the error associated to the different ORCHIDEE parameters will likely be rather small given the large number of observational constraints, compared to the number of unknown parameters. The large error reduction will induce a large error reduction in the state variables simulated by ORCHIDEE, i.e. the carbon fluxes and stocks of large regions. For the ocean fluxes, our first two steps approach, with the computation of prior fluxes from a large set of Pco2 data and the final optimization of each model grid-cell air-sea fluxes (using the atmospheric CO2 data) is likely to bring smaller constraint than for the land fluxes (much more unknowns) and thus larger errors. However, the next version of the CCDAS (V1) will move towards the optimization of few parameters controlling the air-sea fluxes, rather than the fluxes themselves. This will reduce the number of unknowns and thus enhance the ratio of observations versus parameters. However, the posterior errors on the estimated parameters are directly linked to the assigned prior errors and to the errors assigned to each data stream. Given that these errors are difficult to estimate, the errors returned by the CCDAS should also be interpreted with some caution. As discussed in Santaren et al. 2007, the ratios between the uncertainties of the different parameters are likely to be more robust than the absolute values. As a consequence, the spatial and temporal variations of the errors associated to the fluxes (state variable) are also more robust that the flux uncertainties themselves. 5.2 Future assimilation of biomass data We now provide an example of the future use of biomass data to constrain more directly the modelled carbon stocks. In this example, we describe a preliminary study with ORCHIDEE, based on the assimilation of above ground biomass at one site (Le Bray). Le Bray is a maritime pine forest located close to Bordeaux. It has been planted in 1970 and some thinning has been done during the past 40 years. Note that it also suffered severe storm damages in 1999. Aboveground biomass data was available for all years between 2004 and 2007. The ORCHIDEE Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 31/33 model was first run into equilibrium using the classical spin-up approach (recycling the meteorology for 1000 years to equilibrate the soil carbon pools), after which the tree biomass was clearcut. The forest was then grown to the realistic age of the forest (40 years). With this simulation we clearly see that the model overestimates the aboveground biomass (Figure 16). In a next step, three model parameters related to the allocation of NPP to aboveground biomass, leaf biomass and root biomass were optimized using the same method than described above. These parameters are the residence time that describes the turnover rate of woody biomass and two allocation parameters. The allocation parameter r0_opt affects the amount of biomass transferred to fine roots, and the allocation parameter s0_opt the amount of biomass allocated to sapwood. Uncertainties on those three parameters were greatly reduced after the optimization (Figure 15). Estimation of the aboveground biomass by the model is significantly improved after the assimilation (Figure 16). The associated errors on both parameters and carbon stock (Figure 15 and Figure 16) are largely reduced by the optimization. Such error reduction illustrates the potential of the biomass data given the small number of observations (one estimate each year) in terms of additional model constraint given that the eddy-covariance measurements and the satellite data will carry smaller information on partition between above ground and below ground biomass. These data will thus be considered as the next data stream to assimilate in the future version of the CCDAS. Figure 15: A priori and a posteriori allocation parameter values and their uncertainties. Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 32/33 Figure 16: Measured and modeled aboveground biomass before and after the optimization. The uncertainties on the biomass state variable is indicated: 2 sigma values are plotted for the prior and posterior model outputs and for the observations. Deliverable D430.1 Deliverable D430.2 Ref CARBONES-D430.1- REP-LSCE-012-01-00 CARBONES-D430.2- REP-LSCE-013-01-00 Date 31/03/2011 Page 33/33 6. References Courtier, P., J.‐N. Thépaut, and A. Hollingworth (1994), A strategy for operational implementation of 4D‐Var, using an incremental approach, Q. J. R. Meteorol. Soc., 120, 1367–1387, doi:10.1002/qj.49712051912. Chevallier, F., P. Ciais, T. J. Conway, T. Aalto, B. E. Anderson, P. Bousquet, E. G. Brunke, L. Ciattaglia, Y. Esaki, M. Fröhlich, A.J. Gomez, A.J. Gomez-Pelaez, L. Haszpra, P. Krummel, R. Langenfelds, M. Leuenberger, T. Machida, F. Maignan, H. Matsueda, J. A. Morguí, H. Mukai, T. Nakazawa, P. Peylin, M. Ramonet, L. Rivier, Y. Sawa, M. Schmidt, P. Steele, S. A. Vay, A. T. Vermeulen, S. Wofsy, D. Worthy, 2010: CO2 surface fluxes at grid point scale estimated from a global 21-year reanalysis of atmospheric measurements. J. Geophys. Res., 115, D21307, doi:10.1029/2010JD013887 Chevallier, F., F.‐M. Bréon, and P. J. Rayner (2007), The contribution of the Orbiting Carbon Observatory to the estimation of CO2 sources and sinks: Theoretical study in a variational data assimilation framework, J. Geophys. Res., 112, D09307, doi:10.1029/2006JD007375 Gurney, K. R., et al. (2002), Towards robust regional estimates of CO2 sources and sinks using atmospheric transport models, Nature, 415(6872), 626–630, doi:10.1038/415626a Hourdin, F., et al. (2006), The LMDZ4 general circulation model: Climate performance and sensitivity to parametrized physics with emphasis on tropical convection, Clim. Dyn., 27, 787–813, doi:10.1007/s00382-006- 0158-0. Masarie, K. A., and P. P. Tans (1995), Extension and integration of atmospheric carbon dioxide data into a globally consistent measurement record, J. Geophys. Res., 100(D6), 11,593–11,610, doi:10.1029/ 95JD00859. Michalak A.M., , Bruhwiler L., Tans P.P. (2004), A geostatistical approach to surface flux estimation of atmospheric trace gases, J. Geophys. Res. Atm., 109(D14) Santaren D., Philippe Peylin, Nicolas Viovy, and Philippe Ciais, Optimizing a Process based Ecosystem Model with Eddy-Covariance Flux Measurements: Part 1. A Pine Forest in Southern France, Global Biogeochemical cycle, 21 (2), 2007. Tarantola A. (1987), Inverse problem theory: Methods for data fitting and parameter estimation. Elsevier, Amsterdam.