GOESR3 Periodic Reporting Project Team: Bayesian Cloud Mask Reporting Period: September 2012 - February 2013 Team Lead: Team Members: Eileen Maturi, NOAA/NESDIS Jonathan Mittaz, University of MD, CICS. Christopher Merchant, University of Edinburgh, Edinburgh, UK. Christopher Old, University of Edinburgh, Edinburgh, UK. Claire Bulgin, University of Edinburgh, Edinburgh, UK. Project Title: Project Number: Development of a Bayesian Cloud Mask for GOES-R 05 Executive Summary The development of the GOES-R proxy Bayesian cloud screening software is proceeding on schedule. The methodology is consistent with that currently used in the Generalized Bayesian Cloud Screening software for processing GOES-13 over the ocean, but extends the technique to proxy GOES-R channels and to land as well as ocean. Data from the SEVIRI instrument onboard MSG2 are being used as a proxy for the GOES-R instrument. The Bayesian cloud clearing requires a set of probability density functions that delineate the locus of observations across multiple channels under cloudy conditions. These PDFs have been generated using the proxy GOES-R data. The locus of observations under clear conditions is addressed by on-the-fly simulation. The project is exploiting the fast forward model "RTTOV 11" for this simulation, which will allow efficient demonstration of principle within the project resource constraints. (Alternative simulators, such as "CRTM", may be substituted in any subsequent implementation if required.) RTTOV 11 includes a reflectance atlas which has been found to be the most appropriate for defining the land surface reflectivity from the available options tested. The GOES-R proxy software is currently ready to carry out batch processing of a large set of SEVIRI data to generate the simulation error statistics. Adjustment of simulations using these results will make the Bayesian cloud clearing as accurate as possible. The next phase of the project will be to use the PDFs and simulation capability for experimental cloud detection, including iterative problem solving to improve the demonstrated capability. Milestones (1) Generate the prototype Probability Density Functions defining the conditional probabilities required for the Bayesian cloud screening over land. (2) Test the new RTTOV11 visible model over land and compare its performance against the existing VISRTM model currently used in Bayesian cloud screening software. (3) Implement the RTTOV11 visible model in GOES-R proxy Bayesian cloud screening software being developed. 07/27/2016 GOES-R3 Status Report Template NESDIS STAR GOES-R Accomplishments & Plans Accomplishments: 1) Prototype Probability Density Functions for the clear and cloudy conditional probabilities The Bayesian statistic used to determine how likely a scene pixel is clear of cloud requires definitions of conditional probabilities of the pixel being clear given the observations and of the pixel being cloudy given the observations. In the Generalised Bayesian Cloud Screening (GBCS) software the spectral conditional probability of the pixel being clear given the observations is constructed onthe-fly using a forward model to simulate the top-of-atmosphere (TOA) radiance and reflectance for the instrument channels with the joint probability distribution being calculated using a function that was derived assuming the observation and simulated BT errors are Gaussian. The same assumptions cannot be made for the conditional probability of the pixel being cloudy given the observations nor for the textural conditional probabilities for both the cloudy and clear cases, so empirical probability density functions (PDF) in the form of look-up-tables (LUTs) are generated for these cases from existing observations that have been cloud screened. The textural PDFs (based on the local standard deviation of BTs in a 3×3 box) are effective in separating cloud and clear scene from the thermal channel data over the ocean as the textural structure of the SST fields is very different to that of cloud field. This relationship needs to be tested over land for both the thermal and visible channels as the land texture will not necessarily be that different from clouds. The PDFs currently being used to screen cloud over the ocean for the GOES-13 processing are not sufficient for cloud clearing over the land due to the variability in the both land surface emissivity and reflectance caused by vegetation, soil type, rock, snow/ice, moisture content, etc. Therefore extra information is required in the conditional input to include the effect of the land surface on the thermal and visible radiation emitted/absorbed and reflected. It was initially suggested that a single collapsible multi-dimensional PDF be generated covering all conditional cases. However, to reliably populate a PDF requires at least 1000 times more pieces of data than number of cells within the PDF. Given the number of dimensions required to define the large multi-dimensional PDF over land, there were insufficient data available to generate this single PDF. It was agreed that a set of PDFs would be generated based on the IR and visible data separately, and spectral and textural information separately. For the IR textural data separate day/night PDFs were required, as the GBCS code cannot accommodate solar zenith angles greater than 90° (i.e. night time solar angles). The set of PDFs generated are (i) IR Spectral, (ii) Visible Spectral, (iii) IR Textural (Day), (iv) IR Textural (Night), (v) Visible Textural. When these PDFs were being generated a suitable forward model had not been implemented within the GBCS code for clear scene simulations over land, so empirical PDFs for the conditional probability of clear scene were also constructed. These clear scene probability distributions are required to define the locus of interest around which the cloudy PDFs are constructed; they also provide an alternative means of testing the clear-sky simulations. The cloudy PDFs dimensions were chosen so as to focus on the region around the core of the clear-sky data cluster. Fine resolution or extended range where cloud is 100% certain is an inefficient use of computing time and look-up-table space. From previous work it has been found that there are no clear-scenes where the 11μm channel brightness temperatures (BT) are below 260K, so the Bayesian code initially applies a coarse screening to flag a pixel as cloudy if the 11μm BT is less than 260K. The cell size along each dimension was determined by taking a few months of data and finding the spacing that best covered the locus of the observations assuming clear scene. Table 1 lists the parameters used to define each of the PDFs, the range of values for each dimension, the cell size along each dimension, and the number of cells in each dimension. Within this project component the SEVIRI instrument deployed on MSG2 is being used as a proxy for the GOES-R instrument, as it has a comparable sensor set. One year of SEVIRI data collected at 15 minute intervals during 2009 were used to construct the PDF LUTs. The clear/cloudy scenes were determined using the CMS cloud mask provided with the SEVIRI data. The land surface type was defined using the International Geosphere/Biosphere Program (IGBP) surface type map. This map uses 18 surface classifications and is a single representative map of the surface type. Given the lack of seasonal variability in the land surface classification it was decided that the number of classifications would be reduced by combining similar classes. A set of six land surface classes were constructed from the 18 used by IGBP, defined as forest, desert, snow, water, urban, and vegetation. The mapping between this reduced set of classes and the IGBP classes is given in Table 2. The surface type that will have the largest seasonal impact on cloud clearing over land is snow cover. To minimize this seasonal effect, MODIS data from the MOD10C1 collection, provided by National Snow and Ice Data Center (NSIDC), were used to identify snow pixels in the SEVIRI imagery. Daily mappings of the MODIS snow data onto the SEVIRI disk were constructed prior to generating the PDFs to significantly reduce the computing time required. The total number of scene pixels available from the year of SEVIRI data is 4.8×10 11. There were 3.8×1011 cloud covered pixels and of these 1.9×1011 were over land. The IR Spectral PDF LUT has the largest number of cells, 2.4×10 7. The other four PDF LUTs had fewer than 5.2×105 cells. Therefore there were at least 4 orders of magnitude (10000×) more data than LUT cells. For comparison, the number of cells in a single collapsible PDF based on the dimensions used for the individual PDFs is 7.5×10 16. This is 5 orders of magnitude larger than the data available from a single year of data, i.e. we would need 10000 years of data just to have the same number of data points as PDF cells. 07/27/2016 GOES-R3 Status Report Template NESDIS STAR GOES-R For the PDFs to be effective in identifying cloud the principle directions of the clear and cloudy PDFs need to be misaligned so that the two distributions can be separated. There are too many dimensions to reasonably visually represent the full shape of the PDFs in N-dimensional space. Figures 1 to 4 presents examples of the PDFs generated. The PDFs have been collapsed on to two dimensions for the spectral PDFs and one dimension for the textural PDFs. In all cases specific surface types have been selected to show that there is variation in the PDF structure between surface classes. In all cases there is a measureable difference in the location of the distribution peaks between the cloudy and clear PDFs. However the separation for the forest surface type is smaller than that for the desert indicating that some surface classes are going to need more tuning than others to effectively distinguish cloud from clear surface. A full set of prototype clear and cloudy conditional probability PDF look-up-tables have been constructed using a broad selection of dimensional options. Once the GOES-R Bayesian cloud clearing software is complete, these PDFs will be tested to determine the optimal set of dimensions and parameter ranges to use for cloud clearing over land. 2) Testing of new visible channel forward model (RTTOV 11) It has been shown by Mackie et al. (2010) that using the data from the visible channels during the day significantly improves the cloud clearing over land. To do this effectively requires a good definition of the land surface reflectivity and a suitable forward model of the visible channel clear-scene top-of-atmosphere (TOA) reflectance. Mackie et al. (2010) constructed a seasonal bidirectional reflection distribution function (BRDF) atlas from MODIS MCD43C1 products, using data from the years 2006 and 2007. In their analysis they used the VISRTM visible channel forward model, developed at the University of Edinburgh (UoE), to calculate the TOA reflectance. The latest version of the SAF NWP radiative transfer model, RTTOV 11, includes a forward model for the visible channel TOA reflectance. Auxiliary to this visible model is a monthly BRDF atlas generated from MODIS data collected in 2007. The RTTOV11 radiative transfer model will be used in the development of the GOES-R prototype Bayesian cloud screening software. The structure of the underlying generalised Bayesian cloud screening (GBCS) code is such that any forward model can be implemented at a later stage. Our in-house knowledge of the RTTOV models means that its use in the development of the prototype GOES-R code is the most time efficient way for optimizing the cloud clearing over land given the project constraints. The initial testing of the RTTOV 11 visible channel model was carried out in two stages. The first stage was to compare the monthly BRDF atlas supplied with the RTTOV11 software against the seasonal BRDF atlas constructed by Mackie et al. (2010). The second stage was to compare the modelled TOA reflectance values simulated using the RTTOV11 model against those simulated using the VISRTM model (currently implemented in the GBCS code). The key difference between the two BRDF atlases is the spectral resolution. The seasonal BRDF atlas is constructed using the spectral bands associated with the MODIS channels. In general these spectral bands differ between instruments. The new monthly BRDF atlas takes account of this by using a principle component analysis of the USGS Hyperspectral measurements database for soils and vegetation surfaces to generate spectral response functions between 0.4μm and 2.5μm. A BRDF spectrum between 0.4μm and 2.5μm is obtained by combining the BRDF values at the 7 MODIS bands with the 6 leading components of the PCA of the Hyperspectral data (for details see Strahler et al., 1999). To compare the BRDF atlases, the TOA reflectance values for clear-scene over land pixels from a single SEVIRI slot (1 st June 2009, 12 noon GMT) were simulated using VISRTM with surface reflectivity values taken from the two atlases. Clear-scene pixels were identified using the CMS land mask supplied with the SEVIRI imagery, only cases where the solar zenith angle was less than 80° were considered. ECMWF ERA Interim model fields were used to define the atmospheric state for the forward model. The ERA Interim fields were interpolated onto the pixel location and the forward model run for each pixel using the two BRDF values. The seasonal BRDF values were interpolated to the day associated with the slot being processed, while the value from the monthly BRDF were taken simply for the month associated with the slot being processed. The histograms of modelled minus observed data are presented in Figure 5. These show that using the UoE in-house seasonal BRDF atlas on average leads to an under-estimation of the TOA reflectance. The broad width of the histograms associated with the seasonal BRDF atlas indicates that it is providing a poor estimate of the surface reflectance. In contrast use of the RTTOV11 monthly BRDF atlas leads to a significant improvement in the modelled TOA reflectance, with a near-zero mean difference and significantly narrower distributions. The full-disk images of the modelled minus observed difference for the 0.6μm channel (Figure 6) show that the there is no real pattern to where the seasonal BRDF atlas produced an under-estimation, whereas the differences calculated using the RTTOV BRDF atlas show most of the under-estimations occur next to cloud, resulting in a long negative tail to the distribution; this implies that the cloud mask may be incorrectly classifying scenes next to cloudy areas. The RTTOV11 monthly BRDF atlas was used in making a comparison between the RTTOV11 visible channel model and the VISRTM model. Again, only clear-scene over land pixel where the solar zenith angle is less than 80° were considered and the same single slot (1st June 2009, 12 noon) was processed. ERA Interim fields interpolated onto the pixel location were used to define the atmospheric state for the two forward models. For comparison, histograms of the modelled minus observed TOA reflectance were calculated from the two models, the results are shown in Figure 7. For the SEVIRI slot processed the RTTOV 11 visible model produced on average a better agreement with the 07/27/2016 GOES-R3 Status Report Template NESDIS STAR GOES-R observations for the 0.6μm and 1.6μm channels, i.e. the mean difference was closer to zero and the histograms were narrower. The histogram for the RTTOV 11 model of the 0.8μm channel was slightly broader than that for VISRTM and the mean slightly further from zero. In general RTTOV11 was over estimating the TOA reflectance in the 0.8μm channel for this particular slot. To better understand the differences in the model accuracy between channels, two more mid-day slots were processed; 21st March 2009 and 21st December 2009. The histograms of the TOA reflectance differences for the 0.8μm channel are presented in Figure 8. These data indicate that the observed channel differences may be due to model bias. Calculation of the bias statistics will be carried out during the next reporting period. Overall, the RTTOV11 visible model does as well as or better than VISRTM in calculating the TOA reflectance over land. The main disadvantage of the RTTOV 11 model is that it is significantly slower to run on a perpixel basis than VISRTM. 3) Implementation of RTTOV11 visible model in GOES-R proxy Bayesian cloud screening software An earlier version of the RTTOV radiative transfer model has been implemented in the GBCS code for calculating the thermal channel TOA brightness temperatures (BT). To optimise the speed of the image processing of a slot, RTTOV is only run at the atmospheric profile data locations, i.e. at a significantly lower spatial resolution compared with the imagery. To obtain the BT at an image pixel location the modelled BTs from the four surrounding atmospheric profile locations are interpolated to the pixel location using the forward model tangent linears. A similar approach can be applied to the visible channels if we assume that the atmospheric transmission can be approximated by the TOA reflectance divided by the surface reflectance at each atmospheric profile location. These transmission values can be interpolated onto the pixel location and the result multiplied by the surface reflectance at the pixel location to reconstruct the local TOA reflectance. This effectively interpolates the atmospheric contribution onto the pixel location; however we have to assume that the effects of the viewing angles are negligible between profiles. This method has been implemented in the GOES-R proxy Bayesian cloud clearing code being developed. The time to process a single slot reduced from 20 minutes to 4 minutes using this method. Further improvements in speed can be achieved by optimizing the way in which the BRDF atlas is accessed, as this still needs to be done per-pixel. Comparison histograms of the modelled minus observed TOA reflectance for the per-pixel and per-profile processing are presented in Figure 9. There is no noticeable difference between the per-pixel and per-profile results for the 0.8μm and 1.6μm channels. The 0.6μm channel shows a reduction in height of the peak with a slight broadening of the histogram on the positive side to compensate. This same pattern is seen of the two other slots that were processed. The RTTOV11 visible model includes Rayleigh scattering which increases with decreasing wavelength, i.e. the greatest impact will be seen in the 0.6μm channel. It is most probable that the small changes observed in the histogram peak for the 0.6μm channel is due to the non-linear nature of this Rayleigh scattering. The changes in the histogram shapes observed between the per-pixel and per-profile processing are negligible compared to the changes in the mean modelled minus observed TOA reflectance difference found between slots. This indicates that other uncertainties in the processing have a greater impact on the simulated TOA reflectance values. The RTTOV11 model has been fully interfaced with the GOES-R proxy Bayesian cloud clearing code in such a way that it is compatible with the GBCS software. The interfacing has been fully tested. At this stage the GOES-R proxy software is ready to be run in a batch mode for processing multiple slots, simulating the TOA values for all of the SEVIRI visible and thermal IR channels. The next stage before the Bayesian cloud clearing can be implemented is to determine the model biases and construct the error covariance matrix required by the Bayesian scheme. References: Mackie, S., C. J. Merchant, O. Embury & P. Francis, 2010. Generalized Bayesian cloud detection for satellite imagery. Part 2: Technique and validation for daytime imagery. International Journal of Remote Sensing, 31, 2595-2621, doi:10.1080/01431160903051711. Strahler, A. H., W. Lucht, C. B. Schaaf, T. Tsang, F. Gao, X. Li, J. –P. Muller, P. Lewis, M. J. Barnsley, 1999. MODIS BRDF/albedo product: Algorithm Theoretical Basis Document, NASA EOS-MODIS document, v5.0, 53pp, NASA Goddard Space Flight Cent., Greenbelt, Md. Additional Information 1. Interaction with operational partners – None 2. Conference/workshop participation 07/27/2016 GOES-R3 Status Report Template NESDIS STAR GOES-R -The development of the GOES-R Bayesian cloud clearing over land will be presented at the ESA Living Planet Symposium, Edinburgh, UK, 09-13 September, 2013. - GOES-R Risk Reduction Bayesian Cloud Mask for will be presented at NOAA Science Week 18-22 March 2013. 3. Funding concerns – None 4. Outside project publicity – See presentations 5. Journal articles – None to date Plans for the next Reporting Period: There are four components in the next phase of the GOES-R proxy Bayesian cloud screening code development by the University of Edinburgh: (1) Calculate of the visible channel forward model bias and the corresponding error covariance matrix. These data are required by the Bayesian cloud screening scheme. The GOES-R proxy software is ready to be run in batch mode to generate the statistics required to define the bias and error covariance. A subset of slots from a full year of SEVIRI data will be used to generate these statistics. (2) Implement and test the full Bayesian cloud clearing scheme using the bias and error covariance data. (3) Test the prototype clear and cloudy condition PDFs within the Bayesian cloud clearing software and determine the optimal set of PDF parameters for cloud clearing over land. This will be determined by comparing the cloud clearing with the CMS cloud mask. (4) Quantify the how well the GOES-R Bayesian cloud clearing code is performing at clearing cloud over land. This will be done using a set of expertly screened scenes and CALIPSO cloud data. The last two components (3 & 4) will effectively run in parallel as the cloud screening quality needs to be tested to determine the best combination of PDF parameters and dimension ranges. Once the land component has been complete and the relevant changes to the code have been made it will be ported over to the NOAA/STAR system and tested there. Once it is working and tested the development of the ocean cloud mask will begin leveraging off the land cloud mask work/modifications. 07/27/2016 GOES-R3 Status Report Template NESDIS STAR GOES-R Key Graphics Table 1: Dimension used for generation of PDF LUT. The same dimensions are used for both the clear and cloudy class PDFs. PDF LUT Dimension Parameter Range Cell Size N Cells 230K – 350K 2K 60 Satellite Zenith Angle 0° - 60° 10° 6 Day/Night Flag 0° - 180° 90° 2 1–6 1 6 11μm -12μm BT Difference -5K – 15K 2K 10 3.9μm -11μm BT Difference -14K – 60K 2K 37 8.9μm -11μm BT Difference -20K – 10K 2K 15 1–6 1 6 0° - 60° 10° 6 -80 – 100 10 18 0 – 120 4 30 1–6 1 6 Day/Night Flag 0° - 180° 90° 2 Satellite Zenith Angle 0° - 60° 10° 6 Solar Zenith Angle 0° - 90° 5° 18 11μm Texture 0–8 0.02 400 Surface Type 1–6 1 6 Day/Night Flag 0° - 180° 90° 2 Satellite Zenith Angle 0° - 60° 10° 6 11μm Texture 0–8 0.02 400 Surface Type 1–6 1 6 Day/Night Flag 0° - 180° 90° 2 Satellite Zenith Angle 0° - 60° 10° 6 Solar Zenith Angle 0° - 90° 5° 18 0–8 0.02 400 11μm BT IR Spectral Surface Type Surface Type Satellite Zenith Angle Visible Spectral 0.6μm -1.6μm Reflectance Difference 0.8μm Reflectance Surface Type IR Textural (Day) IR Textural (Night) Visible Textural 1.6μm Texture 07/27/2016 GOES-R3 Status Report Template NESDIS STAR GOES-R Table 2: Mapping from the reduced set of surface classes to the IGBP surface classes. Reduced Class Set Forest IGBP Class Mapping (1) Evergreen needle forest (2) Evergreen broadleaf forest (3) Deciduous needle forest (4) Deciduous broadleaf forest (5) Mixed forest Desert (16) Barren / desert Snow (15) Snow / ice Water (17) Water Urban (13) Urban (6) Closed shrubs (7) Open shrubs (8) Woody savannah (9) Savannah (10) Grassland (11) Wetlands (12) Crops (14) Crop / mosaic (18) Tundra Vegetation 07/27/2016 GOES-R3 Status Report Template NESDIS STAR GOES-R Figure 1: Examples of empirical IR spectral cloudy and clear conditional probability PDF LUTs - 11μm vs 11μm - 12μm. Two specific surface types (desert, forest) have been chosen to show that there is variation between surface classes. All other dimensions have been collapsed onto the two displayed. 07/27/2016 GOES-R3 Status Report Template NESDIS STAR GOES-R Figure 2: Examples of empirical IR spectral cloudy and clear conditional probability PDF LUTs: 8.9μm 11μm vs 11μm - 12μm. Two specific surface types (desert, forest) have been chosen to show that there is variation between surface classes. All other dimensions have been collapsed onto the two displayed. 07/27/2016 GOES-R3 Status Report Template NESDIS STAR GOES-R Figure 3: Examples of empirical visible spectral cloudy and clear conditional probability PDF LUTs: 0.6μm – 1.6μm vs 0.8μm. Two specific surface types (desert, forest) have been chosen to show that there is variation between surface classes. All other dimensions have been collapsed onto the two displayed. 07/27/2016 GOES-R3 Status Report Template NESDIS STAR GOES-R Figure 4: Examples of the empirical textural clear and cloudy conditional probability PDF LUTs for the 11μm thermal channel and the 1.6μm visible channel . Two surface types (desert, forest) have been selected to show variation. All other dimensions have been collapsed onto the texture dimension. 07/27/2016 GOES-R3 Status Report Template NESDIS STAR GOES-R (a) 0.6μm channel comparison (b) 0.8μm channel comparison (c) 1.6μm channel comparison Figure 5: Histograms of the modelled minus observed TOA reflectance values calculated using the interpolated seasonal BRDF atlas values (red line) and the monthly BRDF atlas values (blue line) for the three SEVIRI visible channels. Positive (negative) difference values correspond to the model over (under) estimating the TOA reflectance. 07/27/2016 GOES-R3 Status Report Template NESDIS STAR GOES-R (a) Seasonal BRDF atlas (b) RTTOV11 monthly BRDF atlas Figure 6: Full disk images of modelled minus observed 0.6μm TOA reflectances calculated using VISRTM and ERA Interim atmospheric fields, with surface reflectivity taken from the (a) seasonal BRDF atlas and (b) the RTTOV11 monthly BRDF atlas. 07/27/2016 GOES-R3 Status Report Template NESDIS STAR GOES-R (a) 0.6μm channel comparison (b) 0.8μm channel comparison (c) 1.6μm channel comparison Figure 7: Histograms of the modelled minus observed TOA reflectance values calculated using VISRTM (red line) and RTTOV11 (blue line) for the three SEVIRI visible channels. Surface reflectivity was defined using the RTTOV 11 monthly BRDF atlas. Positive (negative) difference values correspond to the model over (under) estimating the TOA reflectance. 07/27/2016 GOES-R3 Status Report Template NESDIS STAR GOES-R (a) Slot 21st March 2009, 12 noon (a) Slot 1st June 2009, 12 noon (a) Slot 21st December 2009, 12 noon Figure 8: Histograms of the RTTOV11 modelled minus observed TOA reflectance for the 0.8μm for three separate slots from different months. These three histograms show variation in the mean difference over time, suggesting the presence of model bias. 07/27/2016 GOES-R3 Status Report Template NESDIS STAR GOES-R (a) 0.6μm channel comparison (b) 0.8μm channel comparison (c) 1.6μm channel comparison Figure 9: Histograms of the modelled minus observed TOA reflectance values calculated per pixel (red line) and per profile (blue line) for the three SEVIRI visible channels. Only the 0.6μm channel shows a significant difference in the height of the peak. All observed differences are negligible compared to other biases. 07/27/2016 GOES-R3 Status Report Template NESDIS STAR GOES-R