Historical data

advertisement

DeSurvey: Module 1.1 Climate Data

Deliverable 1.1.1.1

Report: Downscaled Climate Data Availability

Prepared by:

University of East Anglia, UK

University of Evora, Portugal

National Observatory of Athens, Greece

Corresponding author:

Tom Holt, Climatic Research Unit, UEA, Norwich, UK t.holt@uea.ac.uk

DeSurvey is funded by the Commission of the European Union under the

Framework VI Programme: Global Change and Ecosystems

NB not to be cited or reproduced without prior written permission of the authors

1.0 Introduction

This deliverable addresses the provision of climate data for the Mediterranean as required by other DeSurvey modules. Using a limited number of examples, the methodology for downscaling data to a 1 km grid is described, together with an assessment of how this would be incorporated in the DeSurvey Surveillance System for general application in any drought area anywhere in the world. Although the documentation focuses on the provision of a 1 km grid, more coarse resolution grids could easily be extracted by trivial manipulation of the code, using the same source datasets.

Note that our definition of the Mediterranean covers the domain shown in Figure 1. This larger area enables more comprehensive consideration of ambient climatology and removes bias at boundaries from the main area of interest.

48 o

N 15

42 o

N

10

36 o

N

5

30 o

N

0

12 o

W 0 o

12 o

E 24 o

E 36 o

E

Figure 1 Domain of the Mediterranean climate data

2.0 Historical data

2.1 Introduction

By historical data, we mean data up to the present day derived from observations, as opposed to future climate projections obtained from numerical models. Ideally, we would start with station observations, evenly distributed throughout the Mediterranean, all of comparable quality and freely available over a common period of time. Unfortunately, this is not possible for the following reasons:

1.

The network of available observation stations does not give an even spatial coverage of the Mediterranean and is biased towards low altitudes, level terrain, and has the most comprehensive coverage in the countries of the European Mediterranean.

2.

Station data can be prohibitively expensive and their use is often restricted.

3.

Station data describe conditions at the location of an instrument and may not be representative of the surrounding area.

4.

Observations may suffer from inhomogeneities caused by irregularities in observational technique or changes in instrumentation. Bias may also exist because of missing values.

Therefore, we use gridded reanalysis data from the ERA-40 set (Simmons and Gibson, 2000) to perform the downscaling of historical data.

As will be described, we do use station data, but only to validate the downscaling using data which is freely available to DeSurvey partners in Portugal and Greece.

2

2.2 ERA-40

ERA-40 data have the following advantages over station data.

1.

the data are on a grid which is almost evenly distributed in space throughout the globe. Thus, they have little spatial bias and are representative of an area

2.

3.

rather than a single location.

ERA-40 data are free and have few research limitations.

ERA-40 data have no missing values and undergo limited inhomogeneity checking during the assimilation. However, most reanalysis data are based on assimilated observations from numerous sources, and the coverage changes with time, leading to inhomogeneity, with generally improved quality over

4.

recent decades compared with the early part of the record.

ERA-40 data are easily accessible via the Web in the UK. However, this may

1.

not be the case in other countries.

In addition to the advantages described above, the ERA-40 data do have limitations:

ERA-40 precipitation is not based on station measurements, but derives solely

2.

from the boundary layer climate of the assimilation model.

Gaps in the source observations underpinning ERA-40 are filled by what is essentially a physically consistent form of interpolation. But, no matter how sophisticated the method, missing value replacement always involves error.

3.

As with all datasets, ERA-40 data are liable to processing errors, although errors are documented promptly and usually corrected.

4.

The last full year of the ERA-40 data is currently 2001.

The following ERA-40 variables have been processed ready for downscaling on demand to a 1 km grid for locations in the Mediterranean:

 air temperature at 2m (°C) total convective precipitation (mm)

 total large-scale precipitation (mm)

 total evaporation (mm)

 u-component of the 10m wind (m/s)

 v-component of the 10m wind (m/s)

The temporal resolution of the data is 6-hourly from 1958 to 2001 (inclusive). This enables examination of a smoothed diurnal cycle. Apart from the precipitation and evaporation data, the

ERA-40 variables used here are instantaneous values at 6-hourly intervals. Precipitation and evaporation are forecast accumulations over a 6-hour period.

The climate model used to assimilate data into the ERA-40 set is a spectral model providing data on spherical harmonics which are interpolated onto grids for analysis purposes. The N80 grid used here is an efficient compressed Gaussian grid, symmetrical about the Equator. By latitude, the data are irregularly spaced at approximate intervals of 1.125°. By longitude, the data are regularly spaced at intervals of 1.125°, but poleward of about 26° the number of grid points along a line of latitude starts to fall from 320 to 18 at 89°. This is to accommodate convergence of lines of longitude away from the Equator.

3

To simplify presentation and analysis, a routine was written to extract any subset of the global

N80 data onto a regular 1° by 1° latitude/longitude grid. This software was used to create the

Mediterranean data used in DeSurvey.

3.0 Model data

3.1 Introduction

To assess climate change in the future, the best available data come from computer models of climate. These operate on a succession of timesteps, evaluating equations describing atmospheric processes each time and using the results as input on the next timestep. Results are saved at predetermined intervals as data for particular periods of time. Although the model output available to DeSurvey appears to correspond to normal time it is important to understand that there is not a one to one correspondence between model time and real time. That is, data for 15 th

March 2001 from the model will not be the same as observed data on that date. This is because the model is a climate model rather than a numerical forecasting model and has to provide projections of climate hundreds of years into the future, rather than for a few days. Thus, there is a trade-off, in terms of computer time, between forecast precision and length of forecast. What climate models do provide is data that are representative of climate at a particular time. So, one can expect mean climate to be a reasonable estimate, and the variability about the mean over a period of time. For validation purposes, the frequency distribution of model climate should be very similar to the observed frequency distribution.

To assess the impact of future climate change, models use scenarios of climate change to influence the model physics. Thus, a storyline of future CO

2

emissions is imposed, otherwise the properties of future climate would be similar to today’s climate. Various storylines are available, enabling some assessment of likely uncertainty in projections of future climate.

There are two main types of computer model which operate in the way described above:

1.

General Circulation Models (GCMs) – these are for the whole globe and would typically provide output on a relatively coarse grid (250 km by 250 km, for example), and produce continuous daily data from 1850 to maybe 2200. GCMs usually are fully-coupled, that is, have realistic synchronous interactive physics between the atmosphere and ocean.

2.

Regional Climate Models (RCMs) – these are relatively high resolution models (50 km x

50 km or less) devised to examine the climate of a region in detail. RCMs usually have a simplified ocean to reduce the run time.

Data from RCMs over Europe are considered to be the best option for downscaling future climate. At the time of writing, the best source of such data is the PRUDENCE (an EU-funded project, completed 2005, see http://prudence.dmi.dk/ ) archive, containing freely available data from several RCMs. Unfortunately, the PRUDENCE generation of RCMs only provide data for two 30-year periods – 1961-1990 and 2070-2099 – because of limitations on computer resources at the time. Currently, the ENSEMBLES project is producing a similar archive with continuous time series, but these will not be available until year 3 of DeSurvey at least. Thus, the

PRUDENCE data can be used to assess future changes in climate over Europe, but cannot consider the development of those changes in time. Because the PRUDENCE project focused on the countries of the European Union, most models do not cover the southern and eastern coasts of the Mediterranean. An exception is the Hadley Centre’s HadRM3P model, which is, therefore, chosen as the source of future climate data for DeSurvey.

4

3.2 HadRM3P

The HadRM3P model (Hudson and Jones, 2002; Hadley Centre, 2002) provides data on a regular

44 km grid for the whole of Europe on a rotated co-ordinate system. The same variables as identified above from the ERA-40 data are available at 6-hourly intervals for the periods 1961-

1990 and 2070-2099. We interpolate these onto a regular 0.5° x 0.5° latitude/longitude grid to simplify presentation and analysis. This requires a small adjustment to the downscaling routine but, since it is closer to the native grid of the data, will give greater accuracy than the 1° x 1° grid used for the ERA-40 data.

3.3 Downscaling

3.3.1 Introduction

The procedure commonly referred to as “statistical downscaling” is essentially interpolation using an additional variable to weight the interpolated values, giving higher accuracy. As with all interpolation, there is an error in the interpolated values because it is not possible to “create” data.

Interpolating from a 1° (roughly 115 km) grid to a 1 km grid is particularly difficult since the differences between 1 km grid points will be very small for most climate variables when the initial data are so widely dispersed. Moreover, it is not easy to find an additional variable with the required resolution.

3.3.2 Method

Very valuable properties of spline interpolation are that it provides smoothly continuous 1 st and

2 nd derivatives and is accurate at the original grid nodes. Thus, whilst preserving accuracy, the method incorporates rate of change information that is critical for continuously varying fields such as temperature. Many enhancements have been superimposed on this basic property of splines. For example, Renka (1997) describes a method of implementing spatially-varying tension which reduces the tendency of splines to “undershoot” and “overshoot” between the original nodes. Similarly, in several papers Hutchinson (for example, Hutchinson, 1991) describes the application of thin plate splines to climate data. The thin plate spline is a form of spline under tension where the effect of tension is analogous to pressing a thin sheet of metal over the original data protruding in the z-direction on an x-y grid, giving a minimally-bended surface that passes through all points z. Thus, the interpolation is “stiffer” than the other well-known analogy for tension splines – the rubber sheet passing through all points.

All the climate variables of interest to DeSurvey vary to some extent with altitude. Therefore, we investigated the possibility of weighting the thin plate spline interpolation using a topographic database.

The GLOBE (Global Land One-km Base Elevation) topography database (GLOBE Task Team,

1999) offers a global topography at 1 km resolution which should be ideal for our purposes.

GLOBE is an amalgam of many different datasets with an international team of developers and peer reviewers. According to the manual (Hastings and Dunbar, 1999 and an up-to-date online version at http://www.ngdc.noaa.gov/mgg/topo/report/index.html

), there are several sources of error in GLOBE but at least these are documented. For most of the Mediterranean domain there are not likely to be serious problems with the GLOBE data.

Several studies describe downscaling climate data using topography, for example Hutchinson

(1998a and 1998b) examines precipitation, and Jarvis and Stuart (2001a and 2001b) examine temperature.

5

Using Matlab, we adapted the routines of Billings et al. (2002) to calculate thin plate smoothing splines from our 1° x 1° ERA-40 data and project the result onto a 1 km x 1 km grid using the

GLOBE topography as an added covariate.

The 1 km data make very large datasets and take a long time to compute. For example, to downscale a single ERA-40 variable for the whole of the Mediterranean domain at 6-hourly intervals from 1958 to 2002 would take about 2 months on a modern desktop PC and would create an ASCII output file of about 2,220 GBytes (or 2.2 TBytes). Clearly, analyzing such a dataset, or incorporating it into an impacts model, would require a supercomputer. To overcome these practical difficulties, we devised a method of analysis which operates as a “black box”

Matlab script. The procedure, illustrated by an example, is as follows:

1.

Define the latitude and longitude of a node of interest. In this example we use the grid point 40°N 32°E.

2.

Determine for how many degrees of latitude and longitude about the central node 1 km data are needed. In the example, we create a domain of 1° about the central node. Figure

2 shows the 1 km topography data for this domain.

41 o

N

1800

30'

40 o

N

1600

1400

1200

1000

800

30'

600

400

200

3.

39 o

N

31 o

E

20' 40'

32 o

E

20' 40'

33 o

E

Figure 2 1 km topography (m) for 1° about the example node.

4.

Run the downscaling routine for the required domain. Figure 3 shows the downscaling

(example timestep 7 for surface temperature data) without using topography, Figure 4 shows the effects of incorporating topography into the downscaling.

6

41 o

N

30'

40 o

N

18

17.5

17

16.5

16

19.5

19

18.5

15.5

30'

15

14.5

39 o

N

31 o

E

14

20' 40'

32 o

E

20' 40'

33 o

E

Figure 3 Downscaled temperatures (timestep 7) without topography

41 o

N

30'

40 o

N

19.5

19

18.5

18

17.5

17

16.5

16

15.5

30'

15

14.5

39 o

N

31 o

E

20' 40'

32 o

E

20' 40'

33 o

E

14

Figure 4 Downscaled temperatures (timestep 7) using topography

To give the example a geographical context, Figure 5 shows the downscaling with topography for

2° about the central node. The clear area to the top left of the map is the Black Sea.

7

42 o

N 22

21

41 o

N 20

19

40 o

N 18

17

39 o

N 16

15

38 o

N

30 o

E 31 o

E 32 o

E 33 o

E 34 o

E

14

Figure 5 Downscaled temperatures for 2° about the central node.

To give some indication of the practicalities of downscaling to 1 km Table 1 gives the logistics of producing the above output for a single variable for all timesteps from 1958-2002. There are several points of interest in this table. The size of output file is given for single precision ASCII ouput from Matlab. This could be made smaller by using NetCDF format, for example, or compressing. However, in the worst case, analysis software and impacts models will require

ASCII input, so we present the worst case. The time taken is on a 64-bit desktop PC with 1 GByte

RAM using Matlab. This could be speeded up by using Fortran (possibly by a factor of 2) at the expense of increased development time, increased likelihood of errors, and lack of graphical support. Using 2 degrees about the central node squares the time and file size for 1 degree. In fact, the time is longer than this because of the reduced memory available with bigger variable sizes. Some improvement is possible by manipulating the size of data chunks processed at a given time. Note that to estimate the file size and run time the square of the number of grid squares about the central node is the factor to apply to the 1 degree domain parameters. So, the file size for three degrees would be 14.92 * 9, or 134.28 GByte and the downscaling would take at least

4.4 * 9, or 39.6 hours to complete. In practice, the run time would be longer than this. degrees about central node size of ASCII output file

(GBytes)

14.92 time taken to do downscaling

(hours)

4.4 one two 59.29 29

Table 1 Statistics of downscaling to a 1 km grid

3.4 Validation

Validation has been performed using data from selected stations in Greece and Portugal. Stations were chosen because they were at different altitudes enabling detailed testing of the effect of including topography in the interpolation. Here we present a subset of the full validation for temperature using data from Greek stations.

8

Figure 6 shows that downscaled temperatures are acceptably close to observed temperatures at all stations. The rms error of about 2-3°C is actually very good when one considers that the comparison is between interpolated spatial averages and a point measurement.

The validation for the other variables is also reassuring, although there is an apparent problem with the precipitation data which may be caused by an error in the creation of the 1° x 1° ERA-40 data. This is under investigation and will take only a few days to fix.

4.0 Module 1.1 Web Pages

It was clear from the AGM that DeSurvey partners would value Web Pages describing the availability of downscaled climate data. Although this is not an explicit DeSurvey deliverable, it is felt that it would aid in the development of the Surveillance System in the longer term, and in the immediate dissemination of climate data. A Web version of this document is in preparation, which will provide links to downloadable data, all data documentation, and validation reports. It will also provide a calculator to enable users to estimate the size of requested downscaled data sets. This should prevent the problem of people requesting data they cannot use because it is too demanding of computing resources. It should also deter speculative requests for data.

9

Figure 6 Comparison of mean daily downscaled temperatures with station temperatures 1958-2002 for six Greek stations

The top row of diagrams shows a scattergram of downscaled v. observed temperature, r 2 ranges from 0.95 to 0.98.

The bottom row of diagrams shows the error by day between the downscaled and observed temperatures, the rms error

is typically 2-3°C (from Kostopoulou et al, 2006).

10

5.0 Conclusions

A method of downscaling climate data to 1 km, using topography as an additional covariate in a thin plate spline interpolation, has been developed. The results compare quite well with station data and should give physically consistent results for anywhere in the Mediterranean domain.

Because of the size of the downscaled datasets, it is not practical to downscale foe the whole

Mediterranean and let users create their own subsets. Instead, data will be provided on demand about a central grid node defined by latitude and longitude co-ordinates.

Although demonstration of the method was limited to ERA-40 reanalysis data, exactly the same procedure can be applied to data from Regional Climate Models to give highly detailed scenarios of future climate.

It is proposed to create a set of Module 1.1 Web Pages to document the available data, to provide links to validation reports, to supply data for download, and to enable potential users to calculate the size of requested datasets.

6.0 Addendum

Although not strictly part of this Deliverable, it is important that all DeSurvey Modules think in terms of providing the Surveillance System.

6.1 Conversion for DeSurvey Surveillance System

The downscaling software system, although implemented in Matlab, will also be converted to R for the Surveillance System. R is free software with the advantage that it can be run either as scripts or as “black box” software. It is, thus, ideal for the numerical aspects of the Surveillance

System which specifies that the software should be accessible to non-climate specialists. In fact, a lot of this work has already been done.

6.2 Data availability

Due to license restrictions, we cannot make the ERA-40 N80 data available to future users of the

Surveillance System. However, any Reanalysis data would be appropriate and, for example, the

NCEP Reanalysis data have no restriction on downloading. Unfortunately, NCEP data are only half the resolution of ERA-40 data, making it more difficult to justify downscaling to 1 km. One alternative might be that UEA could provide 1° x 1° ERA-40 data for regions as needed. This will be investigated.

So far as model data are concerned, the regions covered by Regional Climate Models are very limited. However, the Hadley Centre makes their RCM (PRECIS) available free to developing countries. Since this model can run on a Desktop PC under Linux, it may be possible for countries to develop their own model data. Alternatively, the latest batch of GCMs are running at spatial resolutions close to ERA-40. Since these are global datasets, the way forward may be to downscale GCM data rather than RCM data.

11

References

Billings, S.D., G.N. Newsam, and R.K. Beatson. 2002: Smooth fitting of geophysical data using continuous global surfaces. Geophysics, vol. 67, no. 6, pp 1823-1834.

GLOBE Task Team and others (Hastings, David A., Paula K. Dunbar, Gerald M. Elphingstone,

Mark Bootz, Hiroshi Murakami, Hiroshi Maruyama, Hiroshi Masaharu, Peter Holland,

John Payne, Nevin A. Bryant, Thomas L. Logan, J.-P. Muller, Gunter Schreier, and John

S. MacDonald), eds., 1999. The Global Land One-kilometer Base Elevation (GLOBE)

Digital Elevation Model, Version 1.0. National Oceanic and Atmospheric

Administration, National Geophysical Data Center, 325 Broadway, Boulder, Colorado

80303, U.S.A. Digital data base on the World Wide Web (URL: http://www.ngdc.noaa.gov/mgg/topo/globe.html

) and CD-ROMs.

Hadley Centre, 2002: The Hadley Centre regional climate modelling system: PRECIS update

2002. Downloadable web document from the UK Meteorological Office: http://www.metoffice.com/research/hadleycentre/pubs/brochures/B2002/precis.pdf

.

Hastings, David A., and Paula K. Dunbar, 1999. Global Land One-kilometer Base Elevation

(GLOBE) Digital Elevation Model, Documentation, Volume 1.0. Key to Geophysical

Records Documentation (KGRD) 34. National Oceanic and Atmospheric Administration,

National Geophysical Data Center, 325 Broadway, Boulder, Colorado 80303, U.S.A.

Hudson, D.A., and R.G. Jones, 2002: Regional Climate Model Simulations of Present-Day and

Future Climates of Southern Africa. Hadley Centre Technical Note 39, Hadley Centre,

Met Office, Exeter, UK, 41 pp.

Hutchinson, M.F. 1991: The application of thin plate smoothing splines to continent-wide data assimilation. In: Jasper JD (ed.) BMRC Research Report No.27, Data Assimilation

Systems. Melbourne: Bureau of Meteorology: 104-113.

Hutchinson, M.F. 1998a: Interpolation of Rainfall Data with Thin Plate Smoothing Splines – Part

1: Two Dimensional Smoothing of Data with Shrot Range Correlation. Journal of Geog.

Information and Decision Analysis, vol. 2, no. 2, pp 139-151.

Hutchinson, M.F. 1998b: Interpolation of Rainfall Data with Thin Plate Smoothing Splines – Part

2: Analysis of Topographic Dependence. Journal of Geog. Information and Decision

Analysis, vol. 2, no. 2, pp 152-167.

Jarvis, C.H., and N. Stuart, 2001a: A Comparison among Strategies for Interpolating Maximum and Minimum Daily Air Temperatures. Part 1: The Selection of “Guiding” Topographic and Land Cover Variables. Journal of Applied Met., 40, pp 1060-1074.

Jarvis, C.H., and N. Stuart, 2001b: A Comparison among Strategies for Interpolating Maximum and Minimum Daily Air Temperatures. Part 2: The Interaction between Number of

Guiding Variables and the Type of Interpolation Method. Journal of Applied Met., 40, pp

1075-1084.

12

Kostopoulou, E, C. Giannakopoulos, P. Le Sager, H. Flocas, T. Holt, B. Psiloglou, and M.

Hatzaki, 2006: Comparison of ERA-40 reanalysis downscaled temperature with observational data over Greece. Poster presentation to the EGU, Vienna, April, 2006.

Renka, R.J., 1997. Algorithm 773: SSRFPACK: Interpolation of scattered data on the surface of a sphere with a surface under tension. ACM Trans. Math. Softw. 23, 3.

Simmons, A.J. and J.K. Gibson, 2000: The ERA-40 Project Plan. ERA-40 Project Report Series

No. 1. Downloadable Web document available from ECMWF: http://www.ecmwf.int/publications/library/ecpublications/_pdf/era40/ERA40_PRS_1.pdf

.

13

Download