SUNY Albany Trajectory Model Matthew Janiga Version 1.0 October 20, 2011 Contents 1. Introduction 2. Setting up a run 2.1. Setting up the main input grid file 2.2. Setting up the auxiliary grid file 2.3. Setting up the parcel coordinate file 2.4. Setting up the run file. 3. Examples 4. Reading an output file. 1. Introduction and Computation The regional 3D kinematic trajectory model discussed here is based on the TRAJ3D (Noone and Simmonds, 1999) and FLEXTRA (Stohl et al., 1995) models. The goal of this model was to produce a trim, computationally accurate, and user-friendly program for the computation of 3D kinematic trajectories from either numerical simulation or reanalysis data. The program was written in Fortran 90 with direct interfaces to the NetCDF input and output files by utilizing the Unidata NetCDF library (http://www.unidata.ucar.edu/software/netcdf/). The choice of NetCDF was based on the ease of which many input data (GRIB, WRF output, etc.) can be converted to NetCDF using the NCAR Command Language (NCL). This keeps SIMTRAJECTORY itself small and portable! In addition, NCL provides an excellent interface for plotting and analyzing the output NetCDF trajectories. The program requires 4 input files to run. The first file is the grid file, which contains: u, v, and ω. The grid file is used for computing the kinematic trajectories and for determining the position of the trajectories in time-lev-lat-lon space. The auxiliary file contains n variables, which may be used for determining additional properties along a trajectory. They may be 2D (rainrate, MSLP, etc.) or 3D (potential temperature, mixing ratio, etc.) but must have the same coordinates as the grid file. The trajectory file contains the time, pressure, latitude, and longitude of the initial position of each trajectory to be computed (and implicitly the number of trajectories). A namelist file specifies the location of these input files on the disk and some computational and input parameters such as the time-step. The time-step for computing the trajectories is specified as n seconds; a positive (negative) time-step results in a forward (backward) trajectory. Specifying the interpolation is an important decision in trajectory models as it greatly affects the smoothness of the trajectories and the computational time. The following interpolation options are available for these dimensions… Horizontal: bicubic or bilinear Vertical: linear or cubic polynomial Time: cubic or linear Fig. 1) Illustration of bicubic (right) and bilinear (right) interpolation for a given 4x4 grid. http://en.wikipedia.org/wiki/Bicubic_interpolation Bicubic is strongly encouraged for the horizontal dimension unless this is a climatological study and computation power is at a premium. The difference between bicubic and bilinear interpolation is shown in Figure 1. The use of cubic interpolation in time is different from most trajectory models. However, for most research applications supplying 1 “padding element” in time is not a major difficulty. In addition, the extra computation is not really a hindrance for current computers. This results in a much smoother time series for the trajectories. This is encouraged for reanalysis data where the input times are spaced apart. However, for a numerical simulation this problem can be solved by using linear interpolation in time but increasing the time resolution in the gridfile. The vertical interpolation can be done using either linear interpolation or a cubic polynomial. The linear interpolation should only be used if the vertical levels are evenly spaced. Otherwise use the cubic polynomial interpolation which uses Neville’s Algorithm (see Pg. 118120 of Numerical Recipes by Press or http://en.wikipedia.org/wiki/Neville%27s_algorithm). Reading the data from the NetCDF files is the main bottleneck and this is greatly reduced if the NetCDF files are uncompressed although this will result in greater disk space usage. It would be trivial to modify the program so that a large part of the gridfile was stored in memory. This would greatly increase the speed of the program and would be ideal for climatological studies. Time steps are computed on a great circle using Haversine’s formula and an iterative process to account for changes in the wind speed between t0 and t1. This iteration follows Petterson (1940) and is a standard feature of trajectory models. As seen below, first the position at t1 is estimated by the 3D wind at t0; a “zero-acceleration” or first-order estimate which does not account for the change in wind during this time step (see Stohl et al., 1998 for an excellent review of trajectory methods). This zero-acceleration first guess is then averaged with the 3D wind estimated at t1. This process proceeds until convergence occurs 𝑋 𝑖 ≈ 𝑋 𝑖−1 , where X is the position, or some maximum or pre-defined number of iterations are performed. One advantage of the Petterson method is that convergence is assured. This iterative process is a “constant acceleration” or second-order numerical estimate. 𝜕𝑋 (𝑡 ) 𝜕𝑡 0 𝜕𝑋 𝜕𝑋1 (𝑡 )] (∆𝑡) [ (𝑡0 ) + 𝜕𝑡 𝜕𝑡 0 2 (𝑡 ) 𝑋 1 ≈ 𝑋(𝑡0 ) + 2 𝜕𝑋 𝜕𝑋 𝑖−1 (𝑡0 )] (∆𝑡) [ (𝑡0 ) + 𝜕𝑡 𝜕𝑡 𝑋 𝑖 (𝑡1 ) ≈ 𝑋(𝑡0 ) + 2 𝑋1 (𝑡1 ) ≈ 𝑋(𝑡0 ) + (∆𝑡) Even though the trajectories are constrained to constant acceleration during time steps, the use of cubic interpolation in time allows for “variable acceleration” between analysis times. In other words, the variable acceleration between analysis times is to some extent accounted for by the constant acceleration changing magnitude from one time step to the next (if the time step is smaller than the analysis time). However, in the end, playing around with the time step and using trajectory ensembles are really the only way to be sure the trajectories are representative. 2. Setting up a Run This section describes the steps necessary to produce a run using SIMTRAJECTORY. At this time there are strict limitations on the structure of the NetCDF files input to the program. All variables names in this section are CASE SENSITIVE. 2.1 Setting up the gridfile netcdf test_grid { dimensions: time = 13 ; lev = 19 ; lat = 145 ; lon = 301 ; variables: float u(time, lev, lat, lon) ; u:long_name = "zonal_wind" ; u:units = "m s^-1" ; u:_FillValue = 99999.9f ; double time(time) ; time:calendar = "standard" ; time:units = "seconds since 2006-08-04 00:00:00" ; float lev(lev) ; lev:long_name = "pressure" ; lev:units = "Pa" ; float lat(lat) ; lat:long_name = "latitude" ; lat:units = "degrees_north" ; float lon(lon) ; lon:long_name = "latitude" ; lon:units = "degrees_east" ; float v(time, lev, lat, lon) ; v:long_name = "meridional_wind" ; v:units = "m s^-1" ; v:_FillValue = 99999.9f ; float w(time, lev, lat, lon) ; w:long_name = "pressure_vertical_velocity" ; w:units = "Pa s^-1" ; w:_FillValue = 99999.9f ; float z(time, lev, lat, lon) ; z:long_name = "height" ; z:units = "m" ; z:_FillValue = 99999.9f ; // global attributes: :comments = "Matthew Janiga (matthew.janiga@gmail.com)" ; :source = "/ct13/janiga/data/amma_reanalysis/" ; :creation_date = "Sun Mar 27 15:25:53 UTC 2011" ; } Fig. 1) Example header of a 0.5° gridfile from August 2006 over Africa and the Atlantic. The structure of the grid file must be the same as that in the above ncdump of test_grid.nc with the exception of the global attributes; these are for the user’s personal use. The grid file requires four 4-D variables (u, v, w, z) and four 1-D dimension attribute variables (time, lev, lat, lon); the 4-D variables must be structured so that the dimensions follow this order and the dimension names cannot deviate from this. All 4-D variables must have coordinate variables that increase monotonically; time, lat, and lon must have a uniform spacing but lev can have a variable spacing interval. This requires that the horizontal grid use a Cartesian coordinate; wrap points are not handled at this time. Although lon must be continuously increasing, negative values although longitudes cannot exceed 360.0°. The attribute “_FillValue” must be set regardless of whether there are missing values or not. If missing values are found interpolation will not be performed, the wind positions at this time will be set to missing and the trajectory in question will be terminated. The lat and lon variables must be in degrees of type float. The lev variable must have units of Pa and be of type float. The time variable must be of type double and have units of seconds since the first element in the dimension. 2.2 Setting up the Auxiliary File The structure of the auxiliary file must be the same as the grid file. Variables in the auxiliary file may be either 3D (time, lat, lon) or 4D with the same structure as the grid file. Variable names that have the correct dimension and variable name as specified in the namelist file will be used, others are ignored. 2.3 Setting up the Trajectory File The trajectory file contains x arrays of length n, where n is the number of trajectories to be computed. The trajectory file must contain the following arrays: 1.) time: the initial time in seconds since t0 in the grid file (double) 2.) pres: the initial pressure in Pa (float) 3.) lat: the initial lat in degrees north (float) 4.) lon: the initial lon in degrees east (float) The lat and lon arrays must have the same specification as those in the grid file. For example 0:360 and -180:180 are valid longitude arrays for a grid file; however, a 0:360 grid file cannot be accompanied by trajectory file longitudes less than 0 and wrapping is not handled at this time (polar data is not handled either. Trajectory points within 2 elements of the edge of the domain (x, y, t) i.e. points less than element 2 or greater than element nmax-1, counting up from 1 are allowed but not optimal for the tricubic interpolation. In other words, the x, y, and t dimensions should have at least 1 padding element. There is no requirement for initial conditions to coincide with elements in the grid file. The below figure shows an example trajectory file containing six trajectories. netcdf traj_array { dimensions: traj = 6 ; variables: float time(traj) ; time:units = "seconds since 2006-08-04 00:00:00" ; int traj(traj) ; float lev(traj) ; lev:units = "Pa" ; float lat(traj) ; lat:long_name = "latitude" ; lat:units = "degrees_north" ; float lon(traj) ; lon:long_name = "longitude" ; lon:units = "degrees_east" ; // global attributes: :comments = "Matthew Janiga (matthew.janiga@gmail.com)" ; :creation_date = "Mon Apr 4 18:41:49 UTC 2011" ; data: time = 31600, 31600, 31600, 31600, 31600, 31600 ; traj = 1, 2, 3, 4, 5, 6 ; lev = 72500, 72500, 72500, 72500, 72500, 72500 ; lat = -3.2, -2.2, -1.2, 1.2, 2.2, 3.2 ; lon = -3.2, -2.2, -1.2, 1.2, 2.2, 3.2 ; } Fig. 2) Example header and contents of a trajfile from August 2006 over Africa and the Atlantic. This trajectory array corresponds to a SW-NE oriented line of trajectories initiated at the same level and time. 2.4 Setting up the Namelist File The namelist file must be of type *.nml and provides input to the program not contained in the NetCDF files. The command line input… simtraj my_namelist.nml is used to run the program. Some useful debugging information will be printed to the screen. &PARAMS gridfile='input/test_grid.nc' auxfile = 'input/aux_grid.nc' trajfile = 'input/traj_array.nc' outfile = 'output.forward.nc' naux = 2 aux_dim_t = 4,3 aux_name_t = 'pv','mslp' nstep = 200 time_step = 1800 / The namelist file specifies the location and name of the three input NetCDF files and the output NetCDF files. The other variables are: naux – number of auxiliary variables, aux_dim_t – dimension of auxiliary variables, aux_name_t – name in NetCDF file of the auxiliary variables, nstep – the maximum number of time steps, and time_step - the time (s) which is multiplied by the 3D wind to determine the following position. 3. Program Structure (needs work) For computational expediency, the trajectories themselves are stored in memory. This does place some limits on the number of trajectories and time steps that computed in one run. However, for most computers this is on the order of thousands of time steps and trajectories. Furthermore, without considerable additional coding one could use the output files as restart files to circumvent this limitation. The advection of trajectories is determined by using the u and v wind to determine the subsequent lat and lon following the length of time specified in time_step. This is done by solving Haversine’s Formula which relates two pairs of lat and lon points to a distance along a great circle and a bearing. The subsequent pressure is simply the pressure vertical velocity times the time step. In the script comp_simtraj.sh a gfortran flag “-fdefault-real-8” is set which makes real variables a 64-bit double by default. 4. Examining Output (needs work) 5. Comparison to Other Trajectory Models This trajectory model is designed with synoptic or mesoscale meteorology in mind not air pollution. However, it is useful to compare the numerical methods used in this model with those used in other models. The following table shows the numerical methods used in some popular trajectory models (see also Stohl et al., 2001) SUNY Albany V1.0 (as of 2011) Horizontal coordinate system – Mercator Vertical coordinate system - Pressure Spatial resolution – Variable Time resolution - Variable Integration method - N steps, Petterssen (1940) Time step - Variable Horizontal interpolation – Bicubic or bilinear Vertical interpolation – Cubic polynomial or linear Time interpolation – Linear or cubic Vertical velocity – Omega SUNY Albany V1.1 (planned) Horizontal coordinate system – Polar stereographic, Mercator, or Lambert conformal Vertical coordinate system - Pressure or sigma Spatial resolution – Variable Time resolution - Variable Integration method - Petterssen (1940) based on a convergence criteria Time step - Variable Horizontal interpolation – Bicubic or bilinear Vertical interpolation – Cubic polynomial or linear Time interpolation – Linear or cubic Vertical velocity – Omega Anatha’s Program (as of 2011) Horizontal coordinate system – Mercator, only global data Vertical coordinate system – Log pressure Spatial resolution – Variable Time resolution - Variable Integration method - First order no Petterssen Time step - Variable but only backward Horizontal interpolation – Bicubic Vertical interpolation –Linear Time interpolation – Linear Vertical velocity – Omega FLEXTRA V3.1 (as of 1999) Horizontal coordinate system – Mercator or polar stereographic depending on latitude Vertical coordinate system - η Spatial resolution – Variable Time resolution - Variable Integration method - Petterssen (1940) till convergence. Time step - Flexible, based on local Courant number Horizontal interpolation - Bicubic Vertical interpolation – Quadratic polynomial Time interpolation - Linear Vertical velocity – Determined from spherical harmonics HYSPLIT V4 (as of October 2009) Horizontal coordinate system – Polar stereographic, Mercator, or Lambert conformal Vertical coordinate system - Variable Spatial resolution – Variable Time resolution - Variable Integration method - Petterssen (1940) ? Time step - ? Horizontal interpolation - Bilinear Vertical interpolation – ? Time interpolation - ? Vertical velocity – ? LAGRANTO (?) 5. Future Work 1. Make the ability to change to interpolation scheme more flexible. 2. Switch out the bicubic with subroutine with one that uses a search table. 3. Include the ability to change the time chunk read in each time. For instance, read in 100 time steps at once so that all calculations can be performed in memory. 4. Include the ability to handle global data. Wrap points and calculations at the poles. Used the Taylor (1996) code used by FLEXTRA/FLEXPART and HYSPLIT to allow the native coordinate to be Mercator, Lambert Conformal, or Polar Stereographic. From the FLEXTRA documentation…. Special considerations have been given to the poles. Since, in a latitude/longitude grid, the wind direction at the poles is not defined, additional fields of the horizontal wind are created. They are defined on the same latitude/longitude grid, but contain the wind components transformed to a polar stereographic projection. For the trajectory calculations, north and south of some parallel of latitude (definable by the user via parameters switchnorth and switchsouth, e.g., ±75 °), FLEXTRA switches back and forth between the latitude/longitude grid and the polar stereographic one. That means, the latitude/longitude grid remains the reference grid, but for advecting the trajectory, the polar stereographic projection grid is used intermediately. For this, the wind is transformed to the polar stereographic coordinates, but is still given at their original positions on the latitude/longitude grid. 1 For the coordinate transformations, the FORTRAN routines described by Taylor (1997) are used. http://zardoz.nilu.no/~andreas/flextra/flextra3.html 5. Make the number of Petterrsen iterations based on convergence. 6. For computing trajectory climatologies, the only feasible method would be to read in a large period into memory (6-90 days) and performing the calculations in memory. Stohl’s warm trajectory paper computes 350 million 6 day trajectories. Without this optimization, this would require reading in 1x1016 data points in. This is compared to about 1x1011 points if everything is done in memory. For dozens to several thousand FLEXPART for WRF code https://github.com/sajinh/flx_wrf2/tree/master/src_gfortran2