BMEGUI Tutorial 6

advertisement
BMEGUI Tutorial 6
Mean trend and covariance modeling
1. Objective
Spatial research analysts/modelers may want to remove a global offset (called mean trend
in BMEGUI manual and tutorials) from the space/time random field Z(s,t) at spatial
location s and time t, and use the detrended (residual) data X(s,t) for the subsequent
geostatistical analysis. Consider the following relationship:
Z(s,t) = µ(s,t) + X(s,t)
(1)
where Z(s,t) represents the field of interest, µ(s,t) is a deterministic global offset and
X(s,t) is a spatially autocorrelated residual space/time random field. Removing the global
offset µ(s,t) from the field Z(s,t) is optional and depends on the choice of the modeler.
Once the global offset is removed, the geostatistical analysis is performed on the residual
field X(s,t)=Z(s,t)-µ(s,t), which results in estimates of the residual value X(sk,tk) at
unsampled point (sk,tk). Before making an actual prediction, the global offset µ(sk,tk) is
added back to the residual estimate to obtain the estimate Z(sk,tk)=µ(sk,tk)+X(sk,tk). If the
modeler can identify and quantify a meaningful global offset, this global offset explains a
portion of the variability of the raw data Z(s), and the residual data are expected to have a
lower residual variability, which can result in a more successful geostatistical analysis of
the residual field X(s,t). However, there is a real danger of over fitting data when deriving
the global offset, which could result in residual data with too little residual
autocorrelation to perform a successful geostatistical estimation of the residual field.
The global offset can be modeled as a space/time additive separable function of two
separate components, i.e. a temporal global offset and a spatial global offset. The degree
of smoothness in the space/time global offset can be controlled by applying an
exponential filter with user-defined search radius and smoothing range parameters. At
this stage the analyst/modeler has the flexibility of choosing a space/time global offset
from an infinite number of offsets that spans from an offset with long range variability
(i.e. very smooth, and thus pretty uninformative) to an offset with short range variability
(i.e. highly variable, and thus very informative). As explained above, a very informative
space/time global offset may leave too little autocorrelation in the residual data to
conduct a successful geostatistical estimation of the residual field. On the other hand, a
flat space/time global offset leaves a high variability in the residuals, which may produce
estimates with high posterior variance. Thus, there is a tradeoff between lowering the
variability of the residuals while keeping its autocorrelation structure, and hence the
modeler should explore a full assortment of global offsets ranging from
smooth/uninformative to highly variable/very informative in order to select an ideal
compromise that will explain some of the consistent space/time trends in the raw data,
while leaving reasonable autocorrelation in the residuals.
The primary objective of this tutorial is to perform a mean trend analysis (global mean
offset) and remove it from the data to obtain residual data, and then to see the effect of
the global offset on the covariance of the residual data. More specifically, this tutorial
considers five global offsets with varying degree of smoothness, and explores the effect
of each of these global offset on the covariance model parameters (i.e. sill and range) of
the corresponding residual data. This tutorial will help you understand the importance of
the global offset and its impact on the covariance model of the resulting residual data.
2. Install BMEGUI 3.0.0
See tutorial 1.
3. Data
To get the tutorial data, download the data file “data06.csv” from the Tutorial Data Files
and save it in a folder called “work06”. Open the data file using a spreadsheet editor or a
text editor to see the data available. The original data were downloaded from publicly
available online resources and compiled and prepared for this tutorial.
4. BMEGUI Operation
i.Start BMEGUI: double click on BMEGUI desktop icon. It will launch BMEGUI
window. (See the BMEGUI 3.0.0 user’s manual for more details).
ii.Workspace and data file selection: Click on the “Select Working Directory” button on
the “data and directory selection BMEGUI screen” and select the ‘work06’ folder.
Then click on the “Select Data File” button and select data file ‘data06.csv’
Figure 1: Data and directory selection BMGUI screen
iii.Click on the “OK” button. The “Data Field” screen appears after reading the data and
setting working directory
iv. In the “Data Field Setting” select the following column names from the dropdown
menu in each field.





X Field: Long
Y Field: Lat
Time Field: Time_sinceJan1_2007
ID: ID
Data Field: PM25
v.In the “Unit/Name” section, input the following units and name of data in each entry
box.




Space Unit: deg.
Time Unit: days
Data Unit: ug/m3
Name of Data: PM25
Figure 2: The “Data Field” screen
vi.Click on the “Next” button. The “Data Distribution” screen appears
vii.Check the basic statistics (mean, standard deviation, coefficient of skewness, and
coefficient of kurtosis) of the data and its log-transformed data in the “Statistics”
section.
viii.Check the histograms of raw data and log-transformed data. By clicking the “Raw
Data” and “Log Data” tab in the “Histogram” section, you can switch the
histograms
Figure 3: The “Data Distribution” screen showing the Histogram of “Raw Data” (upper) and “Log
Data” (lower)
ix. Since the log-transformed data looks normally distributed, click on the “Use Logtransformed Data” select button at the bottom of the window
x.Click on the “Next” button. The “Exploratory Data Analysis” screen appears. At this
stage, BMEGUI allows you to perform a temporal and spatial exploratory data
analysis.
Figure 4: “Exploratory Data Analysis” screen
xi.Click on the “Temporal Evolution” tab. Change the “Station ID” and see the
corresponding temporal distribution of the data
xii.Click on the “Spatial Distribution” tab. Change the “Time” and see the corresponding
spatial distribution of the data
xiii.Click on the “Next” button. The “Mean Trend Analysis” screen appears
NOTE: We will fit mean trend (global offset) models with 5 different levels of
smoothness (case1, case2, case3, case4, and case5) with level of smoothness ranging
from smooth (uninformative trend model) to variable (informative trend model)
Case1:
xiv.Click on the “Model mean trend and remove it from data” button to plot the mean
trend in the temporal and spatial domains
xv. Here we want to fit a global offset with long range variability that will result in a
very flat mean trend (i.e. a nearly constant global offset). To get a flat mean trend,
we have to enter large values for the search radius and smoothing range
parameters of the exponential filter. Enter the following parameter values, and
click on the “Recalculate Mean Trend” button
Search Radius
Spatial
Temporal
15
1000
Figure 1(a): The “Mean Trend Analysis” screen
Smoothing Range
15
1000
Figure 2(b): The “Mean Trend Analysis” screen
In the figure 4, we can see that the temporal and spatial mean trends (global offsets) are
extremely smooth and look like flat mean trends. BMEGUI will remove this flat mean
trend from the data to obtain the residual (detrended) data.
xvi.Click on the “Next” button. The “Space/Time Covariance Analysis” screen appears.
At this step BMEGUI calculates and plots experimental covariance valuesusing
the residual data.
xvii. We can manually edit temporal and spatial lags and their corresponding lag
tolerances to obtain more pairs of experimental covariance values (red dots) if
needed. Here we will only edit the temporal lags and their lag tolerances. To edit
the temporal lags, please click on the “Temporal Component” tab, and then click
on the “Edit Temporal Lags…” button. A dialog box with default lags appears.
Enter the following values in the “Temporal Lag” and “Temporal Lag Tolerance”
fields of the dialog box.
Temporal Lag:
0.0,20.0,40.0,68.3333333333,136.666666667,205.0,273.333333333,341.6666666
67,410.0,478.333333333,546.666666667,615.0,683.333333333
Temporal Lag Tolerance:
0.0,10,20.0,34.1666666667,34.1666666667,34.1666666667,34.1666666667,34.16
66666667,34.1666666667,34.1666666667,34.1666666667,34.1666666667,34.166
6666667
xviii.Click on the “OK” button. The experimental covariance plot (shown in red dots) is
automatically updated based on the entered temporal lags and corresponding
tolerances.
Figure 3: The “Space/Time Covariance Analysis” screen, showing Spatial and Temporal
Components of the covariance
xix.Now, we can model a covariance model that fits all experimental covariance values
(red dots) as best as possible. We will fit a two-structures covariance model to
ensure a good fit with the experimental covariance values. To fit a two-structures
covariance model, enter 2 in the “Number of covariance structure(1-4)”
xx. Now we have to enter covariance model parameters for each of the two covariance
structures. Input the following model parameters
Structure 1:





Sill: 0.2
Spatial Model: exponentialC
Spatial Range: 4
Temporal Model: exponentialC
Temporal Range: 7





Sill: 0.19
Spatial Model: exponentialC
Spatial Range: 100
Temporal Model: exponentialC
Temporal Range: 75
Structure 2:
xxi.Click on the “Plot Model” button. A plot of covariance model is superimposed on the
experimental covariance values.
Figure 4: The covariance model, shown on the Spatial Component (upper) and Temporal
Component (lower) plot
xxii.Click on the “Temporal Distribution” tab. To obtain the time series of BME estimates
at Station “43”, set the following estimation parameters in the “New Plot” section



BME Parameters: Use default settings
Estimation Parameters:
 Station ID:43
 Estimation Period: 1.0 days to 10.0
Display Parameter: Use default setting
xxiii.Click on the “Estimate” button. A new tab labeled “Plot ID: 0001” appears, and a
new entry appears on the list in the “Plot List” section.
Figure 5: The “BME Estimation” screen
xxiv.Click on the “Plot ID: 0001” tab and check the map of BME estimates.
Figure 6: Time series of BME estimates
xxv.Click on the “Quit” button to close the screen. A dialog box appears. Click on the
“OK” button of that dialog box to confirm that you want to quit BMEGUI.
Case 2:
Repeat Steps i through xiv
xxvi.To obtain the mean trend using new parameters, input the following parameter values,
and click on the “Recalculate Mean Trend” button
Spatial
Temporal
Search Radius
0.001
0.1
Figure 7(a): The “Mean Trend Analysis” screen
Smoothing Range
0.001
0.1
Figure 8(b): The “Mean Trend Analysis” screen
xxvii.Click on the “Next” button. The “Space/Time Covariance Analysis” screen appears.
xxviii.Click on the “Temporal Component” tab, then on the “Edit Temporal Lags…” button.
A dialog box appears.
xxix.Input the following values in the “Temporal Lag” and “Temporal Lag Tolerance”
fields of the dialog box.
Temporal Lag:
0.0,20.0,40.0,68.3333333333,136.666666667,205.0,273.333333333,341.666666667,4
10.0,478.333333333,546.666666667,615.0,683.333333333
Temporal Lag Tolerance:
0.0,10,20.0,34.1666666667,34.1666666667,34.1666666667,34.1666666667,34.16666
66667,34.1666666667,34.1666666667,34.1666666667,34.1666666667,34.166666666
7
xxx.Click on the “OK” button. The experimental covariance plot (shown in red dots) is
automatically updated.
Figure 9: The “Space/Time Covariance Analysis” screen, showing Spatial and Temporal
Components of the covariance
xxxi.Enter 2 in “Number of covariance structure(1-4)”
xxxii.Input the following model parameters
Structure 1:





Sill: 0.05
Spatial Model: exponentialC
Spatial Range: 1.5
Temporal Model: exponentialC
Temporal Range: 5





Sill: 0.0619
Spatial Model: exponentialC
Spatial Range: 3
Temporal Model: exponentialC
Temporal Range: 25
Structure 2:
xxxiii.Click on the “Plot Model” button. A plot of covariance model is superimposed on the
experimental covariance values.
Figure 10: The covariance model, shown on the Spatial Component (upper) and
Temporal Component (lower) plot
Case 3:
Repeat all Steps in case 2 with following changes
xxvii.
To obtain the mean trend using new parameters, input the following parameter values,
and click on the “Recalculate Mean Trend” button
Search Radius
Spatial
Temporal
xxviii.
1
60
Input the following model parameters
Structure 1:





Sill: 0.18
Spatial Model: exponentialC
Spatial Range: 3.9
Temporal Model: exponentialC
Temporal Range: 2





Sill: 0.153
Spatial Model: exponentialC
Spatial Range: 95
Temporal Model: exponentialC
Temporal Range: 30
Structure 2:
Smoothing Range
1
60
Case 4:
Repeat all Steps in case 2 with following changes
xxix.
To obtain the mean trend using new parameters, input the following parameter values,
and click on the “Recalculate Mean Trend” button
Search Radius
Spatial
Temporal
xxx.
0.2
10
Input the following model parameters
Structure 1:





Sill: 0.157
Spatial Model: exponentialC
Spatial Range: 3.7
Temporal Model: exponentialC
Temporal Range: 2





Sill: 0.13
Spatial Model: exponentialC
Spatial Range: 85
Temporal Model: exponentialC
Temporal Range: 20
Structure 2:
Smoothing Range
0.2
10
Case 5:
Repeat all Steps in case 2 with following changes
xxxi.
To obtain the mean trend using new parameters, input the following parameter values,
and click on the “Recalculate Mean Trend” button
Search Radius
Spatial
Temporal
xxxii.
0.1
5
Input the following model parameters
Structure 1:





Sill: 0.11
Spatial Model: exponentialC
Spatial Range: 3
Temporal Model: exponentialC
Temporal Range: 2





Sill: 0.1312
Spatial Model: exponentialC
Spatial Range: 30
Temporal Model: exponentialC
Temporal Range: 15
Structure 2:
Smoothing Range
0.1
5
The analysis carried out above in BMEGUI can be summarized using the tables shown
below
Table 1: Smoothing parameters used to obtain the 5 different global offset models. The
search radius is set to the same value as the smoothing range. Case 1 and case 2 are two
extremes of smoothness in the global offset and are tabulated in the first and last rows.
Spatial Component
Temporal Component
case
1
3
4
5
Search radius
(deg.)
15
1
0.2
0.1
Smoothing
range (deg.)
15
1
0.2
0.1
Search radius
(days)
1000
60
10
5
Smoothing range
(days)
1000
60
10
5
2
0.001
0.001
0.1
0.1
Table 2 Fitted covariance model parameters (sill and autocorrelation range) for each
global offset model.
case
1
3
4
5
2
Sill
0.2
0.18
0.157
0.11
0.05
Structure 1
Spatial
range
(deg.)
4
3.9
3.7
3
1.5
Temporal
range
(days)
7
2
2
2
5
Sill
0.19
0.153
0.13
0.132
0.0619
Structure 2
Spatial
range
(deg.)
100
95
85
30
3
Temporal
range
(days)
75
30
20
15
25
After careful analysis of table 1 and table 2 it can be observed that as we increase the
smoothness in the mean trend (the global offset) we observe changes in the experimental
covariance. An extremely smoothed (i.e. flat and uninformative) mean trend results in
higher residual variance and larger spatial and temporal autocorrelation ranges. On the
other hand, decreased smoothness in the mean trend results in smaller residual variance
but also shorter spatial and temporal autocorrelation ranges. Ideally, we seek large spatial
and temporal autocorrelation range but low variance for the residuals.
In order to see how the autocorrelation range and the residual variance change for each
mean trend model, we calculate for each mean trend model (case 1 to 5) the residual
variance as the sum of the two covariance sills, as well as the variance weighted spatial
range, and the variance weighted covariance range. Each mean trend model is then
represented as a circle in the following plot:
60
variance weighted range
50
spatial component
temporal component
mean spatial and tempooral components
40
30
20
10
0
0.1
0.15
0.2
0.25
0.3
variance
0.35
0.4
0.45
The mean trend model obtained in case 1 had the maximum smoothness (i.e. it is flat) and
it therefore had the largest residual variance. This mean trend model is therefore
represented by the circles with the highest residual variance. On the other hand the mean
trend obtained in case 2 had the smallest smoothness, and it therefore had the smallest
residual variance. This mean trend model is therefore represented by the circle with the
lowest residual variance. Cases 3-5 have residual variances that are in between the case 1
and 2 which are extremes of smoothness.
As can be seen from the plot, each mean trend model represents a tradeoff between
residual variance and covariance range. As we start from the smoothest mean trend model
(with the highest residual variance) and we decrease the mean trend smoothest (i.e. we
are moving toward low residual variance), we see that the covariance range decreases.
This represents a tradeoff. The optimal level of smoothness in mean trend is the
breakpoint where further decrease in smoothness results a drastic decreases in
autocorrelation range. This point is shown in green in the plot.
Conclusion: The degree of smoothness in the space/time global offset can be controlled
by the search radius and smoothing range parameters. A very informative space/time
global offset leaves too little autocorrelation in the residuals to conduct a successful
geostatistical analysis of the residual field. On the other hand, a flat space/time global
offset leaves a large variability in the residuals which produces a covariance model with
high variance. Thus, there is a tradeoff between residual variability and autocorrelation
range, and hence one should choose a space/time global offset which capture some
variability in data and leaves reasonable autocorrelation in the residuals to conduct a
successful geostatistical analysis of the residual field.
Download