Document 11863937

advertisement
This file was created by scanning the printed publication.
Errors identified by the software have been corrected;
however, some errors may remain.
The Effect Of Database
Generalization On The Accuracy Of
The Viewshed
Peter v is her'
Abstract -- This paper examines the effects of database generalization on the
area which is determined to be visible in GIs analysis. Many different methods
of generalization are possible, but here, for any cell at the target resolution,
elevations are determined from: the arithmetic mean, the maximum, the
minimum, and the maximum difference from the mean of the cells within the
kernel, and all possible combinations of regular spacings. The resolutions
studied are 0.5, 0.33, 0.25, and 0.2 of the original study area. The viewsheds
determined over these different resolution DEMs are compared with a number of
possible viewsheds derived by generalization of the viewshed over the original
DEM. Of those tested the maximum deviation from the mean within kernel
provides the best estimate of the pattern and area of the viewshed at all
resolutions.
INTRODUCTION
In recent years considerable attention has been paid to two related lines of
research. The representation of digital spatial data at multiple resolutions,
where either information is stored at multiple resolutions or is generalized from
a detailed (large) scale to a generaked (smaller) scale (Buttenfield and
McMaster 1991;Muller, Lagrange and Weibel 1995). For specific studies see,
for example, Brown et al. (1 993), Chang and Tsai (199 I), Isaacson and Ripple
(1990), Joao (1995) and Painho (1995). Few pieces ofwork have taken a
dataset at one single resolution and examined derived products in alternative
generalizations of the data, but see work of McMaster (19 87).
The research reported here is in t h s same vein. It takes the viewshed as it is
determined in one DEM at the largest resolution, and uses measurements of the
'quality' or 'accuracy' of the viewshed when it is determined from altemative
derived DEMs at smaller resolutions. The purpose is to comment on the
changing nature of the viewshed and the performance of the alternative
generalization operators defined.
Senior Lecturer, University qf Leicester, Leicester, United Ki'ngdoin
THE VIEWSHED
The viewshed is one of the standard procedures included with GIs designed
for analysis of elevation data. It is intended to distinguish locations which can
be seen fiom a particular viewing location (in-view) from those which are outof-view. A he-of-sight is drawn from the viewer to the viewed location, and if
the elevation rises above the elevation of the line-of-sight at any point along the
line-of-sight then the viewed position at the end of the line is determined to be
out-of-view. Otherwise it is in-view. Here the method of point-to-point
determination across a rectangular lattice is used (Fisher 1993).
EXPERIMENTAL PROCEDURE
For the research reported here two Ordnance Survey DEMs from the
United Kingdom were used: one of the Malvern Hills and one including the
southeast part of Dartmoor. Each is a 20 x 20 km tile of the national elevation
coverage of 150,000 gridded elevation data recording elevations at 50 m
intervals. Therefore each is 401 x 40 1 values. Within each test area 100
random points were selected w i t h an active area which excluded a 60-cell
buffer from the edge of the tile. A region 120 x 120 cells was then taken
around each point. The 60,60 cell within each area was taken as the view point
and the viewshed determined. These are binary products which are referred to
below as the original viewsheds.
Generahations were then made of the DEM itself, the original viewshed
and the viewpoint, so that viewsheds could be determined. For simplicity the
original DEMs are generaked so that pixels in the derived datasets map exactly
to the original. The generalizations included 0.5, 0.33, 0.25, and 0.2 times
reductions, giving DEM arrays of 60x60,40x40,30x30 and 20x20. The
generahation procedures are discussed in the next section.
Statistical comparisons include the bias and error of the viewshed in the
generalized DEMs was derived from a comparison with the area of the original
viewshed, and the kappa coefficient of agreement between the viewshed in the
generabed DEMs and a number of alternative generalizations of the original
viewshed.
GENERALIZATION OPERATORS
Generalizing the DEM
It is easy to envisage many alternative approaches to the generalization of a
yidded terrain dataset. The methods used here are entirely based on raster
filtering. They do not include any surface modelhng methods. Another
approach is to resample using one of a number of well known algorithms,
including nearest neighbour, and bilinear interpolation. These methods,
however, are generally reserved for the transformation of one gndded dataset
to another where the orientation of the grid varies, as well as the size of the
grid. Other approaches to generalizing the grid use a more regular geometric
approach, and are investigated here.
Two primary types of generalization are used. The first type searches for
values in the original DEM on a regular spacing basis, while the others take
summary statistics from within a kernel area.
Regular Spacing
The simplest method of generalization is to take a value at a regular
spacing, m, so that in the values in the derived DEM are every m th values in x
and y in the original DEM . If m = 2 then there are 4 possible realizations of
this process. zd = 21, thenzd = 22, thenzd = 23, and finallyzd = 4 (where zl,z2,
z3, and 24 alternative elevation values, and zd is the value in a realization of the
generalization process). If m = 5 then there are 25 different possible values of
Summary Statistics
1) Mean
Within the kernel area the most obvious summary statistic is the mean.
2 ) Maximum and Minimum
Two fiuther generalization operators are the maximum and minimum values
w i t h kernel.
3) Maximum Deviation from the Mean
Finally, the value of the elevation whlch is the largest deviation from the
mean within the kernel is recorded.
Generalizing the Viewshed
For comparison with derived viewsheds at multiple levels of generalization
it is necessary to have a generalization of the original viewshed. It should be
recalled that the viewshed treated here is a binary phenomenon, any cell can
only be treated as being in-view or out-of-view. Three different strategies are
used:
1) All
A cell in the generalized dataset is taken to be in-view if all cells in the
kernel are in-view. Otherwise it is out-of-view.
2) Majority
A cell in the generalized dataset is taken to be in-view if the majority of cells
in the kernel are in-view. Otherwise it is out-of-view. Where there is an even
number of cells in the kernel the majority is taken as 1112+l.
3) Any
A cell in the generaked dataset is taken to be in-view if the any cell in the
kernel is in-view. Otherwise the generalized cell is out-of-view.
Generalizing the Viewpoint
Finally, the treatment of the viewpoint is considered. Viewshed algorithms
treat the viewpoint very differently. Here, the cell considered to be the
viewpoint in any generalization is the generalized cell which contains the
viewpoint in the ungeneraked DEM. Thls approach has the advantage over
others that the viewpoint is always at the centre of a cell in any dataset, and so
coincides geometrically with a height value in the corresponding DEM. The
DEM elevation at the viewpoint is always known as part of the generaked
DEM and never inferred. Many possible reslizations of the original viewshed
correspond to the viewshed at a particular scale, and so results here must be
regarded as provisional.
Implementation
The above approach was implemented, together with the viewshed
algorithm in Turbo Pascal 7.0. The sampling, generalization by all methods,
generation of the viewshed and calculation of summary and comparitive
statistics of the viewsheds were all integrated. It ran on a Pentium PC wkch
had been checked for faults on the processor. Idrisi v 4.1 was used in support
of the analysis to view data at different resolutions, etc. and analysis was done
in Excel 5.0 and using bespoke programs.
RESULTS
Generalizing the Original Viewshed
As noted above, three dflerent versions of generaking the original
viewshed were derived. These are based on all, a majority and any of the
pixels w i t h the kernel being in view in the original. The results of t h s analysis
can best be seen in Table 1, where the correlation between the area of the
original and generalized viewsheds is reported together with the mean bias and
error as a result of the generahation for both test areas. It can be seen that the
correlation for the majority operator is always closes to 1.O, and that it has
much the lowest values of bias and error at all levels of generahation. In short,
the area of the viewshed derived by the majority operator is predictable, and is
very s d a r to the original area, even over large resolution changes, while the
values as a result of the any and all operators are not so predictable at the larger
resolution changes. Furthermore, the values of the majority operator are very
close to the values for the original.
Table 1. -- Bias and Error in generalizations of the Orignal Viewshed.
Dartmoor
Generalizations Malvern
Error
Correl.
Bias
Error
Correl. Bias
2.70
0.997
2.42
3.39
3.78
0.993
0.5 x Any
0.97
0.999
1.34
0.997
-1.19
-0.87
Major
2.79
0.993
-2.48
-3.60
4.06
0.985
All
5.98
6.58
0.983
4.50
4.97
0.993
0.33 x Any
0.09
0.43
0.999
0.24
0.999
Major
-6.21
7.02
0.943
-4.45
5.04
0.972
All
7.10
0.988
6.46
8.36
9.12
0.971
0.25 x Any
0.63
0.998
0.77
0.997
-0.50
-0.56
Major
6.75
0.940
-5.96
-7.93
8.99
0.881
All
9.09
0.980
8.30
10.54
11.46
0.958
0.2 x Any
0.6 1
0.995
0.70
0.997
-0.26
-0.29
Major
7.92
0.896
-6.99
-9.13
10.39
0.803
All
The Area of the Viewshed
When the viewshed is determined over the generalized DEMs, there are a
number of ways to compare it with the original viewshed. Two parameters are
examined here: the area visible, and the kappa coefficient of agreement. The
area of the viewshed at different resolutions is easily determined, and in Table 2
the mean bias and error are given for both test areas at all four resolutions. In
addition, the correlation coefficients between the two areas are reported. The
"best" generalization is that which yields the smallest bias and error and the
largest correlation.
Table 2 is divided by the amount of and methods of generalization. The
values reported for regular spacings are the minimum area and the maximum
area determined among the 4, 9, 16 and 25 possible. There is considerable
variation among these, as is reflected in the error measures which are always
the largest and smallest of any other generalization. On the other hand, the
correlation coefficients for these are on the whole both poorer than the best of
the summary statistic values.
Among the summary statistics of the generalizations, 21 out of the 24
measures of accuracy show the maximum deviation fiom the mean to give the
best performance. The correlation coeficients and mean errors yielded by this
method are always the best of those reported. The bias is sometimes smaller
for some other method (in two cases the maximum within kernel and once the
minimum within kernel).
Table 2. -- Error measures for the area of the viewshed in generalizations of
the viewshed as compared with the area in the original viewshed.
Malvern
Generalizations
Dartmoor
Bias
Error Correl. Bias
Error Correl.
0.5 x
regular Max
6.17
8.17 0.802
3.60
4.34 0.956
1.14
spacing M n
0.47
1.82 0.965
3.32 0.913
4.04
summary Max
6.56 0.809
2.18
3.18 0.946
4.27
6.63 0.823
statistics Min
2.03
3.12 0.957
Mean
4.52
6.76 0.830
2.23
2.97 0.972
MaxDev
3.39
4.02 0.967
2.1 1
2.58 0.992
11.39 13.68 0.706
0.33 x regular Max
7.52
8.75 0.871
2.28
spacing Min
1.03
3.33 0.897
6.42 0.725
6.85
summary Max
9.34 0.768
4.50
6.05 0.874
8.51 11.25 0.688
statistics Mm
4.57
6.59 0.880
Mean
7.87 10.10 0.768
4.74
6.17 0.910
MaxDev
6.60
8.08 0.884
4.08
4.85 0.979
16.34 18.89 0.600
0.25 x regular Max
12.26 14.26 0.712
3.34
spacing Min
7.42 0.673
1.76
4.88 0.812
summary Max
9.55 12.24 0.658
7.46
9.40 0.784
12.61 15.86 0.600
statistics Mm
7.25
9.48 0.806
Mean
11.22 14.01 0.675
7.38
9.12 0.836
MaxDev
9.65 11.23 0.844
6.29
7.53 0.955
20.75 23.23 0.526
0.2 x
regular Max
16.60 18.87 0.648
4.15
8.08 0.651
spacing Min
2.30
5.51 0.738
summary Max
12.45 15.88 0.542
9.78 12.20 0.705
statistics Mm
15.83 18.96 0.580
10.26 12.62 0.685
Mean
14.00 17.12 0.607
10.14 12.38 0.706
MaxDev
12.77 14.60 0.790
8.97 10.57 0.927
I
I
The latter of these shows an enormous spread of values which tends to be
more typical of the 0.2 generahations than that yielded by the maximum
deviation from the mean. The 0.5 generalizations are generally relatively well
correlated as is shown by the correlation coefficients, but this can rapidly
deteriorate.
The Arrangement of the Viewshed
The kappa coefficient of agreement between two sets of data ranges from 0
to 1, where 1 reflects perfect correspondence in the arrangements and 0
disagreement. It has been widely applied to examining the confusion matrix in
remote sensing (Congalton et al., 1983). Here the same approach is used.
Kappa is determined for 2 x 2 tabulations which, for a particular resolution,
compares the visbht y of pixels in the generalized binary viewshed, as
determined by the majority method, with that determined through the DEM
when generalized by one of the methods under test. Summary results are
reported in Table 3 where the maximum, minimum and mean values of the
coefficient for the 100 viewpoints in each study area are reported. The
maximum and minimum values are reporting the best and the worst agreements
between patterns in any situation. The best generalization method should yield
the largest value of kappa at any reduction.
The results for regularly spaced samples report the agreements for the
maximum and minimum areas in any single generalization test. The maximum
viewshed is the best in 22 out of the 24 values reported. In other words, the
largest viewshed generated by regular spacing fiom the kernel area yields the
best agreement in the spatial pattern of the viewshed among all other
generalizations of the DEM. On the other hand, the minimum area viewshed
always yields the worst agreement. This method of generalization provides an
envelope of all other values. Indeed, in the 0.2 generalizations of both the
Malvern and Dartmoor test areas, regular spacing yielding the minimum visible
area gives almost total disagreement with the original viewshed (0.054 and
0.020).
Table 3. -- Kappa coefficients of agreement between the generalized version
of the original viewshed and the viewshed determined over the eeneralized DEM.
Generalizations
Malvern
Dartmoor
IMax
Mean Min IMax
Mean Min
0.661
0.924 0.818
0.899 0.744 0.285
0.5 x
regular Max
0.135
0.845 0.688
0.850 0.591 0.201
spacing Min
summary Max
statistics Min
Mean
MaxDev
0.33 x regular Max
1 0.886 0.694 0.235) 0.877 0.772 0.480
0.162
0.805 0.465 0.106
0.796 0.540
spacing Min
0.868
0.607
0.864
0.666
0.201
0.182
summary Max
statistics Min
Mean
MaxDev
0.25 x regular Max
spacing Mm
summary Max
statistics Min
Mean
MaxDev
0.238
0.833 0.555 0.174
0.827 0.634
0.2 x
regular Max
0.020
0.613 0.320
0.711 0.288 0.054
spacing Min
summary Max
statistics Min
Mean
MaxDev
In 17 out of 24 summaries, the maximum deviation withm kernel yields the
best agreement. In 5 cases the mean is best, and the maximum and minimum
w i t h kernel are best once each. With 0.2 generahzation the worst levels of
agreement are again very poor (0.0 15 in the 0.2 generahzation of the Malvern
area by minimum w i t h kernel). Both the mean and the best levels of
agreement in all situations are, however, quite acceptable.
CONCLUSION
A number of conclusions can be drawn from the current work. For the
viewshed it would appear that the generahzation of the binary viewshed to
other gnd resolutions is best performed by determining those cells at the target
resolution with the majority of visible cells. Other immediately apparent
methods of generalization do not yield results which match the area and pattern
of the original viewshed.
Generahation of the DEM yields many alternative possible viewsheds.
Regular spacing of the kernel area yields both the best and worst estimates of
the visible area. The result is therefore unpredictable, and although the method
is fast and convenient it is not to be recommended as a basis of generalization.
Among the statistical summaries of a kernel area, generahzation of the DEM by
determination of the maximum deviation within the kernel yields the viewshed
which best reflects both the area and arrangement of the visible area. The
results are all most stable and most frequently the best for this method. It is
therefore the method to be recommended (of those tested) in any situation
where a DEM requires generalization, and the viewshed is the derived product
which is uppermost in the investigator's interest.
Finally it should be noted that in most cases neither the area not the pattern
of the viewshed is badly disrupted by generalization of the DEM, although it
can be if an injudicious method of generahation is used.
As stated in the introduction, very little research is reported on propagating
the effects of alternative generalizations of spatial databases. Research sirmlar
to that reported here can be envisaged for almost any spatial data and derived
product. Indeed, it is crucial for very many applications that we know best how
to generalize both categorical data (the binary viewshed), and continuous data
(the DEM), and possible consequences of generalization. While the results
reported here seem quite conclusive, and are very likely to be generalizable to
other areas, so long as the viewshed is of interest, there is absolutely no
guarantee that either the recommended method of generalization should be the
same for all derived products, or that there is not a better generalization
operator for the viewshed.
ACKNOWLEDGEMENTS
I would particularly like to thank Jo Wood for some insightful suggestions
in the course of this work.
REFERENCES
Brown, D.G., Ling Bain and Walsh, S.J., 1993. Response of a distributed
watershed erosion model to variations in the input data aggregation levels.
Computers & Geosciences 19, 499-509.
Buttenfield, B.P., and McMaster, R.B. (Editors) 199 1. Map Generahzation:
Makmg Rules for Knowledge Representation. London: Longman.
Chang, K.-t., and Tsai, B.-w. 1991. The effect of DEM resolution on slope and
apect mapping. Cartography and Geographic Information Systems 18, 6977.
Congalton, R.G., Oderwald, R.G., and Mead, R.A., 1983. Assessing Landsat
classfication accuracy using discrete multivariate analysis statistical techniques.
Photogrammetric Engineering and Remote Sensing 49, 79-87.
Fisher, P.F. 1993. Algorithm and Implementation Uncertainty in Viewshed
Analysis. International Journal of Geographical Information Systems 7,33 1347.
Isaacson , D.L., and Ripple, W.J. 1990. Comparison of 7.5-minute and 1degree digita1 elevation models. Photogrammetric Engineering and Remote
Sensing 56, 1523-1527.
Joao, E.M. 1995. The importance of quantifymg the effects of generalization.
In GIs and Generahation: Methodology and Practice, edited by Muller,
J.C., Lagrange, J.P. and Weibel, R. London: Taylor & Francis.pp. 183- 193.
McMaster, R.B. 1987. The geometric properties of numerical simplification.
Geographical Analysis 19, 330-346.
Muller, J.C., Lagrange, J.P. and Weibel, R. (Editors) 1995. GIs and
Generalization: Methodology and Practice. London: Taylor & Francis. 257
P
Painho, M. 1995. The effects of generalization on attribute accuracy in natural
resource maps. . In GIs and Generalization: Methodology and Practice,
edited by Muller, J.C., Lagrange, J.P. and Weibel, R. London: Taylor &
Francis.pp. 194-206.
Download