Document 11863942

advertisement
This file was created by scanning the printed publication.
Errors identified by the software have been corrected;
however, some errors may remain.
Attribute and Positional Accuracy
Assessment of the
Murray Darling Basin Project,
Australia
Fitzgerald, R.W.', Ritman, K.T. & Lewis, A.
Abstract: The Murray Darling Basin comprises a land area of
1,058,000 km2 covering a substantial portion of SE Australia and
encompassing Australia's largest river system. A basin wide woody
vegetation dataset has been assembled from LANDSAT TM imagery
supplemented with aerial photography at a nominal scale of 1:100,000.
The attribute accuracy assessment method is based on a multi-stage
systematic sample design. A rectangular grid is placed over the entire
Murray Darling Basin dataset and at each primary sample point, a
secondary grid is created formed from a square contiguous set of aerial
photos.
A half tone, grey scale transparency covering approximately 10km2
is generated from satellite digital imagery (LANDSAT TM) for each
primary grid point at a contact scale matching the available air photos.
Overprinted on this transparency are the secondary grid points as patch
size sampling frames. The air photos are then visually co-registered
underneath the transparency with Landcover features from the
LANDSAT TM image. Attribute data (presence or absence of woody
vegetation) is collected directly from the air photos for each secondary
grid point at 4 different patch sizes.
Positional accuracy is assessed by recording 40 or more ground
control points from the 1: 100,000 map sheet containing the primary
grid point. These Eastings and Northings are compared to their position
in the LANDSAT TM image.
The outcome of the accuracy assessment is a Basin wide attribute
and positional accuracy statement and spatial variability contour maps
of attribute and positional accuracy as GIs overlays.
INTRODUCTION
This paper discusses the attribute and positional accuracy assessment
I
Infoplus, Po Box 125, Queanbeyan, NSW
bfitzger@pcug, org.au
2620, Australia. Fax: -+dl (0)6 299 1331, E-mail:
methodology developed for the Murray Darling Basin project2(MDBP). The Murray
Darling Basin (MDB) covers an area exceeding 440 x 1:100,000 map sheets with a
total area of 1,058,000 krn2. The diverse range of vegetation and land forms creates
challenges in terms of methodology and logistics.
The initial focus of the accuracy assessment method include three stratification
hypothesise. The first is that the local geographic variations in the classification
accuracy is a function of vegetation type, terrain and substrate. Second, the smaller
the patch of woody vegetation, the lower the attribute accuracy. Thirdly, the overall
accuracy percentage is proportional to the percentage of vegetation cover. If the
vegetation cover is patchy, then the ratio between the length of a boundary around
a patch polygon and it's enclosed area is of interest. This ratio was examined by
Crapper et.al. (1986).
The positional and accuracy assessment methodology will be:
Statistically sound, practical, inexpensive to implement, easily
a.
understood by project staff and portable to MDBP's GIs;
Produce scalable attribute and positional accuracy assessments of the
b.
woody1 non woody vegetation dataset as;
I.
Attribute accuracy assessment statistics (error matrix, user
and producer accuracies and Kappa statistic) at different
spatial scales;
..
11.
Spatial variability maps of attribute and positional accuracy
for the entire MDBP.
A brief examination of the MDB woodylnon-woody GIs layer provided an
insight into data quality and processing standards. The details of these standards
including lineage are outlined in the Draft specification (Ritman, 1995).
The Victorian & South Australian groups have examined attribute accuracy
assessment methodologies. The work of Czaplewski et.al. (1992) proved to be the
most substantial and well documented attribute accuracy assessment methodology
available. They used aerial photo interpretation as their pseudo ground truth and a
systematic sample. Tadrowski et.al. (1990) in South Australia investigated
supervised classification aided by manual photo interpretation and a stratified simple
random sample for attribute accuracy assessment.
The production of the digital woody vegetation dataset at a nominal Basin-wide
scale of 1:100,000 was achieved by a two stage process (Ritman, 1995). Woody
vegetation is defined as any perennial vegetation having a height exceeding 2m and
a density greater than 20% crown cover (McDonald et.al., 1990).
The first stage was the digital classification from LANDSAT TM imagery of a
template of only woody vegetation. This method is based on that of Gilbee &
Goodson (1992) and comprises an unsupervised 100 I S 0 class cluster analysis of
2
This paper is a product of a consulting project titled "AtfributeAccuracy AssessmentJor Project M305",
DLWC, September 1995, Murray Darling Basin Project M305.
318
LANDSAT TM imagery, followed by manual aggregation based on field input,
aerial photos and ancillary data. In addition a filter (an ARCINFO AML process)
designed to remove patches of cells diagonally and orthogonally connected, created
by Dr. K. Ritman, was passed over the resulting woody vegetation layer to remove
unconnected vegetation patches less than 0.25 ha.. The resultant woody dataset is in
raster format with 25 x 25m pixels.
The next stage was either a manual or digital classification of vegetation structural
elements such as genus, density class and growth form. Only the woody vegetation
layer is subject to the attribute and positional accuracy methodology described in this
paper.
A BRIEF REVIEW OF THE LITERATURE AND PREVIOUS
STUDIES.
Attribute classification accuracy is usually assessed by constructing a contingency
table of a classified map versus ground truth or reference data (Congalton, 1991;
Veregin ,1989). The resulting error or confusion matrix C is a k x k matrix where
k is the number of discreet classes in the classification scheme. In the case of the
MDB woodylnon-woody dataset, k = 2.
The most commonly used index of attribute classification accuracy is the overall
accuracy percent (OA%). Confidence limits for the OA% can be constructed easily
from either the binomial distribution or the normal approximation to the binomial
distribution. Van Genderen et.al. (1978) logically extends the use of the binomial
confidence limits to estimate sample sizes given expected classification accuracy and
confidence levels.
The OA% is a simple index of classification accuracy which has its limitations.
The OA% can't differentiate between errors of omission and commission. Also it
can not reliably be used to compare the performance of different error matrices with
different sample sizes as well as not being able to account for correct classification
by chance alone.
One of the best methods developed to overcome these limitations is the Kappa
statistic discussed extensively by Congalton (199 1) and Fitzgerald & Lees (1994a).
It statistically quantifies the level of agreement and has been shown to give a less
biased estimate of classification accuracy than the OA% (Rosenfield & FitzpatrickLins, 1986).
The effects of sampling schemes on classification accuracies especially when
viewed across the spectral, spatial, environmental, taxonomical and temporal
domains can induce substantial bias into classification. Congalton (1988, 1991)
compares the relative effects of five sampling schemes including random and
stratified random sampling on classification accuracy. Franklin & Hiernaux (1991)
discuss the effect of sampling schemes on woody vegetation classification while
Fitzgerald & Lees (1994b) discuss scale and its relationship to floristic structure.
High accuracy surveying standards exist for assessing the accuracy of topographic
maps in all three dimensions. The state and national mapping agencies are
responsible for the surveying standards and integrity of the map base. The implicit
assumption made in this study is that this map base is generally correcf
Positional accuracy quantifies the accuracy of feature locations after various
image processing and GIs transformations have been applied. A number of tests are
available to assess positional accuracy including deductive estimates, internal
evidence checks, comparisons to source documents and reference to independent
sources of higher accuracy (Veregin, 1989). The latter is the most desirable and the
one used in this study. The independent source is the AUSLIG 1:100,000 map base
series.
Spatial variability maps of both attribute and positional accuracy will be produced
for the MDB from this accuracy assessment methodology. The 12 x 18 primary
systematic grid (described below) will contain the derived data values of OA% and
Kappa (attribute accuracy) and 2D-RMS and CEP (positional accuracy). These
gridded values will then be interpolated to a surface from which a contour map
(isometric lines) of attribute and positional accuracy will be produced.
The accuracy of this interpolation is dependant on the number and spatial
distribution of the observed sample values. Systematic sampling (aligned or
unaligned) is the best of all the sampling techniques tested to minimise the effects
of spatial distribution on contouring (Veregin, 1989).
SAMPLE DESIGN
The constraints on the sample design for the attribute and positional accuracy
assessment for the MDBP were as follows:
The extent of the MDB (1,058,000km2) precludes field checking as
a.
the major source of ground truth. Aerial photo interpretation is the
only practical means in this case;
The design must be practical, simple and expedient to implement.
b.
Specifically the handling of maps, air photos, air photo run maps
along with satellite imagery should be handled as efficiently as
possible with the minimum number of each being accessed as
possible;
Attribute accuracy must be assessed at a minimum patch size of 1
c.
hectare across the entire basin. The patch size scale effect on
classification accuracy should be assessed if possible at scales of
0.25, 1.00,4.00 and 9.00 ha.;
3
This assumption proved incorrect. Field experience of project staff demonstrated that the map base is not
always reliable. Combined with budgetary restrictions, the positional accuracy component of the
methodology was cancelled in the implementation phase.
d.
The results of the accuracy assessment must be statistically
defensible and easily interpreted by the end user community.
With these constraints in mind a review of the literature suggested that the four
best contenders for the sample design were: Simple random sampling (SRS);
Stratified simple random sampling (STSRS); Cluster sampling; Systematic sampling
(SYS).
Simple random sampling has the advantage of being the easiest to construct.
Implementation especially in the field can be problematic. Statistically, the estimates
produced are easily produced and are robust. For the purposes of the MDBP, SRS
was considered impractical to implement on such a large scale dataset.
Stratified SRS is the most often used design as judged from literature. It is
statistically more efficient than SRS (Cochran, 1977), can produce less biased results
than SRS and is a little easier to implement (Congalton, 1988: Janssen & van der
Waal, '1994: Van Genderen et.al., 1978). Constructing a stratified SRS is more
complex than a SRS by itself and stills suffers from the problems of access to ground
truth sites. The claimed statistical gains over SRS are also very dependent on the
spatial autocorrelation structure of the dataset (Congalton, 1988) which is unknown.
Stratified SRS was not considered any more practical to implement than SRS for the
MDBP.
Cluster sampling is the preferred design when cost of access to the ground truth
sites is at a premium. Moisen et.al. (1994) showed that cluster sampling with a fixed
cost had a higher relative efficiency than either systematic or simple random
sampling. The construction and implementation of cluster sampling are more
complex than either stratified or simple random sampling. Once again the
practicalities of this design precluded its use in the MDBP.
Systematic samples have as their most attractive feature their ease of construction
and implementation. Systematic sampling is particularly suited to spatial problems
involving two dimensions or more as noted by Cochran (1977). Cochran (1977) and
Congalton (1988) propose that unaligned systematic sampling is superior to aligned
or centred systematic sampling.
A large number of authors and studies have used systematic sampling. Goodchild
et.al. (199 1) utilised a single stage systematic sample of 1347 sites in the CALVEG
study. Czaplewski et.al. (1992) used a systematic sample of 363 plots for the
Victorian CNR Tree cover project.
The attribute accuracy assessment method recommended for the MDBP is a two
stage systematic sample with subsampling units of equal size. The sampling unit is
a woody vegetation patch. The sample design has the practical advantage of
simplifying the acquisition and handling of the aerial photos used to acquire the
pseudo ground truth values for the attribute accuracy assessment.
In the first sampling stage, a 12 x 18 rectangular primary sampling grid (2 16
primary grid points) is placed over the MDB dataset. This grid is oriented 40" from
North to incorporate linear trends in Landcover categories.
At each primary sample point, a 7x7 secondary grid (49 secondary grid points)
is created orientated to the AMG grid. Thus the number of secondary sample points
will be: 12 x 18 x 7 x 7 = 10,584. Due to the irregular outline of the MDB,
approximately 10% of the primary grid point will fall outside of the Basin and will
be excluded. Thus the final sample size will be less than 10,584, possibly around
9,500. The sampling unit is a woody vegetation patch.
This two stage design has the added advantage of being scalable. The grid cell
size at either sample stage can be varied to reflect desired confidence limits, budget
and time constraints4.The presence1 absence of woody vegetation at the 4 patch sizes
(0.25, 1 , 4 and 9 ha.) will be collected at each Secondary grid point. This data will
be used to investigate the effect of patch size on classification accuracy.
To collect the attribute information, a half tone, grey scale transparency covering
approximately 100km2is generated fiom satellite digital imagery (LANDSAT TM)
for each primary grid point at a contact scale matching the available air photos.
Overprinted on this transparency are the secondary grid points as patch size sampling
frames. The air photos are then visually coregistered underneath the transparency
with Landcover features from the LANDSAT TM image (MDBC, 1995).
Attribute data (presence or absence of woody vegetation) is collected directly
from the air photos for each secondary grid point at 4 different patch sizes. This
information forms the pseudo ground truth values for comparison to the LANDSAT
TM derived vegetation layer (woody1 non woody). The attribute data for each of the
49 secondary grid points is compiled and crosstabulated with the classified
LANDSAT TM woody vegetation values. An OA%, user and producers accuracy
along with the Kappa statistic are derived from this error matrix to assess the
attribute accuracy at each primary grid point. These estimates are statistically valid
at the secondary grid point level.
Positional accuracy is assessed by recording 40 or more ground control points
(GCPs) from the 1:100000 map sheet containing each primary grid point. These
Eastings and Northings are compared to their position in the LANDSAT TM image.
From these GCPs, the mean and standard deviation of the absolute differences along
with the 2D-RMS and Circular Error probability (CEP) radius are computed.
The specified positional accuracy of the woody vegetation dataset at 1:100,000
scale is 90% 5 50 metres. The results from the 3 spot checks indicate that all the
mean 2D-RMS values are > 50m and the 85% CEP are 2 to 3 times the 50m
specification.
4
Budgetary constraints during the implementation of the accuracy assessment methodology dictated that the
sampling grids be reduced to 10 x 15 primary and 6 x 6 secondary points giving a total of 5,400 sampling
points.
SAMPLE SIZE DETERMINATION
The sampling unit for the attribute accuracy assessment is a patch of woody
vegetation. The minimum Basin wide patch size is 1 ha. The total number of 1 ha.
patches is N = 105,800,000 and at the minimum mapping unit size of 0.25 ha., N =
423,2OO,OOO.
One of the statistical problems faced in the sample design is that there are very
few accepted guidelines for determining sample sizes in spatial analysis. Many well
respected authors and studies pluck a sample size from the ether with naught
justification. Goodchild et.al. (1991) uses a sample size of 1347 sites systematically
sampled from 56,973 sites and does not described how they arrived at this figure.
The beginnings of the statistical definition of sample size in spatial analysis is
seen in the early work of Van Genderen et.al. (1978). They outline the use of the
binomial distribution as a means for deciding the sample size based on a confidence
requirement (95%) and a specified minimum classification error in the population
of 85% correct. This corresponds with the US Geological Survey Circular 671
operational job specification. Rosenfield & Melly (1980) and Veregin (1989)
provide a similar rationale based on the confidence interval for proportions, again
based on the binomial distribution.
At the secondary grid level, the proposed sample design takes an area of
10xlOkm. This corresponds to 10,000 x 1 ha. patches. Thus based on the binomial
theory, a sample of 60 one hectare patches from 10,000 one hectare patches, a
sample fraction of O.6%, should provide an unbiased estimate of the population (here
a 10 x 1Okm area) classification accuracy.
However, at the Basin wide level, 60 one hectare sample points amongst
105,800,000 (a sample fraction of 0.0001%) is a tad small! Theoretically it is
justifiable. Typical telephone polling of the entire Australian population has sample
fractions of 0.0 1%, 100 times that of the above!
Unfortunately, the Remote Sensing & GIs literature gives little guidance as to
what constitutes an acceptable sample size. Discussions with Dr. Ray Czaplewski
and Gretchen Moisen confirmed this view. Congalton (199 1) offers a rule ofthumb
of 100 sample sites per classification category. Czaplewski & Catts (1992)
recommend sample sizes of 500 to 1000 based on simulation studies.
The decision on the final sample sizes was constrained by 3 factors. The number
of primary sampling points needed to be as large as practicable to minimise the
contour interpolation error. The secondary grid sample size of 60 (95% correct
classification with 95% confidence) based on the binomial distribution seemed
reasonable based on the literature and discussions with colleagues. The sample size
had to accommodate the patch size sampling frames on the transparent overlay.
The compromise decided on was a primary grid of 12x 18 (2 16 points) and a
secondary grid of 7x7 (49 points) giving a total sample size of 10,584 patches. The
secondary grid size is statistically defensible and the primary grid size gives
sufficient control points for the contour interpolation.
ANALYSIS OF THE ATTRIBUTE AND POSITIONAL ACCURACY.
The Positional accuracy assessment will be based on a summary table of the
absolute differences between the map and digital image coordinates (E & N) of the
GCPs recorded at the secondary grid level for each primary grid point. The GCP
outliers (extreme differences i.e. > 1kin), will be identified, documented and
removed from the analysis. Summary statistics (means, standard deviations,
maximum & minimum) for each primary grid point will be aggregated into a Basin
wide positional accuracy report. The 2D-RMS and Circular error probability radius
(CEP) for each primary grid point will be created from the raw GCP differences
discussed above. A contour plot of the 2D-RMS from each primary grid point will
be produced. This contour map illustrates the spatial variability of positional
accuracy across the MDB.
The Attribute accuracy assessment follows the trend in most of the literature in
utilising an analysis of the error matrix. For each Primary grid point and for each
patch size, an error matrix of the pseudo ground truth values of woody1 non woody
derived from the air photo interpretation versus the values derived from the digital
imagery is constructed.
From this error matrix (for each patch size) the Overall Accuracy % (OA%),
Kappa statistic and user and producer accuracies are computed for each primary grid
point and for each patch size. They can be reported separately as the secondary level
sample size is sufficient to make them statistically valid. The contents of the Basin
wide attribute accuracy summary table is then contour mapped to produce the spatial
variability maps of Attribute accuracy.
CONCLUSIONS
The attribute accuracy assessment project developed and trialed a methodology
for assessing the attribute and positional mapping accuracy of the woody vegetation
layer (woody/ non-woody) within the MDB dataset. The flexibility of two stage
systematic sampling, its simplicity of implementation, the large number of regular
grid points distributed across the basin at different sampling levels and the potential
flexibility for post stratification and interpolation were the deciding factor in
choosing it for the attribute accuracy assessment for the MDBP.
ACKNOWLEDGEMENTS
I'm indebted to the practical experience of Adam Choma of CNR, Peter Knock
of DLWC, Bathurst and Graeme Dudgeon (NSW Dept. Agriculture, Orange). Mrs.
Kim Smith my research assistant persevered with the GCPs. On the statistical front,
Dr. Ray Czaplewski of the US Forestry Service, Fort Collins Colorado USA proved
an invaluable sounding board. Gretchen Moisen of the US Dept. Of Agriculture,
Ogden Utah USA provided insights into accuracy assessment on large scale datasets.
Dr. Brian Lees (Australian National University, Geography Dept.) provided early
thoughts.
REFERENCES
Cochran, W.G., 1977, Sampling Techniques (3rd ed.), John Wiley & Sons, NY
Congalton, R.G., 1988, Comparison of sampling schemes used in generating
error matrices for assessing the accuracy of maps generated from remotely
sensed data., Photogrammetric Engineering and Remote Sensing, v 54, n 5 May
1988, p 593-600
Congalton R.G., 1991, A review of assessing the accuracy of classifications of
remotely sensed data., Remote sensing of Environment, v 37, n 1, p 35-46
Crapper, P.F., Walker, P.A., Nanninga, P.M., 1986, Theoretical prediction of the
effect of aggregation on grid cell data sets., Ge-processing, 3, pl55- 166.
Czaplewski, R. & Catts, G.P., 1992, Calibration of Remotely Sensed proportion or
area estimates for misclassification error. Remote Sensing ofEnvironment, v 39,
p 29-43
Czaplewski, R., Goodson, P., Gilbee, A., Razier, P., Choma, A., 1992, Accuracy
assessment ofRemotely Sensed thematic maps. Project Brief, Dept Conservation
and Environment, Victoria, Australia, September 29, 1992.
Fitzgerald, R.W. & Lees, B .G., 1994a, Assessing the classification accuracy of
multisource remote sensing data., Remote Sensing of Environment, 47,n 3,
362-368
Fitzgerald, R.W ., & Lees, B .G., 1994b, Spatial Context and scale relationships
in raster data for thematic mapping in natural systems, Spatial Data Handling
Conference, Edinburgh, Scotland, Sept 1994. Published by Taylor & Francis, UK
(in press).
Franklin, J., Hiernaux, P.H . Y . , 1991, Estimating Foliage and Woody Biomass
in Sahelian and Sudanian Woodlands Using a Remote-Sensing Model.,
International Journal of Remote Sensing, v 12, n 6, p 1387-1404
Goodchild, M.F., Davis, F.W., Painho, M., Stoms, D.M., 1991, The use of vegetation
maps in geographic information systemsfor assessing conifer lands in California.
National Centre for Geographic Information and analysis, Dept Geography, Uni
California. Report 9 1-2B NCGIA
Gilbee, A. & Goodson, P., 1992, Mapping tree cover across Victoria using
Thematic Mapper digital data and a GIs., Proc. 6th. Australian Remote
Sensing Conference, Wellington, NZ , Nov 1992.
Janssen, L. L.F. & van der Wel, F. J. M., 1994, Accuracy assessment of satellite
derived land-cover data: a review, Photogrammetric Engineering & Remote
Sensing, v 60,n 4, April, p 419-426
McDonald, R.C ., Isbell, R.F., Speight, J.G., Walker, J. & Hopkins, M. S ., 1990,
Australian Soil and Land Survey; Field Handbook, ed. 2, Inkata Press,
Melbourne, Australia.
Moisen, G.G., Edwards, T.C. Jnr, Cutler, D.R., 1994, Spatial sampling to assess
classzjkation accuracy of Remotely Sensed data. Environmental Information
Management and Analysis: Ecosystem to Global Scales Michener, Brunt and
Stafford (eds). p 159-176
Murray Darling Basin Commission, 1995, Recipe for the Attribute and Positional
Accuracy Assessment of the Murray Darling Basin Project., internal working
document, by R W Fitzgerald, Infoplus for DLWC Land Information Centre,
Bathurst, Australia.
Ritman, K.T., 1995, Structural vegetation data; A speczfications for the Murray
Darling Basin Project M305. DLWC, Land Information Centre, Bathurst,
Australia
Rosenfield, G.H. & Fitzpatrick-Lins, K., 1986, A coefficient of agreement as a
measure of thematic classification accuracy., Photogrammetric Engineering and
Remote Sensing, v 52, n 2, p 223-227.
Rosenfield, G.H. & Melly, M.L., 1980, Applications of statistics to thematic
mapping, Photogrammetric Engineering and Remote Sensing, v 46, n 10,
October, p 1287-1294
Tadrowski, T., Hart, D.G.D., Schepp, K., 1990, A study of the possibilities and
accuracies of 1:50 000 vegetation mapping using Remotely Sensed data. 8th
Australian Inst Cartographers Conference, Darwin, Australia, May 1990.
Van Genderen, J.L., Lock, B.F. & Vass, P.A., 1978, Remote Sensing: Statistical
testing of thematic map accuracy. Remote Sensing ofthe Environment, v 7, p 314.
Veregin, H., 1989, A taxonomy of error in spatial databases. National Centre of
Geographic Information and Analysis, Technical Report 89- 12, December 1989.
Download