Document 11863946

advertisement
This file was created by scanning the printed publication.
Errors identified by the software have been corrected;
however, some errors may remain.
A Method for Measuring the Spatial
Accuracy of Coordinates Collected Using
the Global Positioning System
Thomas 0wens1 and David ~ c c o n v i l l e ~
Abstract.--The study applied a method to measure the spatial accuracy of point
data. The quantity of interest was the difference between global positioning
system (GPS) coordinates and surveyed coordinates. This difference is a
bivariate quantity and the probability distribution function (PDF) can be
described by an ellipse with the center at 8 and Y. The confidence ellipse is an
estimate of accuracy; the sample mean is or is not significantly different from the
survey locations at a given a. The tolerance ellipse is an estimate of precision;
a given percentage of the population sampled is enclosed in the tolerance ellipse
at a given a. Thirty-five locations were measured with a GPS receiver at
surveyed locations in the Mississippi River floodplain near La Crosse, Wisconsin.
The average offset was 0.18 m in the easting (X) direction and - 1.13 m in the
northing (Y) direction. Hotelling's one sample test deter~inedthat H, (no
significant departure from the survey locations exists) was rejected at the 0.05
level, which indicated a systematic error in the sample. Ninety-five percent of
the population sampled (at the 0.05 level) was contained in an ellipse centered
on (0.18 m, - 1.13 m), had a major axis of 7.49 m, a minor axis of 5.12 m, and
an angle of 87.74 degrees.
INTRODUCTION
The Environmental Management Technical Center acquires and uses spatial
data for inventory and analysis of trends on the Upper Mississippi River in the
implementation of the Long Term Resource Monitoring Program. Evaluating
accuracy and precision of spatial data in geographic information systems (GIs) is
important for determining the appropriate use of these data. However, a good
method has not been documented for measuring the accuracy of location
measurements because of the problem's complexity and the difficulty in obtaining
accurate location information with traditional survey techniques.
' Cartographer, Environmental Management Technical Center, Onalaska, WI.
Professor, Sr. Mary's University, Winona, MN
It is conceptually easier to determine the spatial accuracy of points rather than
lines or polygons. A point may represent a specific location such as a survey
point or it may represent an area displayed at a small scale. If a point represents
a real single location, it is possible to measure its displacement from its "true"
location. Lines and polygons are complex abstract objects and measuring their
spatial accuracy is more complex (Goodchild and Gopal 1989). For this reason,
this study concentrates on the spatial accuracy of points and leaves the spatial
accuracy of polygons for later study.
The advent of the Global Positioning System (GPS) has reduced the expense
and complexity of traditional survey techniques for measuring the location of
objects on the earth. The GPS enables nonsurveyors to determine the location of
objects with relative ease, but gives no information about the data's accuracy and
precision. This study devises a straight-forward, repeatable, and statistically
sound method of estimating the horizontal accuracy of GPS derived coordinates.
Two measures of accuracy are widely used , RMS and CEP. Root Mean Square
(RMS) error is the square root of the sum of the square of the euclidian distances
of the error, divided by the number of observations. Circle Error Probable (CEP),
is defined as a circle that will enclose 50 percent of the dot print.
These
measures have two disadvantages: (I) they contain no information about the
direction of the error, and (2) they are only measures of accuracy, with no
information about the variability (precision) of the measurements.
Error and accuracy are conceptually the opposite sides of the same coin. Error
is the difference between a quantity's correct value and its estimated value (Thapa
and Bossler 1992). Accuracy is the nearness of quantities to their true values
(Bolstad et al. 1990). Errors may be classified into three types: (1) gross errors
or blunders, (2) systematic errors, and (3) random errors. Gross errors are caused
by carelessness or mistakes of the observer using the equipment. Systematic
errors are due to environmental factors, instrumental imperfections, and human
limitations; they introduce bias into the estimate. Even after all gross and
systematic errors have been eliminated, some small random errors will remain.
Precision is the similarity of repeated measurements among themselves. In a
sample of coordinates representing the same position, precision is the deviation of
the points from the mean coordinate. Total error for a particular measurement is
the sum of the mean (systematic) and deviation (random) errors. Average error
(accuracy) and distribution about this mean (precision) characterize the positional
uncertainty of digital spatial data, which is related to the statistical distribution of
the errors and can be modeled by a probability distribution function (PDF). The
average difference between the test and control data estimates the accuracy, while
the frequency distribution establishes the theoretical PDF which contains
information on both accuracy and precision.
When a sample of GPS coordinates is taken at surveyed locations, the quantity
of interest is the difference in GPS coordinates from the surveyed coordinates.
The X and Y difference between GPS coordinate estimates and the surveyed
coordinate is a bivariate quantity. The joint PDF can be described by a
probability ellipse (Batschelet 1981) with the center at x and P. An ellipse is an
appropriate shape for a joint PDF because it has two dimensions; it is not
rectangular because the joint probability of points occurring in the comers is
small, and it is generally not circular, because X and Y are not necessarily the
same. The ellipse can be described with five statistics: the sample means of (1)
X, (2) and Y, the sample standard deviations of (3) X, (4) and Y, (5) and the
correlation coefficient between X and Y. The quantities X and Y are jointly
distributed, where X and Y depend on each other, but different pairs are
independent of each other in a sample.
There are three ellipses of interest: the standard ellipse, the confidence ellipse,
and the tolerance ellipse. The standard ellipse is a descriptive tool used to
visualize the shape of the ellipse and its orientation. It is readily calculated and
contains about 40% of the sample population. It is not dependent on the sample
size and cannot be used for statistical inference. Five statistics are required to
construct the ellipse: X, Y, s: , sf , and the correlation coefficient r (r =
Cov(X,Y)/s,s, and Cov(X,Y) = ll(n-l)z(Xi - x)(Y, - P)), which measures the joint
behaviors of X and Y. The center of the ellipse is (x, 7 )and the equation of the
ellipse is as follows:
For ease of calculation, some coefficients are shortened:
An ellipse is defined by a major axis (a), a minor axis (b), and an angle 0 of
translation, which is the angle that the major axis is offset from the X axis (and
the angle the minor axis is offset from the Y axis).
Another coefficient must be calculated to arrive at the ellipse parameters:
R = [(A - c)' + 4 ~ ~ 1 ' ~
Then we can calculate the ellipse parameters:
(6)
0 = arctan[2B/(A - C - R)]
(9)
The other two ellipses have an identical shape and orientation but have different
major and minor axes. The Hotelling's confidence ellipse covers the sample's
center with a given accuracy and estimates accuracy. In addition, Hotelling's onesample test can be used to determine if the bivariate sample mean is significantly
different than 0. The confidence ellipse is a region that contains the population
mean at a given probability with the center of the ellipse at (x, Y). The equation
of the ellipse is as follows:
,
is based
Note that the right side of the expression has an added term, T ~which
on the familiar F value for the univariate solution where:
.,
F,,,
denotes the critical value from the F one-tailed distribution with n-2
degrees of freedom and significance level of a. The confidence ellipse is then
calculated using the same coefficients described for the standard ellipse.
The tolerance ellipse is an estimate of precision. It is an estimate of the
confidence ( e g , 95%) of a percentage of the population sampled (e.g, 95%) that
is enclosed in the tolerance ellipse. The tolerance ellipse is a region that contains
a given percent of the population at a given probability with the center of the
ellipse at (x, 7 ) . The equation of the ellipse is:
The right side of the expression has the term H, which is based on the noncentral x2 distribution and is approximated by Table 2 in Chew (1966). The
tolerance ellipse is then calculated using the same coefficients described for the
standard ellipse.
This elliptical method of describing spatial accuracy and precision is valid
because it is (1) intuitive, (2) understandable graphically, (3) statistically sound,
(4) sufficient (e.g., it describes all relevant parameters), and (5) concise.
METHODS
The location of the study included Navigation Pools 7 and 8 of the Upper
Mississippi River near La Crosse, Wisconsin. This reach of the river is a complex
of islands, channels, and backwaters surrounded by 500-ft high bluffs.
Forty-one surveyed ground locations were used as test points. Thirty-six thirdorder points were surveyed by the U.S. Army Corps of Engineers and were
distributed along the main channel of the river. Five first-order points located at
road intersections in the floodplain were surveyed by the Wisconsin and
Minnesota Department of Highways.
A Trimble Pathfinder Basis Plus GPS receiver was used in the field as the
rover. The GPS measurements were taken with the rover on June 8, 9, and 17,
1994. The PDOP mask was set to 5. The receiver was set to take one position
per second and the horizon mask was set at 10". The unit received signals until
approximately 200 readings were taken at each point. A Trimble Pathfinder
Professional receiver was established at a surveyed point to act as the base station.
The base was set to take positions once every 5 seconds, had a PDOP mask of 5,
and a horizon mask of 10". The measurements were post-processed to introduce
differential correction using Trimble's PFinder software on a personal computer.
Survey measurements were recorded in the State Plane Coordinate System,
Minnesota South zone, and converted to the Universal Transverse Mercator
(UTM) projection, zone 15. The GPS measurements were taken in WGS-84 and
converted to the UTM projection, zone 15, NAD27.
The X (easting) and Y (northing) differences between the survey GPS
measurements were tested to determine if their distribution was normal using the
x2 (chi-square) test (Zar 1984). If the easting and northing differences were
normally distributed, it is possible to use bivariate normal tests to test for
systematic bias from zero and to describe the PDF with ellipses (Batschelet 1981).
Hotelling's confidence ellipse, the standard ellipse, and the tolerance ellipse (Chew
1966) were calculated to described the PDF. Hotelling's one-sample test was run
to determine if the sample average was significantly different than 0.
RESULTS AND DISCUSSION
Thirty-five usable locations were differentially corrected from the 41 locations
measured. Five of the original locations were not usable because the base station
and rover did not receive positions from the same satellites and so the GPS
readings were not corrected during differential post-processing (possibly due to
setting the mask angle to 10" on both the rover and the base station). One
surveyed point had incorrect coordinates and was discarded.
The average distance from the survey points was -1.13 m in the northing (Y)
direction and 0.15 m in the easting (X) direction.
The easting and northing distances were tested for normality. H, (that the
distributions were normal) was not rejected at the 0.05 level. Thus, we could
calculate the descriptive and statistical ellipses and test to determine if there was
a significant difference from 0 for the mean distance.
The standard ellipse was centered on the sample means of (0.15, -1.13), and had
a major axis of 2.60 m and a minor axis of 1.77 m with an angle from the X axis
of 87.74" (figure 1). The confidence ellipse had a major axis of 1.12 m and a
minor axis of 0.77 m, with the same angle of 87.74". Hotelling's one-sample test
was used to determine if the sample means were significantly different than 0.
The result was that H, (that the population means equal 0) was rejected at the 0.05
level. The tolerance ellipse, which was calculated to contain 95% of the
population at the 95% confidence level, had a major axis of 7.49 m and a minor
axis of 5.12 m.
The confidence ellipse does not encompass the origin (0, 0), which graphically
shows that the sample mean is different than 0. Thus, the sample has systematic
error that is offset in the south and east directions. We can also say that we are
95% confident that 95% of the population we sampled is contained in an ellipse
that is centered on 0.15, -1.13, and has a major axis of 7.49 m, a minor axis of
5.12 m with an angle of 87.74" at the 0.05 level.
CONCLUSION
The purpose of this study was to apply a method to measure the spatial
accuracy of point data. Hotelling's one-sample test made inferences about the
sample data's means and a tolerance ellipse described the PDF of the population.
The confidence ellipse is an intuitive graphic that shows whether systemic error
exists in the sample. This method is theoretically sound and provides intuitive and
concise statistics about a population. We can now describe the accuracy of the
sample data generated by GPS described in the study as having a mean that is
0.18, - 1.13 m, and has systemic error in the east and south directions. We can
further state that 95% of the population is contained in an ellipse (at the 95%
confidence level) with a center at 0.18, -1.13 m having a major axis of 7.49 m,
a minor axis of 5.12 m, and an angle of translation from the horizontal of 87.74".
This method to measure and describe the spatial accuracy of point data is useful
for three reasons. First, the method is repeatable; similar results will be obtained
for a given set of points when the method is repeated and accuracy statements can
be compared for different sets of data. Second, it is statistically valid; this method
provides a probability statement about the accuracy statement. Third, it is
sufficient. Merely stating that the GPS-derived points are "within 5 m of their
true location" is not sufficient because no probability statement is associated with
the accuracy statement. In addition, X and Y coordinates are related, but do not
vary in a perfect one-to-one relationship; it is necessary to describe how both X
and Y vary, along with the confidence ellipse, which describes the accuracy, and
the tolerance ellipse, which describes the precision.
GPS Offsets from Survev Points
Comparison of Survey Points and GPS ?oints
1
-
/,+
\
-----A
3
/
/
L
Confidenc Ellipse
Standard E lipse
7aelmC-e 3 l i ~ -
-15
-10
-5
0
5
10
15
Meters
GPS Easting Offset
Figure 1.--Distribution of distances between the GPS-derived coo~dinatesand survey-derived
coordinates and ellipses which describe the sample, accuracy, and probability distribution function.
ACKNOWLEDGEMENTS
This study was supported by the U.S. Army Corps of Engineers and the
National Biological Service through the Upper Mississippi River System Long
Term Resources Monitoring Program.
REFERENCES
Batschelet, E. 1981. Circular statistics in biology. Academic Press, New York.
310 pp.
Bolstad, P., P Gessler, and T. Lillesand. 1990. Positional uncertainty in manually
digitized map data. International Journal of GIs. 4(4): 399-4 12.
Chew, V. 1966. Confidence, predictive, and tolerance regions for the multivariate
normal distribution. American Statistical Association Journal 61:605-617.
Goodchild, M., and S. Gopal. 1989. Preface. Pages xi-xv in National Center for
Geographic Information and Analysis. M. Goodchild and S. Gopal, editors.
Accuracy of Spatial Database. Taylor and Francis, New York.
Thapa, K., and J. Bossler. 1992. Accuracy of spatial data used in GIs.
Photogrammetric Engineering and Remote Sensing 58(6): 835-84 1.
Zar, J. 1984. Biostatistical analysis. Prentice-Hall, Englewood Cliffs, New Jersey.
620 pp.
BIOGRAPHICAL SKETCHES
Thomas Owens is a Cartographer with the National Biological Service's
Environmental Management Technical Center in Onalaska, Wisconsin. He
received a Master of Science in forest management and remote sensing from the
University of Minnesota in 1978. He is Leader of the Cartographic Services
Group and provides GIs expertise to the EMTC.
David McConville is a codirector of the Resource Studies Center and Professor
of Biology at Saint Mary's University of Minnesota. He holds a Ph. D. in
fisheries from the University of Minnesota and has over 25 years of experience
working on aquatic ecosystems and, in particular, Mississippi River ecology.
Download