Variance Estimators for Airborne Laser Surveys

advertisement
Error Sources in Regional Airborne LiDAR Surveys
Ross Nelson
Biospheric Sciences Branch, Code 614.4
NASA-Goddard Space Flight Center
Greenbelt, Maryland 20771 USA
1-301-614-6632
Ross.F.Nelson@nasa.gov
November 17, 2005
EXTENDED ABSTRACT
Airborne lasers may be used as sampling tools to estimate regional (e.g.,
county, state, province, prefecture, ecozone) forest resources. LiDARs (Light
Detection and Ranging) are used to measure distances from aircraft to forest
canopy and from aircraft to ground along flight transects 10s to 1000s of
kilometers long. The height of the forest canopy, i.e., the difference between
these ranging measurements, can be quantitatively related to the amount of
wood on the ground. When used in conjunction with Line Intercept Sampling
techniques (LIS, Kaiser 1983; DeVries 1986), airborne LiDAR profiles (Figure 1)
may be employed to estimate forest merchantable volume, total aboveground dry
biomass, carbon, and forest fuel loads over large areas.
One critical component of this sampling procedure is the development of
an equation or set of equations to predict, for instance, biomass, as a function of
laser height measurements.
Typically, these equations are developed by
locating ground plots in selected areas beneath portions of the airborne LiDAR
transect.
Ground-measured
biomass
is
paired
with
coincident
laser
measurements of forest height, and parametric or nonparametric regression
approaches are used to construct the predictive equations. Investigators employ
different models to relate ground-measured biomass to laser-measured forest
heights.
Height-biomass relationships generally tend to be linear or slightly
curvilinear, and the relationships are frequently heteroskedastic (Lambert et al.
2005, see Figure 2). Log-log models are commonly employed (e.g., Næsset
2002; Næsset and Gobakken 2005) to control biomass scatter as height
increases.
Although these log models improve regression fit and control
heteroskedasticity, an argument has been made (Nelson et al. 2004) that
1
regional estimates are less accurate when regression estimates are backtransformed, even after accounting for the back-transformation bias (Wiant and
Harner 1979).
Figure 1. An airborne LiDAR profile acquired over the state of Delaware, on the mid-Atlantic
coast of the eastern US, in the summer of y2000. The bottom picture is a color-infrared airphoto
with land cover delineated and the aircraft flight line (yellow) superimposed. The 6 digit numbers
(hhmmss, GMT) denote the aircraft location, recorded once every 2 seconds by the LiDAR GPS.
The aircraft is flying ~50m/s; approximately 1.4 km of flight line is shown. The top graph depicts
the associated laser trace of vegetation heights (from Nelson et al 2004). The laser spike at the
stream actually depicts null or zero laser returns; water absorbs the near-infrared (λ=0.905 μm)
pulse.
Fifty-six flight lines totaling over 5159 km of flight data over the state of
Delaware were acquired during the summer of y2000. A small, profiling, LiDAR
(the Portable Airborne Laser System, PALS, Nelson et al. 2003a) was used to
collect the forest canopy height information. The 56 parallel lines oriented N-S
along the long axis of the State, were spaced 1 km apart. Delaware includes 3
counties - Newcastle, Kent, and Sussex,- and county and statewide estimates of
2
forest volume, biomass, carbon, impervious surface area, open water area, and
wildlife habitat have been generated (Nelson et al. 2003b, 2004, 2005). Variance
estimates reported in these studies were calculated assuming that the
systematically acquired data were actually a random sample.
Figure 2. A scatterplot of 90th decile laser height (d90, X axis, in m) versus total aboveground dry
biomass (tagdb, Y axis, in t/ha) for two study areas in North Carolina (NC) and Tennessee
(ORNL–Oak Ridge National Lab) USA. The blue and red points are deciduous forest; the green
points are loblolly pine plantations. BioSAR is a vegetation RaDAR. PALS height values are
illustrated.
An assumption of randomness in a systematic survey can lead to a
significant variance overestimate (Osborne 1942; Nyyssönen 1967; 1971),
especially if a population is ordered or spatially autocorrelated (Cochran 1977,
pg. 221; Sukhatme et al. 1984, pg. 417). Conversely, a laser-based line intercept
sample ignores various sources of error, including the regression error discussed
above, leading to variance underestimates. The objective of this study is twofold:
3
(1) Determine and quantify the effects of including regression error in the
variance of laser-based estimates of biomass. (2) Test three different weighted
variance estimators – a simple random sampling estimator (SRS), a successive
differences estimator (SD, Lindeberg 1924; 1926; Guest 1951), and Newton's
Method (NM, T. Gregoire, personal communication, 2005) - and compare these
to the empirical systematic sample variances to identify the most accurate
estimator.
LIS Sampling Error without and with Regression Error:
County and statewide standard errors are calculated without and with
regression error. Regression error is introduced by adding random error to the
regression estimate of biomass at the segment level, where a segment is a short
section of a linear laser transect ≤40m and completely contained within a cover
type. Regression error is assumed to be normally distributed, and the size of the
error is based on the standard deviation of the regression residuals, as follows:
reg
reg
is the regression RMSE, and j is the land cover
 reg
jks = N[0,1] * s j , where s j
subscript. Weighted sums of segment estimates are averaged to calculate flight
line estimates, and weighted flight line estimates are averaged to calculate
regional estimates. Weights are related to lengths of segments or flight lines.
When standard errors based on strictly linear predictive models are
compared with ln-ln model standard errors, the ln-ln standard errors are, on
average, 4-5 times larger than the comparable linear results. The addition of
linear regression error adds, on average, 2-10% to the LIS error. The addition of
back-transformed ln-ln regression error adds, on average, 20-40%. Though the
ln-ln regression models had uniformly higher R2 values, results suggest that, at
least with the models developed in this study and with the procedures employed
to process the flight line data, the use of ln-ln models to predict biomass leads to
inflated variances, poorer cross-validation accuracy (Nelson et al 2004), and
excessive bias. These results reflect the fact that small residual errors can grow
significantly when the ln(biomass) estimates are back-transformed.
Considering a strictly linear, predictive regression model, 95% confidence
limits are on the order of 5-10 t/ha for the forested cover classes if an area the
4
size of Delaware is transected with flight lines spaced 2 km apart, i.e., 28 flight
lines. The more ubiquitous the cover type on the landscape, in general, the
smaller is the standard error of estimate.
Estimating Regional Sampling Error – 3 Estimators
Results for the three variance estimators – SRS, SD, and NM - were
compared to empirical results to see how well the estimators tracked the
systematic sample results, with regression error included. The results indicated
that, for study areas between 2500 - 5000 km2, the weighted simple random
sampling estimator, averaged across cover type and sampling intensity, tracked
the empirical standard errors within ~0-20%. The SRS estimator, within these
areal bounds, is conservative.
Below 2500 km 2, unfortunately, the SRS
estimator quickly becomes pathologically nonconservative.
The weighted
successive differences estimator was most accurate, of the three considered, on
study areas below 2500 km2. The SD estimator was consistently conservative
thoughout the range of areas considered in this study.
The SD estimator
overestimated the systematic sample standard errors, including regression error
by 10-33%.
Trends suggest that the weighted SRS estimator should be
considered on areas exceeding 5000 km, though this observation is based on
extrapolation.
LITERATURE CITED
1. Cochran, William G. 1977. Sampling Techniques, 3rd ed., John Wiley &
Sons, New York. 428 pgs.
2. DeVries, P.G. 1986. Sampling Theory for Forest Inventory. Springer-Verlag,
New York. 399 p.
3. Guest, P.G.. 1951. The Estimation of Standard Error from Successive Finite
Differences. Journal of the Royal Statistical Society, Series B (Methodological)
13(2): 233-237.
4. Kaiser, L. 1983. Unbiased Estimation in Line Intercept Sampling. Biometrics
39: 965-976.
5
5. Lambert, M.-C., C.-H. Ung, and F. Raulier. 2005. Canadian national tree
aboveground biomass equations. Canadian Journal of Forest Research 35:
1996-2018.
6. Lindeberg, J.W. 1924. Über die Berechnung des Mittelfehlers des Resultates
einer Linientaxierung. Acta Forestalia Fennica 25: 3-22. (in German)
7. Lindeberg, J.W. 1926. Zur Theorie Derr Linientaxierung. Acta Forestalia
Fennica 31(6): 3-9. (in German)
8. Næsset, E. 2002 Predicting forest stand characteristics with airborne
scanning laser using a practical two-stage procedure and field data. Remote
Sensing of Environment 80: 88-99.
9. Næsset, E., and T. Gobakken. 2005. Estimating forest growth using canopy
metrics derived from airborne laser scanner data.
Remote Sensing of
Environment, 96(3-4): 453-465..
10. Nelson, R. F., G. Parker, and M. Hom. 2003a. A Portable Airborne Laser
System for Forest Inventory. Photogrammetric Engineering and Remote
Sensing 69(3): 267-273.
11. Nelson, R.F., E.A. Short, and M.A. Valenti. 2003b. A Multiple Resource
Inventory of Delaware Using Airborne Laser Data. BioScience 53(10): 981-992.
12. Nelson, R.F., M. Valenti, A. Short, and C. Keller. 2004. Measuring Biomass
and Carbon in Delaware Using an Airborne Profiling LiDAR. Scandinavian
Journal of Forest Research 19: 500-511. [Erratum. 2005, 3: 283-284.]
13. Nelson, R.F., C. Keller, and R. Ratnaswamy. 2005. Locating and
Estimating the Extent of Delmarva Fox Squirrel Habitat Using an Airborne LiDAR
Profiler. Remote Sensing of Environment, 96(3-4); 292-301.
14. Nyyssönen, A., P. Kilkki, and E. Mikkola. 1967. On the Precision of Some
Methods of Forest Inventory. Acta Forestalia Fennica 81. 60 pgs.
15. Nyyssönen, A., P. Roiko-Jokela, and P. Kilkki. 1971. Studies on
improvement of the efficiency of systematic sampling in forest inventory. Acta
Forestalia Fennica 116. 26 pgs.
16. Osborne, J.G. 1942. Sampling Errors of Systematic and Random Surveys
of Cover-type Areas. Jour. American Statistical Assn. 37(218): 256-264.
17. Sukhatme, P.V., B.V. Sukhatme, S.Sukhatme, and C. Asok. 1984.
Sampling Theory of Surveys with Applications. Iowa State University Press,
Ames, Iowa. 526 pgs.
18. Wiant, H.V., and E.J. Harner. 1979. Percent Bias and Standard Error in
Logarithmic Regression. Forest Science 25(1): 167-168.
6
Download