DSM overview

advertisement
Soil Science Division
Digital Soil Mapping Primer
Tom D’Avello|Suzann Kienast-Brown
2.22.2018
USDA is an equal opportunity provider, employer, and lender.
Principles and Concepts
•
•
•
•
•
•
•
Overview
Why digital soil mapping?
Soil-landscape models
Environmental covariates
Training data
Prediction of soil classes or properties
Validation and uncertainty
Overview
Digital Soil Mapping
– The generation of geographically referenced soil
databases based on quantitative relationships between
spatially explicit environmental data and measurements
made in the field and laboratory (McBratney et al., 2003)
– The spatial prediction of soil classes or properties from
point data using a statistical algorithm
Overview
Digital Soil Map
– Raster composed of 2-dimensional cells (pixels)
organized into a grid
– Each pixel has a specific geographic location and
contains soil data
Overview
Conventional soil mapping
– “Where is the boundary between two soils?”
– Focus on the marginal areas
Digital soil mapping
– Central concept is well defined
– Variation expressed across the landscape
Why?
Digital soil mapping
Illustrate spatial distribution of soil classes or properties
Can document the uncertainty of the prediction
Documentation of tacit knowledge
Reproducible and consistent
Can be used for
• Initial soil maps
• Update or refine existing soil maps
• Generate specific interpretations
• Assess risk
– Rapid inventory, re-inventory, and project-based
management of lands in a changing environment
• FLEXIBLE
–
–
–
–
–
Soil-Landscape Models
Factors of Soil Formation (Jenny 1941)
S = f (Cl, O, R, P, T)
– Soils are a function of 5 environmental factors
• Climate
• Organisms
• Relief
• Parent Material
• Time
– Conceptual model (tacit knowledge)
Soil-Landscape Models
SCORPAN (McBratney et al., 2003)
S = f (S, C, O, R, P, A, N) + ε
– Soil, at a specific point in space and time
• Soil classes, Sc
• Soil attributes, Sa
– Empirical quantitative function of environmental
covariates
• Soil (class, or directly or remotely sensed property)
• Climate
• Organisms
• Relief
• Parent Material
• Age
• N = Spatial Position
– Plus an estimation of error or uncertainty
Soil-Landscape Models
SCORPAN facilitates
– Quantification of the relationships between
environmental covariates and soil classes or properties
in a spatial context
– Estimation of error or uncertainty of the spatial
prediction of soil classes or properties
Environmental Covariates
Choosing covariates
– What factors influence soil formation?
• Link to pedological knowledge
– What data do I have? Want? Need?
– Data should represent the multiple SCORPAN
covariates that influence soil development in your
project area
Multi-path smoothed wetness index
Relative Position
Environmental Covariates
S = f (S, C, O, R, P, A, N)
– Soil
• Legacy soils data, field measurements
– Climate
• Temp, precip, ET, solar radiation, etc.
– Organisms
• Land cover, NDVI, spectral data and derivatives
– Relief
• Elevation derivatives
– Parent material
• Geology, spectral data and derivatives
– Age
• Landform models
– Spatial position
• Inherent in georeferenced data layers
Elevation and spectral data/derivatives
Environmental Covariates
Choosing covariates
– Consider resolution (spatial, spectral, temporal,
radiometric) and what is required to meet the
project needs
• Target soil feature
– Compare needs to the range of data
available
– Elevation and spectral derivatives are a
powerful combination for predicting soil
classes or properties in most areas
• Project needs and physical features of the
area may require only one of these data
sources
Sampling for Training Data
Digital soil mapping is dependent upon the
relationship between
– Predictor variables (covariates)
– Target soil feature (soil class or property)
True for all modeling methods
Samples of covariates representing the
distribution of the target soil feature are
required
– Training data
• “Trains” the model to predict similar occurrences
Sampling for Training Data
Prediction is successful when
– Precise, observed locations of typical soil members
are available
Directed (purposive) field samples
– Knowledge-based classification approaches
– Should not be used exclusively
Random or stratified sampling
– More robust
– Less prone to bias
– Sampling design appropriate for modeling objectives
Sampling for Training Data
DIGITAL SOIL MAPPING = FIELD WORK
Soil scientist
engaged in
digital soil
mapping
Predict Soil Classes or Properties
Optimal set of SCORPAN covariates chosen
Training data collected
Apply a model to the data and predict soil
classes or properties
Predicting Soil Classes
Classification is the process of predicting
discrete classes
– Sorting pixels into finite number of classes based on
data values and distribution in feature space
– If a pixel satisfies class criteria, then it’s assigned to that
class
Classification
Unsupervised
– Algorithm identifies clusters it the data
• Non-informational clusters
– Analyst must interpret resulting clusters
• Informational classes
– No training data required
– Exploratory
– K-means, ISODATA
Supervised
– Analyst selects training sites
– Algorithm creates decision boundaries from class
statistics to cluster data
– Training data required
– Beyond exploratory
– Knowledge-based classification (ArcSIE), fuzzy
classification, predictive modeling (random forests,
neural networks)
Unsupervised Classification
ISODATA
– Shows natural clustering in the data and how potential
classes may be distributed across the landscape
– Covariates include both terrain and spectral data
derivatives
Boundary Water Canoe Area Wilderness, MN
Supervised Classification
Predictive modeling (machine learning)
– Classes and probability surface for one class from random
forests (tree-based)
– Covariates include both terrain and spectral derivatives
Boundary Water Canoe Area Wilderness, MN
Predicting Soil Properties
Regression and interpolation methods
predict continuous soil properties
Regression
– Estimates the relationship between the dependent (soil
observations) and independent variables (covariates)
– Linear regression, logistic regression, random forests
Interpolation
– Models spatial patterns based on values at known
locations and the assumption that locations that are
closer to one another are more similar than those that
are further apart
– Geostatistics (kriging)
Regression
Predictive modeling (machine learning)
– Properties and associated probability surface from
generalized linear model (regression)
Beef Basin area, UT
Predicting Soil Properties
Raster stack
– Continuous raster soil properties
– Key soil property layers at fixed depths
• 0-5cm, 5-15cm, 15-30cm, 30-60cm, 60100cm and 100-200cm
• Target soil properties
Total profile depth (cm)
Plant exploitable soil depth (cm)
Organic carbon (g/kg)
pH (x10)
Sand (g/kg)
Silt (g/kg)
Clay (g/kg)
Gravel (m3 m-3)
ECEC (cmolc/kg)
Bulk density of fine earth (<2mm) fraction
(excluding gravel) (Mg/m3)
– Bulk density of whole soil (includes gravel)
(Mg/m3)
– Available water holding capacity (mm)
–
–
–
–
–
–
–
–
–
–
– Prediction uncertainty
Concept
soil property
0-5 cm
5-15 cm
15-30 cm
30-60 cm
Predicting Soil Properties
Continuous soil property raster stack
– Interpretations for management and use
• The data stack becomes the database
– Add slope, climate, etc. layers needed for
calculating interpretations
– Class data – taxonomic or technical
Interpolation
Geostatistics
– Ordinary kriging applied to predict soil K concentrations
Salt Lake Valley, UT
Validation and Uncertainty
All soil mapping methods rely on models
– Conventional soil mapping
• Qualitative
• Relies on conceptual model (CLORPT)
– Digital soil mapping
• Quantitative
• Relies on quantitative model (SCORPAN)
All models approximate reality and are
subject to error and uncertainty
Validation and Uncertainty
Quantitative nature of digital soil mapping
– Lends itself to quantitative assessments of accuracy and
uncertainty
Communication of accuracy and
uncertainty associated with soil spatial
predictions is imperative
Integral part of delivering modeled products
given their use in resource management
decision making and risk assessment
Accuracy
Predicted values on a map will deviate from
true values to some extent
Accuracy
– Difference between predicted value at a location and
measured value at same location
– High prediction accuracy
• Small difference between predicted and observed
values
Accuracy measures quantify prediction quality
using a validation data set
Accuracy
Soil classes
– User’s, producer’s, overall accuracy
• Confusion matrix
– Kappa, Tau index (weighted Tau), Brier score
Soil properties
– MSE, RMSE, standard error, modified R2
Uncertainty
Sources
– Positional accuracy of pedon
location
– Covariate accuracy (e.g.,
vertical uncertainty of DEM)
– Soil class or property
measurement (e.g.,
taxonomic classification or lab
measurement)
– Model structure (e.g., using
linear model for curvilinear
data)
Uncertainty
Soil classes
– Memberships or probabilities
– Class confusion measures (confusion index, distance
files, entropy)
Soil properties
– Prediction intervals
Uncertainty measures can be generated
during the modeling process or calculated
from results, and represented spatially
Uncertainty
Soil classes – confusion index
As CI approaches 1,
confusion between classes
increases
Similar measures:
• Distance files in
ERDAS IMAGINE
• Entropy
Powder River Basin, WY
Brungard et al., 2015
Uncertainty
Soil properties – prediction interval
– Often shown as lower, mean, upper prediction interval
(similar to low, RV, high)
– Prediction interval width shows spatial variability of
uncertainty
Lower 90% Prediction Interval
Beef Basin area, UT
Mean
Upper 90% Prediction Interval
Prediction Interval Width
Digital Soil Mapping
Summary
– Predicting soil classes or properties from field observations
and environmental covariates using a statistical algorithm
– Spatial representation of classes or properties
– Includes estimates of accuracy and uncertainty
– Consistency and reproducibility
– Flexible product to meet variety of user needs
Digital Soil Mapping
Focus
– Fundamental pedology
• Knowledge of the soil
resource as a natural
body
• Existing and newly
acquired
– Field component
– Latest technological resources
• Applied adaptively throughout process
and in combination with soil knowledge
Digital Soil Mapping Training
Foundational Prerequisites Taken in
the Following Order:
1. Spatial Analyst Workshop (NRCS-NEDC-000271)
2. Statistics for Soil Survey Part 1 (NRCS-NEDC-000400)
3. Intro to Digital Soil Mapping (NRCS-NEDC-000272)
Digital Soil Mapping
with ArcSIE
Statistics For Soil Survey
Part 2
Remote Sensing for Soil
Survey Applications
(NRCS-NEDC-000273)
(NRCS-NEDC-000332)
(NRCS-NEDC-000244)
•Prerequisites
•All 3 foundational
prerequisites
•Prerequisites
•Statistics for Soil
Survey Part 1
• Prerequisites
• All 3 foundational
prerequisites
• Intro to Digital Remote
Sensing (available on-line from
Michigan State University)
Download