Sparse versus Dense Datasets

Sparse Versus
Dense Spatial Data
R.L. (Bob) Nielsen
Professor of Agronomy
Purdue University
West Lafayette, IN
Spatial data & GIS
Spatial data are the
fundamental components of
agricultural GIS.
Growers hope to minimize or
manage spatial yield
variability in order to increase
or maximize profitability.
The causes of yield variability must
therefore be determined, which requires the
acquisition of additional spatial data sets or
‘layers’ of information.
Spatial data sets can be ...
Many data points
per acre
 e.g., grain yield
data sets often
consist of 300 to
600 data points
per acre
Fewer data points
per acre
 e.g., typical grid
soil sampling
results in an
average of 0.4
data point per
GIS software …
Interpolates or fills in the spatial
'holes' in the data to create pretty
color maps that mysteriously become
the essence of truth for believers.
 Dense
data sets have fewer 'holes'
per acre than do sparse
 Thus,
less interpolation is required
 Thus, the resulting map is intuitively
more believable
Yield data are dense …
One sec. readings at 3 mph
equal to 1 data point every 4.4 ft
 600
data points per acre
with a 6-row combine header
Yield maps are believable …
Very little interpolation
required to create yield map.
Soil sample data are
Typical 2.5 acre sampling grid
 Only
0.4 point per acre
Organic matter surface
Interpolated from o.m. values of
2.5 acre soil sample data
Soil surface color from
reclassified aerial IR
Mediocre correlation
Soil o.m. surface map
interpolated from 2.5acre samples
Half-acre soil sampling
More intense sampling
Five times as many data points as before
 Still sparse relative to aerial imagery
Soil surface color from
reclassified aerial IR
Improved correlation
Soil o.m. surface map
interpolated from halfacre samples
Consequence of sparse
Poor interpolation of
spatial variability
2.5 ac soil O.M. map
Aerial image, reclassified
half-ac soil O.M. map
The challenge …
In order to interpret yield maps
wisely, you will need far more data
layers than just soil nutrient levels
and soil types.
 Many
factors influence yield!
 Acquiring these data will require
forethought, time, timeliness,
attention to detail, and (of course)
The good news
Some of the additional data sets you will
acquire will be dense and, therefore,
satisfactory for creating spatial maps
Soil EC
Aerial photography
Satellite imagery
The bad news
Some of the additional data sets you will
acquire will be sparse data sets, the maps
from which must be taken with the
proverbial ‘grain of salt’.
 Soil nutrients
 Plant populations
 Stand uniformity
 Plant height
 Insect pressure
 Disease pressure
 Weed pressure
 Soil compaction
Bottom Line:
Data collected by field scouting,
including soil nutrient sampling, are
often too sparse for GIS programs to
accurately interpolate spatial
 Yet,
more intensive data collection is
often cost- and time-prohibitive
Example: Plant Counts in
Late Planted Soybean
Approx. 10 plant
population checks
per acre on a fairly
equal grid basis
292 total data
points on 30
Cost: Three hikers,
two GPS units, one
Directed sampling
Added another 80
population checks
on the fly as our
eyeballs dictated
372 data points
Cost: Included in
first day’s work
Revisited field, second day
GIS map did not agree
completely with our
eyeballs, so revisited
Added another 54
population checks
 Total of 426 data
points on 30 ac.
Cost: Three hikers,
one GPS unit, one
Soy population map
Based on original grid samples
(10 per acre)
> 200k
150 to 200k
100 to 150k
50 to 100k
< 50k
Did add’nl
sampling help?
Original data
Minor, but potentially
useful improvements
Original data plus
directed samples
on the fly
Including revisit
Green vegetation index
(NDVI) from IR aerial
image (8 July)
Not perfect, but acceptable
Our map of populations
(17 June)
Sample as densely as time and
money will allow.
 From
the perspective of crop scouting
or monitoring, you can never have
too much data!
 Remember, you rarely have a visual
idea of what the true spatial pattern
 So,
sometimes directed sampling is
not feasible.
Sample in as much of an equidistant
pattern as is logistically possible.
 Better
for GIS software, easier on the
person in the field.
 Begin with a grid pattern, modify with
additional directed sampling as
suggested by other data layers or
your own eyes.
Thanks for your attention!
Farming is a
gamble, so let’s
practice ….
Pick a card and
concentrate on
I will make your card disappear!
Did you
concentrate hard?
I believe your card is