JOURNAL OF GEOPHYSICAL
RESEARCH, VOL. 104, NO. D6, PAGES 6199-6213, MARCH 27, 1999
A comparisonof paired histogram,maximum likelihood,class
elimination, and neural network approachesfor daylight
global cloud classificationusingAVHRR imagery
T. A. Berendes,
• K. S. Kuo,• A.M. Logar,
2E. M. Corwin,
2R. M. Welch,
•
B. A. Baum,
3A. Pretre,
4andR. C.Weger
s
Abstract. The accuracyand efficiencyof four approachesto identifyingcloudsand aerosolsin
remotesensingimageryare compared.Theseapproaches
are asfollows: a maximumlikelihood
classifier,a pairedhistogramtechnique,a hybridclasseliminationapproach,anda back-
propagation
neuralnetwork. Regionalcomparisons
wereconducted
on advancedvery high
resolutionradiometer(AVHRR) local areacoverage(LAC) scenesfrom the polar regions,desert
areas,andregionsof biomass-burning,
areaswhichare knownto be particularlydifficult. For
the polar,desert,andbiomassburningregions,the maximumlikelihoodclassifierachieved9497% accuracy,the neuralnetworkachieved95-96% accuracy,andthe pairedhistogramapproach
achieved93-94% accuracy. The primaryadvantageto the classeliminationschemelies in its
speed;its accuracyof 94-96% is comparableto thatof the maximumlikelihoodclassifier.
Experimentsalsoclearlydemonstrate
the effectiveness
of decomposing
a singleglobalclassifier
into separateregionalclassifiers,sincethe regionalclassifierscanbe morefinely tunedto
recognizelocal conditions. In addition,the effectiveness
of usingcompositefeaturesis
comparedto the simplerapproachof usingthefive AVHRR channelsandthereflectanceof
channel 3 treated as a sixth channel as the elements of the feature vector. The results varied,
demonstrating
thatthefeaturescannotbe chosenindependently
of theclassifierto be used. It is
alsoshownthat superiorresultscanobtainedby trainingthe classifiersusingsubclass
informationandcollapsingthe subclasses
afterclassification.Finally, ancillarydatawere
incorporated
into the classifiers,consisting
of a land/watermask,a terrainmap,anda computed
sunglintprobability.While the neuralnetworkdid not benefitfrom thisinformation,the
accuracyof the maximumlikelihoodclassifierimprovedby 1%, andthe accuracyof thepaired
histogrammethodincreasedby up to 4%.
1. Introduction
Identifyingcloudsand aerosolsin remote sensingimageryis
an important first step in the retrieval of both surfaceand
atmospheric
properties,as well asin estimatingradiativeforcing
for climatestudies. The traditionalapproachto this problemhas
been to use a seriesof thresholdtestsrelated to spectralcontrast
[d'Entermont,1986; Inoue, 1987; Prabhakara et al., 1988; Key
and Barry, 1989], radiance spatial contrast [Arking, 1964],
radiancetemporalcontrast[Reynolds
and VonderHaar, 1977;
Minnis and Harrison, 1984], radiancespatial variancecontrast
[Coakleyand Bretherton,1982; Coakleyand Baldwin, 1984],
and radiancetemporalvariancecontrast[Gutmanet al., 1987].
•Departmentof Atmospheric
Sciences,
GlobalHydrologyand
ClimateCenter,Universityof Alabamain Huntsville.
Department
of Mathematics
andComputer
Science,
SouthDakota
Schoolof MinesandTechnology,RapidCity.
3Atmospheric
Sciences
Division,
NASALangleyResearch
Center,
HamptonVirginia.
nMartinandAssociates,
Inc., Mitchell,SouthDakota.
•Institute
ofAtmospheric
Sciences,
SouthDakotaSchoolof Mines
andTechnology,RapidCity.
Copyright1999by theAmericanGeophysical
Union.
Papernumber98JD02584.
0148-0227/99/98JD-02584509.00
6199
This paper compares the accuracy and efficiency of four
alternative approachesto the cloud identification problem: a
maximum likelihood classifier, a new paired histogram
technique, a hybrid class elimination approach,and a backpropagationneuralnetwork.
There are currentlytwo operationalglobalcloudclassification
schemes,both of which are based upon thresholdtests. The
InternationalSatellite Cloud ClimatologyProject (ISCCP) cloud
maskingalgorithmis describedby Rossow[1989], Rossowand
Garder [1993], Rossowet al., [1989a, b], and Sezeand Rossow,
[1991]. The ISCCP algorithmis basedon the premisethat the
observedvisible and infrared radiancesare causedby only two
typesof conditions,"cloudy"and "clear," and that the rangesof
radiancesand the variability that is associatedwith these two
conditionsdo not overlap [Rossowand Garder, 1993]. As a
result, the algorithmis basedupon thresholds,where a pixel is
classified as "cloudy" only if at least one radiance value is
distinctfrom the inferred "clear" value by an amountlarger than
the uncertaintyin that "clear"value.
The National Oceanic and Atmospheric Administration
Clouds from AVHRR (CLAVR) algorithm (phase1) examines
multispectral information, channel differences, and spatial
differencesand then employsa seriesof sequentialdecisiontree
tests [Stoweet al., 1991]. Cloud-free, mixed (variable cloudy),
and cloudyregionsare identified for 2x2 global area coverage
(GAC) pixel arrays. If all four pixels in the array fail all the
cloud tests,then the array is labeled as cloud-free(0% cloudy);
if all four pixels satisfyjust one of the cloudtests,then the array
6200
BERENDES ET AL.: COMPARISON
is labeled as 100% cloudy. If one to three pixels satisfya cloud
test, then the array is labeled as mixed and assignedan arbitrary
value of 50% cloudy. If all four pixels of a mixed or cloudy
array satisfy a clear-restoraltest (required for snow/ice, ocean
specularreflection, and bright desert surfaces),then the pixel
arrayis reclassifiedas "restoredclear" (0% cloudy).
While cloud detectionover ocean surfacesis relatively well
established, it is much more difficult
OF CLASSIFIERS USING AVHRR
The second question considered is as follows: can the
accuracyof global classifiersbe improvedby subdividingthem
into specialized regional classifiers? Experiments were
performedto test the effectivenessof decomposinga single
global classifier into separate regional classifiers,since the
regionalclassifierscan be more finely tuned to recognizelocal
conditions.
over land surfaces.
The final question concernswhether the accuracyof the
classifiers can be improved by using subclasses. The EOS
Clouds and the Earth's Radiant Energy System(CERES) effort
does not require intraclassdistinctions. For example, seven
categoriesof water cloud have been identified in the satellite
scenesconsidered,but the primary task is to identify all of them
as water cloud. Thus the questionwas whether it would be
more effectiveto use the subclassinformationin designingthe
order to be accurate. Alternative methods of cloud detection and
classifiersor to consider all seven types of water cloud as a
classificationhave been developed to addressthis weakness. singleclass.
These methods promise improved accuracy over threshold
Section 2 provides detailed information about the data,
approaches,especiallyover difficult surfacessuch as ice/snow, classes, and features, and section 3 describes the various
deserts,smoke,and sunglintbut at a highercomputationalcost. classificationmethodologies.Section4 presentsthe results,and
Therefore the four techniques described here, a maximum section 5 concludes.
likelihoodclassifier,a pairedhistogramtechnique,a hybridclass
elimination approach,and a back-propagation
neural network, 2. Data and Features
are intercomparedin terms of both accuracyand computational
expense.Thesetechniquesaredescribedin detailin section3.
The variousclassifiersare trained and testedusing AVHRR
The regional comparisonswere conductedon advancedvery LAC satelliteimagery. The spatialresolutionof the LAC data is
high resolutionradiometer(AVHRR) local area coverage(LAC)
1.1 km at nadir. The spectraldata include AVHRR channels1
scenesfrom the polar regions, desert areas, and regions of
(0.5-0.68 gm), 2 (0.725-1.1 gm), 3 (3.55-3.93 gm), 4 (10.5-11.5
biomassburning, areas which are known to be particularly gm), and 5 (11.5-12.5 gm), which includevisible,near-infrared,
difficult.
Three additional data sets were created to test the
midinfrared,and infrared window regions. A sixth channelis
generalizationabilities of the classifiers. The "global"data set created from the reflectance of channel 3. This channel is
containssamplesfrom all areas of the Earth, the "other" set derived by removingthe channel 3 thermal emission,which is
containsglobal data minus vectorswhich would fall within the estimatedusing channel4 emissiontemperature[Allen et al.,
purview of the regional classifiers, and the "twilight" set 1990]. Owing to the fact that the classifiers are trained and
containsvectorsfrom the polar regionswhere the solar zenith testedon data spanningthe periodfrom 1985 until 1993 with the
angleis between80ø and 85ø.
NOAA 9, 10 and 11 sensors,calibration accuracyis essential.
In additionto the accuracyand computationalexpenseissues The calibration algorithm is a standardtechniqueand is well
relatedto the choiceof the classifierselected,this investigation documentedin the literature [e.g., Brown et al., ; Rao et al.,
focusesupon three other questions. First, doesthe use of more 1993, Rao and Chen, 1994; Weinrebet al., 1990;Kidwell, 1995].
advancedfeatures(other thanjust the five AVHRR channelsand Prior to their use in the classifier systems,the raw satellite
a sixth channel derived from the reflectance of channel 3)
channel values are calibrated to physical temperatureand
significantlyimproveclassificationaccuracy?It has been shown reflectance values. Then, the calibrated values for each channel
that spectral,textural,or combinations
of spectraland textural are scaled between 0 and 1.0. Specifically, the temperature
featurescan be effective for cloud identification,particularlyin channel value, which ranges from 180øK to 330øK, and the
the polar regions[Baurnet al., 1995; Ebert, 1987, 1989; Welch reflectancevalue, which rangesfrom 0 to 100, are scaledfrom
et al., 1990, 1992; Key and Barry, 1989; Key 1990; Rabindra et 0.0 to 1.0.
al., 1992; Tovinkereet al., 1993]. Thesecategoriesrepresenta
large numberof potentialfeatures;thus variousapproaches
for
2.1. Sample Selection
feature reduction are utilized in order to select a small subset
Difficulties in cloud discriminationincreaseprogressivelywith
increasingbrightness of the background. In addition, cloud
detectiontechniquesare difficult to apply over land becauseof
the high spatial heterogeneityof land surfaces, as well as
seasonaland regional (including topographic)influences[Eck
and Kalb, 1991; Cihlar and Howarth, 1994]. Threshold tests
would need to be highly specificto the individualbackgroundin
that bestseparates
the classes[Richards,1993]. In the present
investigation,a large numberof features,composedof linear and
nonlinearcombinationsof the six channelsof data, are generated
for the training data, and the paired histogramsystemselectsa
small, optimal set of featuresfor eachclasspair. In this study,
three features, which are optimally separatedand not highly
correlated,are selectedfor each pair of classes. A subsetof
these features, reduced by tighter divergenceand correlation
requirements, is used by the neural network and maximum
likelihoodclassifiers.This techniqueis comparedto the simpler
approachof usingthe six channelsof data for each pixel as the
elementsof the featurevector. Althoughtextural featureshave
been effective in other research,they were not consideredhere
becauseof the large computationaloverhead.
A total of 268,472 samples were selectedfrom 91 AVHRR
LAC scenesto createsix data sets. The globaldatasetcontains
all of the availabletrainingand testingdata. The regionaldata
sets, namely, South America at the time of heavy biomass
burning,the polar regions,desertareas,and "twilight" regionsof
highsolarzenithangle(©o> 80"), areextracted
fromtheglobal
data. Another subsetof the global data, labeledthe "other"data,
is the global data which falls outsideof the polar, desert,and
biomass-burning
regions. The dateson which the imagerywas
acquiredcan be foundin Table l a. The twilight sampleswere
takenfrom the polar and "other"data set, so a separateentryis
not given in Table 1a. The data were divided into trainingand
testing sets tbr each region. The number of samplesused for
BERENDES
ET AL.: COMPARISON
OF CLASSIFIERS
USING AVHRR
6201
which can be viewed and manipulated outside of S1VIS, if
desired. The 268,472 samplesusedin the presentinvestigation
were all generatedusingthe SIVIS system.
Table la, Datesof ImageryUsed
Region
Date of Imagery
BiomassBurning
(SouthAmerica)
Polar
(BeaufortSea,
Aug. - Sept.1985 and
Sept. 1986
April - June1987,
April - Sept. 1988and 1989,
Barent's Sea,
June and December 1992, and
GreenlandSea,and
Baffin Bay)
JanuaryandOctober1993
Desert
June 1986,
(North Africa and
the Middle East)
Other
July 1988, and
March,May, andJune198-9
JanuaryandOctober1993
2.2.
Classes
The eight classeswhich have been identified in the various
data setsare listed in Table 2. However, histogramanalysisof
the data shows that some of the classes are multimodal
of snow/ice.
trainingand testingcan be foundin Table lb. It is importantto
note that the training and testing samples were taken from
independentdata sets,that is, from differentregionsor from the
same region but on different days or different years. It is a
standardpractice to extract both training and testing samples
from the same orbital swaths, but while this improves the
reportedaccuracyof the classifier,it also providesa distorted
estimateof classificationaccuracybecausethe data sets are not
independent. The independenceof the trainingand testingsets
was an importantrequirementtbr this researchand shouldbe a
considerationin the evaluationof any classificationresults.
Accurate labeling of training and testing samples is an
importantfirst stepin the classificationprocess.An experienced
analyst was aided in this processby the Satellite Imagery
Visualization System (SIVIS). SIVIS was developed by V.
Tovinkere to run on Silicon Graphics workstations. It is
composedof a set of displaytools for visualizationand analysis
of satellite data. Plate 1 showsa typical SIVIS session. Tools
providedto the analystincludehistogramequalization,contrast
stretch,gray scaleinversion,and a spatialcoherenceplot. The
analyst can display channel, channel ratio, and channel
difference values along with ecosystem,sunglint, and other
ancillaryinformation. Labeledsamplesare storedin a database
The subclasses also are listed in Table 2.
Note that some of the classesmay not be representedin the
trainingset for a region. For example,desertsare not foundin
polar scenes,and the snow/iceclassgenerallyis not foundin the
desert scenes.
2.3.
Feature
Generation
There is considerableuncertaintyin the literature as to the
selectionof an optimumset of featuresfor use in classification,
and there is an entire subdisciplinedevotedto feature reduction
[Richards, 1993]. Some authors simply utilize the various
sensorchannels,while othersform setsof higher-orderfeatures,
such as channel differences, channel ratios, and normalized
channel difference ratios (e.g., the normalized difference
vegetationindex). However, many of these featuresmay be
highly correlated or may provide little discriminatory
information. While it is generally acceptedthat higher-order
featuresdo provide somewhathigher accuracyin classification,
there is little information concerningthe costs and benefits.
Sincethere is no reliable guidanceas to an optimumselectionof
features, the present investigationhas taken the approachof
examining a very extensive number of features for possible
incorporationinto the classificationprocess. The 185 features
generatedin this studyare listed in Table 3.
It shouldbe noted that although185 featuresare considered,
the feature selection phase selects a much smaller subset of
Table lb. Number of samplesusedfor eachclassin eachdata set
Data
Scenes Water
Set
in some
of the features,indicating the presenceof subcategoriesof that
class. For example, snow-coveredmountain is a subclassof
snow/ice,displayinga distinct secondarypeak in the histogram;
therefore,snow-coveredmountainsmay be considereda subclass
Snow/
Ice
Ice
Cloud
Land
Water Cloud Desert Sunglint Haze /
Smoke
Polar
Train
33
1644
10408
5979
1233
19199
3284
Test
21
1286
7144
4754
1302
12431
708
Train
45
11366
8292
10294
7464
9733
2808
6132
Test
40
9806
5811
5911
4123
4337
1176
4204
Train
47
13101
28541
19515
2640
1880
8143
2509
1103
6440
37
1143
776
5459
Test
1497
2601
1648
440
South America
Desert
Other
Train
39
6917
12580
17208
26423
1272
2304
Test
25
4468
6596
9255
15789
1304
672
Train
55
12982
12971
27608
25462
55014
6517
2172
8220
Test
36
8190
5429
21354
22594
38915
2524
1812
4896
Global
Twilight
Train
18
620
1272
1240
432
3676
Test
16
488
404
712
240
2452
6202
BERENDES
•
SIVIS'
.t¸:
½ont
ET AL.: COMPARISON
OF CLASSIFIERS
USING AVHRR
1tJt
S
IIm
ry V•sui• e •on$y'
I' !
lnfm-nu 'im• •ampk Mode
•'
e
III
1. l•:
11ttC's T
I
•1•o
'1 ,
L
131 t•o
$1Vt• 1, t
ßm
fm•
t•
.t9
#
Spatial Coherence
[•]con!
•tStretch[•lCOntT•
r lth
4
BId(lOS-
nit.tSirth
115)
Plate 1. Sample Satellite ImageryVisualizationSystem(SIVIS) session.The image in the upper left-hand
cornerlabeled "Tile Image" is taken over Norway and Swedenand is displayedas a three-bandoverlay. It is a
sectionof the largersceneshownin the lower left-handcomerlabeled"Snapshot
Toolbox."
BERENDES
Table
ET AL.: COMPARISON
2. List of Classes and Sub-classes
Class
Class
Index
Label
1
Water
2
Snow/ice
Subclasses
Snow/ice
Broken Sea Ice
Snow-covered
3
Ice cloud
4
Land (nondesert)
5
Water cloud
mountains
thin water cloud over land
water cloud over desert
stratus over land
thin watercloudin polarregions
stratus
over
OF CLASSIFIERS
17
Sunglint
channel I < 11%
channel 1 > 11%
27
smoke over land
Haze
6203
256
oceans
cumulus over oceans
Desert
AVHRR
dimensionalityproblems. However,one set of featuresmay be
mostvaluablefor distinguishing
betweentwo particularclasses,
while an entirely different set of featuresmay be optimal for
separatingtwo otherclasses.
The goal of the pairedhistogrammethodis to find a set of
features for each pair of classes which provides optimal
separationbetween the pair. For each feature (F),
the
normalizedhistogramsIF and JF are createdfor classesi and j,
respectively.The two histograms
are scaledto theminimumand
maximum of their combined range for that feature and are
discretizedinto 256 bins. The histogramsof eachpair of classes
are compared,and two measures,overlap(O) and divergence
(DIV), are computedasfollows:
cumulus over land
14
USING
smoke over oceans
dust/aerosols over land
dust/aerosols over oceans
featureswhich are effective for separatingclasses. Generally,
only about 10 - 40 features are selected by the various
classifiers. Note that features102 - 161 are generatedusingthe
hue, saturation,value transformgiven in [Foley et al., 1990].
This transform uses three channel values, or channel differences,
to generatea set of three new numerical features which are
representative
of the red, green,and blue colorcombinations
an
analystusesto determineclassmembership.
O(F)ij
=x•=l
IF(x)J
F(x)
(1)
DIV(F)ij
= Igi- pjl/(IJi
q-IJj)
(2)
where [ti and gj are the means of F for classesi and j,
respectively,IJiand•i are the standarddeviationsof F for classes
i andj, respectively,and x is the histogrambin. This processis
repeatedfor eachfeature. The result is that for eachclass,185
histogramscorresponding
to the 185 featuresare generated.The
divergenceand overlapmeasuresare computedfor eachpair of
classes.
For each pair of classes,the featuresare sortedby largest
divergence. Featureswith equal divergencevalues then are
sortedby smallestoverlapvalue. However, somefeaturesmay
contain redundant information. Therefore the following
procedureis adopted. For eachpair of classes,the first feature
in the sorted list is chosen.
Next
the correlation
between
the
first feature and successivefeatures in the list is computed.
Only the training vectors from the two classes under
2.4. Feature Selection
considerationare usedin the correlationcomputation.
Featureswith correlationsgreaterthan0.9 are discarded.The
Most classifierscan only utilize a relatively small numberof
features. Examplesare the parallelepiped,minimum distance, next feature in the list with a correlationless than or equal
mahalanobis,maximum likelihood and back propagationneural to 0.9 is accepted. The final feature selectedfrom the list must
networkapproaches
[Richards1993]. For thosemethods,it is satisfythe conditionthat the correlationbetweenit and eachof
not feasible to use more than 10-20 features because of
the other two featuresis lessthan or equal to 0.9 The resulting
Table 3. Explanationof Features
Explanation
Feature
1-6
7-21
22-
36
37- 51
52 - 66
six calibratedchannelvaluesof AVHRR channels1, 2, 3, 4, and 5 and the
reflectanceof channel3 (6)
15 channelratios,A/B, where A and B are the channelslisted above,
that is, chl/ch2, chl/ch3, chl/ch4, chl/ch5, chl/ch6, ch2/ch3,ch2/ch4,
ch2/ch5,ch2/ch6,ch3/ch4,ch3/ch5,ch3/ch6,ch4/ch5,ch4/ch6,ch5/ch6
15 channel differences,A- B
15 valuesof arctan(A/ B)
15 2-D Euclideandistancesfor channelsA and B, that is,
Lt¾1/ij
67 - 86
102-
87-
101
161
162-
185
20 3-D Euclideandistancesfor channelsA, B, and C
15 normalizeddifferences(A-B)/(A+B)
20 setsof hue,saturation,andvalue(HSV) for channelcombinations
of A, B, andC (noteeachcombination
of A, B, andC produces
three featuresfor a total of 60 features)
hue, saturation,andvaluefor eachof the following(24 features):
[chl, chl-ch4, ch6], [chl-ch2,chl-ch4, chl-ch6], [ch2-ch3,ch3-ch4,ch4-ch5],
[ch2- ch3,ch3 - ch4, ch4- ch6],and[chl - ch2,ch2- ch3,ch3 - ch4],
[ch3-ch4,ch4-ch5,ch3-ch5],[chl-ch2, chl-ch3, chl-ch4],[chl, ch4-ch5,ch6]
AVHRR, advanced
veryhighresolution
radiometer;
ch,channel;
2-D, two-dimensional;
and3-D
three-dimensional
6204
BERENDES
ET AL.: COMPARISON
OF CLASSIFIERS
USING
AVHRR
three features, F1, F2, and F3, are those that provide a high
degreeof separationbetweenthat pair of classeswith the least
redundancy.These features,or a subsetdescribedin section2.5,
are usedby all the classifiers.
Once n is found
2.5. Ancillary Data
used, with the assumptionof isotropy,to estimate the percent
relativeprobabilityof sunglint,that is,
n = (nx,ny,nz)
where nz gives the cosineof the angle betweenn and the local
zenith,the distributionfoundby Cox and Munk, [1954] can be
In addition to the features described in section 2.4, three other
types of informationare available to the classifiers:a water
mask, terrain maps, and a sunglint probability. The water
100%
xexp[--( )2
2 8.5
percentageis taken from the Fleet Numerical Oceanography
Center [1992] (fnocwat.imgfile) on and the National Oceanic where
and AtmosphericAdministration- EnvironmentalProtection
© = cos-•nz
(4)
Agency, [1992]. This file is in raster format with each pixel
representing
a 10-minby 10-minregionon the Earth. It is used
and 8.5 (degrees)is interpolatedfrom Cox and Munk, [1954,
to determinewhether a pixel is over land or ocean. If the
Figure 3]. This probabilityis usedby the classifiersto improve
percentageof water in the 10-min by 10-min regionis 100%,
the cloud/sunglintdistinction.
then the pixel is consideredto be over openocean,and all land
classesare eliminated as possibleclasses. Care is taken to
accountfor at leastlargeriversandlakesandislands. Likewise, 3. Methodology
the 10-min spatial resolutionterrain map is used to identify
Four classification schemes are described below: maximum
mountainousregions. The snow-covered
mountainsubclassis
likelihood, paired histogram, class elimination, and a back
allowed only over appropriateterrain. Finally, the sunglint
propagationneuralnetwork.
probabilitymask identifies regionsof potential sunglint. The
sunglintclassis not allowedoutsideof theseregions.
3.1. Maximum
Likelihood Classifier
Early in the investigationit becameclear that sunglintwas
easily confused with other classes and was frequently
The maximum likelihood technique is among the most
misclassifiedas cloud. Thereforea sunglintprobabilitymeasure popularsupervisedclassificationschemesfor classifyingremote
was devised to assistin that discrimination. At each pixel, the sensingdata. A descriptionof the techniqueis givenby Duda
solarzenithangle,the viewingangle,andthe azimuthdifference and Hart, [1973] and Richards, [1993].
The maximum
are already computed. With these angles, the orientation likelihood method uses the probability distributionsof the
necessary
for the seasurfaceto producespecularreflectionunder featuresto perform classification. As is commonlydone, the
current viewing geometry can be calculated. The angles maximum likelihood method used here makes the assumption
that the distribution
function for each feature is normal.
The
necessary
for thiscalculationare shownin Figure1.
We define the following: no is the unit vectorpointingto the probabilitythat a vectorx is in classj, p(cjl x) is estimatedusing
Sun, nsis the unit vectorpointingto the satellite,n is the unit Bayes' theorem,that is,
normalvectorof the sea surfaceproducingspecularreflection.
Using the followingrelations,one can solvefor the components
p(xlcj)p(cj)
1• ]
of n:
p(cjlx)-n.(n o Xns)=0
(3)
(5)
p(x)
1
where
n . no = n . ns =cos(--©)
2
p(xl
cj) =
Z
I
to Sun
M/2_1/2
1 exp[--I (x- #j )TE•
1(x- #j )1 (6)
(2•r) 2.;j
2
to
Satellite
and [xjis the meanvectorfor classj, E• is the covariancematrix,
andthe superscriptT denotesthe transpose.
-'
',
The feature selection method described in section 2.4 selects
and ranks the three best featuresthat distinguishbetween each
classpair. For the maximum likelihoodclassifier,only the top
fe_aturefor each pair is retained. Given N classes,there are
N = N(N-1)/2possible
features._However,
thesetof selected
features
is considerably
lessthanN because
somefeatures
are
commonto several class pairs. This processtypically yields
20- 70 features, which still results in a very large covariance
matrix. Furthermore,some of the featuresin the set may be
highly correlated with other features, resulting in a nearly
x
singularmatrix. To avoidnumericalinstabilityand to reducethe
number of features to a more manageablesize, the correlation
matrix for this set of featuresis calculated,and highly redundant
Figure 'l.
Angles needed for the sunglint probability features are deselected. Note that the entire set of training
computation.Note that A(I) is the relative azimuthangle, Os is vectorsis used in this process. The final number of features
the observationangle, 0,, is the solarzenith angle,and © = cos- varies from 7 to 11 for the different regional and global
classifiers described in section 4.
nz as defined in the paper.
BERENDES
ET AL.:COMPARISON
OFCLASSIFIERS
USINGAVHRR
6205
featureF in classj is denotedby JF. To classifya pixel Z in an
3.2. Paired Histogram Classifier
image,thefeaturevectormustfirstbe computed
for thatpixel.
For eachfeature,F, the pixel featurevalueZF is scaledto the
pairedhistogram
technique.Thehistograms,
whichareusedas
rangefor thehistograms
IF andJF(e.g.,Figure2).
discriminators
betweenpairsof classes,
determine
theresultof a appropriate
The
feature
value
is
mapped
to
the
appropriate
binsof the two
ballot accumulationalgorithm.
IF andJF. If IF> JF,classi receivesonevote,if IF <
Givena classpair i andj, threefeaturesare chosen
based histograms
upondivergence
andoverlap
asdescribed
in section
2.4. The JF,classj receivesonevote,andif IF = JF,neitherclassreceives
histogram
for featureF in classi is denoted
by IFandthatof a vote.
The features described in section 2.4 form the basis of the
(A)
:::::: ............ :..: .........
0.0641
(D)
Feature I (1-5)/(1 +5)
:::::::::::: ....: .:: ...... :...... :............ -:..... :............. ::.::::: ............
:::: ..........
:::::::: ...........
::::::-: .........
FirstCanonical
Axis
:-: ....... : ....... .
-
_
-i
0
.3..................
(B)
2.41
0.40
-0.92
1417
(E) Second
Canonical
Axis
Feature
2Sat(2,
4,5)
0.088:
0.0494
Eigenvalue = 4.7e-6
0
0?6
(C) Feature
3 Val(2-3,3-4,4-6)
0.0770
_
-0.603
-15.5
Third Canonical
Axis
0.0452
I! Eigenvalue
=-4.7e-6
ß
55.0
201.0
Water
Cloud
(6)
3.70
-4.06
Land
(4)
Figure2. (a)-(c)Thehistograms
thatareexamples
of threepaired
histograms.
These
histograms
plotthe
frequency
ofoccurrence
oftraining
vector
values
forclasses
land(4)andwater
cloud
(6). Histogram
inFigure
2a
shows
theplotfor(chl- ch5)/(chl
+ ch5).Histogram
in Figure
2bshows
thesaturation
ofchannels
2, 4, and5.
Histogram
in Figure
2cshows
thevalue(intensity)
ofchannels
2-3,34, and4-6.(d)-(f)Thehistograms
thatare
theresult
ofperforming
thecanonical
transformation
onacombination
ofthethree
histograms
ofFigures
2a-2c.
6206
BERENDES ET AL.: COMPARISON
OF CLASSIFIERS USING AVHRR
For example,if feature1 is (chl - ch5)/(chl+ ch 5) andits further tests between classes 1 and 3, classes 1 and 4, etc., are
valuefor pixelZ is 0.3, thenin Figure2 we seethat0.3 falls performed. If class 2 remains,then testsare made between
withintherangewherethehistogram
for class4, land,is greater classes2 and 3, classes2 and 4, etc., until class2 is eliminated.
than for class6, water, and class4 getsone vote. Note that in
In the process,
classes2, 3, and4 maybe eliminated.Thenthe
manycases,
neitherclasswill receivea vote. Thisprocedure
is next test would be between classes5 and 6, etc., until the list is
appliedto all pairsof classes
for all features,
andthe class exhausted.
The resultof this processis that typicallyoneto fourclasses
receiving
thelargestnumberof votesis declared
thewinner.In
the caseof ties,the classesnotinvolvedin thetie are eliminated, remain. If only one classremains,then this is declaredthe
winner. If two to four classes remain, then these are sent to the
andtheprocedure
is repeated.
maximum likelihood classifier for final processing. However,
the maximumlikelihood classifierusedin the final stageis very
fast, sincethe multidimensionalspaceis significantlyreduced.
The paired histogramapproachdescribedabove used the On rare occasions,
it may occurthat all classesare eliminated.
combinationof divergence,overlap,and correlationto reducethe Shouldthis occur,then the eliminationprocedurehasfailed and
number of featuresconsideredfor classification. An analogous the full maximum likelihood classifier is utilized. Obviously,
proceduremay be used to decrease the number of classes this procedure
cannotpossiblysucceedif the correctclassis
consideredwhen making a classification. A hybrid approach, improperlyeliminated.
one that combineselementsof the paired histogramtechnique
with the maximum likelihood classifier, was designedfor this
3.3 Class Elimination Approaches
3.4.
purpose.
Neural
Network
Classifier
The set of potentialclassesis first reducedby performinga
canonicaltransformon the histogramsgeneratedfor each pair of
classes.The canonicaltransform,describedby Richards[1993],
is similar in concept to principal componentanalysis. The
principal componenttransformationoften is used to map the
image data into a new uncorrelatedvectorspace. It producesa
spacein whichthe data havethe largestvariancealongthe first
axis, have the next largest variance along a second,mutually
orthogonalaxis, and so on. However, this approachis based
uponglobalcovarianceof the full datasetandis not sensitiveto
The multilayer perceptronnetwork, trained using the backpropagation algorithm, that was implemented for this
investigationis describedin many sourcessuchas Rumelhartet
al. [1986] and, more recently, in an article by Paola and
Schowengerdt
[1995]. A singleperceptron,
or node,receivesone
or many external inputs on weightedinput lines, computesthe
weightedsum of the inputs, and generatesan outputwhich is a
function of that sum. The computed function generally is
nonlinear and continuouswhich producesa mappingfrom the
input spaceto the classificationspace. Multilayer perceptron
class structure in the data. The canonical transformation offers
networkscontainperceptronsarrangedinto an input layer, an
an alternativeapproachin which the classeshave the largest output layer and one or two hidden layers. Since arbitrary
possibleseparationbetweentheir meanswhenprojectedontothe decision surfacescan be constructedwith two hidden layers,
new axis.
more than two will not add functionality. In this investigation,a
In the present investigation,a canonicaltransformationis singlehiddenlayer provedsufficient.
madefor eachclasspair. The histogramsin the left columnof
The knowledge of the network is stored in the connection
Figure 2 are examplesof three paired histogramsused in the weights which are set to small random numbers when the
approachdescribedin section2. These histogramsplot the networkis initialized. During training,the error for a particular
frequencyof occurrence
of trainingvectorvaluesfor classesland iteration is computed as the aggregate squared difference
(4) and water cloud (6). For example, the histogramin Figure between the desired network outputsand the outputsactually
2a showsthe plot for the feature (chl -ch5)/(chl + ch5). The producedby the weightsfor that iteration(t):
histogramsin the right columnare the resultof performingthe
canonical
transformation
on
a
combination
of
the
three
histogramsin the left column. The techniquefor combining
thesehistogramsis given by Richards[1993]. The graph in
Figure2d showsthe eigenvaluefor the first canonicalaxis. Note
that the eigenvalueis 58.4 for the graphin Figure 2d and is
4.7e-06 for graphsin Figures2e and 2f. This demonstrates
that
the maximum class separabilityinformationcan be extracted
from the first canonical axis.
This was the case for all of the
pair wise histograms.
The classeliminationproceduremay be describedas follows.
Error(t)
=« •
(Actuali(t)
-Desiredi(t))
2
(7)
i
where desiredi is the true class for vector i and actuali is the
classselectedby the network. The weightsare changedusing
the back-propagation
learningalgorithm,whichis basedon the
gradientdescenterror minimizationtechnique.The changein a
weightis determinedby
wij(t+1) = wij(t)+ Awij(t)
(8)
For a givenpixel to be classified,computethe canonicalvalue where
corresponding
to a classpair, for example, 1 and 2. If that
canonicalvalue does not lie within the region defined by the
= •+aAw(j(t-1)
(9)
histogramof class1, then class1 is hereaftereliminatedfrom
anyfurtherconsideration.If the canonicalvaluedoeslie within
theregiondefinedby class2, thenclass2 is potentiallya correct The constantrl, called the learningrate, was set to 0.1 for all
class, and further tests are run. If the canonical value lies experiments. Also note that the secondterm in (9) is a
outsideof the histogramsdescribingboth class 1 and class2, momentumterm, addedto speedconvergence.The momentum
then both classes are eliminated from further consideration.
constantot addsa percentage
of the previousweightupdateto
This processis continuedwith othernoneliminated
classpairs. the current update. An c• value of 0.5 was used in all
For example,if class 1 is eliminatedat somestage,then no experiments.
AWu(t)
r••U
BERENDES
ET AL.: COMPARISON
OF CLASSIFIERS
USING
AVHRR
6207
Theseequationsgive weight updatesfor one presentationof a
trainingvector. The processof adjustingthe weightsis repeated
until the error in the networkstabilizes. This proceduredoesnot
guarantee a global minimum and most often finds a local
minimum. The network was trainedon the samesamplesas the
Afghanistan),(4) "other" (describedas global but excludingthe
polar, biomass burning and desert regions), (5) global
(combination of polar, SA, desert and other regions), (6)
twilight (samplestaken from the polar regions). Note that the
classifiersare for daylight conditionsonly, defined for solar
other classifiers
zenithanglesup to ©,, = 80ø. The twilightclassifieris a special
likelihood
and used the same features
as the maximum
classifier.
case which specificallytargetssolar zenith anglesin the range
80ø_< ©,, _< 85ø. This is importantbecausevirtually all
operationalcloudclassificationschemesare limited to valuesof
4.0 Results
©o less than 75" - 80", becauseof the low reflected radiances
This investigation was conductedin order to constructan causedby the low Sun angles.
Table 4 shows the summary of overall classification
operationalglobal algorithmfor detectingcloudsover all types
accuracies
and computationtimes to compute 100,000 pixels
of surfacesin supportof the EOS CERES ScienceTeam. There
using
a
Silicon
Graphics Octane workstation. Two sets of
are two primary considerations:accuracyand computational
efficiency. However, there are a number of other important experimentsare listed for comparison,one using the generated
issues, including the choice of the classifier, whether or not featuresand the other simplyusingthe channeldata as discussed
specially designed features enhance classification accuracy, in section 4.2. The results demonstrate that the derived features
whetheror not it is beneficialto subdividethe global classifier are more accuratebut require additionalcomputation.Note that
into subregions,whether or not the use of ancillary data are thereis only a singleentryfor the classeliminationschemesince
beneficial,and whetheror not dividingclassesinto subclasses
is it always begins with the full feature set. This classifier
beneficial. In each case,resultsare presentedin termsboth of performs the elimination based on the features which best
separatepairs of classes. Unlike the neural network, if the
classificationaccuracyand computationalcost. As mentionedin
section2.1, the labeling of the training and testing samplesis nonlinearcombinationsof featuresare not presentin the feature
critical to the performanceof the classifier. It is likely that some set, the class elimination techniquecannotcreate them. Thus
small amountof error existsin the samplesas a result of analyst the techniqueis not effective if only the six channelvalues are
error. Thus the reportedresults are relative to the accuracyof
the humananalyst.
4.1. Choice
of Classifier
There are a large numberof classificationapproaches
which
have been reportedin the literatureand, clearly, not all of them
are included in this study. The four approachesthat were
selectedfor comparisonare the traditional maximum likelihood
method(ML), a new paired histogramapproachwhich utilizes
used as features.
below.
Note
The results of both sets of tests are discussed
also that the results in Table
4 use subclass
informationfor training (discussedin section4.4) and ancillary
data during classification(discussedin section4.3). The overall
accuraciesfor the polar region are 94.3% (ML), 94% (CE),
94.5% (NN), and 92.8% (PH).
Table 5 gives the confusionmatricesfor the polar region for
the ML, PH, CE, and NN full feature classifiers. The row
number indicates the actual class, and the column number
indicates the class that the classifier chose (class numbers are
featureswhich best separateclasspairs (PH), a hybridpaired explainedin Table 2). For example,in the ML classifierportion
histogram/maximumlikelihood approachbased upon class of Table 5, the value 2.1 in column 1 of row 2 indicates that
elimination (CE), and a back-propagation
neural network (NN).
Table 4 showsthe resultsfor eachof the following regions:(1)
the polarregions(polewardof 60ø latitude),(2) SouthAmerica
(SA) with periodsof intensebiomassburning,(3) desertregions
(limited to the Middle East, stretching from Morocco to
2.1% of the class2 pixels were incorrectlyclassifiedas class 1.
All four classifiers have high accuraciesfor the water, land,
water cloud, and sunglint classes;the main problem is the
separationof the snow/iceclassfrom the ice cloud(cirrus)class.
The ice cloud class accuraciesare 85.2% (ML), 84.3% (CE),
Table 4. Overall AccuracyandComputationTimes for the ML, CE, NN, andPH Methods
for Each of the Regionaland Global ClassifiersUsing Both the CompleteSet of Features
(Full) and the Smaller Six-Channel Feature Set
ML
CE
Full
NN
PH
Classifier
Full
Six-ch
Full
Six -ch
Full
Six -ch
Polar
94.3
94.0
South America
Desert
96.8
95.0
94.0
93.7
94.0
94.5
95.4
92.8
86.5
95.8
94.2
94.1
95.8
96.5
95.4
94.4
93.7
90.6
84.7
Other
94.6
Global
89.8
93.4
93.0
96.4
95.0
91.0
89.1
90.4
88.6
92.4
91.2
89.1
Twilight
93.0
81.5
90.1
93.0
93.1
92.1
92.0
88.3
Polar
11.8
10.6
11.0
5.2
1.9
17.0
16.9
South America
10.2
8.3
8.2
4.4
2.7
10.2
10.1
Desert
16.5
9.1
9.3
7.1
2.0
13.2
13.3
Other
12.4
8.9
9.5
5.9
3.4
13.5
13.1
Global
12.4
6.8
7.7
8.5
3.5
14.5
14.3
Accuracy,%
Computational
Time, s
6208
BERENDES
ET AL.: COMPARISON
1
2
3
I
98.9
0.1
2.1
0.0
0.4
0.0
0.0
92.1
12.2
0.0
1.5
0.0
95.0
3.0
2
3
4
5
17
1.6
0.0
0.0
0.0
0.0
84.9
9.1
0.3
1.1
0.0
1
2
97.8
1.7
0.1
92.3
3
4
5
17
0.0
0.1
0.0
0.0
12.8
0.8
1.4
0.0
I
2
3
4
5
17
99.0
3.9
0.0
0.3
0.0
0.0
0.0
89.2
8.9
0.4
1.2
0.0
4
5
17
0.0
0.0
0.9
0.1
2.3
85.2
0.0
0.1
0.0
0.0
0.0
97.8
0.3
0.0
3.5
2.6
1.8
97.9
0.0
0.1
0.0
0.0
0.3
100.0
0.0
0.0
1.7
0.3
9.8
88.5
0.0
0.4
0.0
0.0
0.0
97.2
0.5
0.0
3.7
2.4
2.1
97.9
0.0
0.0
0.0
0.3
0.2
100.0
0.0
2.0
0.0
0.0
2.0
3.9
0.1
0.0
84.3
0.0
0.0
0.0
0.0
95.7
0.3
0.0
2.9
3.3
97.8
0.0
0.0
0.2
0.5
100.0
0.9
0.0
0.0
96.9
0.2
5.1
0.0
2.2
0.3
2.2
98.2
0.4
0.2
1.0
0.0
0.2
0.3
94.5
MLClass•er
2
3
4
5
17
PH •ass•er
I
classifiers
AVHRR
confused some of the smoke/haze
NNc•ss•er
Table
Class
I
2
I
99.9
0.0
0.0
2
0.2
98.5
1.1
3
0.0
0.2
classifiers, but for the fundamental task of
4
0.1
identifyingcloudpixels in the polar region,the NN approachis
marginally superior. A sample sceneis included in Plate 2
whichdepictsthe classification
madeby the ML, NN, and PH
5
0.0
17
the polar regions,are underdetected
with all classifiers. The
converseconfusion,snow/iceinterpretedas cloud,alsooccurs6
- 13% of the time. The overall accuraciesare comparablefor the
and NN
thin cloud.
6. Confusion Matrices in South America
5
17
27
0.0
0.0
0.1
0.0
0.1
0.1
0.0
0.0
99.5
0.0
0.3
0.0
0.0
0.0
0.0
96.9
0.2
0.0
2.7
1.3
0.4
0.7
96.4
1.0
0.1
0.2
0.0
0.0
0.0
2.6
97.2
0.0
27
0.2
0.0
0.0
9.8
5.3
1.1
83.6
90.8% (NN), and 88.5% (PH), with the major misclassification
as snow/ice. This meansthat ice clouds,which are prevalentin
ML
with
Clearly, a more robust haze/smokealgorithm is needed for
biomass-burning
regions. For this region, the ML approachis
superior. It exhibits better performancefor the haze/smoke
pixelsanddoesnothavedifficultyidentifyingsunglint.Sunglint
pixelsweredifficultfor thenetworkin everyregionstudied.
Overall accuraciesfor the desert region were 95% (ML),
94.2% (CE), 95.8% (NN), and 93.7% (PH). Corresponding
confusionmatricesfor each or the classifiersare given in Table
7. The brightdesertregionsare confusedwith cloudby boththe
ML and PH classifiers and the with smoke/hazeby the PH
classifier. The NN has very high accuracy(99.4%) for these
pixels. Water clouds,which are muchmore infrequentin this
data set than in otherregions,were a sourceof errorfor all four
classifiers. The accuracieswere 84% (ML), 83.4% (CE), 74.8%
(PH) and67% (NN). In this region,sunglintis misclassified
by
CEclass•er
0.0
3.6
90.8
0.0
0.1
0.0
USING
distinguish between water cloud and ice cloud to ensure
accuracy.Thosecloudsare,in fact, waterclouds.
For South America, overall accuracies of 96.8% (ML),
95.8% (CE), 94.1% (NN), and 94.4% (PH) were achieved
(Table 4). Corresponding
confusionmatricesfor each of the
classifiersare given in Table 6. The primary difficulty is
haze/smoke,which is prevalentin the data set chosen,with
83.6% (ML), 82.9% (CE), 77%(PH), and 75% (NN) accuracies.
The majority of the errors occur when smoke/haze is
misclassifiedas land becauseof the spectralsignatureof the
underlyingsurfaceshowingthroughthe haze. Similarly, all the
Table 5. ConfusionMatrices in the Polar Region
Class
OF CLASSIFIERS
3
4
MLClass•er
PHC•ss•er
classifiers. The scene includes land, ice/snow, thin water cloud,
multilayer water cloud, and ice cloud. Note that both the
multilayerwater cloud,which can be seenin the lower right of
I
97.5
0.0
0.0
0.0
0.0
2.5
0.0
2
3
0.0
0.0
98.6
0.6
0.5
99.1
0.0
0.0
0.8
0.3
0.0
0.0
0.0
Plate 2, and the thin water cloud, which covers the central
4
0.1
0.0
0.0
93.1
0.6
0.7
5.5
5
17
27
0.0
0.2
0.0
1.3
0.0
0.0
0.6
0.0
0.0
1.1
0.0
14.3
93.5
4.2
6.4
3.3
95.7
2.3
0.3
0.0
77.0
I
97.9
0.0
0.0
0.0
0.0
2.1
0.0
2
0.0
96.9
0.5
0.6
1.3
0.1
0.7
3
0.0
0.2
99.3
0.0
0.1
0.0
0.4
4
5
0.1
0.0
0.0
1.2
0.0
0.4
97.0
0.7
0.2
95.6
0.0
2.0
2.7
0.1
portionof the image, are both classifiedas water cloud. The
resultsare similar for all three classifiers;however,the problem
notedabovecan be seenin the upperleft cornerof the classified
images.The specksof white containedwithin the pink ice cloud
are ice cloud pixels misclassifiedas ice/snow. As mentioned
previouslyin this section,theneuralnetworkhasthefewestsuch
misclassifications.
Problems
with
ice cloud
versus ice/snow
motivatedanotherset of experimentswhich demonstratedthat
the ice cloud versus ice/snow ambiguity can be removed by
augmentingthe data set. Using SIVIS, ice cloudpixels which
0.1
CEclass•ier
17
0.2
0.0
0.0
0.0
2.6
97.3
0.0
27
0.0
0.0
0.0
10.3
5.2
1.7
82.9
scenes,relabeled with the correct classification, and added to the
1
98.5
0.0
0.0
0.0
0.0
1.4
0.0
trainingset. Selectingand addingsamplesto fine tuneregional
classifiers in this way is an effective tool for improving
classification
accuracy.Note, also,that the lowerleft cornerhas
pink cloudswhichappearto be ice cloud. Oneof thedifficulties
with histogramequalizationis the potentialfor exaggerating
the
intensityof thepixels. The analystmustusetemperature
datato
2
0.0
93.1
3.8
0.0
3.1
0.0
0.0
0.0
had been misclassified
as ice/snow
were extracted
from
the
NNc•ss•er
3
0.0
0.4
99.6
0.0
0.0
0.0
4
0.0
0.0
0.0
95.7
1.2
0.0
3.1
5
0.0
0.2
0.8
0.5
96.2
2.2
0.1
17
4.4
0.0
0.0
0.0
0.0
0.0
0.0
7.1
12.5
10.2
88.5
2.2
75.0
27
0.0
BERENDES
ET AL.:COMPARISON
OFCLASSIFIERS
USINGAVHRR
6209
6210
BERENDES ET AL.: COMPARISON OF CLASSIFIERS USING AVHRR
Table 7. ConfusionMatricesin the DesertRegion
Class
1
2
3
1
97.4
0.0
0.1
2
0.0
97.3
1.6
3
0.0
0.0
4
0.4
5
14
17
27
3.5
0.0
6.3
0.0
0.1
0.7
0.0
0.0
0.0
4
27
lower for all classifierson this data set. A comparisonwith the
regional classification results given above demonstratesthat
significantimprovementis possibleby subdividingthe globe
into regions and tuning the regional classifiers to local
conditions. As expected,the sameproblemswith sunglintand
1.6
0.0
smoke/haze exist for this data set, but accuracies decrease in
0.0
0.0
almostall categories. Classificationof the pixels in the twilight
data, which is not included in the global results given, is 10%
less accuratewhen thosepixels are incorporatedinto the global
5
14
17
0.0
0.9
0.0
0.5
0.6
0.0
99.8
0.0
0.2
0.0
0.0
0.0
0.2
96.0
0.0
0.0
0.0
0.0
0.6
0.0
0.0
0.7
2.8
84.0
6.6
12.4
0.5
0.2
4.0
93.4
0.0
5.0
0.2
4.7
0.0
80.5
6.1
0.1
2.5
0.0
0.7
87.7
MLClass•er
classifier.
I
97.9
0.0
0.1
0.0
0.4
0.0
1.6
0.0
For the twilight data, the overall accuraciesare 93% (ML),
93% (CE), 93.1% (NN), and 92.1% (PH). Note that thesedata
were taken from the polar regionsandhave a limited numberof
potential classes,specifically,water, ice/snow, ice cloud, land,
2
0.0
97.0
0.3
1.4
1.3
0.0
0.0
0.0
and water
3
0.0
0.4
99.4
0.0
0.3
0.0
0.0
0.0
4
0.3
0.0
0.2
96.1
0.5
0.7
3.4
0.0
0.5
0.0
0.1
0.0
0.9
0.4
1.6
74.8
5.2
0.6
5
14
6.2
84.5
7.7
0.1
6.5
9.8
accurate for identifying cloud pixels, but both have
approximately25% misclassificationof ice/snow. In contrast,
the ML classifiercorrectlyidentifies94.8% of the ice/snowbut
has difficulty distinguishingland from water (27% error). In
spite of the misclassifications,the results are encouraging,
demonstratingthat accuratecloud retrieval in sceneswith high
solarzenith anglesis possible.
Finally, the computationalcostof thesemethodscan be found
in Table 4. It is clear that the neural network approachis the
most efficient and should be used when its accuracy is
comparableto one of the otherapproaches.For the polar, desert,
biomass-burning
and "other"data, the neuralnetworkprocessing
time was 55% less than that requiredfor the ML classifierand
was 57% less than for the PH method. For the global data, the
PH •ass•er
17
8.3
0.0
0.0
0.0
15.5
0.0
74.0
2.2
27
1.8
0.0
0.0
2.1
2.7
8.6
0.0
84.8
I
96.8
0.0
0.1
0.0
0.5
0.0
2.4
0.2
2
0.0
98.5
0.0
0.6
0.9
0.0
0.0
0.0
3
4
0.0
0.4
0.0
0.1
99.5
0.2
0.0
94.8
0.5
3.8
0.0
0.2
0.0
0.3
0.2
CEclassg•er
0.0
5
3.3
0.7
0.0
0.7
83.4
4.1
5.1
2.7
14
0.0
0.0
0.0
0.0
6.3
93.6
0.0
0.0
17
6.3
0.0
0.0
0.0
13.0
0.0
80.0
0.7
27
0.0
0.0
0.0
0.7
0.5
5.0
6.1
87.7
I
2
97.2
0.0
0.0
96.1
0.2
1.2
0.3
1.2
0.4
0.0
0.0
1.1
2.3
0.0
0.0
0.0
3
0.0
0.9
99.1
0.0
0.0
0.0
0.0
0.0
4
0.4
0.0
0.2
98.8
0.4
0.0
0.0
0.1
5
14
3.4
0.0
0.3
0.0
0.0
0.0
1.9
0.5
67.0
0.0
7.9
99.4
5.8
0.0
13.7
0.0
17
27
4.1
0.0
0.0
0.0
0.0
0.0
0.0
6.1
16.0
0.9
0.0
9.1
74.9
0.0
5.0
82.9
NNc•ss•er
all classifiers, with only 74-80% accuracies. Even the ML
classifier,which performedwell on sunglintin SouthAmerica,
has a 20% error rate. A visual inspectionof scenesfrom this
area showsa greaterrangeof sunglintvaluesand a very gradual
gradient between sunglint and water that overlaps with the
spectralsignatureof water cloud. Becauseof the problemswith
sunglint, note that most classifiersdescribedin the literature
avoid sunglint regions. As with the previous data sets,
haze/smokecontinuesto be difficult to distinguishfrom the
underlyingsurface. On the basisof accuracyalone, neither the
NN nor ML classifieris the clear choicefor this region.
The overall accuraciesfor the "other" data are 94.6% (ML),
93% (CE), 96.4% (NN), and 91% (PH). Note that the polar,
desert, and biomass-burningareas were specifically omitted
from this data set. Sunglint continuesto be a sourceof error
with only 72-79% accuracyand is primarily confusedwith water
cloudand smoke/haze.The performanceon cloudclassification
is comparablefor the NN and ML classifiers,makingit difficult
to determine which would be the better choice for this region.
The NN is more accurate for ice cloud and land, while the ML
hasbetterperformancefor sunglintand smoke/haze.There were
insufficient samplesin this data set to train for snow/ice and
bright surface(desertlike) conditions.
The results for the global data set are 89.8% (ML), 88.6%
(CE), 92.4% (NN), and 89.1% (PH). Note that the accuracyis
cloud.
The
NN
and PH
classifiers
are the most
decrease is much smaller, 31% for ML and 41% for PH, since
the global nets are the largest and therefore the most
computationally
intensive. When comparedto the ML classifier,
the CE method reduces the computation time from 7-44%.
However, the accuracyof the CE methodis slightly lower than
that of the ML classifierto which it converges.The decreasein
accuracyof 0.8% in the desertregion and 1.2% in the global
region provides a 44% and 38% reduction,respectively,in
processingtime. A small increasein accuracyfor this method
would make it a superior choice to that of the maximum
likelihood method, particularly for an operational classifier.
Note that the comparisonis betweenthe full featuredata setsfor
the ML and CE classifiers. The six-channelML hascomparable
accuracyand performanceto that of the full feature CE.
However, it may not always be possibleto use such a small
numberof lEatures. The advantageof the CE algorithmwill be
mostapparentfor classification
taskswhichuse a largenumber
of features.
4.2.
Choice of Features
The classificationresultspresentedin section4.1 are based
uponthe useof a largenumberof features,especiallyfor the PH
method.However,thereis a considerable
penaltyfor computing
the various/Eatures and, in the case of the elimination methods,
the canonicalhistograms.In the resultspresentedin section4.1,
the ML and NN methodsused combinationsof 7, 9, 11, 10, 11,
and 7 featuresfor the polar, SA, desert,"other", global and
twilight classifiers,respectively.Conversely,the PH approach
utilizes 82, 70, 94, 90, 118, and 42 features for these same
regions.The questionaddressed
in this sectionis the degreeto
which the derived featurescontributeto higheraccuracyand at
what cost.
Table 4 shows the overall classification
accuracies
and computationtimes for 100,00 vectorsfor the ML, PH, and
BERENDES ET AL.: COMPARISON
OF CLASSIFIERS
USING AVHRR
6211
NN classifiers,using both the full set of featuresdescribedin
detail above and using only the original five AVHRR channels
plus the computedreflectancevaluesof channel3.
With the exception of the polar region, the ML method
computationtime is reducedtbr all of the data sets. Reductions
of 45% are shownfor the global and "other"classifiersand with
little overall difference in classificationaccuracy. A detailed
intercomparisonof the performancematrices reveals that,
althoughthe overall accuraciesare similar, the accuraciesfor
specific classes change. However, there were only three
significantdifferences. The accuracyof the six-channelnetwork
decreasedby 13% for the smoke/hazeclass in the desert and
decreasedby 9% for the sunglint class in the "other" data set.
Interestingly,the accuracyof the 6 channelnetworkincreasedby
The ML accuracyfor the variouspolar, SA, desert, "other,"
global,and twilight classifiersdecreasedby 0.5- 1.5% when the
ancillarydata were not used. There is very little advantagein
using the ancillary data as additional inputs to the neural
network. The highly nonlinearstructureof the NN approachcan
adjust the weights without having to utilize this additional
separation between two classes that was only visible by
displayingall three bandsas the featurechannel 1 - channel2.
The decrease in accuracy suggeststhat features essential to
making class distinctionsin the twilight and "other" data sets
were not being createdby the network but were includedin the
derived feature set. More researchis necessaryto determine if
a changein network topologywould impact the complexityof
classificationapproachby effectivelyrequiringa much larger
and more complexdecisionsurfaceto be generated,and it is
more computationallyexpensive. The impact of training without
information.
This is true for both the full feature and six-channel
classifiers.
The situation
is much different
for the PH method.
For the
full set of features, use of the ancillary information leads to an
increasein overall accuracyof 0.5- 3% for the polar, SA, desert,
"other," global, and twilight classifiers. The differences are
even larger for the six-channel approach,with increasesin
accuracyof 0.5% to 4% and a 9% increasefor the desertregion.
12% for the smoke/haze class in South America.
For all other
The use of ancillarydata may also decreasethe computational
cases,the six-channelapproachproducesequivalent accuracy times. Typical savings are 10-15% using the ML approach,
because the classifier
can eliminate
certain
classes from
with a large reductionin computationalexpense.
Table 4 showsthat the NN approachusing the six-channel consideration.The savingsare muchlarger for the PH approach,
data achievedcomparableresultsfor all the data sets,with some with computationaltimes decreasingby as muchas 50%. The
slight increases and some slight decreasesin accuracy for efficiencyof the neuralnetworkwasunchanged.
specificclasses, and requiredonly 39-72% of the computation
time required for the compoundfeature data. These lower 4.4. Use of Subclasses
computationaltimes are causedby a decreasein the number of
All of the resultspresentedabovehaveusedsubclasses.That
input nodes, and a correspondingreductionin the number of
means
that the classifiersactuallyconsidereda muchlarger set
weight updates required, and by removing the feature value
of
classes
than are reported These subclasseswere then
computations.For the twilight data, accuracydroppedby 1.0%,
processwas completed.For
for the "other" data it fell 1.4%, and for the global data the "collapsed"after the classification
decreasewas 1.2%. Although these decreasesare small, the example, the water cloud classconsistsof sevensubclasses.All
differenceswere investigatedusinga visualizationprogram. It is of the water cloud subclass results are collapsed after
classification,and the final resultsare seenin the performance
clear that the network
"builds"
features
similar
to those
Obviously, this procedure complicates the
discussedin section 2.3. For example, the network found a matrices.
subclass distinctions was tested on the ML,
PH, and NN
classifiers.
As expected,fewer classesresultedin reducedcomputation
time for all threeclassifiers.The reductionswere40-50% (ML),
10-35% (NN), and 50-65% (PH).
Less expectedwas the
the combinations
of features
the network
was able to
approximateor if the missing featurescan be identified and resultingreductionin classificationaccuracyexhibitedby every
classifier. The reductionsin classificationaccuracyof 1-3%,
included in the feature vectors.
demonstrate
that many classes are not well clustered in
Table 4 also showsthat while the computationaltimes are
approximately equal for the full feature and six-channel multidimensionalspace. The subclassesform more compact
classifiers using the PH method, there is a penalty in clusters which may be easier to separate. The significantly
classification accuracy by 2-7%. Methods that are highly higher computationalexpenseis justified for this study since a
nonlinear, such as ML and especially NN, suffer little from lossof accuracyof 2% can be significantfor the purposesof the
usingthe six-channelfeatures. They are capableof warpingthe EOS CERES team. Note that similar experimentsusingthe sixdecisionsurthcessufficientlyto accommodatethe clustereddata channeldataproducedreductionsin accuracyof up to 8%.
in each class. However, the PH method does not have this
luxury. It simulatesnonlinearitythroughthe use of complex 5. Conclusions
features, such as ratios and difference ratios. Thus the more
nonlinear the classifier, the less it is useful to utilize complex
features.
4.3. Utilizing Ancillary Data Sets
This paper comparesthe accuracyand efficiencyof four
classification
schemes
for identifyingpixelsin AVHRR imagery.
The four classifiersrepresentthree distinctapproaches
to this
problem. The maximum likelihood classifier is a traditional
All of the classifierswere modifiedto utilized ancillarydata statistical
technique,
the pairedhistogram
methodis basedupon
sets. In particular, a land-water mask was used to eliminate the algorithmsfor identifying surfaceswhich separatepairs of
water classover land surfaces,spatial resolutionterrain maps classes,and the neuralnetworkreliesuponthe network'sability
were used to identify mountainousregions, and the sunglint to learn nonlinear separatingsurfaceswhich isolate each class
probabilitymask identified regionsof potentialsunglint. The from all others. The fourth classifier, the class elimination
sunglintclass is not allowed outsideof these regions. The technique,is a variationof the maximumlikelihoodtechnique
questionarises as to the cost-benefitof utilizing the ancillary which uses a canonical transform to reduce the number of
information in the various classifiers.
potential classesand, concomitantly,to increaseclassification
6212
BERENDES
ET AL.: COMPARISON
OF CLASSIFIERS
USING AVHRR
speed. The class elimination techniqueis identical with the
which can, at best, seriouslyimpact classificationspeed,and at
worst, introducenoise into the classifier. A large number of
Althoughthe fundamentalgoal of this researchis to distinguish featuresare generatedfor the training data, and thosewith the
cloud from noncloud pixels in AVHRR imagery, all of the greatestpotential for identifying particularclassesare retained.
techniqueswere sufficientlyrobustto allow for the identification For the paired histogram and class elimination techniques,
of many more classesand subclasses.Eight classes,which can features are selected which exhibit the lowest correlation and the
be split into 20 subclasses,
were selectedfor this study.
highest divergence. A subset of these features, reduced by
The regional comparisonswere conductedon AVHRR LAC
tighter divergenceand correlationrequirements,is usedby the
This
scenesfrom the polar regions, desert areas, and regions of neural network and maximum likelihood classifiers.
biomass burning; areas which are known to be particularly techniqueis comparedto the simpler approachof usingthe six
difficult.
Three additional data sets were created to test the
channelsof data for each pixel as the elementsof the feature
generalizationabilities of the classifiers. The "global"data set vector. The paired histogram method suffered declines in
containssamples from all areas of the Earth, the "other" set accuracyfrom 1-9% usingthe six-channeldata. The maximum
containsglobal data minus vectorswhich would fall within the likelihood classifier was less sensitiveto the changein features
purview of the regional classifiers, and the "twilight" set selected, with small decreasesin accuracyfor some of the
containsvectorsfrom the polar regionswhere the solar zenith experiments. The neural network was the least affected,with
some small decreases and some small increases.
Thus the
angleis between80ø and 85ø.
Accuracy is reported for sets of testing data which are feature selectionmethodcannotbe chosenindependentlyof the
independentof the trainingdata, that is, the testingdata are from classifier to be used. In all cases, classificationspeed was
different swaths than the training data and had not been increasedby using the six-channeldata. The most significant
previouslyused by the classifiersin any way. For the polar, improvements occurred for the neural network, which
desert, and biomass-burningregions,the maximum likelihood experiencedas much as a 72% reductionin classificationtime.
classifier achieved 94-97% accuracy, the neural network
No one classifier solved all problems. The maximum
achieved94-96% accuracy,and the paired histogramapproach likelihood and neural network approacheshave comparable
achieved93-94% accuracy. The primary advantageto the class accuracies. The maximum likelihood is slightly more accurate
elimination schemelies in its speed. Its accuracyof 94-96% is for the biomass-burningarea, but the neural network has the
an averageof 1% lower than that of the maximum likelihood superiorperformancefor the global data. However, the neural
method,but speedupsof 7-44% for theseregionsare worthyof network is the least computationally expensive approach.
note. Althoughthe overall accuraciesare very similar, there are Consideringboth efficiency and accuracy,the neural network
differences for specific classes, particularly for sunglint and using regional classifiersand the six-channeldata as input is
smoke/haze. The maximum likelihood classifierhas the highest presentlythe best choice for this task. Although the paired
accuracyfor these difficult classes. For other classes, the histogramclassifierappearsto be the least attractivebecauseit
accuracyof the neural network is similar and in some casesis was neither the most accurate nor the most efficient, that
superior to that of the maximum likelihood classifier. The approachmay ultimately prove the most useful. The paired
neural network
also has the smallest classification
times.
histogramclassifieris clearly the most flexible, can mosteasily
Although the paired histogram method is rarely the most incorporateancillarydata as they becomeavailable,and is well
accurate,its performanceis similar to that of the other two suited for handling multimodal class distributions. Similarly,
classifiers,and it is clearly the mostflexible method. For all of the current accuracy of the class elimination technique is
these techniques,additional efforts need to be focusedupon insufficientfor this project. However, the classificationspeed
regionsof strongsunglint, cumulusclouds,silty water, slush, would make this a viable option if the accuracycould be
and thin smoke/haze.
increased by 1-2%. Further study of both techniques is
Experiments also clearly demonstratethe effectivenessof
warranted,particularlyas additionalancillarydatabases
become
decomposinga single global classifierinto separateregional available.
classifiers,sincethe regionalclassifierscanbe morefinely tuned
to recognizelocal conditions. Using the full featureset, all of
Acknowledgments. This work was supportedby National Aeronautics
the classifiersproducedaccuraciesbetween 89 and 92% when andSpaceAdministrationContractNAS 1-19077 whichis partof theEarth
trainedand testedon globaldata, a significantdecreasefrom the ObservingSystem(EOS) Cloudsand the Earth's RadiantEnergySystem
resultsreportedabove. Interestingly,the resultswere slightly (CERES) program. Supportwas alsoprovidedby NAGW 3740, whichis
better, 94% for the maximum likelihood, 96% for the neural managedby RobertJ. Curran
network, and 91% for the paired histogramtechnique,when
usingthe "other"data. This demonstrates
thatthe complexityof References
the classificationtask decreaseswhen the three difficult regions Allen, R. C., P. A. Durkee, and C. H. Wash, Snow/cloud discriminationwith
maximum
likelihood
method when no classes can be eliminated.
listed above are removed.
multispectralsatellitemeasurements,
J. Appl. Meteorol., 29, 994-1004,
1990.
Experimentsusing the "twilight" data show the varying
impactof a large solarzenith angle. The majorityof the errors Arking, A., Latitudinal distribution of cloud cover from TIROS II
photographs,
Science,143, 569-572, 1964.
in this data set occurredin distinguishingsnow/ice from ice Baum, B. A., T. Uttal, M. Poellet, T. P. Ackerman, J. M. Alvafez, J. Intrieri,
cloud and separatingland from water. However, the overall
D. O'C. Starr, J. Titlow, V. Tovinkere, and E. Clothiaux, Satellite remote
classificationaccuraciesof 92-93% usingthe full featureset are
sensingof multiplecloudlayers,J. Atmos.Sci., 52, 4210-4230, 1995.
encouraging, demonstrating that accurate cloud retrieval in Brown,J. W., O. B. Brown,and R. H. Evans,Calibrationof advancedvery
high resolutionradiometer infrared channels:A new approachto
sceneswith high solarzenithanglesis possible.
nonlinearcorrection,J. Geophys.Res., 98, 18257-18268, 1993.
Feature selectionis a critical step in classifier design. A Cihlar, J., and J. Howarth, Detection and removal of cloud contamination
delicate balance is required between providing enough
from AVHRR images,IEEE Trans. Geosci.RemoteSens.,32, 583-589,
1994.
informationto performthe classificationandprovidingtoo much
BERENDES
ET AL.: COMPARISON
OF CLASSIFIERS
Coakley,J. A., Jr., and D. G. Baldwin, Towardsthe objectiveanalysisof
cloudsfrom satelliteimagerydata,J. Clim. Appl. Meteorol., 23, 1065-
USING
AVHRR
6213
spacecraft: assessmentand recommendationsfor corrections,NOAA
Tech.Rep.NESDIS,70, 21 pp., 1993.
Reynolds,D. W., and T. H. VonderHaar, A bispectralmethodfor cloud
Coakley, J. A., Jr., and F. P. Bretherton,Cloud cover from high-resolution
parameterdetermination,Mon. WeatherRev., 105, 446-457, 1977.
scannerdata: Detectingandallowingfor partiallyfilled fieldsof view, J.
Richards,J. A., RemoteSensingDigital Image Analysis:An Introduction,
Geophys.Res., 87, 4917-4932, 1982.
2ndEd., 340 pp., Springer-Verlag,
New York, 1993.
Cox, C., and W. Munk, Measurementsof the roughnessof the sea surface Rossow,W. B., Measuringcloud propertiesfrom space: A review, J.
Climate, 2, 201-213, 1989.
from photographs
of the Sun'sglitter,J. Opt. Soc.Am., 44, 838-850,
1954.
Rossow, W. B., and L. C. Garder, Cloud detection using satellite
measurementsof infrared and visible radiancesfor ISCCP. J. Clim., 6,
d'Entermont, R. P., Low- and midlevel cloud analysisusing nighttime
2341-2369, 1993.
multispectral
imagery,J. Clim.Appl.Meteor.,25, 1853-1869,1986.
Duda, R. O., and P. E. Hart, Pattern Classificationand SceneAnalysis, Rossow, W. B., L. C. Garder, and A. A. Lacis, Global, seasonal cloud
variationsfromsatelliteradiancemeasurements,
I, Sensitivityof analysis,
482 pp.,JohnWiley, New York, 1973.
J. Clim., 2,419-458, 1989a.
Ebert, E., A pattern recognitiontechniquefor distinguishingsurfaceand
cloudtypesin the polar regions,J. Clim. Appl. Meteorol.,26, 1412- Rossow, W. B., C. L. Brest, and L. C. Garder, Global, seasonal surface
1427, 1987.
variationsfrom satellite radiance measurements,J. Clim., 2, 214-247,
1989b.
Ebert, E., Analysis of polar clouds from satellite imagery using pattern
recognitionand a statisticalcloudanalysisscheme,J. Appl. Meteorol., Rumelhart, D., G. Hinton, and R. Williams, Learning internal
28, 382-399, 1989.
representations
through error propagation,in Parallel Distributed
Eck, T. F., and V. L. Kalb, Cloud-screeningfor Africa using a
Processing:Explorationin the Microstructureof Cognition,editedby
geographically
and seasonally
variablethreshold,
Int. J. RemoteSens.,
D. Rumelhartand J. McClelland,pp. 318-362, MIT Press,Cambridge,
1099, 1984.
12, 1205-1221, 1991.
Fleet Numerical OceanographyCenter, FNOC/NCAR global elevation,
terrain, and surfacecharacteristics,
Digital RasterData on a 10-minute
Geographic(lat/lon) 1080x2160 grid, Global Ecosystems
Database,
version1.0, discA, Natl GeophysData Cent.,Boulder,Colo., 1992.
Foley, J. D., A. Van Dam, S. K. Feiner, and J. F. Hughes, Computer
GraphicsPrinciplesand Practice,1175 pp.,Addison-Wesley,
Reading,
Mass., 1986.
Seze, G., and W. B. Rossow,
Remote Sens., 12, 921-952, 1991.
Stowe,L. L., E. P. McClain, R. Carey, P. Pellegrino,G. Gutman,P. Davis,
C. Long, and S. Hart, Global distributionof cloud cover derivedfrom
NOAA/AVHRR operationalsatellitedata,Adv. SpaceRes., 11(1), 51-
Mass., 1990.
Gutman,G., D. Tarpley,andG. Ohrin,Cloudscreening
for determination
of
land surface characteristicsin a reduced resolution satellite data set, Int.
J. Remote Sens., 8, 859-870, 1987.
Effects of satellite data resolution on
measuringthe space/timevariations of surfacesand clouds, Int. J.
54, 1991.
Tovinkere, V. R., M. Penaloza,A. Logar, J. Lee, R. C. Weger, T. A.
Berendes,and R. M. Welch, An intercomparison
of artificialintelligence
approaches
for polar sceneidentification,J. Geophys.Res., 98, 5001-
5016, 1993.
Inoue, T., A cloud type classificationwith NOAA-7 split-window
measurements,
J. Clim.Appl. Meteor., 24, 669-686, 1987.
Weinreb, M.P., G. Hamilton, S. Brown, and R. J. Koczor,Nonlinearity
Key, J., Cloud cover analysiswith Arctic AVHRR data, 2, Classification
correctionsin calibrationof advancedvery high resolutionradiometer
with spectraland texturalmeasures,J. Geophys.Res., 95, 7661-7675,
infraredchannels,J. Geophys.Res.,95, 7381-7388, 1990.
1990.
Welch, R. M., K. S. Kuo, and S. K. Sengupta,Cloud and surfacetextural
Key, J., andR. G. Barry, Cloudcoveranalysiswith Arctic AVHRR data, 1:
featuresin polarregions,IEEE Trans. Geosci.RemoteSens.,28, 520Cloud detection,J. Geophys.Res.,94, 18521-18535, 1989.
528, 1990.
Kidwell, K., NOAA Polar Orbiter Data UsersGuide, report,Natl. Oceanic Welch, R. M., S. K. Sengupta,A. K. Goroch,P. Rabindra,N. Rangaraj,and
andAtmos.Admin., Washington,D.C., 1995.
M. S. Navar, Polar cloud and surface classificationusing AVHRR
Minnis, P., and E. F. Harrison, Diurnal variability of regional cloud and
imagery: An intercomparison
of methods,J. Appl. Meteorol.,31,405420, 1992.
clear-skyradiativeparametersderivedfrom GOES data, I, Analysis
method,J. Clim.Appl. Meteorol.,23, 993-1011, 1984.
Paola, J. D., and R. A. Schowengerdt,A review and analysis of
backpropagation
neural networksfor classification
of remotely-sensed B. A. Baum,AtmosphericSciencesDivision,NASA LangleyResearch
Center,Hampton,VA 23681.
multi-spectral
imagery,Int. J. RemoteSens.,16, 3033-3058,1995.
T. A. Berendes,K. S. Kuo, and R. M. Welch, Departmentof
Prabhakara,C., R. S. Fraser, G. Dalu, M. L. C. Wu, and R. J. Curran, Thin
cirrus clouds: Seasonal distribution over oceans deduced from Nimbus-4
IRIS, J. Appl. Meteorol.,27, 379-399, 1988.
Rabindra,P., S. K. Sengupta,
andR. M. Welch,An interactive
hybridexpert
systemfor polar cloudand surfaceclassification,
Environmetrics,3(2),
121-147, 1992.
Rao, C. R. N., and J. Chen, Post-launchcalibrationof the visible and near
infrared channelsof the advancedvery high resolutionradiometeron
NOAA-7, -9, and-11 spacecraft,
NOAATech.Rep.NESDIS,78, 22 pp.,
1994.
Atmospheric
Sciences,
GlobalHydrologyandClimateCenter,Universityof
Alabama in Huntsville, Huntsville, AL 35806.
E. M. Corwin,A. M. Logar,Department
of Mathematics
andComputer
Science,SouthDakotaSchoolof MinesandTechnology,RapidCity, SD
57701.
A. Pretre, Martin and Associates,Inc., 1515 N. SandbornBlvd., Mitchell,
SD
57301.
R. C. Weger,Instituteof AtmosphericSciences,SouthDakotaSchoolof
Mines andTechnology,RapidCity, SD 57701.
Rao, C. R. N., J. Chen,F. W. Staylor,P. Abel, Y. J. Kaufman,E. Vermote,
W. R. Rossow,and C. Brest,Degradationof the visibleand near-infrared (ReceivedMay 6, 1998;revisedJuly 29, 1998;
channelsof the advancedvery highresolutionradiometeron the NOAA-9
accepted
July 30, 1998)
Download

A comparison of paired histogram, maximum likelihood, class