CLUSTER Menu Location: Image Processing/Hard Classifiers CLUSTER provides an unsupervised classification of input images using a histogram peak technique. Operation Notes Macro Command See Also... ISOCLUST CLUSTER Operation CLUSTER first asks that you specify the number of files (bands) to be used in the classification. The maximum number is 7. You may enter the band names individually or insert a raster group file (.rgf). Next, specify an output image name. You are then required to specify the number of grey levels to be used to stretch the bands. (See Notes for full details.) Then you must specify the percentage to be saturated from each end of the data distribution of each band. Specify the generalization level for the classification, either broad or fine. The broad classification gives a general picture of spectral classes, while the fine classification gives intricate details (see Note 2). Finally, CLUSTER requires you to specify a clustering rule. Your options are as follows: 1. Drop least significant clusters: Specify the percentage to drop, e.g., 1.0 if to drop 1%. The clusters are ranked in terms of how much of the image they describe, and only those clusters that are cumulatively larger than the percentage threshold are retained. After the minor clusters are rejected, cells previously assigned to one of the omitted clusters are reassigned to the most similar of the retained clusters. This option works well as a "blind" classifier. 2. Set maximum number of clusters: Enter this maximum. The clusters are ranked in terms of how much of the image they describe. The first N clusters are retained and all other pixels are reassigned to the most similar of those N clusters. 3. Retain all clusters: Select this option to retain all clusters. The clusters will be ordered according to the total area they occupy, from most to least. Then use HISTO to see a graphic histogram of the classification using a class width of 1. Look for significant breaks in the slope of the histogram. If you find one, it suggests you have found a natural clustering level. Then run CLUSTER and set the maximum number of clusters to be that which you saw as the break point in the histogram. This is probably the optimum approach for this technique. [return to top] CLUSTER Notes 1. The maximum number of bands is 7 and the maximum number of clusters is 255 within the range of 1255. 2. CLUSTER uses a histogram peak technique of cluster analysis. This is equivalent to looking for the peaks in a one-dimensional histogram, where a peak is defined as a value with a greater frequency than its neighbors on either side. Once the peaks have been identified, all possible values are assigned to the nearest peak and the divisions between classes fall at the midpoints between peaks. Here a one to sevendimensional histogram is used to find the peaks. A peak is thus a class where the frequency is higher than all of its cardinal neighbors. The diagonal neighbors are omitted because of the correlation between bands. In the broad classification, a class must contain a frequency higher than all of its non-diagonal neighbors. In the fine classification, this is relaxed, permitting one non-diagonal neighbor to have a higher frequency. This accommodates true peaks which are otherwise missed because a nearby peak of greater magnitude obscures the usual dip between the peaks. 3. CLUSTER does not use the histogram from the original images to define peaks. Instead, it stretches each input image into a given number of grey levels before calculating their histograms. There are three stages involved in this process as explained below. First, the histogram of each input image is obtained by going through the original input image’s digital numbers. From this histogram, the numeric cumulative proportion at each grey level is found by adding up the frequency of each grey level from low to high frequencies. The cutoff points are then determined according to the saturation percentage specified. For example, to saturate an image by 5% in each tail, the 5% and 95% cumulative proportion cutoff points are determined by checking the accumulative proportion from the histogram. To derive the histogram for each input image, the gray levels of the input images are linearly stretched into 256 levels, ranging from 0 – 255. This allows CLUSTER to handle any of the three data types--byte, integer, or real--in that data range. Second, each image is linearly stretched into the given grey levels by applying the cut off points. In the stretch process, a pixel’s new value is evaluated using the following procedure: Finally, histograms are developed from each newly stretched image. The histograms derived from the newly stretched images are used to find peaks to perform the cluster analysis. 4. CLUSTER works best if a non-zero saturation percent is given. This ensures that the bins specified for each band occupy the region where data are most concentrated. Generally about 1% – 2.5% saturation works best for cluster analysis. 5. The result of CLUSTER may be used with MAKESIG to develop signatures for supervised classification techniques such as MAXLIKE. 6. The maximum number of clusters in the output result is 255. If the cluster analysis detects more that 255 clusters, the least significant classes in terms of class size are dropped. 7. The limitation for the combination of grey levels (G) and number of bands (B) is: seven bands are used in the analysis, grey levels cannot be greater than ten. For example, if 8. The CLUSTER algorithm is modified from a histogram peak technique described by John A. Richards, Remote Sensing Digital Image Analysis: An Introduction. Berlin: Springer, 1986. [return to top] CLUSTER Macro Command Running this module in macro mode requires seven or eight parameters: 1: x (to indicate that macro mode is being used) 2: raster group file (which contains up to seven images to be analyzed) 3: output filename (the new file being produced) 4: number of grey levels to linearly stretch each input image (e.g.: 6) 5: saturation percentage on each tail of the scale (e.g. 1.0 for 1%) 6: generalization level (1=broad / 2=fine) 7: clustering rule (1=drop least significant / 2=set maximum number / 3=retain all) If clustering rule is 1 in 7 above, then: 8: percentage to drop (e.g., 1.0 for 1%) If clustering rule is 2 in 7 above, then: 8: maximum number of clusters (enter maximum number of clusters) e.g., "cluster x rastergroup*clusters*6*1.0*2*1*1.5" e.g., "cluster x rastergroup*clusters*6*1.5*2*2*12" e.g., "cluster x rastergroup*clusters*6*1.0*1*3" See Also... Creating and Running Macro Files [return to top] FISHER Menu Location: Image Processing/Hard Classifiers The FISHER module provides image classification based on discriminant analysis. A single output image is created. The FISHER classifier conducts a linear discriminant analysis of the training site data to form a set of linear functions that express the degree of support for each class. The assigned class for each pixel is then that class which receives the highest support after evaluation of all functions. These functions have a form similar to that of a multivariate linear regression equation, where the independent variables are the image bands, and the dependent variable is the measure of support. In fact, the equations are calculated such that they maximize the variance between classes and minimize the variance within classes. The number of equations will be equal to the number of classes, each describing a hyperplane of support. The intersections of these planes then form the boundaries between classes in band space. Operation Notes Macro Command FISHER Operation Enter the signature group file that defines the class signatures. Then enter a name for the classified output image. (The signature files contain information about the images used to create them. The same image set is automatically used in the classification.) FISHER Notes 1. Fisher (1936) was the first to suggest that classification should be based on a linear combination of the discrimination variable. He proposed using a linear combination that maximizes differences among classes while minimizing variation within classes. Linear classification functions are derived based on his proposal. There are as many classification functions as there are classes. Each function allows us to compute classification scores for each pixel for each class, by applying the formula: Where the subscript i denotes the respective class; the subscripts j, j=1, 2, …, m denote the m bands; is a constant for i’th class; is the weight for the j’th band in the computation of the classification score for the i’th class; is the observed value for the respective case for the j’th class. is the score for class i. The class with the highest score is assigned to the pixel as its class. Furthermore, and are calculated using the following formula: where N is the total number of pixels over all classes, P is the number of classes, is an element from the inverse of the sum of the variance/covariance matrix of each class, is the observed value in i’th band for k’th class. 2. The coefficients of the discriminant function are displayed once a fisher classification is finished. Here is an example of such results: Constant Band1 Band2 Band3 Band4 Band5 Band7 Water -4.45717 1.137830.67007-0.68765 Conifer -173.38107 3.65205-1.15369 Deciduous -403.71720 0.219850.17839-0.35184 -0.76625 4.65476-2.46579 2.791201.047040.15153 -0.99773 4.453671.85645-0.63264 12.34566 3.358096.137968.34237 Cement -8211.43092 40.83527 -1.55131 Asphalt -662.69219 -3.49598 3.204183.204181.52584-0.43992 Rocks -722.02100 -5.49247 6.084896.084890.560105.121476.97500 Grass -472.58807 -1.12114 -0.38630 -0.38630 7.76313 2.565694.600331.01245 In the above table, the first column is the class names, the second column is the constant, i.e. the in the above equations, the following columns, i.e. in the above equations, are the coefficients for each band used in the fisher classification. For a given pixel vector, which the number of bands is its dimension, the score to each class is calculated based on the above table, and the class with the highest score is assigned to this pixel. For example, if a pixel vector is given as (6, 5, 4, 8, 4, 1), its scores for each class are calculated as (5.1, -133,6, -349.7, -7865.0, -577.3, -670.1, -411.6). In this case, the pixel is classified into water, because it gets the highest score for water. If we look at the table by rows, besides the first denoting row, it has rows of coefficients for each class; if we look at the table by columns, besides the first denoting column, it has a constant column followed by the coefficient columns for each band. 3. References: Klecka, W. R., 1980, Discriminant analysis. Series: Quantitative applications in the social sciences, a SAGE UNIVERSITY paper. Fisher, R. A. 1936, The use of multiple measurements in taxonomic problems. Annals of Eugenics 7: 179-188. 4. FISHER is limited to the input of 64 bands and 255 signatures. [return to top] FISHER Macro Command Running this module in macro mode requires three parameters: 1: x (to indicate that macro mode is being used) 2: signature group file (*.sgf) 3: output image (name of the image showing the classification result) e.g., "fisher x training* fisher" See Also... Creating and Running Macro Files ISOCLUST Menu Location: Image Processing/Hard Classifiers ISOCLUST is an iterative self-organizing unsupervised classifier based on a concept similar to the wellknown ISODATA routine of Ball and Hall (1965) and cluster routines such as the H-means and K-means procedures. Operation Notes Macro Command See Also... CLUSTER MAKESIG MAXLIKE ISOCLUST Operation ISOCLUST first asks that you specify the number of files (bands) to be used in the classification. The maximum number is 7. You may enter the band names individually or insert a raster group file (.rgf). You then have the option of entering a title for the output classification image. Select the Next button. You will see progress reports at the bottom-right indicating that the CLUSTER and HISTO modules are being run. Then a histogram will appear indicating the frequency of pixels that belong to each seed cluster. These clusters are based on the fine generalization option in CLUSTER, with all clusters being shown. Examine this histogram for significant breaks of slope. These represent logical levels of generalization from which a classification might be seeded. (See the section on Unsupervised Classification in the Classification of Remotely Sensed Imagery Chapter of the IDRISI Guide to GIS and Image Processing for details on this technique. Also see the module CLUSTER.) Once you have determined the number of clusters you wish to have in the final image, either close or minimize the histogram. Then specify the number of iterations, the number of clusters, and the minimum sample size per class required in the training process within an iteration. (See the modules MAKESIG and CLUSTER for a complete explanation of these parameters.) Finally, enter the name of the output image. Because of the efficiency of the seeding step, very few iterations are required to produce a stable result. The default of 3 iterations works well in most instances. Note that three modules are being called by the ISOCLUST module: CLUSTER, MAKESIG, and MAXLIKE. CLUSTER is used to derive the initial clusters, or seed, from the set of input images. This result is then fed into an iteration of MAKESIG and then MAXLIKE. Since MAXLIKE uses the results from MAKESIG, the number of pixels in each training category is critical for MAKESIG to produce a representative variance/covariance matrix which is then used by MAXLIKE. Therefore, the parameter that specifies the minimum sample size per class is critical to achieving a stable and reliable result. During the MAKESIG process, training categories with total number of pixels less than the given sample size are not considered as valid training categories, thus no signatures are produced for such categories. After the first iteration, the result from MAXLIKE now becomes the new seed for MAKESIG. Then MAXLIKE is run again from this result and the iterations continue to that specified. return to top ISOCLUST Notes 1. Ball, G.H., and Hall, D.J., 1965. A Novel Method of Data Analysis and Pattern Classification. Menlo Park, CA: Stanford Research Institute. 2. The typical logic of the family of cluster algorithms found in Ball and Hall (1965) and in the H- and Kmeans procedures is as follows: a) The user decides on the number of clusters to be uncovered. (One is clearly blind in determining this. As a consequence, a common approach is to ask for a large number and then aggregate clusters after interpretation. A more efficient approach to this problem will be offered below, based on the specific implementation in IDRISI.) b) A set of N clusters is then arbitrarily located in band space. In some systems, these locations are randomly assigned. In most, they are systematically placed within the region of high frequency reflectances. c) Pixels are then assigned to their nearest cluster location. d) After all pixels have been assigned, a new mean location is computed. e) Steps c) and d) are iteratively repeated until no significant change in output is produced. The implementation of this general logic in IDRISI is different in the following respects: a) The user specifies the bands to be used. b) A histogram is produced representing clusters that express the frequency with which they occur across the bands. The user examines the graph and looks for significant breaks in the curve. These represent major changes in the generality of the clusters. c) The user specifies the number of clusters to be created, based on one of these major breaks in the histogram. The particular break the user chooses will depend on the level of generality desired in the output image. (See the section on Unsupervised Classification in the Classification of Remotely Sensed Imagery Chapter of the IDRISI Guide to GIS and Image Processing for details on this technique. Also see the module CLUSTER.) The cluster seeding process is actually done with the CLUSTER module in IDRISI. CLUSTER is truly a clustering algorithm (as opposed to a segmentation operation as is true of many so-called clustering routines). This leads to a far more efficient and accurate placement of clusters than either random or systematic placement. The iterative process makes use of a full Maximum Likelihood procedure. This provides a very strong cluster assignment procedure. 3. The maximum number of bands is 7 and the maximum number of clusters is 255 within the range of 1255. ISOCLUST Macro Command Running this module in macro mode requires six parameters: 1: x (to indicate that macro mode is being used) 2: input raster group filename (.rgf) 3: output filename 4: maximum number of classes to classify 5: minimum sample size for training (normally at least 10 times the number of bands) 6: number of iterations e.g., "isoclust x landsat*landcover*15*60*3" See Also… Creating and Running Macro Files [return to top] KMEANS Menu Location: Image Processing/Hard Classifiers KMEANS provides an unsupervised classification of input images using a K-means clustering technique. Operation Notes Macro Command See Also… CLUSTER ISOCLUST KMEANS Operation KMEANS first requires that you specify the number of files (bands) to be used in the classification. You may enter the band names individually or insert a raster group file (.rgf). Next, specify an output image name and an optional mask image. In the mask image, areas that are zero will not be included in the analysis. You are then required to specify the maximum number of output clusters to classify (see Note 2). Next, specify a rule for initializing cluster centroids. The options include random seed, diagonal axis, and random partition (see Note 4). Then, specify the stopping criteria to terminate the clustering process. You can enter parameters to terminate the clustering if the percentage of migrating pixels is less than a specified percentage of the entire image pixels, and by defining a maximum number of iterations. The clustering will terminate when either of these is satisfied. Finally, KMEANS offers an option to eliminate small clusters by merging them with larger clusters that are closest to their means. Specify the minimum number of pixels per cluster as a proportion of the total pixels in the image. KMEANS Notes 1. The maximum number of bands is 64 and the maximum number of clusters is 256. 2. KMEANS uses the so-called K-means clustering technique to partition an n -dimensional imagery into K exclusive clusters. KMEANS begins by initializing K centroids (means), then assigns each pixel to the cluster whose centroid is nearest, updates the cluster centroids, then repeats the process until the K centroids are fixed. This is a heuristic, greedy algorithm for minimizing SSE (Sum of Squared Errors), hence, it may not converge to a global optimum. Since its performance strongly depends on the initial estimation of the partition, a relatively large number of clusters are generally recommended to acquire as complete an initial pattern of centroids as possible. See the classification chapter in: Richards, J.A., and X. Jia, 1999. Remote Sensing Digital Image Analysis (New York: Springer), for more detail. 3. KMEANS uses Euclidian distance for calculating the distances between pixels and cluster centroids. The distance between a pixel and a cluster centroid takes the general form: Where 4. The performance of KMEANS is highly dependent on the initialization of the centroids. The random partition rule works by randomly assigning each pixel to one of K clusters, and then determines the initial centroids based on those initial assignments. The random seed rule first randomly picks up K points in the data set as the initial centroids, and then assigns each pixel to the nearest centroid according to the minimum-distance rule. The diagonal axis rule sets initial K centroids from the n-dimensional value space of the input bands in a systematic way, i.e., it evenly retrieves K points on a diagonal line that starts from the minimum-value vector point to the maximum-value vector point derived from the band series. KMEANS Macro Command Running this module in macro mode requires nine or ten parameters: 1: x (to indicate that macro mode is being used) 2: input raster group file (images to be analyzed) 3: mask filename (enter mask filename, or enter ‘none’ if no mask is used) 4: output filename (the new file being produced) 5: maximum number of output clusters (the maximum number of clusters to classify) 6: cluster centroid initialization rule (1=random seed / 2=diagonal axis / 3=random partition) 7: percentage of migrating pixels (e.g., 1.0 for 1%) 8. maximum number of iterations 9: merge clusters (y=yes / y=no) If cluster merging choice is Y in 10 above, then: 10: minimum cluster size as a proportion to pixels in image (e.g., 1.0 for 1%) e.g., "kmeans x rastergroup*mask*kclusters*25*1*1.0*100*Y*1.0" e.g., "kmeans x rastergroup*none*kclusters*16*3*1.0*50*N" See Also… Creating and Running Macro Files [return to top] KNN Menu Location: Image Processing/Hard Classifiers Image Processing/Soft Classifiers/Mixture Analysis KNN is a k-nearest neighbor classifier that can perform both hard and soft classifications. KNN uses knearest neighbors from a subset of all of the training samples in determining a pixel’s class or the degree of membership of a class. For a hard classification, a pixel is assigned to the class which dominates the knearest neighbors. When soft classification is required, each category’s proportion among the k-nearest neighbors is assigned to the pixel as a degree of membership to that category. Operation Notes Macro Command KNN Operation KNN first requires you to enter the signature group file that defines the classes. Next, enter a name for the classified output image. Then enter the k value. This value must be an integer value and can not be greater than the smallest sample size of all training categories. Enter the maximum number of samples per class allowed in the classification. Finally, choose to generate a soft output and a prefix for the output files. The prefix will also be used to name the raster group file containing the soft output files. KNN Notes 1. The KNN process is a fairly simple calculation. In theory, if using only one input image, for example, the Euclidean distance is calculated for each pixel in this image to all chosen training pixels in all classes. Then, for the specified k, only those k-nearest neighbors are examined in determining an image pixel’s class or degree of membership to a class. For the hard classification output, the class that dominates the knearest neighbors is assigned to that pixel. For the soft classification output, the proportion for each category among the k-nearest neighbors is evaluated. For each class an image is produced with each pixel assigned its respective proportion. 2. Note that this technique may favor classes with larger sample sizes over those with smaller sample sizes, i.e., training pixels. Likewise, the value of k may affect the decision of assigning a pixel to a class and its degree of membership to each class. 3. The value of k is limited to the smallest training class sample size. 4. The training samples for each class may be resampled if its size is larger than the given limit. For example, if the limit is set to 30 pixels and a particular class has 60 training pixels, then every other training pixel will be chosen. If the actual training sample size of a class is less than the given limit, all of the training samples will be used in the analysis. 5. Processing time is greatly influenced by the k value and the number of sampled training pixels. KNN Macro Command Running this module in macro mode requires six parameters: 1: x (to indicate that macro mode is being used) 2: input signature group file (.sgf) 3: output image (hard classification result) 4: input k value 5: maximum number of training samples per class 6: output prefix for soft output (specify ‘none’ if no soft output is required) e.g., "knn x trainingsgf* knnimg*50*30*none e.g., "knn x trainingsgf* knnimg*50*30*knnsoft See Also… Creating and Running Macro Files MAXLIKE Menu Location: Image Processing/Hard Classifiers MAXLIKE undertakes a Maximum Likelihood classification of remotely sensed data based on information contained in a set of signature files. The Maximum Likelihood classification is based on the probability density function associated with a particular training site signature. Pixels are assigned to the most likely class based on a comparison of the posterior probability that it belongs to each of the signatures being considered. MAXLIKE is also known as a Bayesian classifier since it has the ability to incorporate prior knowledge using Bayes' Theorum. Prior knowledge is expressed as a prior probability that each class exists. It can be specified as a single value applicable to all pixels, or as an image expressing different prior probabilities for each pixel. Operation Notes Macro Command See Also... MINDIST PIPED MAKESIG SIGCOMP BAYCLASS MAXLIKE Operation MAXLIKE first requires you to choose the manner in which prior probabilities are to be incorporated. Prior probabilities may be entered in several formats. When no knowledge exists about the prior probabilities with which each class can occur, then you should indicate that equal prior probabilities should be used (the default). When you have reasonable knowledge of the expected proportional area of each class over the image as a whole, you should choose the second option (specify a prior probability value for each signature). Thus if you expect that 42% of the area is under a given cover type, the a priori probability of that class is 0.42. The third option is to enter prior probabilities as a separate real number image (with values between 0-1) for each class. This allows you to incorporate spatial predictive models into your determination of prior probabilities. For example, you may decide that the prior probability of an area known in the past to be forest changing to residential is highly likely near to roads and highly unlikely far away from roads. This can be expressed quite easily since each pixel is given a separate prior probability value using this approach. As always, the sum of probabilities for each pixel must be 1.0 (see Note 5). The final option allows you to specify either a uniform value or an image of probabilities. In all cases except equal probabilities, prior probabilities are specified in the second dialog screen. MAXLIKE classifies an image based on the information contained in a series of signature files. These need to have been previously created either with MAKESIG. MAXLIKE requires that you enter signature names by specifying a Signature Group File (".sgf") or entering them in individually in the signature grid. Signature Group files are created with IDRISI Explorer. To specify signatures individually, you must first indicate the number of files using the spin button and then enter in each signature name. Regardless of the manner in which signatures will be entered, and in all cases except for equal probabilities, prior probabilities must be specified alongside each signature in the probability definition input box, either by specifying a value or a prior probability image. Using the fourth option to specify prior probabilities by giving a uniform value or an image of probabilities, you will need to specify for each signature whether to use a probability value or image. A pick list will appear in the Probability Value/Image column for you to select. MAXLIKE also allows you to exclude a specific proportion of the least likely pixels from any classification. This causes the pixels with the least likelihood of belonging to any of the classes for which you have signature data to be left unclassified. Ordinarily this proportion is specified with the aid of a Chi-Squared Distribution table. A small table has been built into the module for the cases of 0%, where all pixels are classified, 1% and 5%. You also have the option to enter the Chi-Squared value directly. Lastly, you must specify a name and title for the output image. The module produces an image with cells assigned a class code equal to the signature position in the signature list and a legend that lists the signatures by name. [return to top] MAXLIKE Notes 1. The procedures used for this module were derived from the formulas published in Richards, J.A.,1986. Remote Sensing Digital Image Analysis, Berlin: Springer-Verlag. 2. Be sure that you have adequate data to support this classifier. In general, you should aim to have a sample of pixels equal to at least the number of bands times 10. 3. Maximum Likelihood classification works very well when the training sites are well defined (i.e., are reasonably homogeneous). However, when they are not so well defined it can perform terribly. In these situations it is better to use the Minimum Distance to Means classifier (MINDIST). 4. The maximum number of signatures is 255 within the range of 1 - 255. The maximum number of bands is 64. 5. If the probabilities entered do not sum to one, the values are normalized so they do sum to one. MAXLIKE Macro Command Running this module in macro mode requires five or six parameters: 1: x (to indicate that macro mode is being used) 2: output filename (new file to be created) 3: input signature group file (.sgf) 4: proportion of pixels to exclude (between 0.0 to 1.0) 5: prior probability option (1 = equal prior probabilities / 2 = single prior probability for each class / 3 = use prior probability images / 4 = use mixture of single values and images) If 2 or 4 in parameter 5 above, then: 6: input signature parameter file with extension—see note below) If 3 in parameter 5 above, then: 6: input raster group file (.rgf—files must be in the same order as the signature group file) e.g., "maxlike x landcover*classes*0.05*1" e.g., "maxlike x landcover*classes*0.05*2*probval.txt" e.g., "maxlike x landcover*classes*0.05*3*probrgf" e.g., "maxlike x landcover*classes*0.05*4*probimgandval.txt" Note: The signature parameter file should be a text file located in the data directory. This may be created with EDIT and may have any extension. However, if it is given an extension of ".id$", it will be deleted during the normal cleanup undertaken by IDRISI when you exit the program. The structure of this file depends on how prior probabilities are being specified. If single prior probabilities for each class are used: create a list, the first line indicating the number of values in the list. It must be the same as the number of signatures. Then, on each following line, indicate a prior probability value. e.g., 5 0.5 0.2 0.4 0.1 0.6 If a mixture of single values and images are to be used: create a list, the first line indicates the number of pair lines in the list. It must be the same as the number of signatures. Then, each following pair line consists of a number to indicate the prior probability format (1=single value / 2=filename), followed by either the prior probability value or the image name (whichever is appropriate). e.g., 5 1 0.5 1 0.2 2 prior_pl 2 prior_p2 1 0.6 See Also... Creating and Running Macro Files [return to top] MINDIST Menu Location: Image Processing/Hard Classifiers MINDIST undertakes a Minimum Distance to Means classification of remotely sensed data based on the information contained in a set of signature files. The Minimum Distance to Means classification is based on the mean reflectance on each band for a signature. Pixels are assigned to the class with the mean closest to the value of that pixel. To account for differences in the variability of signatures, MINDIST allows band-space distances to be normalized. MINDIST is slower than the parallelepiped classification procedure, PIPED, and faster than the maximum likelihood classification, MAXLIKE. MINDIST is commonly applied when the number of pixels used to define signatures is very small or when training sites are not well defined. Operation Macro Command See Also... MAXLIKE PIPED MAKESIG SIGCOMP MINDIST Operation MINDIST first requires that you choose the distance type to be used, raw or normalized (Z-scores), and the maximum search distance. The choice of normalized (Z-scores) distance type is recommended as this will account for differences in the variability of signatures. The maximum search distance determines the distance beyond which a pixel will not be classified. For raw Dn values, enter a distance between the minimum and maximum Dn values in the images being processed. For normalized (Z-scores) units, enter the desired Z-score value. Statistically, a Z-score of 1.96 should leave about 5% of pixels unclassified, however, the actual number varies considerably depending on the size of the training sites used. Regardless of which distance type you choose, you can specify an infinite distance. Doing so causes all pixels to be classified, regardless of how far they are from the closest class mean. MINDIST classifies an image based on the information contained in a series of signature files. These need to have been previously created with MAKESIG. MINDIST requires that you enter signature names by specifying a Signature Group File (".sgf") or entering them in individually in the signature grid. Signature Group files are created with IDRISI Explorer. To specify signatures individually, you must first indicate the number of files using the spin button and then enter in each signature name. Finally, enter the output filename. The module produces an image with cells assigned a class code equal to the signature position in the signature list and a legend that lists the signatures by name. [return to top] MINDIST Macro Command Running this module in macro mode requires five parameters: 1: x (to indicate that macro mode is being used) 2: input signature group file (.sgf) 3: output filename (the new file to be created) 4: distance type (1=raw / 2=normalized [Z-score]) 5: maximum distance (a real number distance [0=unlimited]) e.g., "mindist x trainsig*landcov*2*0" Note: Signature Group File: All images should be located in the same directory as the .sgf file. This file can be created with IDRISI Explorer. See Also... Creating and Running Macro Files [return to top] PIPED Menu Location: Image Processing/Hard Classifiers PIPED undertakes a parallelepiped classification of remotely-sensed data based on the information contained in a set of signature files. The parallelepiped classification is based on a set of lower and upper threshold reflectances determined for a signature on each band. To be assigned to a particular class, a pixel must exhibit reflectances within this reflectance range for every band considered. The parallelepiped procedure is the fastest of the classification routines. It is also potentially the least accurate. Operation Notes Macro Command See Also... MAKESIG SIGCOMP MINDIST MAXLIKE PIPED Operation PIPED classifies an image based on the information contained in a series of signature files. These need to have been previously created either with MAKESIG. PIPED requires that you enter signature names by specifying a Signature Group File (".sgf") or entering them in individually in the signature grid. Signature Group files are created with IDRISI Explorer. To specify signatures individually you must first indicate the number of files using the spin buttons, then enter in each signature name. Regardless of the manner in which signatures will be entered, the opening dialog will require that you specify the manner in which the thresholds are to be established. Two options are presented for the definition of thresholds: either the minimum and maximum values from the training site data can be used, or a specified number of standard deviations from the mean (z-scores) may be used. The latter is preferred and is indicated as the default. A z-score of 1.96 would be expected to exclude (i.e., leave unclassified) the 5% of pixels that are least like any of the signatures given. Similarly, a value of 2.58 would exclude the 1% of most outlying pixels. PIPED finally requires that you enter the name of an output image. The module produces an image with cells assigned a class code equal to the signature position in the signature list and a legend that lists the signatures by name. [return to top] PIPED Notes 1. The parallelepiped procedure provides no assurance that a pixel will not be classified as several classes, and this does happen. In these cases, the class chosen is the last signature encountered that meets the threshold criteria. You should specify your signatures, when asked for their names, in reverse order of importance or likelihood, starting with the least important or likely ones, and ending with those that are most important or likely. 2. Pixels with values that do not fall within the range of any of the parallelepipeds will be left unclassified (category zero). PIPED Macro Command Running this module in macro mode requires four parameters: 1: x (to indicate that macro mode is being used) 2: input signature group file (.sgf) 3: output filename (the new file to be created) 4: threshold value (-1=use min/max / values>0 represents the z-score to be used, e.g., 1.96) e.g., "piped x classes*landcov*-1" See Also... Creating and Running Macro Files [return to top]