Lab 9 Image Classification Geog 418/518, Fundamentals of Remote Sensing Name 1 Name 2 Lab Overview In this lab, you will create a create a classified image. To accomplish this, you will Define spectral signatures for key categories of features. Evaluate the spectral signatures. Use the signatures to create a maximum likelihood supervised classification. Conduct a qualitative evaluation of the accuracy of your classification. Background: The Supervised Classification Process The classification techniques you will explore today are commonly used by remote sensing professionals. All the techniques you will apply are described in your textbook or in the online ERDAS Field Manual, 1999, as well as being covered in class. In a typical supervised classification, the steps that are followed include: Surveying a field map that provides the “ground truth” for the classification. This map is usually surveyed by a ground-based field team that visits a number of sites where the features of interest (e.g., certain tree types or land use classes) are located. The field team maps the location and extent of the features so that they can be precisely located using image coordinates, even when the features cannot be clearly seen on the image. For example, the field team might map the location and extent of wheat fields. Although you might not be able to visually differentiate wheat from other crops on a Landsat image, you could still map the wheat fields to the image using the coordinates of the ground truth data. In other words, you can overlay the ground truth map onto the image to show where the wheat fields are. (Note that this requires you to have accurately georeferenced images and maps). Selection of “training sites” from known locations on the image. For example, you might select a certain number of pixels (at least 30 and preferably 50 or more for maximum likelihood classification) from the wheat field portions of the image. The spectral signature of these wheat field pixels will then be used to search for other similar pixels on the image. -1- Evaluation of the training pixels to determine if any appear to be non-representative (e.g., different than the other wheat field training pixels) Classification of the image using the training pixels. Evaluation of the classification. As a first approximation, the evaluation of the classification can done using a visual, qualitative comparison of imagery and the classification map. This is the approach you will apply today. Please be aware, however, that while qualitative assessment is useful for understanding the classification process and making broad generalized statements about the classification (e.g., this feature was confused with that feature, the classification is awful, mediocre, good, etc.), it is not sufficient for most management or research applications. In today’s lab, part of the reason you can get away with using a qualitative evaluation is because you will have a high spatial resolution image with 1-m pixels where you can clearly and unambiguously (most of the time) identify features on the image. Because you have high resolution imagery on which features can be identified, you can bypass the field survey and the collection of a ground truth data set (at least for the purposes for which we are using the lab). In most cases, however, you will not have 1-m imagery. When you use coarser spatial resolution imagery such as TM5, determining accuracy based solely on visual interpretation is usually not a viable solution. Furthermore, if you want quantitative measures of accuracy (e.g., percent correctly classified), you cannot use simple visual comparisons of classification maps and images. When using coarser pixel resolution, or when deriving quantitative measures of accuracy, you must have ground-based maps to train the image and test the classification results. These quantitative accuracy assessments are carried out by overlaying the ground truth data on the classification map and counting how often the classification is correct or incorrect for each category. In today’s lab you will conduct a visual qualitative estimate of accuracy, but will not do a quantitative evaluation with ground truth and classification maps. Part 1, Acquiring spectral signatures to “train” the classification Ideally, pixels used as training sites should: Be scattered around the image to capture the variability in reflectances due to illumination differences. One should select training site pixels from have at least 5 to 10 locations around the image. Have 10*n pixels per classification category, where n is the number of bands in the imagery. Have similar spectral reflectances for each category. For example, in order to identify wheat fields on an image, you should only use training sites from a green wheat field, or a golden wheat field, or a mowed wheat filed, but not all three at the same time. If you want to classify -2- all three stages of wheat, you should compile separate spectral signatures for each category of wheat. Be accessible for ground truthing and verification. Choosing pixel locations on a ridge top may be wonderful for a mountaineer, but does not make for easy logistics. You may often wish to visit the training sites in the field to understand the land cover at that site. In reality, logistical and image limitations often force you to follow less than ideal practices. It is quite common to only use 30 to 50 training pixels per category and sometimes all these pixels may come from one site. If you do not follow the ideal rules laid out above, it is especially important to be aware of how the training sites may lead to confusion within the classification. An excellent overview of all these issues can be found in Congalton and Greene (1999). Your first order of business is to collect some training sites. Open up the Lamar_TM5 image as a true color composite. We will attempt to classify the following features on this image: (1) water; (2) trees: (3) gravel bar; (4) woody debris; (5) sage and (6) willow/alder/sedge. Wood Sage Bar Water Sage Bar Individual logs Confiers Willow/ sedge Photo 1 perspective Figure 1. True color image of the Lamar River, 08-03-1999, Yellowstone National park. The accompanying file (Lab_classification_photos) has ground photos of the key features shown on the image. -3- Also open the file Lab_classification_photos and take a look at the ground images for the various categories. You will want to think carefully about these ground images and their spectral reflectances as you answer the questions in this lab and evaluate your spectral signature files and classification 1. What is the minimum number of pixels you should choose to use as training sites for any one feature type (e.g., water)? 2. Based on a visual examination of the true color composite (Figure 1) and the ground photos, which features do you think might be difficult to separate based on their spectral reflectances? To create spectral signature files, you first select areas of interest that capture the spectral signal of the feature of interest (e.g., trees). The following steps lead you through some different ways of collecting areas of interest for your spectral signature files. The use of different techniques for creating areas of interest in intended to show you all the options for defining spectral signatures – you should not interpret this to mean that any method can be used at any time. You should consider the goals of the project and the accuracy requirements before choosing a particular method. For example in my research work, I always use ground-based maps to define where my training sites are (described in section 1.1 immediately below). 1.1 Using the polygon tool to collect AOIs In the View window, open up AOI/Tools and select the Polygon icon which looks like this: . Locate 5 areas scattered across the image that are water and draw an AOI polygon at each site. In total, the number of pixels in the 5 sites should equal or exceed your answer to question 1 above, which will give you a sense of how big the AOIs should be. Remember, it is okay to collect more training pixels than the minimum required, but it is not okay to select fewer. By the same token, if you highlight all or a large majority of the water, you will have nothing left to classify, and your training pixels will have picked up so much within-water spectral variation that they won’t work as well as using a smaller number of pure” water pixels. As you move from one AOI water site to the next, you will have to re-select the Polygon tool for each new AOI. Only select areas where it is clear you are selecting water and nothing else. In other words, avoid edges and anything that might be a mixed pixel. After drawing all five AOIs, select them all in the Viewer window by clicking on AOI/Group. Select or deselect subsets of the AOIs by holding down the shift key while left clicking on AOI with your mouse. After grouping the AOIs, select File/Save/AOI Layer as… in the Viewer window and save the AOI as water_training.aoi. This way your work will not disappear if you have a computer hiccup . -4- You have now created an AOI. The next step is to edit it, which is done in the Signature Editor window. Open Classifier/Signature Editor. In the Signature Editor window select Edit/Add, or click on the Add Icon: The pixels in the AOI polygon boundaries will be added as a row of data (note – the AOI must be highlighted in the Viewer window before it will be added to the Signature Editor). Click on the Signature Name cell for the row and rename it Water. Click on the color cell and change it to blue. Check the Count column to determine how many pixels you selected. If you do not have enough pixels to meet the criteria you outlined in question 1, go back to the AOI tool and select some more pixels from different sites. Then add the pixels to the Signature Editor. Remove the water AOI from the screen by opening the View/Arrange Layers window, right clicking on the AOI layer and deleting it. Save your signature file (not the AOI file) as Lamar_training before going on to the next step. 1.2 Using the seed growing tool to define AOIs In your AOI window, select the Seed Properties tool (or go to AOI/Seed Properties through the Viewer window menu). Set the Euclidean distance to 50, then select the Seed Tool icon and click on a tree top. Locating tree tops accurately can be more difficult than you more imagine at first glance. Use shadows to help distinguish trees from other green features. (Note: I selected the Euclidean distance of 50 after some trial and error. Try resetting the distance to higher or lower numbers to see how this affects your AOI definitions). Repeat this process for at least 5 (and preferably more) tree tops scattered across the image. Group the tree top AOIs and save them at tree_training.aoi. Add the AOI as a layer to your Signature Editor. Give this row a Signature Name of Trees and assign it a dark green color. Save the revised signature file. 1.3 Using the inquire cursor to define AOIs Change the band combinations on your Lamar_TM5 image to assign band 5 to red, band 4 to green, and band 3 to blue. This will highlight the woody debris much better than the true color composite does. Accumulations of wood can be relatively small, so you want to be precise in locating your AOIs. A way to be precise is to use the Utility/Inquire Cursor to “seed” the AOI. The Inquire Cursor can be located precisely to the nearest pixel. -5- If you have closed it, reopen the AOI/Seed Properties window. Set the Euclidean Distance to 25 and the number of pixels to 50. Select Utility/Inquire Cursor or select the Inquire Cursor icon (the cross hair on the Viewer window). Locate it on top of a piece of wood and click the Grow at Inquire button in the Region Growing Properties window. Repeat this process for about 10 locations around the image. If you experience is like mine, you will occasionally grab more materials than you want. When this happens, delete that AOI. Add the wood AOI as a layer to your Signature Editor. Give this row a Signature Name of Wood and assign it a Sienna color. Be sure to remember to save the revised signature file. Delete all the AOIs from the Viewer window before proceeding to the next step. Don’t worry – as long as you have saved the AOI and Spectral Signature files, the areas of interest can be brought back up. Removing them from the Viewer window simply serves to simplify the screen before the next step. 1.4 Using feature space to collect AOIs The approaches above either used: (1) spatial features to define the training site, which is what you did when you drew the polygons in the water, or (2) the spectral characteristics (Euclidean distance) within a certain spatial distance of a seed pixel to define a training site. One can also use spectral criteria only to define training pixels by using a feature called Feature Space. Feature Space creates a scatter plot of the spectral values for two bands for all the pixels on the image. Using Inquire Cursor you can then explore where individual pixels from the image are located on the scatter plot. In turn, this information can be used to create an AOI that highlights all pixels with those spectral values. In the Signature Editor window, select Feature/Create/Feature Space Layers. Use Lamar_TM5 as the input layer and also use Lamar_TM5 as the output root name. In the rows that open up under the output root name, highlight Lamar_TM5_3-4.fsp. This will create an image that plots band 3 versus band 4. Click OK. Open a second viewer and display Lamar_TM5_3-4.fsp.img. This is a scatter plot of band 3 versus band 4 for all the pixels values in the images. The different colors indicate the number of pixels that have certain values. Cooler colors (purple, blue) represent fewer pixels with those values, warmer colors (yellow, red) indicate that a larger number of pixels have those values. In the Signature Editor, select Feature/View/Linked Cursors. Enter the Viewer number in which your feature space (the scatter plot) is displayed, click Select, then left click in the viewer with the feature space image. Click on Link. Then left click in the Viewer with the image of the Lamar River. You should have a cursor appear in both viewers. Move the cursor in the image viewer (probably your Viewer #1) to the sage brush region and see where the cursor on the feature space moves. Move the cursor on the image to a number of points -6- where there is sage brush to determine which portion of the scatter plot correspond to the spectral signal of sage. 3. Move the cursor to some of the major categories that we are trying to separate (trees, willows, wood, etc.). Which category or categories (if any) is the sage most likely to be confused with from a spectral perspective? Why do you think this is the case? Remember to look at the ground photos to develop a feel for the composition of the different features. 4. What is the potential disadvantage of using the feature space to define the training sites rather than defining an AOI on the image? If you have closed it, reopen you AOI/Tools window. Click on the polygon tool and create a closed polygon around the area on your feature space image that corresponds to the sage. Then click on Edit/Add in the Signature Editor. Name your new category Sage and assign it a gold color. The feature space AOI only indicates a spectral range of values, but does not associate those values with specific pixels on the image. For this reason, you will notice that there is no pixel count associated with the sage on the Signature Editor. To fix this, click on Feature/Statistics in the Signature Editor window. This will associate the spectral values with specific pixels. 1.5 Collecting additional AOIs for willows and gravel bar Using one of the four methods outlined above, create training site files for willow/sedge/grass (one category) and gravel bar. Assign a color of green to willows and a color of dark gray to gravel. Remember to save your AOIs for each group. 5. Which approach to collecting an AOI did you use to create a spectral signature file for willows? Explain why you chose this approach to “find” willow locations. -7- Before continuing on to the next portion of the lab, be sure to save your signature file. Part 2: Evaluating the Training Sites You have now collected a lot of training pixels, but you really don’t know if the training pixels all fit the criteria outlined at the start of the lab. In particular, you don’t know how “homogenous” the training sites are and how pure the training pixels are. The following sections outline several approaches for evaluating your training sites. 2.1 Checking your training site AOIs Is it wise to check your training site AOIs to make sure their locations are correct. For example, when I first selected AOIs for gravel bars, I mistakenly included some of the AOIs from the wood training set as well. I did not realize this until I went back to the image and examined the AOI locations for each training set. To check your AOI locations, go to the Signature Editor window and move the > symbol by clicking in the > column to select a feature type. Then select View/Image AOI. The AOI for that training set should open up on the image. Examine the AOIs for that feature to make sure they correspond to the locations you intended them to be in. If the locations are incorrect, edit the AOIs or recreate them. NOTE: you will not able to able view the AOIs you collected using the Feature Space approach. A way to do this is outlined in the next section, I have found that ERDAS Imagine gets a little spooky when you ask it to bring up AOIs. You may have to close out the Viewer window and reopen it to refresh its memory. In this case, you will have to reopen your signature files through the Signature Editor window. 2.2 Using an image mask to check your feature space training sites When you chose the sage training site pixels using the Feature Space approach, you never were able to see which pixels on the image corresponded to the spectral region you selected as your training set. To determine this, highlight the sage row in your Signature Editor window (the sage row will be highlighted in yellow). Then select Feature/Masking/Feature Space to Image. Click Apply in the FS to Image Masking window that pops up, then left click in the Viewer window with the image in it. The features that correspond to the spectral range you high lighted in the feature space will now show up on the image. In my case, the mask showed me that I had highlighted too large a spectral region. Anything that even vaguely resembled dry vegetation was included in my training set. This is inappropriate, because the training sites should represent only pure pixels of sage. I therefore had to delete my original set of sage training sites and create a new one in the feature space window using a much smaller AOI polygon that only highlighted a relatively small subset of the sage sites. -8- 2.3 Comparing Spectral Signatures One of the keys to effective classification is to avoid overlap in the training spectra for each object. Detecting spectral overlap is therefore a very useful exercise. Open up a new Viewer window and open Lamar_TM5_3-4.fsp.img. In the Signature Editor window, highlight all the variables, then select Feature/Objects. A Signature Objects window will open. In the Signature Objects window, select Viewer 2 (or whichever viewer your Lamar_TM5_3-4.fsp.img spectral plot is open in) and left click in the feature space window. Make sure the Plot Ellipse box is checked on. Click OK. Ellipses should be drawn around certain areas on your spectral plot. If there is significant overlap between the ellipses, this is bad because it indicates that the training sites (at least in bands 3 and 4 shown on this image) are similar. This in turn makes it hard to differentiate features. Try changing the standard deviation on the Signature Editor window to higher and lower values. 6. Is there more or less overlap when you set the standard deviation to a lower value (e.g., 1.0 instead of 2.0). Why is this (i.e., what does the standard deviation represent)? 7. What features overlap (note that the colors of the ellipses correspond to the colors you have assigned each category)? Why do you think these categories overlap? Remember to look at the ground photos for additional information. 8. What is the largest standard deviation value you can choose that avoids any overlap? -9- 2.3 Examining the training set histograms and statistics You can also examine the distribution of pixel values for each training set. Move the > symbol in the Signature Editor window to the water row. Then select View/Histograms. A histogram of band 1 will pop up along with a Plot Option window. Examine the histogram for water in band 1. Then change the bands in the Plot Option window to see how the values change with wavelength. 9. Use the table below to describe the shape and range of the values for your training sets. Band Water Uni-, bi- or multimodal? Range of values Wood Uni-, bi- or multimodal? Range of values 1 2 3 4 5 6 Typically, one would wish to have a unimodal distribution of values for your training set, which would indicate that your training pixels represent one category of features. The modality of the training set will depend entirely on which locations you chose as training site, but I would guess that your wood and tree training sites display bimodal or multi-modal distributions in some bands. 10. Are there legitimate variations in wood reflectance that might lead one to accept a training set with bimodal or multi-modal distributions? If so, what features of wood might be driving this multi-modal reflectance? (remember to look at the ground photos) 2.4 Contingency tables A contingency table overlays ground truth and classified pixels to determine how well they match up. One can apply the contingency table technique to a classification of the training pixel sites, which represent your ground truth (i.e., you know that these pixels are wood, tree, etc.). In theory, since the training pixels are know to be a given feature, the classification based on those same training sites should work extremely well for those locations. For example, you would expect the statistics for the willow training set to classify all pixels in that training set as willow and not as sage. If the willow training set classifies some of its own pixels as sage, then it suggests a problem with the training set. -10- In the Spectral Editor window, highlight all the rows, select Evaluate/Contingency, leave the default as Maximum Likelihood, and click OK. A table showing the ability of the training sites to classify themselves correctly will pop up. Refer to our class notes or the text if you need to refresh your memory on how to read a contingency table. 11. Which features (if any) were most confused? Do the features that were confused fit with your answers to 2, 3 and 7 above? 2.5 Separability Indices Finally, a number of numeric indices are available for evaluating separability of features. This can be accessed in the Signature Editor window by selecting Evaluate/Separability, choosing the desired index, and clicking OK. Feel free to run these and look over some of the index results. The online ERDAS Imagine Field Guide provides an overview of how to interpret the numeric results. Part 3. Supervised Classification Once you are confident that your training sites are appropriate, you can apply them using a number of different classification approaches. Today’s lab focuses on the maximum likelihood approach, which is probably the most commonly used algorithm for classification. Select Classifier/Supervised Classification. Enter Lamar_TM5.img as your input image, Lamar_TM5_training.sig as your signature file, and Lamar_TM5_class_max as your output file. Under the Parametric Rule make sure that Maximum Likelihood is selected. In the Non-parametric rule select Feature Space. Make sure the Fuzzy Classification is checked on with 2 layers entered. This fuzzy classification will create two layers of classification. The first classification layer indicates the most probable classification for each pixel. The second layer indicates the second most probable classification for each pixel. Comparing these layers can provide good insights into which classes are being confused by the algorithm, but also may be a real indicator of mixed pixels, where a given pixel may contain gravel and wood for example. Once you have all the settings correctly indicates, click OK. Part 4. Qualitative Evaluation of the Classification Close down all your open windows, then open Lamar_TM5_class_max in Viewer 1, specifying that you want layer 1 opened in Pseudo Color. If the classification has gone well, the classified image will -11- open up with the same colors assigned to each class that you assigned to features in your signature files. 12. At a first glance, which category is classified the best. In terms of spectral characteristics (reflectance, absorption) why is this the case? Now open up Lamar_TM5 in the same viewer as a true color composite, making sure that you click off the clear display box in the Raster Options. With both layers open in the same view, select Utilities/Swipe from the Viewer window menu. Move the swipe back and forth to look at how different features have been classified and evaluate the quality of the classification. In a second viewer window open up layer 1 of the classified image as a pseudo color image. In the same window, open up layer 2 as a pseudo color image (remember to click off the clear display box). If layer 2 comes up as a gray scale, you may have to go into Raster/Attributes to reassign the image the same colors as layer 1. Use the image swipe tool to see which classification was most probable after the layer 1 classification. 13. Based on your visual analysis, which three categories were more commonly mixed up? 14. Consider the full range of ways in which you can control the classification process, ranging from the imagery you use (think resolution), to ground truthing, to selection of training sites, to image restoration, to image enhancement and classification approaches. What are three actions you might take to improve the classification accuracy for this particular site? Explain why you think those approaches might work for this site. -12- If you are feeling inspired, try some other classification approaches like minimum distance or parallelepiped and see how the accuracies compare to your maximum likelihood approach. References: Congalton RG, Green K (1999) Assessing the Accuracy of Remotely Sense Data: Principles and Practices. New York, Lewis Publishers, 137 p. -13-