Spectral Analysis and Classification of Hyperspectral Data Wendy Zhang Southern University at New Orleans Mentor Lloyd McGregor Lockheed Martin Space Operations-Stennis Programs NASA Faculty Fellowship Program 2003 ABSTRACT Imagine spectrometers or “hyperspectral sensors” are remote sensing instruments that combine the spatial presentation of an imaging sensor with the analytical capabilities of a spectrometer. The AVIRIS (Airborne Visible-Infrared Imaging Spectrometer) collects data in 224 bands that are approximately 9.6 nm (narometer) wide in contiguous bands between 0.40 and 2.45 m (micron). The main objective of the AVIRIS object is to identify, measure, and monitor constituents of the Earth’s surface and atmosphere based on molecular absorption and particle scattering signatures. Research with AVIRIS data is predominantly focuses on understanding processes related to the global environment and climate change. In this project, we emphasize on spectrally oriented classification procedures for land cover mapping, particularly, the surface classification using AVIRIS data. Keywords: Hyperspectral data, spectral analysis, endmember, classification Wendy Zhang, Southern University at New Orleans 1. Introduction NASA's Earth Sciences program is primarily focused on providing high quality data products to its science community. NASA also recognizes the need to increase its involvement with the general public, including areas of information and education. Many different Earth-sensing satellites, with diverse sensors mounted on sophisticated platforms, are in Earth orbit or soon to be launched. These sensors are designed to cover a wide range of the electromagnetic spectrum and are generating enormous amounts of data that must be processed, stored, and made available to the user community. Imagine spectrometers or “hyperspectral sensors” are remote sensing instruments that combine the spatial presentation of an imaging sensor with the analytical capabilities of a spectrometer. They may have up to several hundred narrow spectral bands with spectral resolution on the order of 10nm or marrow. They acquire images throughout the visible, near-IR, and thermal IR portions of the spectrum. The AVIRIS (Airborne Visible-Infrared Imaging Spectrometer) collects data in 224 bands that are approximately 9.6 nm (narometer) wide in contiguous bands between 0.40 and 2.45 m (micron). Because of the large number of very narrow bands sampled, hyperspectral data enable the use of remote sensing data collection to replace data collection that was formally limited to laboratory testing or ground site surveys. The main objective of the AVIRIS object is to identify, measure, and monitor constituents of the Earth’s surface and atmosphere based on molecular absorption and particle scattering signatures. Research with AVIRIS data is predominantly focuses on understanding processes related to the global environment and climate change. AVIRIS data has hundreds of spectral bands, compare this to broad-band multispectral scanners such as Landsat Thematic Mapper ™, which only has 6 spectral bands and spectral resolution on the order of 100nm or greater. Each pixel has an associated, continuous spectrum than can be used to identify the surface materials. The end result of the high spectral resolution of imaging spectrometer is that we can identify materials, where with broad-band sensors we could only discriminate between materials. In this project, we emphasize on spectrally oriented classification procedures for land cover mapping, particularly, the surface classification using AVIRIS data. The initial goal of the research is to decide which are the most appropriate spectra from the spectral library for identifying a particular ground cover. The term ”endmember” describes a substance or material that has a unique spectral signature. Endmembers may be a single substance or a composite of materials. It is expected that, based on endmember spectra extracted, the data patterns can be easily retrieved, and that the most appropriate bands can be identified. The overall objective of image classification procedure is to automatically categorize all pixels in an image into land cover classes or themes. 2. Overview of Hyperspectral Data and Image Analysis 2.1 AVIRIS AVIRIS is a proven instrument in the realm of Earth Remote Sensing. It is a unique optical sensor that delivers calibrated images of the upwelling spectral radiance in 224 contiguous NASA Faculty Fellowship Program 2003 1 Spectral Analysis Project Report Wendy Zhang, Southern University at New Orleans spectral channels (bands) with wavelengths from 400 to 2500 nanometers. AVIRIS has been flown on two aircraft platforms: a NASA ER-2 jet and the Twin Otter turboprop. The AVIRIS sensor collects data that can be used for characterization of the Earth's surface and atmosphere from geometrically coherent spectroradiometric measurements. This data can be applied to studies in the fields of oceanography, environmental science, snow hydrology, geology, volcanology, soil and land management, atmospheric and aerosol studies, agriculture, and limnology. Applications under development include the assessment and monitoring of environmental hazards such as toxic waste, oil spills, and land/air/water pollution. With proper calibration and correction for atmospheric effects, the measurements can be converted to ground reflectance data which can then be used for quantitative characterization of surface features. The main objective of the AVIRIS object is to identify, measure, and monitor constituents of the Earth’s surface and atmosphere based on molecular absorption and particle scattering signatures. Research with AVIRIS data is predominantly focuses on understanding processes related to the global environment and climate change. 2.2 Spectral Regions The utility of the electromagnetic spectral regions are depicted according to the wavelength measured in nanometers (nm), micron (m), and centimeters (cm). Region Wavelength Remarks Gamma ray < 0.03 nm X-ray 0.03 to 3.0 nm Ultraviolet 0.03 to 0.4 m Photographic 0.3 to 0.4 m UV band Visible 0.4 to 0.7 m Infrared 0.7 to 100 m Reflected IR band 0.7 to 3.0 m Thermal IR band 3 to 5 m 8 to 14 m Microwave 0.1 to 30 cm Radar 0.1 to 30 cm Radio > 30 cm Incoming radiation is completely absorbed by the upper atmosphere and is not available for remote sensing. Completely absorbed by atmosphere. Not employed in remote sensing. Incoming wavelengths less than 0.3 m are completely absorbed by ozone in the up atmosphere. Transmitted through atmosphere. Detectable with film and photodectectors, but atmospheric scattering is severe. Imaged with film and photodetectors. Includes reflected energy peak of earth at 05. m. Interaction with matter varies with wavelength. Atmospheric transmission windows are separated by absorption bands. Reflected solar radiation that contains no information about thermal properties of materials. The band from 0.7 to 0.9 m is detectable with film and is called the photographic IR band. Principal atmospheric windows in the thermal region. Images at these wavelengths are acquired by optical-mechanical scanners and special vidicon systems but not by film. Longer wavelengths can penetrate clouds, fog, and rain. Images may be acquired in the active or passive mode. Active form of microwave remote sensing. Radar images are acquired at various wavelength bands. Longest wavelength portion of electromagnetic spectrum. Some classified radars with very long wavelength operate in this region. NASA Faculty Fellowship Program 2003 2 Spectral Analysis Project Report Wendy Zhang, Southern University at New Orleans 2.3 Spectral Response Patterns The broad feature types of three basic types of earth features, green vegetation, dry bare soil, and clear later water, are normally spectrally separable. Water and vegetation might reflect nearly equally in visible wavelengths, yet these features are always separable in near-IR wavelengths. Since spectral responses measured by remote sensors over various features often permit an assessment of the type and condition of the features, these responses are referred to as spectral signature. Many earth surfaces manifest very distinctive spectral reflectance and emittance characteristics. These characteristics result in spectral response patterns. Spectral signature is different from spectral pattern. Signature tends to imply a pattern that is absolute and unique. Spectral response patterns measured by remote sensors may be quantitative, but they are not absolute. They may be distinctive but they are not necessarily unique. 2.4 Spectral Resolution, Spectral Sampling, and Spectral Modeling Spectral resolution determines the way we see individual spectral features in materials measured using imaging spectrometry. Spectral resolution refers to the width of an instrument response (band-pass) at half of the band depth (the Full Width Half Max [FWHN]). The spectral resolution required for a specific sensor is a direct function of the material you are trying to identify, and the contrast between that material and the background materials. Spectral sampling refers to the band spacing – the quantization of the spectrum at discrete steps. Quality spectrometers are usually designed so that the band spacing is about equal to the band FWHM. Spectral modeling shows that spectral resolution requirements for imaging spectrometers depend upon the character of the material being measured. 2.5 HyMap Data HyMap is a state-of-the-art aircraft-mounted commercial hyperspectral sensor developed by Integrated Spectronics, Sydney, Australia, and operated by HyVista Corporation. HyMap provides unprecedented spatial, spectral and radiometric excellence. The system is a whiskbroom scanner utilizing diffraction gratings and four 32-element detector arrays (1 Si, 3 liquid-nitrogen-cooled InSb) to provide 126 spectral channels covering the 0.44 – 2.5 m range over a 512-pixel swath. Spectral resolution varies from 10 –20 nm with 3 – 10m spatial resolution and SNR over 1000:1. NASA Faculty Fellowship Program 2003 3 Spectral Analysis Project Report Wendy Zhang, Southern University at New Orleans 3. Hyperspectral Image Spectral Analysis and Mapping 3.1 Spectral Feature Fitting and Analysis Spectral Feature FittingTM (SFFTM) is an absorption-feature-based method for matching image spectra to reference endmembers (different materials contained in each spatial resolution cell). Most methods used for analysis of hyperspectral data still do not directly identify specific materials. They only indicate how similar the material is to another known material or how unique it is with respect to other materials. Techniques for direct identification of materials via extraction of specific spectral features from field and laboratory reflectance spectra have been in use for many years. All the methods require that data to be reduced to reflectance and that a continuum be removed from the reflectance data prior to analysis. A continuum is a mathematical function used to isolate a particular absorption feature for analysis. It corresponds to a background signal unrelated to specific absorption features of interest. Spectra are normalized to common reference using a continuum formed by defining high points of the spectrum (local maximal) and fitting straight line segments between these points. The continuum is removed by dividing it into the original spectrum. Spectral feature fitting requires that reference endmembers be selected from either the image or a spectral library, that both the reference and unknown spectra have the continuum removed, and that each reference endmember spectrum be scaled to match the unknown spectrum. A “scale” image is produced for each endmember selected for analysis by first subtracting the continuum-removed spectra from one, thus inverting them and making the continuum zero. A single multiplicative scaling factor is then determined that makes the reference spectrum match the unknown spectrum. Assuming that a reasonable spectral ranges have been selected, a large scaling factor is equivalent to deep spectral feature, while a small scaling factor indicates a weak spectral feature. A least-square-fit is then calculated band-byband between each reference endmember and the unknown spectrum. The total root-mean-square (RMS) error is used to form an RMS image for each endmember. The Spectral AnalystTM matches unknown spectra to library spectra and provides a score with respect to the library spectra. The spectral analyst uses several methods to produce a score between 0 and 1, with 1 equaling a perfect match. 3.2 Spectral Angle Mapper (SAM) The Spectral Angle Mapper (SAM) is a physically-based spectral classification that uses the ndimensional angle to match pixels to reference spectra. The algorithm determines the spectral similarity between two spectra by calculating the angle between the spectra, treating them as vectors in a space with dimensionality equal to the number of bands. SAM compares the angle between the endmember spectrum (considered as a n-dimensional vector, where n is the number of bands) and each pixel vector in n-dimensional space. Smaller angles represent closer matches to the reference spectrum. SAM is an automated method for comparing image spectra to individual spectra or a spectral library. SAM assumes that the data have been reduced to apparent reflectance (true reflectance multiplied by some unknown gain factor controlled by topography and shadows). NASA Faculty Fellowship Program 2003 4 Spectral Analysis Project Report Wendy Zhang, Southern University at New Orleans The algorithm determines the similarity between two spectra by calculating the “spectral angle” between them, treating them as vectors in a space with dimensionality equal to the number of bands (nb). Considering a reference spectrum and an unknown spectrum from twoband data, the two different materials will be represented in the 2-D scatter plot by a point for each given illumination, or as a line (vector) for all possible illumination. The method uses only the “direction” of the spectra and not their length. The angle between the vectors is the same regardless the length. The length of the vector relates on how fully the pixel is illuminated. It is insensitive to the unknown gain factor and all possible illuminations are treated equally. Poorly illuminated pixels will fall closer to the origin. The “color” of a material is defined by the direction of its unit vector. material A spectral angle Band 1 material B Band 2 Figure 3.1 Two-dimensional example of the Spectral Angle Mapper The SAM algorithm generalizes this geometric interpretation to nb-dimensional space. SAM determined the similarity of an unknown spectrum t to a reference spectrum r, by applying the following equation: nb Σ tiri i=1 α = cos –1 ____________________ nb ½ nb ½ 2 2 Σ ti Σ ri i=1 i=1 where nb equals the number of bands in the image. For each reference spectrum chosen in the analysis of a hyperspectral image, the spectral angle α is determined for every image spectrum (pixel). This value, in radians is assigned to the corresponding pixel in the output SAM image. One output image for each referenced spectrum. The derived spectral angle maps from a new data cube with the number of bands equal to the number of reference spectra used in the mapping. Gray-level threshold is typically used to empirically determine those areas that mostly match the reference spectrum while retaining spatial coherence. The SAM algorithm takes as input a number of “training classes” or reference spectra from ASCII files, ROIs, or spectral libraries. It calculates the angular distance between each spectrum in the image and the reference spectra in n-dimensions. The result is a classification NASA Faculty Fellowship Program 2003 5 Spectral Analysis Project Report Wendy Zhang, Southern University at New Orleans image showing the best SAM match at each pixel and a “rule” image for each reference spectra showing the actual angular distance in radians between each spectrum in the images and the reference spectrum. Darker pixels in the rule images represent smaller spectral angles, the spectra are more similar to the reference spectrum. SAM assumes that the data have been reduced to apparent reflectance and uses only the “direction” of the spectra, and not the “length”. Thus the SAM classification is insensitive to illumination effects. This technique is relatively insensitive to illumination and albedo effects when used on calibrated reference data. 3.3 Spectrum Profile Every material has its own set of reflectance values for each band that can be used to identify it. In a spectral profile, the y-coordinate shows the reflectance value of the material from 0-8000. The numbers on the x-coordinate identify the band wave. Wavelengths in the profile are measured in “microns,” where 1m = 1 millionth meter. The shape of the plot in the spectrum profile can be used to identify the material. 4. Experiments The experiment is set up to find the appropriate endmebers to classify surface cover of Stennis Space Center using AVIRIS data that was taken on July 29, 1999. The data file used is Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) apparent reflectance mosaic data of part of Stennis Space Center, Mississippi, USA, calibrated using ATREM (ATmospheric REMoval) atmospheric modeling software. The data cover the 0.3704 to 2.5101μm range in 224 spectral bands and the image data file size is 984MB. The software used to do this project is the Environment for Visualizing Images (ENVI) version 3.6 running on UNIX systems. The spectral libraries used are USGS Vegetation Spectral Library (wavelength is 0.3951 to 2.56μm and the wavelength accuracy is on the order of 0.5nm in the near_IR and 0.2nm in the visible), Jasper Ridge Spectral Library for Green Vegetation, Dry Vegetation, and Rocks, The John Hopkins University Spectra Library for Man Made Materials (0.42 –14 m), Man Made Materials (0.3 – 12.5 m), and Vegetation (0.3 –14 m). 4.1 Methodology Traditional supervised and unsupervised classification techniques will require very long processing times for hyperspectral data because of the dependence on the number of wave bands (224). A more serious problem is the need to estimate class signatures, i.e., the mean vector and covariance matrix, when using algorithms such as maximum likelihood, based on second order statistics. The difficulty lies in the small number of available training pixels per class compared with the number of wavebands used and is related directly to the Hughes phenomenon. If too few training samples are used then the class model may be very accurate for the training data and classification accuracy on training data can be very high. However, classification accuracy on testing data will be poor. In this case, the classifier is over trained and the statistics estimated are unreliable. Most methods used for analysis of hyperspectral data still do not directly identify specific materials. They only indicate how similar the material is to another known material or how unique it is with respect to other materials. Techniques for direct identification of materials NASA Faculty Fellowship Program 2003 6 Spectral Analysis Project Report Wendy Zhang, Southern University at New Orleans via extraction of specific spectral features from field and laboratory reflectance spectra have been in use for many years. It is not feasible to use regular classification methods to classify images using AVIRIS hyperspectral data. The spectral libraries are provided by different agencies and labs. We browse image spectra and compare it to spectral library. When visually comparing and contrasting the corresponding AVIRIS spectra with the features shown in the laboratory spectra, find how similar the two spectra are. We put image spectra and similar library spectra together to create a spectral profile. ENVI has a spectral matching tool, the Spectral AnalystTM. The Spectral AnalystTM scores the unknown spectrum against the library and provides a score with respect to the library spectral. The spectra analyst uses three methods, Spectral Angle Mapping, Spectral Feature Fitting, and Binary Encoding, produce a score between 0 and 1, with 1 equaling a perfect match. We have put 0.33 weight on each method and tried to use Spectral AnalystTM to identify spectra. Regions of Interest (ROIs) are used to extract statistics and average spectra from groups of pixels. We create the ROIs of the pixels we have examined and extract statistics and spectral plots of the selected ROIs. Compare the spectral features of each mean spectrum and may identify some unique characteristics. 4.2 Findings: 4.2.1 Comparison of image spectra and spectral library We compare the pixel spectra in spectral profile of SSC hyperspectral data (n_03_mosaic) with the spectra from spectral libraries and have following findings: Library: USGA Vegetation Spectral Library Spectra: aspenlf2.spc Aspen_Leaf_B DW92-3 Plot File: Aspenlf2_plot.jpg NASA Faculty Fellowship Program 2003 7 Spectral Analysis Project Report Wendy Zhang, Southern University at New Orleans Library: USGA Vegetation Spectral Library Spectra: grass.spc Lawn-grass GDS91 (Green) Plot File: Grass_plot.jpg Library: USGA Vegetation Spectral Library Spectra: bluespru.spc Blue_Spruce DW92-5 Needle Plot File: Bluespru_plot.jpg Library: USGA Vegetation Spectral Library Spectra: juniper.spc Juniper_Bush IH91-4B whole Plot File: Juniper_plot.jpg NASA Faculty Fellowship Program 2003 8 Spectral Analysis Project Report Wendy Zhang, Southern University at New Orleans Library: USGA Vegetation Spectral Library Spectra: firtree.spc Fir_Tree IH91-2 complete Plot File: Firtree_plot.jpg Library: USGA Vegetation Spectral Library Spectra: rabbit.spc Rabbitbrush ANP92-27 whole Plot File: Rabbit_plot.jpg We notice from the above spectral profiles that the image apparent reflectance spectra are best-match library spectrum from wavelength 1.5 –2.5. 4.2.2 Result of Spectral Analyst The results from the Spectral AnalystTM are disappointed. Select the spectrum for pixel x:110, y:1759. Figure 4.1 Spectral Analyst dialog shows that the Spectral Feature Fitting (SSF) scores 0 that means it does not match the pixel to any spectrum in USDA Vegetation library. The highest score, 0.543, matches walnut leaf rather than grass that the pixel spectrum presents. Figure 4.2 gives a better match with grass but SSF still scores 0. NASA Faculty Fellowship Program 2003 9 Spectral Analysis Project Report Wendy Zhang, Southern University at New Orleans Figure 4.1 The Spectral Analyst Dialog for Pixel 110, 1759 Figure 4.2 The Spectral Analyst Dialog for Pixel 589, 1585 NASA Faculty Fellowship Program 2003 10 Spectral Analysis Project Report Wendy Zhang, Southern University at New Orleans 4.2.3. Use Average Spectrum We select an ROI and then extract statistics and a spectral plot of the selected ROI. We have following findings: The outcomes of ROI are close to the spectral profile of individual pixels. NASA Faculty Fellowship Program 2003 11 Spectral Analysis Project Report Wendy Zhang, Southern University at New Orleans 5. Problems and Future Studies I have given opportunities by my NASA colleagues, the SSC University Affairs Officer, others at SSC center to gain training with and exposure to new remote sensing technologies, approaches, and processes. I have gained deep understanding and hands-on experience by working on the AVIRIS hyperspectral data this summer. My ultimate object of research is increase the performance of image processing using AVIRIS hyperspectral data by grouping the useful endmenbers and reducing the bands used. My current study only involves to basic spectral analysis. There is a great gap between theory methodology and application. Through the experiments, we have found that it is extremely difficult to identify vegetation species by spectral analysis. The spectra of vegetation changes due to season, climate, environment, growing condition, and etc. There lack of vegetation spectral libraries. The Spectral Angle Mapper (SAM) assumes that the data have been reduced to apparent reflectance and uses only the “direction” of the spectra, and not the “length”. Thus the SAM classification is insensitive to illumination effects. This technique is relatively insensitive to illumination and albedo effects when used on calibrated reference data. It is a great challenge for me to find spectral patterns of vegetation using hyperspectral data. I’ll continue my search with NASA and Lockheed Martin Space Operations colleagues and also seek help from USDA research group in Westlaco, Texas. 6. References [1] [2] [3] [4] [5] [6] [7] [8] Gonzalez, R. C. and Wintz, P., Digital Image Processing, Addison-Wesley Publishing Company, 1977. Lo, C. P. and Yeung, A. K.W., Concepts and Techniques of Geographic Information System, Prentice-Hall, Inc., 2002. ISBN 013-080427-4. Lillesand, T. M. and Kiefer, R. W., Remote Sensing and Image Interpretation, 4th Edition, John Wiley & Sons, Inc., 2000. ISBN 0-471-25515-7. Richards, J. A. and Jia, X., Remote Sensing Digital Image Analysis, Third, Revised and Enlarged Edition, Springer, 1999. ISBN 3-540-64860-7. The Environment for Visualizing Images, User’s Guide, Research Systems, Inc., 1999. The Environment for Visualizing Images, Tutorials, Research Systems, Inc., 1999. Multispectral Imagery Reference Guide, LOGICON Geodynamics, Inc., 1997 ERDAS Field GuideTM, Sixth Edition, Leica Geosystems, GIS & Mapping Division, 2002. NASA Faculty Fellowship Program 2003 12 Spectral Analysis Project Report