Segmentation of Dual-Frequency Polarimetric SAR Data for an Improved Land Cover Classification Ken Yoong LEE and Timo Rolf BRETSCHNEIDER EADS Innovation Works Singapore 110 Seletar Aerospace View, Singapore 797562 Tel: +65 65927324; Fax: +65 66591276 E-mail: {ken-yoong.lee, timo.bretschneider}@eads.net Abstract In this paper an existing hybrid segmentation algorithm is extended for segmenting multi-look polarimetric synthetic aperture radar data. There are four modules in this extended hybrid segmentation algorithm, namely 1) speckle suppression by a spatially adaptive filter, 2) edge magnitude computation based on Roy’s largest eigenvalue, 3) initial segmentation by morphological watershed transform, and 4) region merging. The performance of the extended hybrid segmentation algorithm was examined using NASA/JPL AIRSAR POLSAR data. The obtained segmentation outputs were then used as the inputs for land cover classification. The supervised classification was performed on per-pixel and per-segment bases using a complex Wishart classifier. The targeted land cover classes included bareland, bridge, built-up, mangrove forest, marshland, oil palm, rice paddy, rubber, scrub-grassland, water body, diversified cropland, dryland forest, fish farm, mudflat as well as road. From the obtained results, the overall accuracies were 63% and 67% for both the C- and L-band, respectively. An improved accuracy of 75% was attained by using the dual-frequency combination. Furthermore, the per-segment approach was found to improve the accuracy with reduced salt-and-pepper effects in the classification outputs. Key Words: polarimetric synthetic aperture radar, segmentation, classification, land cover 1. Introduction With the realisation of multi-frequency fully polarimetric synthetic aperture radar (POLSAR) imaging systems, such as SIR-C, AIRSAR, E-SAR, EMISAR and Pi-SAR, it is now possible to capture complete polarimetric signatures of the Earth’s surface. To date, there is a growing interest within the remote sensing community to utilise the acquired data for various applications (Boerner et al., 1998). One of the key applications is land cover classification. Multi-frequency POLSAR data can, through different frequency and polarisation combinations, provide abundant and valuable information (e.g. intensity, total power, polarisation ratio, phase difference and correlation coefficient), which help in understanding and quantifying physical interaction behaviours between radar waves and illuminated land cover features. However, POLSAR data suffer inherently from speckle noise disturbance, which can degrade the application for land cover classification. Being a critical pre-processing task, speckle suppression of POLSAR data remains challenging. It is difficult for a speckle filter to differentiate perfectly noise and image features, such as edges, lines, point targets etc., which are to be preserved. Moreover, this presence of speckle noise can restrict the applicability of conventional image processing techniques. In particular, the conventional edge detectors, region-based segmentation algorithms as well as pattern recognition classifiers need to be refined by taking both speckle noise characteristics and POLSAR data contents into account. Apart from this, most of the proposed classifiers for POLSAR data operate on a per-pixel basis. The classification outputs based on individual pixels, however, can be unsatisfactory due to the presence of inherent speckle noise and the exclusion of intrinsic spatial neighbourhood information. Hence, a per-segment approach is more preferable for POLSAR image classification. Keeping these considerations in mind, a selected hybrid segmentation algorithm is extended in this paper for POLSAR data, followed by a quantitative evaluation of the per-segment approach in improving land cover classification. This paper is organised as follows: Section 2 presents the extended hybrid segmentation algorithm. In Section 3, the segmentation and classification of POLSAR data are discussed. Concluding remarks are given in Section 4. 2. Extended Hybrid Segmentation Algorithm A hybrid segmentation algorithm was proposed by Haris et al. (1998) for segmenting optical pictures. Its advantage is that both edge and region information is considered during the segmentation process. There are four main modules in this segmentation algorithm: 1) noise reduction, 2) gradient computation, 3) initial segmentation by morphological watershed transform, and 4) region merging. In this study the hybrid segmentation algorithm was extended for segmenting multi-look POLSAR data, where the four modules become 1) speckle suppression, 2) edge magnitude computation, 3) initial segmentation by morphological watershed transform, and 4) region merging. Each module is executed consecutively and discussed in more detail below. 2.1 Speckle Suppression POLSAR data are inherently corrupted by speckle noise. This presence of speckle noise can significantly degrade the performance of image segmentation. Thus, a spatially adaptive speckle filter, which was proposed by Lee and Bretschneider (2007), is employed for speckle suppression. The proposed filter consists of five processing steps, namely 1) edge detection, 2) line detection, 3) texture analysis, 4) classification based on scattering mechanisms, and 5) despeckling. The obtained filtering output is then used as an input for computing edge magnitudes. 2.2 Edge Magnitude Computation The purpose of this module is to generate an edge magnitude output, which is then used as an input into the subsequent watershed transformation. In this module, the Roy’s largest eigenvalue-based edge detector (Lee, 2009, Chapter 4) is employed to compute edge magnitude from the filtered POLSAR data. The corresponding processing steps are outlined in Lee (2009, p. 43). The edge magnitude of each pixel, i.e. the maximum Roy’s largest eigenvalue, is computed by using four basic edge templates of 33 pixels as shown in Figure 1. 2.3 Initial Segmentation by Watershed Transform The concept of watershed was first applied by Beucher and Lantuejoul (1979) to image segmentation problems. To date, the use of the morphological watershed transform for segmenting single-frequency single-polarisation SAR data is well recognised and can be found in Fjørtoft et al. (1998), Lemarechal et al. (1998), Li et al. (1999) and others. Its advantages include that: (i) it is a natural segmentation approach, (ii) it is independent of any forms of probability density function in the segmentation process, and (iii) it does not need to state any controls. However, the drawback is that in most cases it produces an over-segmented result. The watershed transform contains two steps: sorting and flooding. The input edge magnitude image is considered as a topographic surface. Each pixel value becomes the altitude of the surface. Firstly, all pixels are sorted in increasing order according to their edge magnitude value. The flooding is then carried out based on an immersion simulation. There are two rules in the flooding process: (i) random access to any pixels in the image and (ii) direct access to the neighbours of a given pixel. The water first rises from the minimum magnitude pixels, which are also known as local minima. The flooding process is continued for each magnitude k until the maximum. Each k-magnitude pixel, which is adjacent to an already labelled watershed region, is added into that region. A new region is constructed for the pixel which is not connected to any existing regions. As mentioned above, the watershed transform always produces an over-segmented output due to the presence of spurious local minima. In order to remove these local minima, the edge magnitude image is thresholded prior to the watershed transform: g x , y if g x , y T I x , y . (1) otherwise T The variable g(x, y) denotes the edge magnitude value of a pixel located at coordinate (x, y) and T is the user-defined threshold. The thresholding replaces all lower magnitude values by the uniform threshold T and small regions can then be eliminated by removing insignificant local minima. 2.4 Region Merging To further remedy the over-segmentation problem, a modified hierarchical stepwise optimisation (HSWO) algorithm is proposed here. This modified version follows closely the original framework (Beaulieu and Touzi, 2004), except for the use of the Roy’s largest eigenvalue as a stepwise criterion. The modified HSWO algorithm involves four processing steps. It proceeds iteratively as follows: Step 1: The segmented output resulting from the watershed transform is used as an initial image partition. Step 2: For each adjacent pair of regions i and j, compute the stepwise criterion: 1 1 1 SC . (2) Ni N j 1 1 max ch1 Ci C j ,ch1 C j Ci Step 3: Perform the globally best merging. To achieve this, find and merge the spatially adjacent region pair with the maximum stepwise criterion value over the entire image. Step 4: The process is terminated if no merging is needed; otherwise, go to Step 2. The termination depends on the preset desired number of regions. As can be seen in Equation (2), the stepwise criterion consists of two terms. The first term favours the merging of small regions, where Ni and Nj refer separately to the size of the two regions i and j. The larger the region sizes Ni and Nj, the smaller the value of 1 N i 1 N j . The second term measures the dissimilarity of the two regions. Both Ci and Cj are the average covariance matrix of the two regions i and j, respectively. The eigenvalue ch1 C i C j 1 refers to the largest eigenvalue of Ci C j 1 and max denotes the maximum operator. If both Ci and Cj are identical, the second term is equal to unity due to the fact that both ch1 C i C j 1 and ch1 C j C i1 are unit eigenvalues. In contrast, the second term is less than unity for Ci Cj. 3. Experiment and Discussion In this section the details of both the NASA/JPL POLSAR data and simulated POLSAR data are first given, followed by the discussion on segmentation and classification results. 3.1 Experimental Data The nine-look dual-frequency NASA/JPL POLSAR data (CCT ID: CM6419) were acquired on 19th September 2000 using the AIRSAR imaging system onboard a DC-8 research aircraft. The aircraft flew at approximately 8 km altitude during the PACRIM-2 science mission. The look angles were 26 and 62.5 corresponding to the near and far ranges of 9 km and 17 km, respectively. Only the C- and L-band data were examined in this study since the P-band suffers from undesirable radio frequency interference. The test data cover a coastal plain in the north-west of Peninsular Malaysia, which is situated at Kuala Muda of Kedah state. It is bounded between 5 36’ N and 5 44’ N latitude as well as between 100 23’ E and 100 29’ E longitude. Being an active agricultural zone, large portions of the area are irrigated land, which is cultivated mainly with rice paddy crops. Both rubber and oil palm trees are planted on a moderate scale. Mangrove forests are found along both riversides of Sungai Merbok. In order to evaluate effectively the performance of the extended hybrid segmentation algorithm, four multi-look POLSAR datasets were simulated following the procedure suggested by Lee et al. (1994, see Appendix). These datasets included four-look C-band, four-look L-band, nine-look C-band, and nine-look L-band. Each simulated set consists of 400 columns and 150 rows. Figure 2 presents the simulated POLSAR data containing four different land cover classes, namely rubber, oil palm, scrub-grassland, and rice paddy. 3.2 Segmentation The capabilities of the extended hybrid segmentation algorithm were first examined using multi-look simulated POLSAR datasets and then with the NASA/JPL POLSAR data. As shown in Figure 3, the extended hybrid segmentation algorithm produced satisfactory results, where all simulated datasets were successfully segmented into the desired eight regions. It was observed that the region boundaries were slightly better delineated in the L-band compared with the C-band. In addition, the increasing number of looks helped to improve the segmentation results. For the NASA/JPL POLSAR test data, five different region numbers were heuristically chosen and examined, i.e. 10000, 25000, 50000, 100000 and 250000. Figures 4 and 5 show the subsets of the selected outputs, respectively. It was found that the segmentation outputs relied strongly on the user-specified number of regions. As indicated by the yellow coloured arrows in Figures 4 and 5, the segmentation errors occurred in both the C- and L-band outputs when the desired number of regions was set to 10000. From the segmentation outputs, it was noticed that some pixels in the oil palm plantation area were wrongly merged into the same region from the neighbouring agricultural land. The oil palm plantation area is labelled with “A”, whereas the agricultural land is given the label “B” in the figures. 3.3 Classification In n-look POLSAR data, each pixel consists of a 33 Hermitian polarimetric covariance matrix: 2 S HH 2 S HH S HV S HH S VV 2 C 2 S HV S HH 2 S HV 2 S HV S VV (3) , 2 2 S VV S HV S VV S VV S HH where Srt denotes the scattering element of the received polarisation r and transmitted polarisation t. The subscripts H and V represent horizontal and vertical polarisations, respectively. Goodman (1963) showed that a matrix A nC obeys a complex Wishart distribution. Based on this result, Lee et al. (1994) derived a distance measure and introduced the so-called supervised complex Wishart classifier for multi-look POLSAR data. The distance measure is defined by (4) d C,C m ln C m tr C m1C . The matrix C is the covariance matrix for a candidate pixel p, while Cm is the average covariance matrix of a target class m. The operators | | and tr denote the determinant and trace of a matrix, respectively. The pixel p is assigned to the nearest class. By assuming that the POLSAR data of different frequencies are statistically independent, the distance measure in Equation (4) can be generalised for classifying multi-frequency POLSAR data (Lee et al., 1994). The generalised distance measure is given as follows: d C n , C m,n ln C m,n tr C m1,n C n , N n 1 (5) where Cn is the covariance matrix for a candidate pixel p in the n-th frequency band. The matrix Cm,n refers to the average covariance matrix of a target class m in the n-th frequency band. The variable N denotes the total number of frequency bands. In this study the complex Wishart classifier was employed to classify dual-frequency fully polarimetric SAR data and single-frequency fully polarimetric SAR data. The supervised classification was performed on per-pixel and per-segment bases. The inputs for the per-pixel classification were both the unfiltered and filtered C- and L-band data. The filtered C- and L-band data resulting from the spatially adaptive filter (Lee and Bretschneider, 2007) with a 77 window were selected for the experiments. In the per-segment classification, the segmented outputs from the hybrid segmentation algorithm with 100000 target regions were tested for each frequency band. The covariance matrix of each pixel in the segmented output was the average covariance matrix of the region which the pixel belongs to. Figures 6 and 7 show the per-pixel and per-segment classification outputs. In the following, the discussion on the classification results is divided into three parts: 1) assessment of classification accuracies, 2) effect of radar frequencies as well as 3) comparison between per-pixel and per-segment approaches. 3.3.1 Assessment of Classification Accuracies To evaluate the classification outputs, the existing topographical and land use maps as well as optical remotely sensed images (i.e. Landsat-5 TM, MASTER and ASTER data) were used for locating test samples. A field work was carried out in order to validate the suitability of the test samples. Table 1 tabulates the computed overall accuracies and Kappa statistics. As expected, the classification results using the unfiltered data were relatively poor. The overall accuracies of both the C- and L-band were 35.09% and 42.16%, respectively. The improved results were obtained by using the filtered data, where the overall accuracies were separately 62.78% and 66.95% for both the C- and L-band. A further improved accuracy of 74.82% was attained by using a dual-band input, i.e. the combination of both the C- and L-band. The increase in accuracy was due to the complementary information of both the C- and L-band for classifying mangrove forest and rice paddy classes. In the C-band alone, the mangrove forests were found to be largely misclassified into the rubber category, while the rice paddy fields were mistakenly classified as water bodies in the L-band. In the experiments, some difficulties were encountered in classifying the diversified cropland, dryland forest, fish farm, mudflat and road segments. The classification difficulties were mainly caused by the poor separation between these and other land cover classes. In the C-band, the diversified cropland was erroneously classified into the rubber category, while it was assigned into the mangrove forest and oil palm categories in the L- band. For the dryland forest, it was wrongly grouped into the mangrove forest and rubber categories in the C-band, whereas only into the rubber category in the L-band. The fish farms were improperly classified as bareland and marshland in the C- and L-band, respectively. For the road segments, the misclassification into the bareland category was observed. The mudflat was misclassified as the rice paddy fields and water bodies separately in both the C- and L-band. 3.3.2 Effect of Radar Frequencies From the results, it was found that the effectiveness of POLSAR data for land cover classification relies on the radar frequencies. For example, the rice paddy fields in the C-band were better classified than in the L-band. The rice crops became transparent in the L-band imaging due to the shorter crop heights compared with the L-band wavelength (i.e. 24 cm). The scattering mechanisms over the rice paddy fields in the L-band were almost entirely contributed by the water surface. Hence, the rice paddy fields were partially misclassified as water bodies. For the mangrove forests, a better classification result was obtained using the L-band data. With the longer wavelength, the L-band showed a stronger radar penetration in both mangrove forests and rubber plantation areas. Subsequently, the L-band provided a more distinct discrimination between both mangrove forests and rubber plantation areas. 3.3.3 Comparison between Per-pixel and Per-segment Classification Approaches It is well-known that per-pixel classification approaches always produce classification outputs with salt-and-pepper effects over homogeneous areas. To mitigate theses effects, per-segment approaches or post-classification operations are normally employed. In this study, the salt-and-pepper effects were observed in the C- and L-band per-pixel classification results. The per-segment classification approach was found to improve the C-band classification accuracy. The overall accuracy of the C-band per-pixel output was 62.78%, while it increased to 68.01% for the per-segment classification. For the L-band classification outputs, only a slightly increase in accuracy was observed. An overall accuracy of 66.95% was obtained from the per-pixel L-band result, while the computed overall accuracy was 68.47% for the per-segment. As expected, the use of the dual-band inputs in the per-segment classification produced the better accuracy, which resulted in an overall accuracy of about 76%. 4. Conclusions In this study an existing hybrid segmentation algorithm, which uses both edge and region information, was extended for multi-look POLSAR data. Applied to NASA/JPL POLSAR C- and L-band data, the segmentation outputs were found to rely on the user-defined region number, which is employed as the termination rule. In land cover classification, an improved accuracy was attained by using the dual-frequency input. As expected, the comparison between per-pixel and per-segment approaches showed that the latter improved the classification accuracy, where the salt-and-pepper effects were reduced significantly in the classification outputs. References Beaulieu, J.-M. and Touzi, R. (2004). Segmentation of textured polarimetric SAR scenes by likelihood approximation. IEEE Transactions on Geoscience and Remote Sensing, 42(10), 2063-2072. Beucher, S. and Lantuejoul, C. (1979). Use of watersheds in contour detection. Proceedings of International Workshop on Image Processing: Real-time Edge and Motion Detection / Estimation, 2.1-2.12. Boerner, W.-M., Mott, H., Lüneburg, E., Livingstone, C., Brisco, B., Brown, R. J., Paterson, J. S., Cloude, S. R., Krogager, E., Lee, J. S., Schuler, D. L., van Zyl, J. J., Randall, D., Budkewitsch, P., and Pottier, E. (1998). Polarimetry in radar remote sensing: basic and applied concepts. In: Manual of Remote Sensing – Principles and Applications of Imaging Radar, edited by Henderson, F. M. and Lewis, A. J., 3rd ed., John Wiley, New York, 271-357. Fjørtoft, R., Lopès, A., Marthon, P., and Cubero-Castan, E. (1998). An Optimal Multiedge Detector for SAR Image Segmentation. IEEE Transactions on Geoscience and Remote Sensing, 36(3), 793-802. Goodman, N. R. (1963). Statistical Analysis Based on a Certain Multivariate Complex Gaussian Distribution (An Introduction). Annals of Mathematical Statistics, 34(1), 152-177. Haris, K., Efstratiadis, S.N., Maglaveras, N., and Katsaggelos, A.K. (1998). Hybrid image segmentation using watersheds and fast region merging. IEEE Transactions on Image Processing, 7(12), 1684-1699. Lee, J.S., Grunes, M.R., and Kwok, R. (1994). Classification of multi-look polarimetric SAR imagery based on complex Wishart distribution. International Journal of Remote Sensing, 15(11), 2299-2311. Lee, K.Y. (2009). Polarimetric Synthetic Aperture Radar Image Processing for Land Cover Classification. Nanyang Technological University: Ph.D. thesis. Lee, K.Y. and Bretschneider, T. (2007). Spatially adaptive despeckling for multi-look polarimetric synthetic aperture radar imagery. CDROM Proceedings of the 28th Asian Conference on Remote Sensing, Kuala Lumpur, Malaysia. Lemarechal, C., Fjørtoft, R., Marthon, P., Cubero-Castan, E., and Lopes, A. (1998). SAR Image Segmentation by Morphological Methods. SPIE Proceedings, vol. 3497, 111-121. Li, W., Benie, G. B., He. S., Wang, D. -C., Ziou, D., and GWYN, Q. H. J. (1999). Watershed-based Hierarchical SAR Image Segmentation. International Journal of Remote Sensing, 20(17), 3377-3390. Title of table Table 1: Assessment of per-pixel and per-segment classification results Table 1: Assessment of per-pixel and per-segment classification results Percent of correct classified into Bareland Bridge Built-up Mangrove forest Marshland Oil palm Rice paddy Rubber Scrub-grassland Water body Overall accuracy Kappa statistic Unfiltered C-band 27.42 53.25 80.97 12.76 36.45 30.20 74.20 27.80 65.52 99.77 35.09 0.2742 Per-pixel classification Unfiltered Filtered Filtered L-band C-band L-band 32.21 33.11 40.02 72.19 67.46 84.02 66.85 85.33 79.90 45.65 44.05 84.46 36.17 78.03 74.15 45.77 82.07 93.72 19.16 85.46 29.36 69.48 81.53 99.69 47.12 99.59 57.97 99.52 100.00 99.81 42.16 62.78 66.95 0.3403 0.5685 0.6144 Filtered Cand L-band 56.74 91.12 87.90 89.10 92.38 95.65 59.77 99.88 83.65 99.94 74.82 0.7046 Per-segment classification C-band L-band C- and L-band 52.39 71.01 88.37 50.06 77.89 94.36 86.35 89.95 100.00 100.00 68.01 0.6281 35.78 82.25 83.96 92.77 87.94 99.62 18.21 99.93 62.36 100.00 68.47 0.6323 83.39 86.39 96.36 91.52 94.39 100.00 58.36 100.00 85.02 100.00 76.60 0.7251 Titles of Figures Figure 1. 33 edge templates with different orientations. Figure 2. Simulated POLSAR data. Figure 3. Results from the extended hybrid segmentation algorithm. Figure 4. C-band outputs resulting from the extended hybrid segmentation algorithm. Figure 5. L-band outputs resulting from the extended hybrid segmentation algorithm. Figure 6. Complex Wishart classification results. Figure 7. Complex Wishart classification results obtained from NASA/JPL POLSAR dual-frequency data. Figure 1. 33 edge templates with different orientations (a) 0º, (b) 45º, (c) 90º and (d) 135º. (a) (b) (c) (d) Figure 2. Simulated POLSAR data. (a) and (b) are the four- and nine-look C-band, while the fourand nine-look L-band are given in (c) and (d). For each image, the HH, HV and VV intensities are displayed in the RGB colour space. The areas of scrub, rice paddy, oil palm and rubber are labelled as A, B, C and D, respectively. Four-look C-band Nine-look C-band Four-look L-band Nine-look L-band Figure 3. Results from the extended hybrid segmentation algorithm. Watershed transform input 100000 regions 50000 regions 10000 regions Figure 4. C-band outputs resulting from the extended hybrid segmentation algorithm. Note that each display is only part of the entire processed output. Watershed transform input 100000 regions 50000 regions 10000 regions Figure 5. L-band outputs resulting from the extended hybrid segmentation algorithm. Note that each display is only part of the entire processed output. (a) (b) (c) (d) (e) (f) Figure 6. Complex Wishart classification results. Per-pixel classification outputs of (a) C-band unfiltered data, (b) C-band filtered data, (d) L-band unfiltered data, and (e) L-band filtered data. Per-segment classification outputs of (c) C- and (f) L-band segmented data. Please refer to Figure 7 for the legend. Legend (a) (b) Figure 7. Complex Wishart classification results obtained from NASA/JPL POLSAR dual-frequency data. (a) and (b) are, respectively, per-pixel and per-segment classification outputs.