Characterization of homesteads in India using C band Synthetic Aperture Radar data Varunika Jain, C Patnaik and S Panigrahy EPSA Space Applications Centre, Ahmedabad, INDIA Contact details: cpatnaik@sac.isro.gov.in Phone: 91 79 26914037 Fax: +91 79 26915846 1 Characterization of homesteads in India using C band Synthetic Aperture Radar data ABSTRACT The rural settlements in India are complex in their composition consisting of built-up area, gardens around homes with miscellaneous trees. The homesteads which are basically settlements have an impact on the agrarian set-up. The homestead pattern and composition in India is undergoing rapid changes due to the population pressure. With expanding population, the agricultural lands are used up for these settlements. In order to map these homesteads a methodology has been suggested in this paper. Attempt has been made to characterize them based on the vegetation density as a function of the backscatter. Multi temporal Radarsat-2 SAR Wide 2 beam data has been used and the role of dual polarization SAR data has been explored. The cross polarization ratio has been used along with temporal HH data. Results are encouraging and show potential to map and characterize homesteads in India. Keywords: Homesteads, settlements characterisation, backscatter, cross polarization, Vegetation, 2 1. Introduction: In the Indian landscape, one encounters two distinct types of rural settlement pattern. In many parts, the rural life is the nucleated village, a compact collection of dwellings. In other parts, the so called “village” area may contain separate clusters of homesteads. The term “homestead” has been variously defined. Homestead means a dwelling house and land on which it stands, together with any attached garden, orchard or and out buildings used for purpose of horticulture or agriculture and any tank. (Dey, 1993). In subsistence economy, a sizable part of family income in real terms comes from homestead and plays a significant role in family nutrition and social status (Marsh, 1998, Hanstad and Lokesh, 2002). In Indian scenario, the homestead trees contribute significantly to total tree cover. Homestead trees now forms one of the nine categories of “Trees outside Forest” being surveyed by Forest Survey of India. The homestead pattern and composition in India is undergoing rapid changes due to the population pressure. As the population increases, new households are formed, either annexed to the existing homestead or formation of new homesteads in agricultural fields. Preference to road/canal or other utility services modulates the homestead pattern in an area. Knowledge of the spatial pattern of the homestead is thus essential for many of these studies. In this context, the satellite remote sensing data has a significant role to play, particularly that of microwave data. The sensitivity of radar data to dielectric and geometric property has been found to have an advantage in detection of settlements. The agglomeration of di- and trihedral corner reflectors in urban environments makes these regions stand out as clusters of more or less bright signal returns in radar data. This effect has been used for monitoring the urban footprint (Haack, 1984; Henderson, 1995, Henderson and Xia, 1997). However, the highly variable nature of the urban landscape enhances the complexity and multifarious nature of interactions between urban features and radar signals. The effects of some of the landscape factors on the appearance and interpretability of radar images in general have been reviewed previously by Fung and Ulaby (1983), Simonett and Davis (1983). Both theoretical considerations and empirical observations indicate that the cross-polarized imagery is less susceptible to the specular return from dihedral and trihedral reflectors (cardinal effect) that is apparent on the like-polarized imagery. On like-polarized 3 imagery, similar land use types often appear dissimilar, or vice versa. However, numerical analysis of pixel values and use of automated or manual interpretation strategies have led to mixed conclusions as to the preferred (like or cross) polarization. The applicability of textural information using polarised and/or multifrequency radar data have proven to be particularly valuable for urban study. In general large collections of structures with relatively little or no vegetation appear quite visible on HH polarization, while HV polarization would be preferred in analyzing the other land uses within the urban area, particularly vegetative area. In recent years, object-oriented approaches are being investigated (Hofman, 2001, Thiel et al., 2008). These more sophisticated techniques provide possibilities to describe and utilise the geometric, textural and especially contextual properties of the real-world objects in the classification process (Henderson and Xia, 1998). However, most of the earlier studies were more experimental in nature, based on data available on opportunity. This paper highlights the results of a study on delineation of the rural settlement/homesteads using multidate C band SAR data. 2. Study area and Data used: The study area was spread over five blocks (Ketugram 1, Ketugram II, Katwa I, Katwa II and Mangalkot ) in rural Bardwan district, West Bengal in the east coast of India. Radarsat-2 Wide 2 Synthetic Aperture Radar data obtained in HH and HV polarization was used (Table-1). Three date data acquired during May 05, 29 and June 22, 2010 were used. 3. Methodology: 3.1 Data set preparation The processing of multi-temporal Radarsat data for this study involved: i) data download; ii) speckle reduction; iii) data calibration; iv) image georeferencing, v) multidate image coregistration. Work was carried out using PCI Geomatica Image processing software. The detailed steps are: Data Download: 4 The acquired SAR data was in 16-bit as .tiff format. It was imported alongwith the ephemeris data. The Ground Control Points (GCPs) of the respective scenes are stored in a separate .xml file called Product.xml and the calibration coefficients are stored in another .xml file called lutSigma.xml. Necessary xml parsers were developed to extract the GCPs and the calibration coefficients from the respective files. Speckle was suppressed using Enhanced Lee Filter with window size 5*5. After suppressing the speckle by filtering, the gain scaling values provided across the range were used for converting slant range data to ground range where the feature wise backscatter could be quantified. The equation used for calibration is as follows: σ° (in dB) = 10* log10 (DN2 + A0)/Aj )) where DN is amplitude value from the raw data A0 is offset value, usually 0 Aj is the gain scaling coefficient. The incidence angle component is built into the gainscaling coefficient and hence not computed separately. This calibrated data was georectified and used for analysis.. Image co-registration: The calibrated data need to be stacked together to generate multi-temporal data for use in the study. The output co-registered and georectified image contained the three multitemporal data at 10 meter pixel spacing. A second order polynomial with cubic convolution resampling was adopted for this. 3.2. Classification Rationale of Approach Settlements are composed of combinations of natural elements and elements of the built environment of the cultural landscape constructed from a variety of bio- and geophysical materials (wood, mud, concrete, metal and stone). These built landscapes 5 gives rise to more or less bright signal returns. This effect can be used to map the settlements. However, many environmental variables affect this bright signal viz, type, amount, and pattern of vegetation cover, adjoining land cover classes and patterns, and seasonal changes. This constraint can be overcome to some extent by using multi-date data to discriminate the invariant nature of the settlements from its surrounding land cover (agriculture crops, water, fallow fields etc). In addition, scale, and image processing and enhancement techniques can also have significant effects on the delectability of settlements and the accuracy of mapping. Formation of decision rules Ground-truth locations of homesteads/settlements derived using high resolution optical data were used to form the decision rule. The backscatter signature of settlements/homesteads in single and multi-date data was analyzed. Single date and multi-date combinations were evaluated and the best set of data was used for classification. The accuracy of classification was further evaluated using blind site approach. 3.3 Characterization based on vegetation proportion/density The cross polarization ratio (XPR i.e. HH/HV) gives an idea of the type of vegetation that is found under a given area. The XPR was computed for the first acquisition as vegetation under homesteads does not change during short intervals of around 50 days when all the acquisitions took place. After demarcating the homesteads from the rest of the image, the XPR values were used to categorize the vegetation for these homesteads. 3.4 Final map and statistics The settlement and homesteads were segregated based on area in hectares and 5 classes were generated. The classes catered to the small, medium and large size of homesteads and their individual components. The frequency of each class was also noted. 4. Results and discussion: 6 The response of homesteads to SAR could be put under two broad categories viz. i) core man made structural area and ii) vegetation interspersed with the man made structures. This vegetation may be sparse (SV) or dense (DV) Preliminary analysis of the data set showed that in HH polarization, the settlements and the core area of homestead showed high backscatter and clearly identifiable from other classes.. This was attributed to the double bounce effect caused by the dihedral corner reflector like properties of building materials of huts, farm sheds and other constructed objects. However, the same was not observed in HV data as backscatter of all land cover classes overlapped. Fig 1 shows the single date HH and HV data over a part of the study area highlighting this difference. Further analysis using known boundary of the homesteads showed that the areas which were surrounded by vegetation or dwellings interspersed by home gardens/orchards showed a lower backscatter range, thus mixed with other classes like agriculture/scrub land etc. In case of homestead areas with dense vegetation (sparse built-up), the signal is more of vegetation. The core area of settlements composed of more built up structures gave a constant high backscatter of > -1 dB in all the dates (Fig.2). In the fringe areas interspersed with vegetation or homesteads with sparse vegetation, the backscatter varied from -2 to -4 dB, and in homesteads with dense vegetation, the backscatter varied from -4 to -7 dB. Table-2 shows the mean and standard deviation of the different settlement categories in three date SAR data. The core settlements appeared very bright in a two/three date color composite image (HH polarization) due to high backscatter in all the dates (Fig. 2). HV data showed overlap of signature in all classes. Thus, it was not found suitable for classification. Only HH data was used for classification. Use of any single date data resulted poor classification accuracy of homesteads. Use of two date data (HH polarization) resulted around 70 per cent classification accuracy, which improved to around 85 percent by using three date data (Fig.3, Table-3). 4.1 Improvement in mapping using GIS The pattern of homesteads/settlements as classified using SAR data resulted mainly isolated clusters as the continuity was disrupted by misclassification of vegetated areas. Similarly, the fringe areas of core settlements with homestead vegetation mixed with adjoining agriculture 7 fields. Thus, it was essential to include these areas while mapping. This exercise was carried out using GIS buffer approach through proximity analysis. From a large sample of training sites, it was found that the mean distance of fringe area of homesteads from core settlements was around 200 meters. This was considered as a threshold to buffer the core clusters to map the settlement/homesteads. This buffer was then rendered into a single cluster by merging. After merging, the threshold was now subtracted form the single cluster. This resulted in an irregular cluster while taking the peripheral areas and the boundary matched well with that derived using high resolution optical data (Fig. 4, 5). Mean and standard deviation of backscatter from homesteads for the three acquisitions were compared, and similar values were observed for all the three acquisitions. Therefore to characterize homesteads based on vegetation density, single date HV data was used. The difference of HH and HV backscatter values, also known as cross polarization ratio, was used to characterise vegetation density. To categorize vegetation as dense limits of 6 dB to 10 dB were used. Medium dense vegetation falls into two categories with limits of 2 to 6 dB and 10 to 15 dB. Sparse vegetation was segregated using limits 15 to 23 dB. This was applied with in the settlement boundary. A qualitative homestead vegetation categorization could be obtained by this. Analysis showed that around 25 per cent area were under sparse vegetation and 12 per cent under dense vegetation, the remaining fall into core area. 5. Conclusion The rural settlements in India are complex in their composition consisting of built-up area, home garden with miscellaneous trees, small orchards etc. and are referred as homesteads. The core settlement area with built-up structures causing double bounce scattering in HH polarization data gave a high backscatter in all dates. However, no such specific signature was observed in case of HV polarization data. The accuracy of identifying the settlements improved with two or three date data (HH) by including settlements with sparse vegetation areas. However, the dense vegetated settlement areas were found to have poor classification accuracy. This resulted in isolated clusters of settlements. Thus, a GIS based buffer technique 8 was developed to include these areas and merge isolated core settlement patches to reconstitute the homestead boundary. A buffer of 200 meters was found to produce optimum result compared with the base map prepared using high resolution optical data. This study shows that Radarsat SAR Wide 2 beam data with large swath has the potential to be used for large area application to map the rural settlement/homesteads in India. This has the advantage over high resolution optical data in terms of large swath and calibrated backscatter signature that offers a common algorithm to detect these features. The HV data though not suitable for identification of the settlements, was found to be useful to characterize the homesteads on the basis of vegetation density, by using the cross polarisation ratio. However, further investigation is required to quantify this aspect. 9 6. Acknowledgements: Authors are grateful to Dr M Chakraborty, Group Director, ATDG/SAC for his guidance. The Radarsat data used in this study is due to courtesy of the national project FASAL. 10 7. References: Ahmad N, 1956, The pattern of rural settlement in east Pakistan, Geographical Review, 48 (3) Dey, N. K., 1993, Homesteads and Orchards in India, Mittal Publications, New Delhi (ISBN 81-7099-510-8), pp 3-5. Fung A K and F. T. Ulaby, “Matter-energy interaction in the microwave region,” Manual of Remote Sensing, 2nd Edition, D. S. Simonett and F. T. Ulaby, Eds., American Society for Photogrammetry, Bethesda, Maryland, 1983, ch. 4, pp. 115–164. Haack, B.N., 1984. L- and X-Band Like- and Cross-Polarized Synthetic Aperture Radar for Investigating Urban Environments. Photogrammetric Engineering and Remote Sensing, 50(3), pp.331-340. Hanstad T and Lokesh S.B., 2002, Homestead Plots as Land Reform: Analysis from West Bengal, RDI (Rural Development Institute ) Reports on Foreign Aid and Development, No. 115, 10-11 Henderson, F. M., 1995, “An analysis of settlement characterization in central Europe using SIR-B radar imagery,” Remote Sensing of Environment, vol. 54, no. 1, pp. 61–70. Henderson, F.M. & Xia, Z., 1997. SAR applications in human settlement detection, population estimation and urban land use pattern analysis: a status report. IEEE Transactions on Geoscience and Remote Sensing, 35(1), pp.79-85. Henderson, F.M., Xia, Z.G., 1998. Radar Applications in Urban Analysis, Settlement Detection and Population Analysis.Principles and Applications of Imaging Radar (F.M. Henderson and A.J. Lewis, eds.), Chapter 15. New York, pp. 733-768. 11 Hofman, P., 2001. Detecting urban features from IKONOS data using an object oriented approach. In: Remote Sensing & Photogrammetry Society (Ed.): Proceedings of the First Annual Conference of the Remote Sensing & Remote Sensing Society, 28–33. Marsh, R, 1998, Building on Traditional Gardening to Improve Household Food Security, Food, Nutrition and Agriculture No. 22, at 11 (Food and Agriculture Organization 1998), available at tp://ftp.fao.org/docrep/fao/X0051t/X0051t02.pdf6 Simonett D. S and R. E. Davis, “Image analysis—Active microwave,” Manual of Remote Sensing, D. S. Simonett and F. T. Ulaby, Eds., 2nd Edition, American Society for Photogrammetry, Bethesda, Maryland, ch. 24, 1983, pp. 1125–1181. Thiel, M., Esch, T. & Schenk, A., 2008. Object-Oriented Detection of Urban Areas from TerraSAR-X Data. In The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. 12 Table-1 Specifications of SAR data used Table-2 Mean backscatter of different settlement classes in three date SAR data Table-3 Confusion matrix of training site pixels showing the classification accuracy of the homestead classes in three date HH polarization data Table- 4: Size-wise distribution pattern of settlement/homesteads in the study area derived using SAR data 13 Table-1 Specifications of SAR data used Beam Position W2 Data type 16 bits Beam mode Descending (0600 AM local time) Incidence angle range 30-39 (degrees) Polarisation HH, HV Pixel Spacing 12 m Product type SGF Swath 150*150 km Table-2 Mean backscatter of different settlement classes in three date SAR data Site Date of Acquisition no. (2010) 1 2 3 Feature Mean Std. Devn (dB) (dB) May 05 Core Settlement 0.54 1.52 May 29 Area 2.22 1.77 June 22 2.61 1.93 May 05 Homestead with -5.2 0.93 May 29 dense -4.26 0.94 June 22 Vegetation -5.39 0.59 May 05 Homestead with -3.57 1.26 May 29 sparse -2.43 1.49 June 22 vegetation -2.67 2.16 14 Table-3 Confusion matrix of training site pixels showing the classification accuracy of the homestead classes in three date HH polarization data Feature Null Core Homestead Homestead Settlement sparse dense Water Agriculture Scrub Settlement 2.0 90.6 6.0 1.4 0.0 0.0 0.0 Homestead 4.9 22.4 27.0 36.3 0.0 4.9 4.5 0.4 0.9 11.6 72.5 0.0 1.7 12.8 Water 1.4 0.0 0.0 0.0 98.6 0.0 0.0 Agriculture 1.0 0.0 1.0 1.9 0.0 88.9 7.3 Scrub 1.4 0.0 1.4 5.0 0.0 5.3 86.8 – sparse Homestead – dense Table- 4: Size-wise distribution pattern of settlement/homesteads in the study area derived using SAR data Area (Ha) <1 1-10 10-50 50-100 >100 Frequency 53 147 96 13 2 % 17.0 47.3 30.8 4.2 0.7 15 Fig.1. Single date HH and HV polarization data over a part of the study area showing backscatter of settlement/homesteads Fig.2 Mean backscatter of settlement (core),homesteads (dense vegetation-DVA) and sparse vegetation (SVA) in three date SAR data. Fig.3. (a) and (b) showing two and three date color composite of HH data. The core settlement area appear white due to high backscatter in all the dates, (c) three date classified image; the magenta color indicates the pixels that were not classified in two date data, while present in three date data. Fig. 4: (a) Three date FCC of HH polarization data and (b) the homestead map derived before and after application of buffer (green and red color). Fig 5: The final settlement boundary (red color) obtained using GIS approach and the reference boundary (blue color) derived using high resolution optical remote sensing data along with (a) three date HH SAR and (c) true color composite optical data. Fig. 6. Three date HH polarization SAR data FCC showing the settlement/homestead class in Yellow color and the final homestead/settlement map showing size-wise distribution pattern for the study area. 16 HH polarisation Hv polarisation Settlements Fig.1. Single date HH and HV polarization data over a part of the study area showing backscatter of settlement/homesteads 5 Backscatter in dB 3 1 -1 -3 Core DVA SVA -5 -7 0 1 2 3 Acquisition Number 4 Fig.2 Mean backscatter of settlement (core),homesteads (dense vegetation-DVA) and sparse vegetation (SVA) in three date SAR data. 17 (a) (b) (c) Fig.3. (a) and (b) showing two and three date color composite of HH data. The core settlement area appear white due to high backscatter in all the dates, (c) three date classified image; the magenta color indicates the pixels that were not classified in two date data, while present in three date data. (a) (b) Fig. 4: (a) Three date FCC of HH polarization data and (b) the homestead map derived before and after application of buffer (green and red color). (a) (b) (c) Fig 5: The final settlement boundary (red color) obtained using GIS approach and the reference boundary (blue color) derived using high resolution optical remote sensing data along with (a) three date HH SAR and (c) true color composite optical data. 18 Fig. 6. Three date HH polarization SAR data FCC showing the settlement/homestead class in Yellow color and the final homestead/settlement map showing size-wise distribution pattern for the study area. 19