Evolutionary Approach for Land cover classification using GA based Fuzzy Clustering Techniques Final Report for the Minor Research Project submitted to Tamil Nadu State Council for Higher Education Chennai-600 005. by Dr. N. Sujatha Assistant professor PG & Research Department of Computer Science Raja Dorai Singam Govt Arts College Sivagangai- 630561 ,Tamilnadu. TAMIL NADU STATE COUNCIL FOR HIGHER EDUCATION FORMAT FOR SUBMISSION OF FINAL REPORT FOR MINOR RESEARCH PROJECT 1. Name of the Teacher : Dr.N.Sujatha 2. Designation : Assistant professor 3. Communication Address: : Dr.N.Sujatha, Assistant Professor, PG & Research Department of Computer Science, Raja Dorai Singam Govt Arts College, Sivagangai-630 561. Contact No : 9443093017 E-Mail-Id :sujamurugan@gmail.com 4. Institutional Address : Raja Dorai Singam Govt Arts College, Sivagangai-630 561. 5. Project Titile :” Evolutionary approach for Land cover classification using GA based Fuzzy clustering techniques” 6. Sector :Computer Science (Science) 7. Grant approved and expenditure incurred during the project Total amount approved (Rs.) 1,000,00.00 Total Expenditure (Rs.) 1,00,103.00 CONTENTS Chapters Title Page No 1 INTRODUCTION 1.1 Data mining 1 1.2 Introduction to Image Processing 5 1.3 Introduction to Remote Sensing 7 1.4 Introduction to Genetic Algorithm 8 2 LITERATURE SURVEY 2.1 Unsupervised Classification 11 2.2 Supervised Classification 16 2.3 Classification using genetic algorithm 18 3 PREPROCESSING 3.1 Preprocessing Image 21 3.2 Median Filters 28 4 LAND COVER CLASSIFICATION 4.1 Introduction 31 4.2 Cluster Analysis 32 4.3 Median Filters 33 4.4 Fuzzy C Means 34 4.5 Genetic Algorithm 36 4.6 Proposed work 39 5 CONCLUSION 44 Chapter 1 Introduction 1.1 Data Mining Progress in digital data acquisition and storage technology has resulted in the growth of huge databases. This has occurred in all areas of human endeavor, from the mundane (such as supermarket transaction data, credit card usage records, telephone call details, and government statistics) to the more exotic (such as images of astronomical bodies, molecular databases, and medical records). The interest has grown in the possibility of tapping these data, of extracting from them information that might be of value to the owner of the database. The discipline concerned with this task has become known as data mining. Data mining [1] refers to extracting or “mining” knowledge from large amounts of data. Many people treat data mining as a synonym for another popularly used term, Knowledge Discovery from Data, or KDD. Knowledge discovery as a process consists of an iterative sequence of the following steps: Data cleaning (to remove noise and inconsistent data) Data integration (where multiple data sources may be combined) Data selection (where data relevant to the analysis task are retrieved from the database) Data transformation (where data are transformed or consolidated into forms appropriate for mining by performing summary or aggregation operations) Data mining (an essential process where intelligent methods are applied in order to extract data patterns) Pattern evaluation (to identify the truly interesting patterns representing knowledge based on some interestingness measures) Knowledge presentation (where visualization and knowledge representation techniques are used to present the mined knowledge to the user). Frequently, the data to be mined is first extracted from an enterprise data warehouse [2] into a data mining database or data mart. OLAP (On-Line Analytical Processing) is part of the spectrum of decision support tools. Traditional query and report tools describe what is in a database. OLAP goes further; it’s used to answer why certain things are true. The user forms a hypothesis about a relationship and verifies it with a series of queries against the data. The OLAP analyst generates a series of hypothetical patterns and relationships and uses queries against the database to verify them or disprove them. OLAP analysis is essentially a deductive process. Data mining is different from OLAP because rather than verify hypothetical patterns, it uses the data itself to uncover such patterns. It is essentially an inductive process. Task of Data Mining Data mining as a term used for the specific set of six activities or tasks [3] as follows: Classification Estimation Prediction Affinity grouping or Association Rules Clustering Description and Visualization The first three tasks - classification, estimation and prediction are all examples of directed data mining or supervised learning. In directed data mining, the goal is to use the available data to build a model that describes one or more particular attribute(s) of interest (target attributes or class attributes) in terms of the rest of the available attributes. The next three tasks – association rules, clustering and description are examples of undirected data mining i.e. no attribute is singled out as the target; the goal is to establish some relationship among all the attributes. Clustering During a cholera outbreak in London in 1854, John Snow used a special map to plot the cases of the disease that were reported. A key observation, after the creation of the map, was the close association between the density of disease cases and a single well located at a central street. After this, the well pump was removed putting an end to the epidemic. Associations between phenomena are usually harder to detect, but the above is a very simple, and for many researchers, the first known application of cluster analysis. Since then, cluster analysis has been widely used in several disciplines, such as statistics, software engineering, biology, psychology and other social sciences, in order to identify natural groups in large amounts of data. These data sets are constantly becoming larger, and their dimensionality prevents easy analysis and validation of the results. Clustering has also been widely adopted by researchers within computer science and especially the database community, as indicated by the increase in the number of publications involving this subject, in major conferences. Clustering [4] [5] is a basic tool used in data analysis, pattern recognition and data mining for finding unknown groups in data. It can be considered the most important unsupervised learning problem; so, as every other problem of this kind, it deals with finding a structure in a collection of unlabeled data. It is therefore a collection of objects which are “similar” between them and are “dissimilar” to the objects belonging to other clusters. Besides the term data clustering as synonyms like cluster analysis, automatic classification, numerical taxonomy, botrology and typological analysis. Cluster analysis [8] is a difficult problem because many factors i.e., effective similarity measures, criterion functions, algorithms are come into play in devising a perfect clustering technique for a given clustering problems. Also no clustering method can effectively handle all sorts of cluster structures i.e. shape, size and density. Sometimes the quality of the clusters that are found can be improved by preprocessing the given data. It is common to try to find noisy values and eliminate them by a preprocessing step. The input for a system of cluster analysis is a set of samples and a measure of similarity (or dissimilarity) between two samples. The output from cluster analysis is a number of groups /clusters that form a partition, or a structure of partitions, of the data set. The final goal of clustering can be mathematically described as follows: Where X denotes the original data set, Ci, Cj are clusters of X, and n is the number of clusters. Data Pre-processing [9] describes any kind of processing performed on raw data to prepare it for further processing method. This process includes Data cleaning, which fill in missing values, smooth noisy data, identify or remove outliers and resolve in consistencies; Data integration, which integration of multiple data bases, data cubes, or files; Data transformation, which is normalization and aggregation. Clustering has wide applications in Image Processing Document classification Pattern Recognition Spatial Data Analysis Economic Science Cluster Web log data to discover groups of similar access patterns Types of Clustering There are four types of Clustering. They are Partitional, Hierarchical, Density-Based and Grid-Based Clustering. Partitional type of Clustering The Partition Clustering algorithm [10] splits the data points into k partition, where each partition represents a cluster. Given D, a data set of n objects, and k, the number of clusters to form, a partitioning algorithm organizes the objects into k partitions (k ≤ n), where each partition represents a cluster. The clusters are formed to optimize an objective partitioning criterion, such as a dissimilarity function based on distance, so that the objects within a cluster are “similar,” whereas the objects of different clusters are “dissimilar” in terms of the data set attributes. The cluster should exhibit two properties; they are (1) each group must contain at least one object (2) each object must belong to exactly one group. In this type of clustering, the familiar algorithms are K-Means, K-Medoids, CLARANS, Fuzzy K-Means, K-Modes. Hierarchical type of Clustering: In hierarchical clustering [11], a treelike cluster structure (dendrogram) is created through recursive partitioning (divisive methods) or combining (agglomerative) of existing clusters. Agglomerative clustering methods initialize each observation to be a tiny cluster of its own. Then, in succeeding steps, the two closest clusters are aggregated into a new combined cluster. In this way, the number of clusters in the data set is reduced by one at each step. Eventually, all records are combined into a single huge cluster. Divisive clustering methods begin with all the records in one big cluster, with the most dissimilar records being split off recursively, into a separate cluster, until each record represents its own cluster. Hierarchical Clustering, the familiar algorithms are AGNES, DIANA, In CURE, CHAMELEON, BIRCH, ROCK. Density-Based Clustering: The Density-Based Clustering [12] method group objects according to specific density objective functions. Density is usually defined as the number of objects in a particular neighborhood of a data objects. In these approaches a given cluster continues growing as long as the number of objects in the neighborhood exceeds some parameter. This is considered to be different from the idea in partitional algorithms that use iterative relocation of points given a certain number of clusters. The algorithms in this method include DBSCAN, DENCLUE and OPTICS. Grid-Based Clustering: Grid based methods [13] quantize the object space into a finite number of cells (hyperrectangles) and then perform the required operations on the quantized space. It has fast processing time which depends on number of cells in each dimension in quantized space. It uses a multi resolution grid data structure. The grid-based clustering approach differs from the conventional clustering algorithms in that it is concerned not with the data points but with the value space that surrounds the data points. The important algorithms in this method include STING, Wavelet and CLIQUE. 1.2 Introduction to Image Processing: Origin of Image Processing: The basic techniques [7] used to generate high-quality images from digital data were originally developed to process spacecraft images of Mars. These images represented one experiment that was included among others on the Mariners 4, 6, 7 and 9 missions. These digital images have a dynamic range (sensitivity) from 10 to 50 times that of the eye. Thus in raw format, only a small part of the data is available to the eye. In order to derive the maximum information available from the digital images, techniques were developed first at IPL/JPL and later at the U.S. Geological Survey to extract these data and put them into an optimally interpretable format for the human eye (Levinthal and others, 1973, Batson, 1973). One of the basic problems that arose in processing the Mariner images was caused by coherent and random noise introduced by the detector, by digital recording, and by data transmission, reception, and reduction systems. Most of the processes used to enhance digital images will also enhance noise, thereby seriously degrading the quality of the final image. Therefore, major efforts were undertaken to develop image processing techniques to remove the noise (Rindfleisch and others, 1971; Chavez and Soderblom, 1975). After these clean-up procedures were applied to create noise-free data bases, other techniques were used to improve the images further, including techniques for removing effects of variation in solar illumination angle and for correcting geometry. The image was then processed to enhance fine detail (high-pass filtering) or albedo variations (low-pass filtering) and to enhance contrast (stretching). Most of the image-processing techniques developed for Mariner 9 images were easily modified for application to Landsat (formerly ERTS, or Earth Resources Technology Satellite) data, although some alteration was necessary because of the much larger image data sets. Landsat data were acquired in four spectral bands (bands 4, 5, 6, and 7, respectively, 0.5 to 0.6, 0.6 to 0.7, 0.7 to 0.8, and 0.8 to 1.1/xm). Each Landsat image contains roughly 60 times more data than a Mariner image. Although new problems were introduced because of this larger data set, the final product contains much more information than that available from Mariner 9 data. An image [6] may be defined as a two-dimensional function, f(x,y) where x and y are spatial (plane) coordinates, and the amplitude of f at any pair of coordinates (x,y) is called the intensity of a gray level of the image at that point. When x,y and the intensity values of f are all finite, discrete quantities, we call the image a digital image. The field of digital image processing refers to processing digital images by means of a digital computer. Image Processing [14] is a technique to enhance raw images received from cameras/sensors placed on satellites, space probes and aircrafts or pictures taken in normal day-to-day life for various applications. Various techniques have been developed in Image Processing during the last four to five decades. Most of the techniques are developed for enhancing images obtained from unmanned spacecrafts, space probes and military reconnaissance flights. Image Processing systems are becoming popular due to easy availability of powerful personnel various applications such as: Remote Sensing Medical Imaging Non-destructive Evaluation Forensic Studies Textiles Material Science. Military Film industry Document processing Graphic arts Printing Industry 1.3 Introduction to Remote Sensing and Image Processing Of all the various data sources used in GIS [15], one of the most important is undoubtedly that provided by remote sensing. Through the use of satellites, we now have a continuing program of data acquisition for the entire world with time frames ranging from a couple of weeks to a matter of hours. Very importantly, we also now have access to remotely sensed images in digital form, allowing rapid integration of the results of remote sensing analysis into a GIS. The development of digital techniques for the restoration, enhancement and computer-assisted interpretation of remotely sensed images initially proceeded independently and somewhat ahead of GIS. However, the raster data structure and many of the procedures involved in these Image Processing Systems (IPS) were identical to those involved in raster GIS. As a result, it has become common to see IPS software packages add general capabilities for GIS, and GIS software systems add at least a fundamental suite of IPS tools. IDRISI is a combined GIS and image processing system that offers advanced capabilities in both areas. Because of the extreme importance of remote sensing as a data input to GIS, it has become necessary for GIS analysts (particularly those involved in natural resource applications) to gain a strong familiarity with IPS. 1.4 Introduction to Genetic Algorithm: Genetic algorithms (GAs) [16] were invented by John Holland in the 1960s and were developed by Holland and his students and colleagues at the University of Michigan in the 1960s and the 1970s. In contrast with evolution strategies and evolutionary programming, Holland's original goal was not to design algorithms to solve specific problems, but rather to formally study the phenomenon of adaptation as it occurs in nature and to develop ways in which the mechanisms of natural adaptation might be imported into computer systems. Holland's 1975 book Adaptation in Natural and Artificial Systems presented the genetic algorithm as an abstraction of biological evolution and gave a theoretical framework for adaptation under the GA. Holland's GA is a method for moving from one population of "chromosomes" (e.g., strings of ones and zeros, or "bits") to a new population by using a kind of "natural selection" together with the genetics−inspired operators of crossover, mutation, and inversion. Each chromosome consists of "genes" (e.g., bits), each gene being an instance of a particular "allele" (e.g., 0 or 1). The selection operator chooses those chromosomes in the population that will be allowed to reproduce, and on average the fitter chromosomes produce more offspring than the less fit ones. Crossover exchanges subparts of two chromosomes, roughly mimicking biological recombination between two single−chromosome ("haploid") organisms; mutation randomly changes the allele values of some locations in the chromosome; and inversion reverses the order of a contiguous section of the chromosome, thus rearranging the order in which genes are arrayed. (Here, as in most of the GA literature, "crossover" and "recombination" will mean the same thing.) Holland's introduction of a population−based algorithm with crossover, inversion, and mutation was a major innovation. (Rechenberg's evolution strategies started with a "population" of two individuals, one parent and one offspring, the offspring being a mutated version of the parent; many−individual populations and crossover were not incorporated until later. Fogel, Owens, and Walsh's evolutionary programming likewise used only mutation to provide variation.) Moreover, Holland was the first to attempt to put computational evolution on a firm theoretical footing (see Holland 1975). Until recently this theoretical foundation, based on the notion of "schemas," was the basis of almost all subsequent theoretical work on genetic algorithms. In the last several years there has been widespread interaction among researchers studying various evolutionary computation methods, and the boundaries between GAs, evolution strategies, evolutionary programming, and other evolutionary approaches have broken down to some extent. Today, researchers often use the term "genetic algorithm" to describe something very far from Holland's original conception. In this book I adopt this flexibility. Most of the projects I will describe here were referred to by their originators as GAs; some were not, but they all have enough of a "family resemblance" that I include them under the rubric of genetic algorithm. In 1992 John Koza has used genetic algorithm to evolve programs to perform certain tasks. He called his method "genetic programming" (GP). A genetic algorithm (or GA) is a search technique used in computing to find true or approximate solutions to optimization and search problems. GAs are categorized as global search heuristics. They are a particular class of evolutionary algorithms that use techniques inspired by evolutionary biology such as inheritance, mutation, selection, and crossover (also called recombination). The evolution usually starts from a population of randomly generated individuals and happens in generations. In each generation, the fitness of every individual in the population is evaluated, multiple individuals are selected from the current population (based on their fitness), and modified to form a new population. The new population is used in the next iteration of the algorithm. The algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population. Genetic algorithms [17] are a family of computational models belonging to the class of evolutionary algorithms, part of artificial intelligence. These algorithms encode a potential solution to a specific problem on a simple chromosome like data structure. Genetic algorithms are a family of computational models belonging to the class of evolutionary algorithms, part of artificial intelligence These algorithms encode a potential solution to a specific problem on a simple chromosome like data structure Uses techniques inspired by natural evolution such as inheritance, mutation, selection and crossover. Genetic algorithms are a family of computational models belonging to the class of evolutionary algorithms, part of artificial intelligence These algorithms encode a potential solution to a specific problem on a simple chromosome like data structure Uses techniques inspired by natural evolution such as inheritance, mutation, selection and crossover They are often viewed as function optimizers Chapter 2 Literature Survey Remote sensing involves gathering information about the earth's surface remotely, and generally encompasses acquiring this data from aircraft or satellites. Remote sensing is very much an interdisciplinary area of scientific investigation, and relies in large part on knowledge of physics, mathematics, Computer science and geography. The land cover pattern of a region is an outcome of natural and socio – economic factors and their utilization by man in time and space. Land is becoming a scarce resource due to immense agricultural and demographic pressure. Hence, information on land cover is essential for the selection, planning and implementation of land use schemes to meet the increasing demands for basic human needs and welfare. This information also assists in monitoring the dynamics of changes in land cover. Land cover change has become a central component in current strategies for managing natural resources and monitoring environmental changes. The advancement in the concept of satellite image processing greatly influences the land cover mapping thus providing an accurate evaluation of the spread and health of the world’s forest, grassland, and agricultural resources has become an important priority. Image classification in Remote Sensing Digital image classification techniques group pixels to represent land cover features. Land cover could be forested, urban, agricultural and other types of features. There are three main image classification techniques. They are Unsupervised image classification, Supervised image classification, Object-based image analysis. Pixels are the smallest unit represented in an image. Image classification uses the reflectance statistics for individual pixels. Unsupervised and supervised image classification techniques are the two most common approaches. However, object-based classification has been breaking more ground as of late. 2.1 Unsupervised Classification Pixels are grouped based on the reflectance properties of pixels. These groupings are called “clusters”. The user identifies the number of clusters to generate and which bands to use. With this information, the image classification software generates clusters. There are different image clustering algorithms such as K-means and ISODATA. The user manually identifies each cluster with land cover classes. It’s often the case that multiple clusters represent a single land cover class. The user merges clusters into a land cover type. The unsupervised classification image classification technique is commonly used when no sample sites exist. Unsupervised Classification Steps: Generate clusters Assign classes Xingping Wen et al., [21] proposed an unsupervised classification method. Firstly, the hyperspectral remote sensing image was atmospherically corrected. Accuracy atmospheric correction is the key to the classification. Then, endmember spectra were extracted using PPI algorithm, and the image was classified using SAM. Traditionally SAM algorithm used constant threshold. They improved and used adjustable threshold, and the pixel belong to class which has the smallest spectral angle. Finally, the endmember spectra were clustered based on K-means algorithm and classes were combined according to the K-means algorithm result. The final classification map was projected and outputted. It is an effective classification method especially for hyperspectral remote sensing image. Users also can adjust the endmember and classes number according to their applications. Gaussian mixture models (GMM) are widely used for unsupervised classification applications in remote sensing. Expectation-Maximization (EM) is the standard algorithm employed to estimate the parameters of these models. However, such iterative optimization methods can easily get trapped into local maxima. Researchers use population based stochastic search algorithms to obtain better estimates. Caglar Art et al., [22] presented a novel particle swarm optimization-based algorithm for maximum likelihood estimation of Gaussian mixture models. The proposed approach provides solutions for important problems in effective application of population based algorithms to the clustering problem. They presented a new parametrization for arbitrary covariance matrices that allows independent updating of individual parameters during the search process. They also described an optimization formulation for identifying the correspondence relations between different parameter orderings of candidate solutions. Experiments on a hyperspectral image show better clustering results compared to the commonly used EM algorithm for estimating GMMs. Mohd Hasmadi et al., [23] described a study that was carried out to perform supervised and unsupervised techniques on remote sensing data for land cover classification and to evaluate the accuracy result of both classification techniques. The study used SPOT 5 satellite image taken on January 2007 for 270 / 343 (path / row) as a primary data and topographical map and land cover maps as supporting data. The land cover classes for the study area were classified into 5 themes namely vegetation, urban area, water body, grassland and barren land. Ground verification was carried out to verify and assess the accuracy of classification. A total of 72 sample points were collected using Systematic Random Sampling. The sample point represented 25% of the total study area. The results showed that the overall accuracy for the supervised classification was 90.28% where Kappa statistics was 0.86, while the unsupervised classification result was 80.56% accurate with 0.73 Kappa statistics. In conclusion, they found that the supervised classification technique appears more accurate than the unsupervised classification. Xiong Liu [24] used migrating means clustering unsupervised classification (MMC), maximum likelihood classification (MLC) trained by picked training samples and trained by the results of unsupervised classification (Hybrid Classification) to classify a 512 pixels by 512 lines NOAA-14 AVHRR Local Area Coverage (LAC) image. All the channels including ch3 and ch3t are used in this project. The image is classified to six classes including water, vegetation, thin partial clouds over ground, thin clouds, low/middle thick clouds and high thick clouds plus unknown class for supervised classification. In total, the results using these three methods are very consistent with the original three-band overlay color composite image and the statistical mean vectors for each class are consistent using different methods and are reasonable. He also noted that the ch3t temperature is usually much larger than the thermal channel-measured temperature for clouds, the colder the thermal temperature, the larger their difference. The ch3 reflectance is anti-correlated with the ch1 and ch2 reflectance, which is due to that high reflectance ice clouds can absorb most of the energy in this channel. The results of MMC and MLC trained by the results of MMC are better than that of the MMC trained by picked samples. The MLC trained by picked samples produces more unknown classes than that trained by MMC, which is probably due to that the standard deviation (multivariate spreads) for each class generated by MMC is usually larger than that of picked training samples. It takes more computation time to run MMC (5 iterations) than MLC if the classes are the same, but take more time to pick samples over and over to get comparable results. The results of MLC trained by picking samples are worse than the other two methods due to the difficulty of picking representative training samples. The hybrid supervised/unsupervised classification combines the advantages of both supervised classification and unsupervised classification. It doesn’t require the user have the foreknowledge of each class, and can still consider the multivariate spreads and obtain accurate mean vectors and covariance matrixes for each spectral class by using the entire pixels image as training samples. In Puerto Rico the land use has been changing, every day new developments (urban, industrial, commercial and agricultural) are emerging. The purpose of Edwin Martínez Martínez ‘s [25] work is to develop the land use of Río Jauca a sub-basin of the Rio Grande de Arecibo watershed that is an important natural resource and supplies water to the metropolitan area. Remote sensing techniques can be used to assess several water quality parameters and also for land use classifications. For his work the ERDAS Imagine V8.5 computer software was used to develop a land use classification using IKONOS images. The generated land use classification was compared with a land use generated using Arc View, to decide which method provides better land use classification. An unsupervised classification method provides the interpretation, feature extraction and endmember estimation for the remote sensing image data without any prior knowledge of the ground truth. Cheng-Yuan Liou [26] explored such method and constructed an algorithm based on the non-negative matrix factorization (NMF). The use of the NMF is to match the non-negative property in sensing spectrum data. The data dimensionality is estimated by using the partitioned noise-adjusted principle component analysis (PNAPCA). The initial matrix used to start the NMF is obtained by using the fuzzy c-mean (FCM). This algorithm is capable to produce a region- or part-based representation of objects in images. Both simulated and real sensing data are used to test the algorithm. Traditionally, classification approaches have focused on per-pixel technologies. Pixels within areas assumed to be automatically homogeneous are analyzed independently. These new sources of high spatial resolution image will increase the amount of information attainable on land cover. Significance is that the data can be acquired by our eyes and the energy can be analyzed. But satellites are capable of collecting data beyond the visible band also. However, the traditional methods of change detection are not suitable for high resolution remote sensing images. To overcome the limitations of traditional pixel-level change detection of high resolution remote sensing images, based on geo referencing and analysis method, Babykalpana et al., [27] presented a unsullied way of multi-scale amalgamation for the high resolution remote sensing images change detection. Their experiment showed that this method has a stronger advantage than the traditional pixel-level method of high resolution remote sensing image change detection. It is widely used for the crops classification in the world and this classification method is used for land cover and land use because vegetation components are important in the images. The basic axis is also to preserve the greenery of the city for the healthy environment. Ratika Pradhan et al., [28] developed the classification algorithm for remotely sensed satellite data using Bayesian and hybrid classification approach. In their paper, a Bayesian classifier and hybrid classifier algorithm for remotely sensed satellite data has been developed. To test and validate the algorithm, the sample image taken into consideration is multi-spectral IRS1C/LISS III of East Sikkim, India. The proposed classifier can also be used for hyperspectral remote sensing data considering the best bands as input for preparing spectral class distribution. The sample image is classified by both Bayesian and Hybrid classifier and then the overall accuracy were calculated. The overall accuracy for the sample test image was found to be 90.53% using the Bayesian classifier and 91.59% using the Hybrid respectively. The reason for high accuracy may be to some extent attributed for the reason that the part of the training set is being considered as ground truths instead of actual data. Since the accuracy of the results depends only upon the test set chosen, the efficiency of any algorithm shall not be considered on the accuracy measure alone. The classified images shall also be compared ground truth information physically. From the comparison, it is found that both the methods are equally efficient. 2.1 Supervised Classification The user selects representative samples for each land cover class in the digital image. These sample land cover classes are called “training sites”. The image classification software uses the training sites to identify the land cover classes in the entire image. The classification of land cover is based on the spectral signature defined in the training set. The digital image classification software determines each class on what it resembles most in the training set. The common supervised classification algorithms are maximum likelihood and minimum-distance classification. Supervised Classification Steps: Select training areas Generate signature file Classify Nearest neighbor techniques are commonly used in remote sensing, pattern recognition and statistics to classify objects into a predefined number of categories based on a given set of predictors. These techniques are especially useful for highly nonlinear relationship between the variables. In most studies the distance measure is adopted a priori. In contrast Luis Samaniego et al., [29] proposed a general procedure to find an adaptive metric that combines a local variance reducing technique and a linear embedding of the observation space into an appropriate Euclidean space. To illustrate the application of this technique, two agricultural land cover classifications using mono-temporal and multi-temporal Landsat scenes are presented. The results of the study, compared with standard approaches used in remote sensing such as maximum likelihood (ML) or k-Nearest Neighbor (k-NN) indicate substantial improvement with regard to the overall accuracy and the cardinality of the calibration data set. Also, using MNN in a soft/fuzzy classification framework demonstrated to be a very useful tool in order to derive critical areas that need some further attention and investment concerning additional calibration data. A new approach to the classification of hyperspectral images is proposed. The main problem with supervised methods is that the learning process heavily depends on the quality of the training data set. In remote sensing, the training set is useful only for simultaneous images or for images with the same classes taken under the same conditions; and, even worse, the training set is frequently not available. On the other hand, unsupervised methods are not sensitive to the number of labelled samples since they work on the whole image. Nevertheless, relationship between clusters and classes is not ensured. In this context, L. Gomez-Chova et al., [30] proposed a combined strategy of supervised and unsupervised learning methods that avoids these drawbacks and automates the classification process. The method is based on the general formulation of the expectation-maximization (EM) algorithm. This method is applied to crop cover recognition of six hyperspectral images from the same area acquired with the HyMap spectrometer during the DAISEX-99 campaign. For classification purposes, six different classes are considered. Classification accuracy results are compared to common methods: ISODATA, Learning Vector Quantization, Gaussian Maximum Likelihood, Expectation-Maximization, and Neural Networks. The good performance confirms the validity of the proposed approach in terms of accuracy and robustness. Benqin Song et al., [31] developed a new classification strategy that integrates sparse representations and EMAPs for spatial–spectral classification of remote sensing data. Their experiments reveal that the proposed approach, which combines the advantages of sparse representation and the rich structural information provided by EMAPs, can appropriately exploit the inherent sparsity present in EMAPs in order to provide state-of-the-art classification results. This is mainly due to the fact that the samples in EMAP space can be approximately represented by a few number of atoms in the training dictionary after solving the optimization problem, whereas the same samples could not be represented in the original spectral space with the same level of sparsity. The proposed strategy was tested on both simulated and real multi/hyperspectral data sets. A comparison with state-of-the-art classifiers shows very promising results for the proposed approach, particularly when a very limited number of training samples are available. In this context, the SUnSAL algorithm provided excellent classification performance as compared with other techniques. Remote sensing image segmentation requires multi-category classification typically with limited number of labeled training samples. While semi-supervised learning (SSL) has emerged as a sub-field of machine learning to tackle the scarcity of labeled samples, most SSL algorithms to date have had trade-offs in terms of scalability and/or applicability to multi-categorical data. Ayse Naz Erkan et al., [32] evaluated semi-supervised logistic regression (SLR), a recent information theoretic semi-supervised algorithm, for remote sensing image classification problems. SLR is a probabilistic discriminative classifier and a specific instance of the generalized maximum entropy framework with a convex loss function. Moreover, the method is inherently multi-class and easy to implement. These characteristics make SLR a strong alternative to the widely used semi-supervised variants of SVM for the segmentation of remote sensing images. They demonstrated the competitiveness of SLR in multispectral, hyperspectral and radar image classification. Zhang Ming et al., [33] proposed to make a maximum use of remote sensing data and GIS techniques to assess land use and soil classification in the rolling hilly area around Nanjing, Eastern part of China (Sub – tropical region). Landsat MSS data perform a classification using the ILWIS software. The supervised classification was based upon a multispectral analysis and on “ground truth”. The minimum distance to mean classifier, the maximum likelihood classifier and the box classification were used. According to the reflectance characteristics of the surface material, the soil classification indicated 7 classes that belong to 4 units and 10 subunits. The land use map contains 5 classes. It has been proved that within limitations, classification algorithms and threshold parameters have an important influence on the classification result and should be selected carefully based on the training area. 2.2 Classification using Genetic Algorithm Taking Jiading district of Shanghai as the study area, which is one of the most typical urbanization fringe regions in Shanghai during the past decades, the urban sprawl in this region from 1989 to 2006 was studied on the basis of multi-spectral remotely sensed images. Multi-source data including four epochs of representative TM images (1989, 1995, 2001 and 2006) and the vector topographic map were used in our study. A genetic algorithm optimized back propagation neutral network approach was first proposed by X. Zhang et al., [34] to classify sorts of land use types from the four epochs of remotely sensed images. The accuracy of the classification was assessed for the four remotely sensed images. The urban land use type was thus extracted based on the land use classification result and three urban land use change images were correspondingly derived from the extracted urban land use type. Based on this, the study area was divided into four plates: the central plate, the southern plate, the northern plate and the western plate, and the detailed temporal-spatial urban land use changes in the study area were investigated in each plate by using overlapping pixel comparison based change detection method during the three time intervals: 1989-1995, 1955-2001 and 2001- 2006. From the change detection analysis, the process and pattern of the urban land use change in the Jiading district was finally revealed during nearly 20 years in our study. The use of remote sensing images as a source of information in agribusiness applications is very common. In those applications, it is fundamental to know how the space occupation is. However, identification and recognition of crop regions in remote sensing images are not trivial tasks yet. Although there are automatic methods proposed to that, users very often prefer to identify regions manually. That happens because these methods are usually developed to solve specific problems, or, when they are of general purpose, they do not yield satisfying results. Jefersson Alex dos Santos [35] presented a new interactive approach based on relevance feedback to recognize regions of remote sensing. Relevance feedback is a technique used in content-based image retrieval (CBIR) tasks. Its objective is to aggregate user preferences to the search process. The proposed solution combines the Optimum-Path Forest (OPF) classifier with composite descriptors obtained by a Genetic Programming (GP) framework. The new approach has presented good results with respect to the identification of pasture and coffee crops, overcoming the results obtained by a recently proposed method and the traditional Maximum Likelihood algorithm. Support vector machine (SVM) is originally developed for linear two-class classification via constructing an optimal separating hyper plane, where the margin is maximal. In case of not linearly separable training data, SVM is by means of kernel trick to map the original input space into a high dimensional feature space to enhance the classifier generalization ability. Genetic Algorithm (GA) is a stochastic and heuristic searching algorithm that is inspired by natural evolution. By using GA along with SVM here K. Moje Ravindra et al., [36] tried to make classification of the objects such that it will be closer to the original image. Image features are important to be considered since the images are retrieved based on these features. The extraction of suitable features from the image is the basic step by which the query image and the database images can be compared. The commonest features in an image are color, shape and texture. Generally, in the classification of object-based high resolution remote sensing images, numerous object features, such as spectral, texture, shape and contextual, are calculated after segmentation. However, the determination of the most appropriate feature subsets not only degrades the computational complexity, also can obtain a higher classification rate. By using SVM classifier we can classify the objects more accurately when we use a Non-parametric classifiers. A feature extraction from invisible bands for hyperspectral image analysis, where there is no input to the human perception system at all. The perception-centered approach has the obvious advantages in serving the “master” or “user” of the image retrieval system human, given that matching human performance is the ultimate goal of the system. They concluded that, whenever there is confusion occurs near the boundary, genetic parameter for image classification is best. CHAPTER 3 PREPROCESSING According to Mather (1999) the term pre-processing is understood as the correction of geometric and radiometric deficiencies and the removal of data errors. It seems natural that errors within the data are removed, if possible, before image interpretations start. The choice of methods to do so is always purpose dependent. If, for instance, a check of a certain land cover or object with an satellite image is the purpose, visual interpretation might be sufficient and even geometric correction not necessary (Jensen, 1996b). In our opinion the operator should define precisely the demands on the data and chose the necessary processing steps to achieve his specific task. The importance of pre-processing methods becomes obvious in change detection or monitoring applications, where the operator must be able to distinguish data noise, pre-processing and data handling errors from real changes. 3.1 Preprocessing in Image Processing Preprocessing images commonly involves removing low-frequency background noise, normalizing the intensity of the individual particles of images, removing, reflections, and masking portions of images. Image preprocessing [18] is the technique of enhancing data images prior to computational processing. It is a common name for operations with images at the lowest level of abstraction -- both input and output are intensity images. The aim of preprocessing is an improvement of the image data that suppresses unwanted distortions or enhances some image features important for further processing. Four categories of image preprocessing methods according to the size of the pixel neighborhood that is used for the calculation of new pixel brightness: Pixel brightness transformations. Geometric transformations. Pre-processing methods that use a local neighborhood of the processed pixel. Image restoration that requires knowledge about the entire image. Image pre-processing methods use the considerable redundancy in images. Neighboring pixels corresponding to one object in real images have essentially the same or similar brightness value. Thus, distorted pixel can often be restored as an average value of neighboring pixels. One example of this is filtering impulse noise. If pre-processing aims to correct some degradation in the image, the nature of a priori information is important: knowledge about the nature of the degradation; knowledge about the properties of the image acquisition device, and conditions under which the image was obtained. The nature of noise (usually its spectral characteristics) is sometimes known. The Knowledge about the objects is searched in the image. If knowledge about objects is not available in advance it can be estimated during the processing. Pixel brightness transformations Brightness transformations modify pixel brightness -- the transformation depends on the properties of a pixel itself. Two classes of brightness corrections: position dependent and gray-scale transformations. Position dependent brightness correction considers original brightness and the pixel position in the image. Gray scale transformations change brightness without regard to position in the image. Geometric transformations Geometric transforms permit the elimination of geometric distortion that occurs when an image is captured. An example is an attempt to match remotely sensed images of the same area taken after one year, when the more recent image was not taken from precisely the same position. To inspect changes over the year, it is necessary first to execute a geometric transformation, and then subtract one image from the other. A geometric transform is a vector function T that maps the pixel (x,y) to a new position (x',y'). Figure 3.1 Geometric Transform on a Plane The transformation equations are either known in advance or can be determined from known original and transformed images. Several pixels in both images with known correspondence are used to derive the unknown transformation. A geometric transform consists of two basic steps Determine the pixel co-ordinate transformation. Map the co-ordinates of the input image pixel to the point in the output image. The output point co-ordinates should be computed as continuous values (real numbers) as the position does not necessarily match the digital grid after the transform. Finding the point in the digital raster which matches the transformed point and determining its brightness. Brightness is usually computed as an interpolation of the brightness of several points in the neighborhood Typical geometric distortions which have to be overcome in remote sensing: distortion of the optical systems nonlinearities in row by row scanning and non constant sampling period. Pixel co-ordinate transformations General case of finding the co-ordinates of a point in the output image after a geometric transform, usually approximated by a polynomial equation This transform is linear with respect to the coefficients ark, b rk. If enough pairs of corresponding points (x,y), (x',y') in both images are known, it is possible to determine ark, b rk by solving a set of linear equations. More points than coefficients are usually used to improve the estimate of the coefficients. If the geometric transform does not change rapidly depending on position in the image, low order approximating polynomials, m=2 or m=3, are used, needing at least 6 or 10 pairs of corresponding points. The corresponding points should be distributed in the image in a way that can express the geometric transformation -usually they are spread uniformly. The higher the degree of the approximating polynomial, the more sensitive to the distribution of the pairs of corresponding points the geometric transform. In practice, the geometric transform is often approximated by the bilinear transformation 4 pairs of corresponding points are sufficient to find transformation coefficients Even simpler is the affine transformation for which three pairs of corresponding points are sufficient to find the coefficients Brightness interpolation Assume that the planar transformation has been accomplished, and new point coordinates (x',y') were obtained. The position of the point does not in general fit the discrete raster of the output image. Values on the integer grid are needed. Each pixel value in the output image raster can be obtained by interpolating the some neighboring non-integer samples. The brightness interpolation problem is usually expressed in a dual way (by determining the brightness of the original point in the input image that corresponds to the point in the output image lying on the discrete raster). Local pre-processing Pre-processing methods use a small neighborhood of a pixel in an input image to get a new brightness value in the output image. Such pre-processing operations are also called filtration. Local pre-processing methods can be divided into the two groups according to the goal of the processing: Smoothing suppresses noise or other small fluctuations in the image; equivalent to the suppression of high frequencies in the frequency domain. Unfortunately, smoothing also blurs all sharp edges that bear important information about the image. Gradient operators are based on local derivatives of the image function. Derivatives are bigger at locations of the image where the image function undergoes rapid changes. The aim of gradient operators is to indicate such locations in the image. Gradient operators suppress low frequencies in the frequency domain (i.e. they act as high-pass filters). Noise is often high frequency in nature; unfortunately, if a gradient operator is applied to an image the noise level increases simultaneously. Clearly, smoothing and gradient operators have conflicting aims. Some preprocessing algorithms solve this problem and permit smoothing and edge enhancement simultaneously. Averaging using a rotating mask Avoids edge blurring by searching for the homogeneous part of the current pixel neighborhood, the resulting image is in fact sharpened brightness average is calculated only within the homogeneous region a brightness dispersion sigma^2 is used as the region homogeneity measure. Let n be the number of pixels in a region R and g(i,j) be the input image. Dispersion sigma^2 is calculated as The computational complexity (number of multiplications) of the dispersion calculation can be reduced if expressed as follows Median smoothing In a set of ordered values, the median is the central value. Median filtering reduces blurring of edges. The idea is to replace the current point in the image by the median of the brightness in its neighborhood. The advantages of median filtering are that it is not affected by individual noise spikes , eliminates impulsive noise quite well, and it does not blur edges much and can be applied iteratively. The main disadvantage of median filtering in a rectangular neighborhood is its damaging of thin lines and sharp corners in the image -- this can be avoided if another shape of neighborhood is used. Edge detectors Edges are pixels where brightness changes abruptly. Calculus describes changes of continuous functions using derivatives; an image function depends on two variables - partial derivatives. A change of the image function can be described by a gradient that points in the direction of the largest growth of the image function. An edge is a property attached to an individual pixel and is calculated from the image function behavior in a neighborhood of the pixel. It is a vector variable with magnitude and direction: The gradient magnitude and gradient direction are continuous image functions where arg(x,y) is the angle (in radians) from the x-axis to the point (x,y). The gradient direction gives the direction of maximal growth of the function, e.g., from black (f(i,j)=0) to white (f(i,j)=255). There are three classes of gradient operators: Operators which approximate derivatives of the image using differences such as the Laplacian, Roberts or Prewitt operators. Operators based on the zero-crossings of the image function second derivative such as Marr-Hildreth or Canny edge detectors. Operators which match an image to a parametric model of the edges. Laplace operator The Laplace operator is a very popular operator approximating the second derivative which gives the gradient magnitude only. The Laplacian is approximated in digital images by a convolution sum. A 3 x 3 mask for 4-neighborhoods and 8-neighborhood The Laplacian operator has a disadvantage -- it responds doubly to some edges in the image. Canny edge detection Optimal for step edges corrupted by white noise. Optimality related to three criteria. Detection criterion ... important edges should not be missed; there should be no spurious responses. Localization criterion ... distances between the actual and located position of the edge should be minimal. One response criterion ... minimizes multiple responses to a single edge (also partly covered by the first criterion since when there are two responses to a single edge one of them should be considered as false) Other local pre-processing operators Several other local operations exist which are used for different purposes: Line finding, line thinning, line filling and interest point operators are among them. Adaptive neighboring pre-processing The majority of pre-processing operators work in neighborhoods of fixed sizes in the whole image, of which square windows (3x3, 5x5 or 7x7) are most common. Pre-processing operators of variable sizes and shapes exist and bring improved pre-processing results. They are based on detection of the most homogeneous neighborhood of each pixel. However they are not widely used, mostly because of computational demands and the lack of a unifying approach. A recent approach to image pre-processing introduces the concept of an adaptive neighborhood which is determined for each image pixel. The neighborhood size and shape are depending on the image data and on parameters which define measures of homogeneity of a pixel neighborhood. A significant property of the neighborhood for each pixel is the ability to self tune to contextual details in the image. Image restoration Image restoration - suppressing image degradation using knowledge about its nature. Most image restoration methods are based on convolution applied globally to the whole image. Degradation causes: Defects of optical lenses. Nonlinearity of the electro-optical sensor. Graininess of the film material. Relative motion between an object and camera. Wrong focus. Atmospheric turbulence in remote sensing or astronomy. The objective of image restoration is to reconstruct the original image from its degraded version. Image restoration techniques - two groups: Deterministic methods - applicable to images with little noise and a known degradation function. The original image is obtained from the degraded one by a transformation inverse to the degradation. Stochastic techniques - the best restoration is sought according to some stochastic criterion, e.g., a least squares method. In some cases the degradation transformation must be estimated first. 3.2 Median Filters There are many preprocessing techniques which are discussed above. In this paper, the preprocessing is done by removing noise using median filters. The best-known orderstatistic filter is the median filter [6], which, as its name implies, replaces the value of a pixel by the median of the intensity levels in the neighborhood of that pixel: (g,t)} (s,t) Sxy The value of the pixel at (x, y) is included in the computation of the median. Median filters are quite popular because, for certain type of random noise, they provide excellent noise-reduction capabilities, with considerably less blurring than linear smoothing filters of similar size. Median filters are particularly effective in the presence of both bipolar and unipolar impulse noise. Median filtering [19] is a nonlinear process useful in reducing impulsive, or salt-and-pepper noise. It is also useful in preserving edges in an image while reducing random noise. Impulsive or salt-and pepper noise can occur due to a random bit error in a communication channel. In a median filter, a window slides along the image, and the median intensity value of the pixels within the window becomes the output intensity of the pixel being processed. Like lowpass filtering, median filtering smoothes the image and is thus useful in reducing noise. Unlike lowpass filtering, median filtering can preserve discontinuities in a step function and can smooth a few pixels whose values differ significantly from their surroundings without affecting the other pixels. An important parameter in using a median filter is the size of the window. The choice of the window size depends on the context. Because it is difficult to choose the optimum window size in advance, it may be useful to try several median filters of different window sizes and choose the best of the resulting images. In median filtering [20], the neighboring pixels are ranked according to brightness (intensity) and the median value becomes the new value for the central pixel. It can do an excellent job of rejecting certain types of noise, in particular, “shot” or impulse noise in which some individual pixels have extreme values. In its operation, the pixel values in the neighborhood window are ranked according to intensity, and the middle value (the median) becomes the output value for the pixel under evaluation. Figure 3.2 Median Filtering Operations In particular, compared to the smoothing filters examined thus far, median filters offer three advantages: No reduction in contrast across steps, since output values available consist only of those present in the neighborhood (no averages). Median filtering does not shift boundaries, as can happen with conventional smoothing filters (a contrast dependent problem). Since the median is less sensitive than the mean to extreme values (outliers), those extreme values are more effectively removed. The median is, in a sense, a more robust “average” than the mean, as it is not affected by outliers (extreme values). Since the output pixel value is one of the neighboring values, new “unrealistic” values are not created near edges. Since edges are minimally degraded, median filters can be applied repeatedly, if necessary. Advantages It is simple to understand. The median filter preserves brightness differences resulting in minimal blurring of regional boundaries. Preserves the positions of boundaries in an image, making this method useful for visual examination and measurement. Median computer algorithm can be customized. We can repeat the median filter on the image until there are no further changes. In this way the median filter works like a maximum expectation restoration. Chapter 4 Land Cover Classification using GA based Fuzzy Clustering Techniques for Remotely Sensed Data 4.1 Introduction A digital remotely sensed image is typically composed of picture elements (pixels) located at the intersection of each row i and column j in each K bands of imagery. Associated with each pixel is a number known as Digital Number (DN) or Brightness Value (BV) that depicts the average radiance of a relatively small area within a scene. A smaller number indicates low average radiance from the area and the high number is an indicator of high radiant properties of the area. The size of this area effects the reproduction of details within the scene. As pixel size is reduced more scene detail is presented in digital representation. While displaying the different bands of a multispectral data set, images obtained in different bands is displayed in image planes (other than their own) the color composite is regarded as False Color Composite (FCC). High spectral resolution is important when producing color components. For a true color composite an image data used in red, green and blue spectral region must be assigned bits of red, green and blue image processor frame buffer memory. A color infrared composite ‘standard false color composite’ is displayed by placing the infrared, red, green in the red, green and blue frame buffer memory (Fig. 2). In this healthy vegetation shows up in shades of red because vegetation absorbs most of green and red energy but reflects approximately half of incident Infrared energy. Urban areas reflect equal portions of NIR, R & G, and therefore they appear as steel grey. Geometric distortions manifest themselves as errors in the position of a pixel relative to other pixels in the scene and with respect to their absolute position within some defined map projection. If left uncorrected, these geometric distortions render any data extracted from the image useless. This is particularly so if the information is to be compared to other data sets, is it from another image or a GIS data set. Distortions occur for many reasons. Rectification is a process of geometrically correcting an image so that it can be represented on a planar surface, conform to other images or conform to a map. That is, it is the process by which geometry of an image is made plan metric. It is necessary when accurate area, distance and direction measurements are required to be made from the imagery. It is achieved by transforming the data from one grid system into another grid system using a geometric transformation. Ground Control Points (GCP) are the specific pixels in the input image for which the output map coordinates are known. By using more points than necessary to solve the transformation equations a least squares solution may be found that minimizes the sum of the squares of the errors. Care should be exercised when selecting ground control points as their number, quality and distribution affect the result of the rectification. Remote sensing can be defined as any process whereby information is gathered about an object, area or phenomenon without being in contact with it. The output of a remote sensing system is usually an image representing the scene being observed. Image classification is an important part of the remote sensing, image analysis and pattern recognition. In some instances, the classification itself may be the object of the analysis. The image classification therefore forms an important tool for examination of the digital images. Classification strategies are basically divided into three. Supervised Classification techniques require training areas to be defined by the analyst in order to determine the characteristics of each category. Unsupervised Classification searches for natural groups of pixels, called clusters, present within the data by means of assessing the relative locations of the pixels in the feature space. Hybrid Classification takes the advantage of both the supervised classification and unsupervised classification. 4.2 Cluster Analysis Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning ,pattern recognition, image analysis, information retrieval, and bioinformatics. Cluster analysis itself is not one specific algorithm, but the general task to be solved. It can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them. Popular notions of clusters include groups with small distances among the cluster members, dense areas of the data space, intervals or particular statistical distributions. Clustering can therefore be formulated as a multi-objective optimization problem. The appropriate clustering algorithm and parameter settings (including values such as the distance function to use, a density threshold or the number of expected clusters) depend on the individual data set and intended use of the results. Cluster analysis as such is not an automatic task, but an iterative process of knowledge discovery or interactive multi-objective optimization that involves trial and failure. It will often be necessary to modify data preprocessing and model parameters until the result achieves the desired properties. Clustering analysis is classifying samples according to their similarity by means of unsupervised training. It makes the samples, which have greater similarity, as a class, and occupies the partial area of feature space. The clustering center of each partial area is respectively acting as a representative of the corresponding type. There are various types of clustering. One of its types is Partitional Clustering. Based on Partitional Clustering, many algorithms are used. One of the familiar algorithms is the FuzzyCmeans clustering algorithm. 4.3 Median Filters The median filter is a nonlinear digital filtering technique, often used to remove noise. Such noise reduction is a typical pre-processing step to improve the results of later processing. Median filtering is very widely used in digital image processing because, under certain conditions, it preserves edges while removing noise. The main idea of the median filter is to run through the signal entry by entry, replacing each entry with the median of neighboring entries. The pattern of neighbors is called the "window", which slides, entry by entry, over the entire signal. For 1D signal, the most obvious window is just the first few preceding and following entries, whereas for 2D (or higher-dimensional) signals such as images, more complex window patterns are possible (such as "box" or "cross" patterns). Note that if the window has an odd number of entries, then the median is simple to define: it is just the middle value after all the entries in the window are sorted numerically. Given a set of random variables X=(X1,X2,…XN) , the order statistics X(1) ≤ X(2)≤…..X(N) are random variables, defined by sorting the values of Xi in an increasing order. The median value is then given as Where m=2K+1 is the median rank. The median is considered to be a robust estimator of the location parameter of a distribution and has found numerous applications in smoothing and denoising, especially for signals contaminated by impulsive noise. For a grayscale input image with intensity values xi,j , the two-dimensional median filter is defined as Where W is a window over which the filter is applied. Median Filter Algorithm The Median Filter is performed by taking the magnitude of all of the vectors within a mask and sorted according to the magnitudes. The pixel with the median magnitude is then used to replace the pixel studied. The Simple Median Filter has an advantage over the Mean filter since median of the data is taken instead of the mean of an image. The pixel with the median magnitude is then used to replace the pixel studied. The median of a set is more robust with respect to the presence of noise. The median filter is given by Median filter(x1….xN) =Median (||x1 ||2…….||xN||2). When filtering using the Simple Median Filter, an original and the resulting filtered pixel of the sample have the same pixel. A pixel that does not change due to filtering is known as the root of the mask. 4.4 Fuzzy c means The concept of fuzzy partition is essential for cluster analysis, and consequently also for the identification techniques that are based on fuzzy clustering. Fuzzy and possibility partitions can be seen as a generalization of hard partition which is formulated in terms of classical subsets. Generalization of the hard partition to the fuzzy case follows directly by allowing µ ik to attain real values in [0, 1]. Conditions for a fuzzy partition matrix are given by (Ruspini, 1970): The ith row of the fuzzy partition matrix U contains values of the ith membership function of the fuzzy subset Ai of Z. The second Equation constrains the sum of each column to 1, and thus the total membership of each z k in Z equals one. The fuzzy partitioning space for Z is the set Most analytical fuzzy clustering algorithms (and also all the algorithms presented in this chapter) are based on optimization of the basic c-means objective function, or some modification of it. Hence we start our discussion with presenting the fuzzy cmeans functional. A large family of fuzzy clustering algorithms is based on minimization of the fuzzy c-means functional formulated as (Dunn, 1974; Bezdek, 1981): Where is a fuzzy partition matrix of Z, is a vector of cluster prototypes (centers), which have to be determined, is a squared inner-product distance norm, and is a parameter which determines the fuzziness of the resulting clusters. The value of the cost function can be seen as a measure of the total variance of z k from vi. The algorithm steps are as follows Select m (m > 1); initialize the membership function values µij, i = 1, 2, ... , n; j = 1, 2, . . . , c. Compute the cluster centers zj, j = 1, 2, ... , c Compute Euclidian distance dij, i = 1, 2, ... , n; j = 1, 2, ... , c. Update the membership functionµij, i = 1, 2, ... , n; j = 1, 2, ... , c If not converged, go to step 2. Several stopping rules can be used. One is to terminate the algorithm when the relative change in the centroid values becomes small or when the objective function cannot be minimized more. The FCM algorithm is sensitive to initial values and it is likely to fall into local optima. 4.5 Genetic Algorithm American scholar, J. Holland, first raised the Genetic Algorithm (GA) concept in 1975. It is based on “survival of the fittest” in Darwin’s theory of evolution. The basic genetic operations, which are repetitively utilized for the groups possibly containing solution, make the new groups generated then make them evolved constantly. At the same time, the optimization individuals in optimized groups are searched based on the global parallel search technique so as to obtain the global optimum solution fulfilled demands. GA generates valuable solutions for hard optimization problems using techniques that are inspired by natural evolutionary operators such as inheritance, mutation, selection, and crossover. In a genetic algorithm, a population of strings (called chromosomes or the genotype of the genome), which encode candidate solutions (called individuals, creatures, or phenotypes) to an optimization problem, evolves toward better solutions. Traditionally, solutions are represented in binary as strings of 0s and 1s, but other encodings are also possible. The evolution usually starts from a population of randomly generated individuals and happens in generations. In each generation, the fitness of every individual in the population is evaluated, multiple individuals are stochastically selected from the current population (based on their fitness), and modified (recombined and possibly randomly mutated) to form a new population. The new population is then used in the next iteration of the algorithm. Commonly, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population. If the algorithm has terminated due to a maximum number of generations, a satisfactory solution may or may not have been reached. A typical genetic algorithm requires: All solutions should have a genetic representation (in a shape of chromosome). There should be a fitness function to assess the solutions. Chromosomes The chromosomes represent set of genes, which code the independent variables. Every chromosome represents a solution of the given problem. Individual and vector of variables will be used as other words for chromosomes. A set of different chromosomes (individuals) forms a generation. By means of evolutionary operators, like selection, recombination and mutation an offspring population is created. Selection The selection of the best individuals is based on an evaluation of fitness function or fitness functions. Fitness value is a quality measurement of each solution. Better fitness values belong to better individuals in each population. When termination criteria are satisfied, algorithm reaches to better fitness value. In the final generation, a solution with better fitness value among others is found as the desired solution. Crossover The first step in the reproduction process is the recombination (crossover). In it the genes of the parents are used to form an entirely new chromosome. The typical recombination for the GA is an operation requiring two parents, but schemes with more parent area also possible. Two of the most widely used algorithms are Conventional (Scattered) Crossover and Blending (Intermediate) Crossover. Mutation The newly created by means of selection and crossover population can be further applied to mutation. Mutation means random change of the value of a gene in the population. Generate the initial population. Calculate the values of the function that we want to minimize or maximize. Check for termination of the algorithm. Selection is done between all individuals in the current population are chose those, who will continue and by means of crossover and mutation will produce offspring population Crossover – the individuals chosen by selection recombine with each other and new individuals will be created. The aim is to get offspring individuals that inherit the best possible combination of the characteristics (genes) of their parents. Mutation – by means of random change of some of the genes, it is guaranteed that even if none of the individuals contain the necessary gene value for the extreme, it is still possible to reach the extreme. New generation – the best individuals chosen from the selection are combined with those who passed the crossover and mutation, and form the next generation. Figure4.1. The GA Flowchart 4.6 Proposed Work Land cover is an important component in understanding the interactions of the human activities with the environment and thus it is necessary to be able to simulate changes. This proposal aims at developing a novel land cover clustering method using GA based clustering techniques. The proposed method has two phases: the first step computes a refined starting condition from a given initial one that is based on an efficient technique for estimating the modes of a distribution. The refined initial starting condition allows the iterative algorithm to converge to a “better” local minimum. And in the second step, a novel method has been proposed to improve to cluster quality by GA based refinement algorithm. Input Image Pre Processing Fuzzy C-means Clustering GA based Tuning Ground truth verification Classified map Figure 4.2: Methodology Genetic algorithm (GA) is randomized search and optimization techniques guided by the principles of evolution and natural genetics, having a large amount of implicit parallelism. The basic reason for the improvement is, in any clustering algorithm the obtained clusters will never give us 100% quality. There will be some errors known as misclustered. That is, a data item can be wrongly clustered. These kinds of errors can be avoided by using the improvement algorithm. The genetic algorithms based clustering may not be able to handle large amounts of data. The Fuzzy C-means algorithm does not lend itself well to adaptive clustering. And an important point is, so far, the researchers are not contributed to improve the cluster quality after grouping. In this proposed method, a new framework has been introduced to improve the cluster quality from Fuzzy C-means algorithm. The proposed algorithm is applied to the remotely sensed data (Survey of India toposheets and IRS-1C satellite imageries) of Theni region. Preprocessing Satellite images cannot be given directly as the input for the proposed technique. Thus, it is indispensable to perform pre-processing on the input image, so that the image gets transformed to be relevant for the further processing. In proposed technique, A Median filter which is a non linear filter is used in the R, G and B layers for filtering noise. It is used because, under certain conditions, it preserves edges while removing noise. GA Based Clustering A major problem with Fuzzy C-means algorithm is that it is sensitive to the selection of initial partition and may converge to a local minimum of variation if the initial partition is not properly chosen. So in the proposed method, we estimate the mode value as an initial partition. Performance optimization using genetic algorithms is given by a sequence of steps, which are: 1. Generate initial population. 2. Evaluate population 3. Selection. 4. Crossover. 5. Mutation. 6. Reinsertion of new individuals to the population. From step 2 to step 6, it performs an iterative process until a stopping criterion is met, in Fig. 4.3 we can see the Scheme of GA for optimization of the Fuzzy C-Means algorithm (FCM). In this figure we can observe that population evaluation is done by FCM algorithm, but for us to know how good some individuals need something that does not indicate the fitness of these, to measure aptitude of individuals evaluated by FCM, we use the proposed validation index mentioned in section III. Individuals evaluated by the FCM algorithm, are formed only by two parameters which are the number of clusters and the exponent of weight. Figure 4.3. FuzzyCMeans with GA Results The proposed algorithm is applied to the remotely sensed data (Survey of India toposheets and IRS-1C satellite imageries) of Theni region. Two Theni region images are used to implement the proposed algorithm. The first image is 1152x1152 size tiff image. The second image is 1153x1153 size tiff image. Both images are color image. The Original and the clustered images are shown. Figure4. 4. Original & Clustered Theni Region Image1 Figure 4.5. Original & Clustered Theni Region Image2 Even though the visual comparison gives detailed information for Fuzzy C-means clustering, to further evaluate the performance of proposed work the accuracy assessment has been done. The confusion matrix in terms of pixels and percentage is given in Table 4.1 and Table 4. 2. The overall classification accuracy is 96.04%. Table 4.1. Confusion Matrix (Pixels) Class Urban Vegetation Hilly Region Total Urban Vegetation 2239 20 21 1680 4 100 22642 1800 7 90 1963 2060 2266 1791 2067 6124 Hilly Region Total From the Confusion Matrix, it is clear that urban yields a maximum classification accuracy of 98.81% when compared to Vegetation and Hilly Region. Table 4.2. Confusion Matrix (Percentage) Class Urban Vegetation 1.17 93.80 5.03 Hilly Region 0.19 4.84 94.97 Urban Vegetation Hilly Region Total 98.81 0.88 0.31 100 Total 36.97 29.39 33.64 100 100 100 Table 4.3 gives the producer and user accuracy for individual classes. By reducing the misclassification between the Vegetation and Hilly region the overall accuracy can be further improved. Table 4.3. Accuracy Assessment Class Producer accuracy (%) User Accuracy (%) Urban Vegetation Hilly Region 98.81 93.80 94.97 98.90 93.33 95.29 Chapter 5 Conclusion Remote sensing can be defined as any process whereby information is gathered about an object, area or phenomenon without being in contact with it. The output of a remote sensing system is usually an image representing the scene being observed. Image classification is an important part of the remote sensing, image analysis and pattern recognition. In some instances, the classification itself may be the object of the analysis. The image classification therefore forms an important tool for examination of the digital images. Classification strategies are basically divided into three. Supervised Classification techniques require training areas to be defined by the analyst in order to determine the characteristics of each category. Unsupervised Classification searches for natural groups of pixels, called clusters, present within the data by means of assessing the relative locations of the pixels in the feature space. Hybrid Classification takes the advantage of both the supervised classification and unsupervised classification. Land cover is an important component in understanding the interactions of the human activities with the environment and thus it is necessary to be able to simulate changes. It is essential component where in other parameters are integrated on the requirement basis to derive various developmental index for earth resources. Digital classification methods of remotely sensed images have acquired a growing importance in the automatic recognition of the land cover patterns. Particularly, unsupervised classification methods have traditionally been considered as an important approach for the interpretation of remotely sensed images. Unsupervised classification is frequently performed through clustering methods. These methods examine the unknown pixels in an image and incorporate them into a set of classes defined through the natural clusters of the gray levels of the pixels. Cluster analysis provides a practical method for organizing a large set of data so that the retrieval of information may be made more efficiently. However, although there is a large quantity of different clustering methods in the pattern recognition area, only a limited quantity of them can be used in remote sensing applications. However there are two major limitations that exist in these methods. The first is that a predefined number of clusters must be given in advance. The second is that the FCM technique can get stuck in sub-optimal solutions. Recent attempts have adapted the K-means clustering algorithm as well as genetic algorithms based on rough sets to find interval sets of clusters. And an important point is, so far, the researchers haven’t contributed to improve the cluster quality once it is clustered. This method enables the clustering to be performed by taking the initial centroid using mode function which allows the iterative algorithm to converge to a “better” local minimum. Then the GA based refinement algorithm to improve the cluster quality. The expected outcome of this work will be the Land use land cover map, of the study area namely Theni region, TamilNadu. The proposed method has two phases: the first step computes a refined starting condition from a given initial one that is based on an efficient technique for estimating the modes of a distribution. The developed initial starting condition allows the iterative algorithm to converge to a “better” local minimum. And in the second step, a novel method has been proposed to improve to cluster quality by GA based improvement algorithm. Even though the visual comparison gives detailed information for Fuzzy C-means clustering, to further evaluate the performance of proposed work the accuracy assessment has been done. REFERENCES [1]. Jiawei Han and M Kamber , “Data Mining: Concepts and Techniques”, Second Edition. [2]. Two Crows Corporation, “Introduction to Data Mining and Knowledge Discovery”, Third Edition. [3]. Dr. Rajni Jain, “Introduction to DataMining Techniques”. [4]. Pasi Franti, “Clustering Methods”. [5]. T.Soni Madhulatha, “An Overview on Clustering Methods”, IOSR Journal of Engineering Apr. 2012, Vol. 2(4) pp: 719-725. [6]. Rafael C.Gonzalez & Richard E.Woods,‘Digital Image Processing’, Third Edition. [7]. http://en.wikipedia.org/wiki/Image_processing [8]. B.G.Obula Reddy, Dr. Maligela Ussenaiah, “Literature Survey on Clustering Techniques”, IOSR Journal of Computer Engineering. [9]. M. Kuchaki Rafsanjani, Z. Asghari Varzaneh, N. Emami Chukanlo , “A survey of hierarchical clustering algorithms”, TJMCS Vol .5 No.3 (2012) 229-240 . [10]. P. IndiraPriya & Dr. D.K. Ghosh, “A Survey on Different Clustering Algorithms in DataMining Technique”, International Journal of Modern Engineering Research, Vol.3, Issue.1, Jan-Feb. 2013 pp-267-274. [11]. Daniel T. Larose, “Discovering Knowledge in Data”. [12]. Periklis Andritsos, “Data Clustering Techniques”. [13]. Ilango & Dr. V. Mohan, “A Survey of Grid Based Clustering Algorithm”, International Journal of Engineering Science & Technology, Vol. 2(8), 2010, 34413446. [14]. B. Eshwar Reddy, M. Veeresha & Nagaraja Ra, “Image Processing: A Survey”. [15]. [16]. [17]. [18]. https://www.mtholyoke.edu/courses/tmillett/course/geog205/files/remote_sensing.pdf Melanie Mitchell, “ An Introduction to Genetic Algorithm”. http://www.cs.cmu.edu/~02317/slides/lec_9.pdf http://www.eng.iastate.edu/ee528/sonkamaterial/chapter_4.htm#4.1 [19]. http://nptel.iitk.ac.in/courses/Webcourse-contents/IITKANPUR/Digi_Img_Pro/chapter_8/8_16.html [20]. http://medim.sth.kth.se/6l2872/F/F7-1.pdf [21]. Xingping Wen and Xiaofeng Yang, “An Unsupervised Classification Method for Hyperspectral Remote Sensing Image Based on Spectral Data Mining”, INTECH. [22]. Caglar Art et al., “Unsupervised Classification of Remotely Sensed Images Using Gaussian Mixture Models And Particle Swarm Optimization”. [23]. Mohd Hasmadi et al., “Evaluating supervised and unsupervised techniques for land cover mapping using remote sensing data”, GEOGRAFIA , Malaysian Journal of Society and Space 5 issue 1 (1 - 10)2009, ISSN 2180-2491. [24]. Xiong Liu, “Supervised Classification and Unsupervised Classification”, ATS 670 Class Project. [25]. Edwin Martínez Martínez, “Remote Sensing Techniques For Land Use Classification Of Rio Jauca Watershed Using Ikonos Images”. [26]. Cheng-Yuan Liou et al., “Unsupervised Classification of Remote Sensing Imagery with Non-negative Matrix Factorization”, National Science Council under project NSC 93-2213-E-002-008. [27]. Y. Babykalpana et al., “Supervised/ Unsupervised Classification of LULC using remotely Sensed Data for Coimbatore city, India”, International Journal of Computer Applications (0975 – 8887) Volume 2 – No.7, June 2010. [28]. Ratika Pradhan et al., “Land Cover Classification of Remotely Sensed Satellite Data using Bayesian and Hybrid classifier”, International Journal of Computer Applications (0975 – 8887) Volume 7– No.11, October 2010. [29]. Luis Samaniego et al., “Supervised Classification of Agricultural Land Cover Using a Modified k-NN Technique (MNN) and Landsat Remote Sensing Imagery”, Remote Sens. 2009, 1, 875-895, ISSN 2072-4292. [30]. L. Gomez-Chova et al., “Semi-Supervised Classification Method For Hyperspectral Remote Sensing Images”, The Information Society Technologies (IST) programme of the European Community. [31]. Benqin Song et al., “Remotely Sensed Image Classification Using Sparse Representations of APs”, IEEE Transactions on GeoScience And Remote Sensing, Vol. 52, No. 8, August 2014. [32]. Ays¸e Naz Erkan et al., “Semi-Supervised Remote Sensing Image Classification Via Maximum Entropy”, The Spanish Ministry of Innovation and Science under projects. [33]. Zhang Ming et al., “Application to Satellite Remote Sensing to Soil and Land use Mapping in the rolling hilly areas of Nanjing, Eastern China”, EARSEL Advances in Remote Sensing, Vol 2, No. 3, XI 1993. [34]. X. Zhang et al., “Genetic Algorithm Optimized Neutral Network Classification And Overlapping Pixel Change Detection Based On Remote Sensing For Urban Sprawl: A Case Study In Jiading District Of Shanghai, China”, The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B7. Beijing 2008. [35]. Jefersson Alex dos Santos et al., “Interactive Classification of Remote Sensing Images by using Optimum-Path Forest and Genetic Programming”. [36]. K. Moje Ravindra et al., “Classification of Satellite images based on SVM classifier Using Genetic Algorithm”, International Journal of Innovative Research in Electrical, Electronics, Instrumentation and Control Engineering Vol. 2, Issue 5, May 2014. IJCSIET--International Journal of Computer Science information and Engg., Technologies ISSN 2277-4408 || 01122014-005 Land Cover Classification using GA based Fuzzy Clustering Techniques for Remotely Sensed Data Dr.N.Sujatha Assistant Professor, Department of Computer Science, Raja Doraisingham Govt. Arts College, Sivagangai,Tamil Nadu, India sujamurugan@gmail.com ABSTRACT: Remote Sensing Imagery is used by the Government and private agencies for the wide range of applications from military to farm development. Fuzzy c-means clustering is an effective algorithm, but the random selection in center points makes iterative process falling into the local optimal solution easily. In this Paper, a novel clustering method is developed using GA based clustering techniques. This technique enables the clustering to be performed by taking the initial centroid using mode function which allows the iterative algorithm to meet to a “better” local minimum. Then the GA based improvement algorithm to get better cluster quality. The study area taken here is the Theni region, Tamil Nadu. Keywords: Fuzzy C-means, Membership, Mode, Euclidean Distance, Population, Chromosomes, Mutation, Crossover, Selection. being observed. Image classification [2] is an important part of the remote sensing, image analysis and pattern recognition. In some instances, the classification itself may be the object of the analysis. The image classification therefore forms an important tool for examination of the digital images. Classification strategies are basically divided into three. Supervised Classification [3] techniques require training areas to be defined by the analyst in order to determine the characteristics of each category. Unsupervised Classification searches for natural groups of pixels, called clusters, present within the data by means of assessing the relative locations of the pixels in the feature space. Hybrid Classification takes the advantage of both the supervised classification and unsupervised classification. 2. Clustering Analysis Clustering analysis [5] is classifying samples according to their similarity by means of unsupervised training. It makes the samples, which have greater similarity, as a class, and occupies the partial area of feature space. The clustering center of each partial area is respectively acting as a representative of the corresponding type.There are various types of clustering. One of its type is Partitional Clustering. Based on Partitional Clustering, many algorithms are used. One of the familiar algorithms is the FuzzyCmeans clustering algorithm. 3. Fuzzy C-means Algorithm 1. INTRODUCTION Remote sensing [1] can be defined as any process whereby information is gathered about an object, area or phenomenon without being in contact with it. The output of a remote sensing system is usually an image representing the scene It is an extension of k-means. Fuzzy CMeans [6] allows data points to be assigned into more than one cluster. Each data point has a degree of membership (or probability) of belonging to each cluster. In fuzzy clustering, each point has a degree of belonging to clusters, as in fuzzy logic, rather than belonging completely too just one cluster. Thus, points on the edge of a cluster may be in the cluster to a lesser degree than points in the center of cluster. For each point x we have a coefficient giving the degree of being in the kth cluster u k(x). Usually, the sum of those coefficients is defined to be With fuzzy k-means, the centroid of a cluster is the mean of all points, weighted by their degree of belonging to the cluster: Select m (m > 1); initialize the membership function values µij, i = 1, 2, ... , n; j = 1, 2, . . . , c. Compute the cluster centers zj, j = 1, 2, ... , c Compute Euclidian distance dij, i = 1, 2, ... , n; j = 1, 2, ... , c. Update the membership functionµij, i = 1, 2, ... , n; j = 1, 2, ... , c If not converged, go to step 2. Several stopping rules can be used. One is to terminate the algorithm when the relative change in the centroid values becomes small or when the objective function cannot be minimized more. The FCM algorithm is sensitive to initial values and it is likely to fall into local optima. 4. Genetic Algorithm The degree of belonging is related to the inverse of the distance to the cluster center: then the coefficients are normalized and fuzzyfied with a real parameter m > 1 so that their sum is 1. So For m equal to 2, this is equivalent to normalizing the coefficient linearly to make their sum 1. When m is close to 1, then cluster center closest to the point is given much more weight than the others, and the algorithm is similar to k-means. The algorithm [7] steps are as follows American scholar, J. Holland, first raised the Genetic Algorithm (GA) concept in 1975. It is based on “survival of the fittest” in Darwin’s theory of evolution. The basic genetic operations, which are repetitively utilized for the groups possibly containing solution, make the new groups generated then make them evolved constantly. At the same time, the optimization individuals in optimized groups are searched based on the global parallel search technique so as to obtain the global optimum solution fulfilled demands. GA [4] generates valuable solutions for hard optimization problems using techniques that are inspired by natural evolutionary operators such as inheritance, mutation, selection, and crossover. A common genetic algorithm involves two main parts: All solutions should have a genetic representation (in a shape of chromosome). There should be a fitness function to assess the solutions. 4.1 .Chromosomes The chromosomes represent set of genes, which code the independent variables. Every chromosome represents a solution of the given problem. Individual and vector of variables will be used as other words for chromosomes. A set of different chromosomes (individuals) forms a generation. By means of evolutionary operators, like selection, recombination and mutation an offspring population is created. 4.2 . Selection The selection of the best individuals is based on an evaluation of fitness function or fitness functions. Fitness value is a quality measurement of each solution. Better fitness values belong to better individuals in each population. When termination criteria are satisfied, algorithm reaches to better fitness value. In the final generation, a solution with better fitness value among others is found as the desired solution. 4.3. Crossover The first step in the reproduction process is the recombination (crossover). In it the genes of the parents are used to form an entirely new chromosome. The typical recombination for the GA is an operation requiring two parents, but schemes with more parent area also possible. Two of the most widely used algorithms are Conventional (Scattered) Crossover and Blending (Intermediate) Crossover. 4.4 Mutation The newly created by means of selection and crossover population can be further applied to mutation. Mutation means random change of the value of a gene in the population. Generate the initial population. Calculate the values of the function that we want to minimize or maximize. Check for termination of the algorithm. Selection is done between all individuals in the current population are chose those, who will continue and by means of crossover and mutation will produce offspring population Crossover – the individuals chosen by selection recombine with each other and new individuals will be created. The aim is to get offspring individuals that inherit the best possible combination of the characteristics (genes) of their parents. Mutation – by means of random change of some of the genes, it is guaranteed that even if none of the individuals contain the necessary gene value for the extreme, it is still possible to reach the extreme. New generation – the best individuals chosen from the selection are combined with those who passed the crossover and mutation, and form the next generation. Figure1. The GA Flowchart 5. Methodology Land cover is an important component in understanding the interactions of the human activities with the environment and thus it is necessary to be able to simulate changes. This proposal aims at developing a novel land cover clustering method using GA based clustering techniques. The proposed method has two phases: the first step computes a refined starting condition from a given initial one that is based on an efficient technique for estimating the modes of a distribution. The refined initial starting condition allows the iterative algorithm to converge to a “better” local minimum. And in the second step, a novel method has been proposed to improve to cluster quality by GA based refinement algorithm. improvement is, in any clustering algorithm the obtained clusters will never give us 100% quality. There will be some errors known as misclustered. That is, a data item can be wrongly clustered. These kinds of errors can be avoided by using the improvement algorithm. The genetic algorithms based clustering may not be able to handle large amounts of data. The Fuzzy C-means algorithm does not lend itself well to adaptive clustering. And an important point is, so far, the researchers are not contributed to improve the cluster quality after grouping. In this proposed method, a new framework has been introduced to improve the cluster quality from Fuzzy Cmeans algorithm. The proposed algorithm is applied to the remotely sensed data (Survey of India toposheets and IRS-1C satellite imageries) of Theni region. 5.1 . Preprocessing Input Image Pre Processing Fuzzy C-means Clustering GA based Tuning Ground truth verification Classified map Figure 2: Methodology Genetic algorithm (GA) is randomized search and optimization techniques guided by the principles of evolution and natural genetics, having a large amount of implicit parallelism. The basic reason for the Satellite images cannot be given directly as the input for the proposed technique. Thus, it is indispensable to perform pre-processing on the input image, so that the image gets transformed to be relevant for the further processing. In proposed technique, A Median filter which is a non linear filter is used in the R, G and B layers for filtering noise. It is used because, under certain conditions, it preserves edges while removing noise. 5.2 . GA Based Clustering A major problem with Fuzzy C-means algorithm is that it is sensitive to the selection of initial partition and may converge to a local minimum of variation if the initial partition is not properly chosen. So in the proposed method, we estimate the mode value as an initial partition. Performance optimization using genetic algorithms is given by a sequence of steps, which are: 7. Generate initial population. 8. Evaluate population 9. Selection. 10. Crossover. 11. Mutation. 12. Reinsertion of new individuals to the population. From step 2 to step 6, it performs an iterative process until a stopping criterion is met, in Fig. 3 we can see the Scheme of GA for optimization of the Fuzzy CMeans algorithm (FCM). In this figure we can observe that population evaluation is done by FCM algorithm, but for us to know how good some individuals need something that does not indicate the fitness of these, to measure aptitude of individuals evaluated by FCM, we use the proposed validation index mentioned in section III. Individuals evaluated by the FCM algorithm, are formed only by two parameters which are the number of clusters and the exponent of weight. of Theni region. Two Theni region images are used to implement the proposed algorithm. The first image is 1152x1152 size tiff image. The second image is 1153x1153 size tiff image. Both images are color image. The Original and the clustered images are shown. Figure 4. Original & Clustered Theni Region Image1 Figure 5. Original & Clustered Theni Region Image2 Even though the visual comparison gives detailed information for Fuzzy C-means clustering, to further evaluate the performance of proposed work the accuracy assessment has been done. The confusion matrix in terms of pixels and percentage is given in Table 1 and Table 2. The overall classification accuracy is 96.04%. Table 1. Confusion Matrix (Pixels) Figure 3. FuzzyCMeans with GA 6. Results Class Urban Vegetation Hilly Region Total Urban Vegetation 2239 20 21 1680 4 100 22642 1800 7 90 1963 2060 2266 1791 2067 6124 Hilly Region Total The proposed algorithm is applied to the remotely sensed data (Survey of India toposheets and IRS-1C satellite imageries) From the Confusion Matrix, it is clear that urban yields a maximum classification accuracy of 98.81% when compared to Vegetation and Hilly Region. [2].Aykut Akgun et al., “Comparing Different Satellite Image Classification Methods: An Application In Ayvalik District,Western Turkey”. Table 2. Confusion Matrix (Percentage) [3].Ratika Pradhan et al., “Land Cover Classification of Remotely Sensed Satellite Data using Bayesian and Hybrid classifier”, International Journal of Computer Applications (0975 – 8887) Volume 7– No.11, October 2010. Class Urban Vegetation 1.17 93.80 5.03 Hilly Region 0.19 4.84 94.97 Urban Vegetation Hilly Region Total 98.81 0.88 0.31 100 Total 36.97 29.39 33.64 100 100 100 Table 3 gives the producer and user accuracy for individual classes. By reducing the misclassification between the Vegetation and Hilly region the overall accuracy can be further improved. Table 3. Accuracy Assesment Class Urban Vegetation Hilly Region Producer accuracy (%) 98.81 93.80 94.97 User Accuracy (%) 98.90 93.33 95.29 7. Conclusion The proposed method has two phases: the first step computes a refined starting condition from a given initial one that is based on an efficient technique for estimating the modes of a distribution. The developed initial starting condition allows the iterative algorithm to converge to a “better” local minimum. And in the second step, a novel method has been proposed to improve to cluster quality by GA based improvement algorithm. 8. References [1].“Introduction to Remote Sensing and Image Processing” IDRISI Guide to GIS and Image Processing Volume 1. [4].K. Moje Ravindra et al., “Classification of Satellite images based on SVM classifier using Genetic Algorithm”, IJIREEICE, Vol 2, issue 5, May 2014. [5].Yingjie Wang, “Fuzzy Clustering Analysis by Using Genetic Algorithm”, ICIC Express Letters, Volume 2, Number 4, December 2008, pp. 331—337 [6].Ramandeep Kaur & Gurjith Singh Bhathal, “A Survey of Clustering Techniques”, International Journal on Computer Science and Engineering Vol. 02, No. 09, 2010, 2976-2980 [7].Hesam Izakian & Ajith Abraham, “Fuzzy C-means and fuzzy swarm for fuzzy clustering problem”, Expert Systems with Applications 38 (2011) 1835–1838, www.elsevier.com/locate/eswa.