1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 APPENDIX A: Self-Organizing Map The Self-Organizing Map or SOM is a pattern recognition technique or a cluster algorithm that is based on unsupervised competitive learning process. The output neurons of the network compete among themselves to be activated or fired with the result that only one output neuron wins the competition known as the ‘winner neuron or node’. It is an effective method for feature extraction and classification. Within limits, SOM can be used to identify as many distinct patterns as we wish. SOM is characterized by the formation of a topographic map of the input patterns in which the spatial locations or coordinates of the neurons in the lattice indicates the intrinsic statistical features contained in the input patterns and thereby facilitating the data compression and easy visualization. Mathematically, this is a process of topology conserving projection from an original higher-dimensional data space into the lower-dimensional lattice [Haykin, 1999]. The main advantage of SOM is that it does not require any prior knowledge of the data domain or the human intervention. This method is similar to the standard iterative clustering algorithm such as k-means clustering [Gutierrez et al., 2004]. SOM algorithm has been extensively used in different disciplines of science including meteorology [Cavazos, 1999; Hewitson and Crane, 2002; Gutierrez et al., 2005; Ambroise et al., 2000; Leloup et al., 2007; Tozuka et al., 2008, CSG08; Morioka et al., 2010, Joseph et al., 2011 etc.]. This technique is different from other statistical analysis in the sense that it uses the minimum Euclidean distance among the reference vectors associated with a node and the input data vector to cluster in each node. The main emphasis of the SOM algorithm is in putting similar vectors close to one another in a low-dimensional map. Greater are the distances between any two reference vectors, more different are the two nodes and so are the patterns associated with the nodes. Heskes and Kappen, 1995 showed that in a way, SOM technique is analogous to (but more complex than) nonparametric regression technique. We have used the Kohonen model [Kohonen, 1990] of SOM in this study and its implementation resembles to the vector-quantization method. Given an N-dimensional (ND) data space consisting of cloud of data points (input variables), the SOM algorithm distributes an arbitrary number of nodes (or cluster centres) in the form of a 1D or 2D regular lattice in such a way that each node is uniquely defined by a reference vector (or code vector) consisting of weights. Each weight of the reference vector is associated with a particular input variable. SOM adjusts the reference vectors to the ND data cloud through a user defined iterative cycle minimizing the Euclidean distance between the reference vector for any j th node, W j and the input data vector, X . Mathematically, 35 36 X Wc min X W j ; where Wc is the closest reference vector to X . 37 38 39 40 41 42 43 44 45 As mentioned earlier for a particular data record only one node wins and the winner node locates the centre of the topological neighbourhood of the cooperating nodes. An optimal mapping will be such that the winner node also changes the neighbour nodes as defined by the user. This inclusion of the neighbourhood makes the SOM classification nonlinear since each node has to be adjusted relative to the neighbour. This training cycle may be continued for n times and may be mathematically described as: 46 W n c n x n W n , j R n 1 j j j W j n 1 ,otherwise W j n A1 47 48 49 50 51 52 53 54 55 56 Here Wj n is the reference vector for the j th node for nth training cycle, x n is the input vector, R j n is the predefined neighbourhood around the node j and c n is the neighbourhood kernel which defines the neighbourhood. The neighbourhood attains its maximum at the winning node and its amplitude decreases monotonically with the increase in the lateral distance decaying to 0 for lateral distance making it the necessary condition for convergence. Gaussian neighbourhood satisfies these requirements to produce a smoother neighbourhood and hence in this study we have used the Gaussian neighbourhood: 57 2 1 2 2 n i n exp r r j A2 58 59 where n and n are constants monotonically decreasing with n . n is the 60 learning rate which determines the ‘velocity’ of the learning process and n is the 61 62 63 64 65 66 67 68 69 70 71 amplitude which determines the width of the neighbourhood kernel. The r j and ri are the coordinates of the nodes j and i in which the neighbourhood kernel is defined. The free software for SOM is available at http://www.cis.hut.fi/research/somresearch/. The SOM reference vectors span the data space and each node represents the position that approximates the mean of the nearby samples in the data space. Another important advantage of SOM is that smaller (larger) number of SOM nodes can be allocated for a sparse (dense) dataset [Hewitson and Crane, 2002]. Hewitson and Crane, [2002] have demonstrated the advantages of SOM by using a simple artificial 2D data set. CSG08 has explained the same for a 2D meteorological data set while Joseph et al., [2011] used SOM for a 3D atmospheric dataset. The experimental detail and the schematic figure of SOM (Fig.1) are explained in the section 2.1.