International Journal of Science, Engineering and Technology Research (IJSETR) Application of Cluster and Factor Analysis in Groundwater Quality monitoring – A Case Study S.Krishnaraj, Sanjiv Kumar and K.P.Elango Abstract— In this case study, chemometric techniques, such as factor analysis and cluster analysis were applied for assessment and monitoring the groundwater quality. These analytic techniques were employed for the better interpretation of large complex water quality data set monitored in the four seasons from twenty five groundwater locations of K.Paramathi block,Karur district of Tamil Nadu during the year 2012. The water samples were characterized for the physico-chemical parameters such as temperature, pH, total alkalinity, electrical conductivity, total hardness, calcium ions, magnesium ions, total dissolved solids, fluorides, chlorides and sulphates. The data obtained were subjected to Hierarchical cluster analysis grouped twenty five sampling stations into seven clusters which in turn classified under three categories like less polluted (LP), moderately polluted (MP) and highly polluted (HP) sites based on the similarity of water quality characteristics and pollution load. Factor analysis indicated four factors initially and when rotation of the factor axis was executed, it yielded two factors with clear indication of high loadings for some variable and low loadings for others, facilitating data interpretation in terms of original variables. The calculation of factor scores facilitates the identification of sampling sites with high pollution load. These two analyses confirm the sampling sites with high pollution load and marker variables to be considered for treatment. Thus, this study demonstrates the usefulness of chemometric techniques for effective groundwater quality management. provide a basis on developing realistic tools that could help local decision-makers on the suitable management of the groundwater in the area. Index Terms— cluster analysis, factor analysis, factor score and marker variable. 2.2. Sample collection and monitoring parameters A total of 25 water quality monitoring stations were identified and water samples were collected in the middle month of four seasons namely post-monsoon (January – March), summer (April –June), pre-monsoon (July – September) and monsoon (October – December) of the year 2012. The groundwater samples were analyzed for parameters which include pH, electrical conductivity, total dissolved solids, total alkalinity, total hardness, Ca (II), Mg (II), Na, K, fluorides, sulphates and chlorides using standard protocols [5] and the quality of the data was ensured through careful standardization. I. INTRODUCTION Groundwater quality parameters are controlled by many factors such as rainfall, composition of aquifer material, topography, hydrologic fluctuation and climate. These factors interact in a complex way and result in spatial and temporal variation in water quality parameters. The application of different chemometric techniques helps in the interpretation of complex data matrices to understand the water quality and offers a valuable tool for reliable management of water resources as well as rapid solution to problems [1,2]. Thus, the main objective of this study is to assess the groundwater quality and its suitability for drinking and domestic purposes using chemometric techniques. Different chemometric techniques such as factor analysis (FA), cluster analysis (CA) and principal component analysis (PCA) are all used for better assessment of the water quality and to identify the pollution source apportionment [3,4]. They are designed to reduce the number of variables to a small number of indices while attempting to preserve the relationships present in the original data. With this background, an attempt was made to implement factor analysis method in order to reduce number of factors and identify practical pollution indicators in the study area. Furthermore, this study also intends to II. MATERIALS AND METHODS 2.1. Study area The K.Paramathi block of Tamil Nadu, India is located at 10.95º N and 78.08 ºE with a mean elevation 122 m. The average annual rainfall is about 855 mm. The city gets most of its seasonal rainfall from the north east monsoon between late September to mid November. Vast mineral deposits, availability of water and good infrastructure are conducive for industrialization in the Amravati river basin of Karur has resulted in heavy textile based industrialization. Many small, medium and large scale textile industries are situated in the region and these establishments have adversely affected the ground water quality. Besides, the increased population and improper drainage system in the study area have immense potential to influence the ground water quality. 2.3. Multivariate statistical techniques 2.3.1. Factor Analysis The main purpose of factor analysis (FA) is to reduce the contribution of these significant variables to simplify even more of the data structure coming from PCA. This purpose can be achieved by rotating the axis defined by PCA, according to well establised rules and contributing new variables, also called varifactors (VF). PC is a linear combination of observable water quality variables, whereas VF can include unobservable, hypothetical, latent variables [7,8]. The FA is expressed as 𝑍𝑗𝑖 = 𝑎𝑓1 𝑓1𝑖 +𝑎𝑓2 𝑓2𝑖 + ………….+𝑎𝑓𝑚 𝑓𝑚𝑖 + 𝑒𝑓𝑖 Where z is the measured variable, a is the factor loading, f is the factor score, e is the residual term accounting for errors or other source of variation, i is the sample number and m the total number of factors. 1 All Rights Reserved © 2012 IJSETR International Journal of Science, Engineering and Technology Research (IJSETR) Factor analysis attempts to explain the correlations between the observations in terms of the underlying factors, which are not directly observable [9]. There are three stages in factor analysis [10]: Step: 1(Number of Factors) The number of factors needed to explain correlation among the variables can be determine by the method similar to determine the number of Principal component that should be retained in PCA.(The most popular nourishes are the eigen value – greater than one rule and the Scree plot). Step: 2 (Factor solution) The factor pattern matrix gives the pattern or structure loadings. The sum of the squared pattern loadings for a given factor is the communality of all the variables with that factor and is given by the eigen value of the factor. The main objective of the factor analysis is to explain the inter correlation among the variables and not to account for the total variation in the data. Step: 3 (Assess the estimated factor solution One can examine the correlation among the indicator after the effect of the factors has been partial led out.It is obvious that for a good factor solution the resulting partial correlations should be close to zero, because once the effect of the common factors has been removed there is nothing to link the indicators. 2.3.2. Cluster Analysis Cluster analysis is a technique used for combining observations in to groups or clusters such that i) Each group or cluster is homogeneous or compact with respect to certain characteristics. That is, observations in each group are similar to each other. ii)Each group should be different from other groups with respect to the same characteristics ie observations of one group should be different from the observations of other groups. Note: The objective of cluster analysis appears to be similar to that of factor analysis. It is therefore possible to use FA to cluster observations, and to use cluster analysis to cluster variables. The FA techniques used to cluster observations is known as Q- factor analysis. 2.3.2.1. Objective of Cluster Analysis The objective of cluster analysis is to group observations into cluster such that each cluster is as homogeneous as possible with respect to the clustering variables. Step: 1 Select a measure of similarity Step: 2 Type of clustering technique to be used (e.g. Hierarchical or non-hierarchical). Step: 3 Type of clustering method for the selected technique is selected. Step: 4 Decision regarding the number of cluster is made. Step: 5 Interpretation of cluster solution. III. RESULTS AND DISCUSSION The groundwater samples collected during the four seasons were analyzed using standard methods and their descriptive statistical data is presented in Table 1,2,3 and 4.The data obtained from the laboratory analysis were used as inputs for factor analysis (FA) as well as cluster analysis (CA). This analysis was performed using the XLSTAT 2013 software. 3.1 Factor reduction and marker variables The results of the present study indicated that the water is alkaline in nature. The average pH values varied from 7.0 – 8.1 (post-monsoon), 7.3 -8.4 (summer), 7.1 - 8.2 (pre-monsoon) and 7.3 – 8.5 (monsoon). The average EC values varied between 1168 – 4120, 1068 – 3851, 1089 – 4098, 1210 - 4532 µS during post-monsoon, summer, pre-monsoon and monsoon seasons, respectively. The average total dissolved solids (TDS) values varied from 138 - 2945 mg L-1 (post monsoon), 792 - 2852 (summer), 856-3026 (pre monsoon), and 941 - 3208 (monsoon). The total hardness (TH) and other parameters are found to be within the permissible limit except fluoride which was found to exceed the limit in certain sampling points and varies slightly with seasons. The Eigen value, percentage of variance and cumulative percent calculated are shown in the Table 5 and 6.The analysis generated four factors (Eigen value greater than unity) which accounted for 63.6%, 59.5 %, 57.5% and 56.6 % of the total variance in post-monsoon, summer, premonsoon and monsoon seasons, respectively. After varimax rotation, each original variable tends to be associated with one (or a small number) of the factors and each factor represents only a small number of variable. Table 4 shows the factor pattern of water quality parameters after varimax rotation for the four seasons (2012). The parameters are grouped based on the factor loadings. Factor 1 exhibited 26.9% of the total variance of 52.1 % with strong positive loadings for EC and TDS in post-monsoon season, 26.4% of the total variance of 48.1 % with strong positive loadings for EC and TDS in summer season, 24.7% of the total variance of 47 % with strong positive loadings for EC and TDS in pre-monsoon and 25.1% of the total variance of 46.4% with strong positive loadings for EC and TDS in monsoon season. These high loadings represent a relative high correlation between each other. Factor 2 exhibited 25.1% with strong loading of SO4 and moderate positive loading for pH in post-monsoon, 21.7% with strong loading of SO4 and moderate positive loadings for pH in summer, 22.2% with strong loading of SO4 and moderate positive loading for pH in pre-monsoon and 21.2% with moderate positive loading for pH and SO4 in monsoon season. The above results indicates that in all the four season factor 1 exhibits EC and TDS as strong positive loadings and factor 2 exhibits SO 4 as strong positive loadings. The factor scores were then calculated for all 25 monitoring stations and are shown in Table 7 and 8. In postmonsoon, the high scores for Factor 1 were observed at stations 1, 3, 8,10and 19. During this season, the high scores for Factor 2 were observed at stations 3, 5,8,19 and 24. In the case of summer season, the high scores for Factor 1 were observed at stations such as 1, 2,4,10 and 15 and the high scores for Factor 2 were observed at stations 2,11,17,21 and 22. In pre-monsoon season, the high scores for Factor 1 were observed at stations 1, 2, 4,7,10 and 15, the high scores for Factor 2 were observed at stations 1, 2 and 17. During the monsoon season, the high scores for 2 All Rights Reserved © 2012 IJSETR International Journal of Science, Engineering and Technology Research (IJSETR) Factor 1 were observed at stations 1, 2, 4,10,15,16 and 25 and the high scores for Factor 2 are 3, 8 and 19. These findings reveal the fact that twelve water quality parameters listed in the present study can be represented by factors 1 and 2 and these could be used as an indicator for potential contamination. In other words, any arbitrarily selected parameter from factor 1 and 2 could be used as a ‘marker’ variable to detect potential contamination. For the present investigation, the probable candidates for this purpose could be any one of the easily measured parameters such as EC or TDS for factor 1 and Total hardness TH or SO4 for factor 2 in all the four seasons. Similar approach of deriving a marker variable through factor analysis has been reported [8]. 3.2 Spatial similarity and site grouping Cluster analysis was used to detect the similarity groups between the sampling sites. It yielded a dendrograms (Fig 1,2,3and 4) and profile plots(Fig 1A,2A,3A and 4A), grouping all 25 sampling sites of K.Paramathi block into six statistically significant clusters during post monsoon season. The Cluster 1, composed of the sampling stations assigned by numbers 1,4,7,9,10,11and 15 (central object 11) concerns 28 % of the total water samples, Cluster 2 represented by sample numbers 2,13 and 14(central object 14) occupies 12 % of the total water samples and Cluster 3 includes sample numbers 3,8 and 19 (central object8) corresponds to highly polluted (HP) sites. The Cluster 4, composed of the sampling stations assigned by number 5 concerns 4 % of the total water samples, Cluster 5 represented by sample numbers 6,12,16,20,21,23,24 and 25 (central object 20) occupies 32 % of the total water samples corresponds to moderately polluted (MP) sites. The Cluster 6, composed of the sampling stations assigned by numbers 17and 22 (central object 17) concerns 8 % of the total water samples and Cluster 7 represented by sample number 18 occupies 04 % of the total water samples corresponds to Low polluted (LP) sites. Summer season: Cluster 1, composed of the sampling stations assigned by numbers 1,2,4,6,7,9,10,11,12,15,16 and 23 (central object 15) concerns 48 % of the total water samples, Cluster 4 represented by sample numbers 13,14,18,20,21 and 25(central object 21) occupies 24 % of the total water samples and Cluster 5 includes sample numbers 17and 22 (central object17) occupies 08% of the total water samples corresponds to highly polluted (HP) sites. The Cluster 2, composed of the sampling stations assigned by number 3,8 and 19 ( central object 3)concerns 12 % of the total water samples, Cluster 6 represented by sample number 24 occupies 4 % of the total water samples corresponds to moderately polluted (MP) sites. The Cluster 3, composed of the sampling station assigned by number 5 concerns 4 % of the total water samples corresponds to Low polluted (LP) sites. Pre monsoon season: Cluster 1, composed of the sampling stations assigned by numbers 1,4,6,7,9,10,11,15and 23 (central object 15) and concerns 36 % of the total water samples, Cluster 2 represented by sample numbers 2,13,14,18,20,21 and 25(central object 14) occupies 28 % of the total water samples, Cluster 6 represented by sample number 16 (central object 16) occupies 4 % of the total water samples and Cluster 7 includes sample numbers 17 and 22 (central object17) corresponds to highly polluted (HP) sites. The Cluster 3 composed of the sampling stations assigned by number 3,8 and19( central object 3) concerns 12 % of the total water samples, Cluster 5 represented by sample number 12 occupies 4 % of the total water samples corresponds to moderately polluted (MP) sites. The Cluster 4, composed of the sampling stations assigned by number 5 concerns 4 % of the total water samples and Cluster 8 represented by sample number 24 occupies 4 % of the total water samples corresponds to Low polluted (LP) sites. Monsoon season: Cluster 1, composed of the sampling stations assigned by numbers 1,4,6,7,9,10,11and 15 (central object 11) concerns 32 % of the total water samples, Cluster 2 represented by sample numbers 2,8,13,14,18,20,21 and 25(central object 25) occupies 32 % of the total water samples, Cluster 3 represented by sample number 3 and19 (central object 3) occupies 8 % of the total water samples Cluster 5 includes sample numbers 12 and 16 (central object12) occupies 8 % of the total water samples, Cluster 6 includes sample numbers 17 and 12 (central object17) concerns 8 % of the total water samples corresponds to highly polluted (HP) sites. The Cluster 7 composed of the sampling stations assigned by number 23 occupies 4% of the total water samples and Cluster 8 represented by sample number 24 occupies 4 % of the total water samples corresponds to moderately polluted (MP) sites. The Cluster 4, composed of the sampling stations assigned by number 5 concerns 4 % of the total water samples corresponds to Low polluted (LP) sites. From the outcome of the factor analysis and cluster analysis, the sampling sites assigned by the numbers 1,3,8,10 and 19 in post monsoon season, 1,2,4,10,11,15,17,21 and 22 in summer season,1,2,4,7,10 and 10 in pre monsoon and 1,2,3,4,8,10,15,16,19 and 25 in monsoon season were identified as the highly polluted . IV. CONCLUSION This investigation employed two important chemometric techniques and evaluated spatial and temporal variations in the groundwater quality of K.Paramathi block, Tamil Nadu. This case study revealed that factor analysis helped to extract factors and identify the marker variable responsible for variations in the groundwater quality at different sampling sites. Besides, the cluster analysis grouped 25 sampling sites into seven clusters of similar water quality characteristics, which were further classified under three categories respectively highly polluted, moderately polluted and low polluted sampling sites. Based on this outcome, it is possible to design a future optional sampling strategy, which could reduce the number of sampling stations and associated costs. This case study with K.Paramathi block as a model system demonstrated the scope of chemometric analytical techniques for analysis and interpretation of complex datasets to undertake meaningful decisions for effective management of groundwater quality. Such techniques need to be explored to save crucial response time to potential contamination risks. 3 All Rights Reserved © 2012 IJSETR International Journal of Science, Engineering and Technology Research (IJSETR) V. TABLES AND FIGURES Table 5. Eigen value (EV), percentage of variance (V) Table 1. Descriptive statistical data of groundwater samples during post-monsoon Post-monsoon Std. Variable Minimum Maximum Mean Deviation EC 1168 4120 2425 734.5 TDS 138 2945 1729 617.6 TA 193 495 303 75.2 TH 142 810 362 170.3 pH 7.0 8.1 7.3 0.3 Ca (II) 82 234 135 41.9 Mg (II) 42 126 76 23.6 Na 132 450 249 101.4 K 20 49 35 7.1 F 0.3 48 2.4 9.4 SO4 54 195 104 44 Cl 125 575 278 115 Table 2. Descriptive statistical data of groundwater samples during summer Summer Std. Variable Minimum Maximum Mean Deviation EC 1068 3851 2181 712.5 TDS 792 2852 1617 516.0 TA 208 562 344 78.5 TH 102 698 358 143.1 pH 7.3 8.4 7.7 0.2 Ca (II) 86 295 155 47.9 Mg (II) 22 128 60.2 30.5 Na 128 454 246 102.7 K 23 52 35 7.1 F 0.25 0.93 0.51 0.17 SO4 52 178 107 38.8 Cl 138 572 295 117.2 and cumulative percent (C) Factor 1 2 3 4 5 6 Post-monsoon EV V (%) C (%) 3.52 29.4 29.4 2.73 22.74 52.14 1.38 11.53 63.67 0.77 6.47 70.15 0.36 3.06 73.21 0.18 1.53 74.74 EV 3.29 2.49 1.36 0.98 0.29 0.16 Summer V (%) 27.47 20.77 11.34 8.20 2.42 1.35 C (%) 27.47 48.24 59.58 67.78 70.20 71.56 Table6. Eigen value (EV), percentage of variance (V) and cumulative percent (C) Factor 1 2 3 4 5 6 Post-monsoon EV V (%) C (%) 3.12 26.02 26.02 2.52 21.06 47.08 1.26 10.49 57.57 0.94 7.83 65.41 0.28 2.37 67.79 0.15 1.30 69.08 EV 3.04 2.52 1.22 1.05 0.27 0.07 Summer V (%) 25.36 21.03 10.23 8.78 2.25 1.89 C (%) 25.36 46.40 56.63 65.42 67.67 69.56 Table 3. Descriptive statistical data of groundwater samples during post-monsoon Pre-monsoon Std. Variable Minimum Maximum Mean Deviation EC 1089 4078 2372 741.2 TDS 856 3026 1755 540.9 TA 184 496 301 75.5 TH 96 640 319 139.9 pH 7.1 8.2 7.5 0.27 Ca (II) 56 229 129 40.6 Mg (II) 24 113 56 22.8 Na 102 432 223 101.0 K 23 46 33.3 6.5 F 0.3 0.82 0.503 0.14 SO4 49 165 99 33.7 Cl 106 532 273 109.3 Table4. Descriptive statistical data of groundwater samples during monsoon monsoon Std. Variable Minimum Maximum Mean Deviation EC 1210 4532 2608 808.5 TDS 941 3208 1886 577.05 TA 22 545 326 103.6 TH 96 718 351 154.4 pH 7.3 805 7.7 0.316 Ca (II) 64 252 144 44.2 Mg (II) 22 101 49 20.3 Na 112 451 234 97.5 K 21 43 31 5.9 F 0.3 0.9 0.57 0.17 SO4 52 182 109 37.1 Cl 111 585 302 119.9 4 All Rights Reserved © 2012 IJSETR International Journal of Science, Engineering and Technology Research (IJSETR) Table 7. Factor scores after varimax rotation Dendrogram -0.352738 -0.152738 0.0472623 Similarity 0.2472623 0.4472623 0.6472623 0.8472623 Obs16 Obs23 Obs11 Obs24 Obs6 Obs10 Obs19 Obs17 Obs20 Obs12 Obs22 Obs25 Obs1 Obs13 Obs4 Obs14 Obs2 Obs8 Obs7 Obs5 Obs15 Obs21 Obs3 Obs9 Obs18 Summer Factor Factor 1 2 0.486 2.081 1.563 1.913 0.393 -2.261 0.097 2.193 -0.735 -1.041 -0.560 0.084 0.117 -0.534 0.409 -2.092 -0.667 -0.397 1.115 -0.483 -0.103 0.968 -0.858 0.034 -0.323 -0.389 0.157 -0.086 0.570 1.163 0.385 0.321 -0.330 1.465 -0.917 0.306 0.230 -1.741 0.403 -0.686 -0.712 0.851 -0.698 1.058 -0.395 -1.044 0.740 -1.298 0.408 -1.160 Fig.1: Dendrogram showing clustering of sampling sites in post monsoon season Dendrogram -0.352738 -0.152738 0.0472623 Similarity Sampli ng Station 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Post-monsoon Factor Factor 1 2 -1.617 1.823 -0.059 -1.250 1.408 1.629 1.756 -1.797 -0.274 1.522 -0.158 -0.285 0.093 0.109 1.461 1.556 0.057 0.368 -0.195 0.926 0.295 -0.727 -0.292 0.332 -0.504 0.150 -0.057 -0.694 0.329 -1.103 -0.143 -0.612 -1.470 -1.110 -0.091 -1.109 1.443 1.555 0.688 -0.952 0.659 -0.882 -1.526 -1.057 -0.711 0.797 0.992 -1.105 0.609 -0.777 0.2472623 Table 8. Factor scores after varimax rotation 0.4472623 0.6472623 0.8472623 Obs16 Obs23 Obs11 Obs24 Obs6 Obs10 Obs19 Obs17 Obs20 Obs12 Obs22 Obs25 Obs1 Obs13 Obs4 Obs14 Obs2 Obs8 Obs7 Obs5 Obs15 Obs21 Obs3 Obs9 Obs18 Summer Factor Factor 1 2 0.434 2.028 -0.365 1.127 0.103 2.120 0.255 2.647 0.268 -0.912 0.068 0.142 -0.250 0.454 -0.317 1.571 0.393 -1.184 0.635 1.267 -0.224 -0.448 -0.920 -1.524 0.168 -1.032 -0.059 0.224 -0.148 1.502 -0.506 0.829 -0.340 -1.814 -0.493 -1.527 -1.106 1.953 0.036 -0.395 -0.653 -0.744 0.617 -1.790 0.094 -0.503 -0.485 -1.857 -0.559 1.221 Fig.2: Dendrogram showing clustering of sampling sites in summer season Dendrogram -0.31518 -0.11518 0.0848196 0.2848196 0.4848196 0.6848196 0.8848196 Obs25 Obs12 Obs1 Obs13 Obs4 Obs14 Obs2 Obs8 Obs7 Obs21 Obs15 Obs9 Obs18 Obs3 Obs5 Obs16 Obs11 Obs10 Obs24 Obs6 Obs23 Obs22 Obs19 Obs17 Obs20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Post-monsoon Factor Factor 1 2 1.128 2.390 0.854 1.325 0.642 -2.223 0.459 1.908 -1.022 -0.972 -0.102 0.144 -0.220 0.980 0.374 -2.142 -0.469 -0.657 1.413 0.120 0.361 0.471 -0.981 0.508 -0.535 -0.607 -0.086 0.054 0.712 0.942 0.028 0.409 -0.669 1.196 -0.763 0.380 0.500 -1.941 0.330 -0.476 -0.690 0.877 -0.682 0.954 -0.790 -1.179 0.371 -1.359 0.113 -1.381 Similarity Sampling Station Fig.3: Dendrogram showing clustering of sampling sites in pre monsoon season 5 All Rights Reserved © 2012 IJSETR International Journal of Science, Engineering and Technology Research (IJSETR) [10] [11] [12] C.W. Liu, K.H. Lin and Y.M. Kuo, Y.M, Application of factor analysis in the assessment of groundwater quality in a blackfoot disease area in Taiwan, Sci. Total Environ. 313, 2003, 77-89. K. Bengraine and T.F. Marhaba, Using principal component analysis to monitor spatial and temporal changes in water quality, Journal of Hazardous Materials, B100, 179-195,2003. M. Karthikeyan, Chemometric analysis of water quality parameters in Dindigul district and removal of fluoride using polymeric materials, doctoral diss., Gandhigram Rural Institute – Demmed University, Gandhigram, 2010. Dendrogram -0.299285 -0.099285 Similarity 0.1007151 0.3007151 0.5007151 0.7007151 Obs12 Obs11 Obs6 Obs23 Obs7 Obs22 Obs25 Obs10 Obs16 Obs24 Obs18 Obs17 Obs20 Obs19 Obs9 Obs21 Obs1 Obs13 Obs14 Obs8 Obs2 Obs4 Obs5 Obs3 Obs15 0.9007151 Fig.4: Dendrogram showing clustering of sampling sites in monsoon season REFERENCES [1] [2] [3] [4] [5] [6] . [7] V. Simeonov, P. Simeonova, and R. Tsitouridou, Chemometric quality assessment of surface waters: two case studies, Chem. Eng. Ecol. 11, 449-469,2004. S. Shrestha and F. Kazama, Assessment of surface water quality using multivariate statistical techniques: A case study of the Fuji river basin, Japan, Environmental Modeling and Software, 22, 464-475,2007. T.G. Kazi, M.B. Arain, M.K. Jamali, N. Jalbani, H.I. Afridi, R.A. Sarfraz, J.A. Baig, and A.Q. Shah, Assessment of water quality of polluted lake using multivariate statistical techniques: A case study, Ecotoxicology and Environmental Safety,72, 301-309,2009. L. Belkhiri, A. Boudoukha, L. Mouni, and T. Baouz, Multivariate statistical characterization of groundwater quality in Ain Azel plain, Algeria, African Journal of Environmental Science and Technology, 4(8), 526-534,2010. APHA, Standard methods for the examination of water and wastewater (American Water Works Association, Washington DC, 1985). B. Helena, R. Pardo, M. Vega, E. Barrdo, J.M. Fernandez, L. Fernandez, Temporal evolution of groundwater composition in an alluvial aquifer ( Pisuerga river, spain) by principal component analysis., Water Res. 34, 807- 816,2000. M. Vega, R. Pardo, E. Barrado and L. Deban, Assessment of seasonal and polluting effects on the quality of river water by exploratory data analysis, Water Res.32, 3581-3592,1998. [8] S. Yu, J. Shang, and J. Zhao, Factor analysis and dynamics of water quality of the Songhua River Northeast China. Water, Air & Soil Pollution, 144, 159 – 169,2003. [9] A.K. Gupta, S.K. Gupta, and R.S. Patil, Statistical analyses of coastal water quality for a port and harbor region in India. Environ Monit. Assess., 179-200,2005. 6 All Rights Reserved © 2012 IJSETR