Evaluation of Groundwater Quality Parameters Using Multivariate Statistics Methods A case Study of Majmaah, KSA Students Hussam Khaled Almubark Abdullah A. Alzeer Supervisor Dr. SaMeH S. Ahmed Majid Mufleh Almotairi Civil and Environmental Engineering Department College of Engineering – Majmaah University A presentation prepared for: 1 Outlines • • • • • • • • Introduction Objectives and Problems Water Quality Data Methodology Statistical Analysis Multivariate Statistical Analysis Geostatistics Conclusions & Recommendations 2 Introduction • Groundwater is the main source (95 %) of fresh water in Saudi Arabia. • It is important to ensure good and safe water quality for drinking and other purposes. • Monitoring water quality is important and any contribution to assist this program is always welcome. • Characterisation of the water quality parameters means defining levels, distribution, changes with time, etc. 3 Objectives The main objectives of this Project are: To evaluate and map the characterisation of groundwater quality at certain area in Majmaah. To assist the monitoring program by reducing the number of measured variables, using advanced statistical methods (Multivariate Statistics). To classify groundwater in the area into spatial water quality types using modern techniques (Geostatistics). 4 Problems Problem 1 Wells without coordinates Problem 2 No previous water quality analysis have been conducted at the study area Problem 3 Need for reducing the number of WQP to save cost and time Problem 4 Support the monitoring program by dynamic maps revealing WQP in the place Software’s and Analyses MS Office Word Power Point Excel: -Descriptive Stat. -Correlation matrix -Graphs -Coordinates -Calculations StatGraph Variable analysis Scatter plot Correlations PCA FA SPSS Correlations PCA Rotation methods Surfer Kriging Contour maps 3D figures 6 Water Quality Data Field, lab., and Analysis 7 Study Area Info. Values Latitude N 25o Longitude E 46o Area 21 km2 Distance 12 km No. of wells 15 Population 60,000 8 Sampling in the field 9 Laboratory Analyses 10 Results of lab. Analysis Well 1 2 3 4 5 6 7 8 9 pH 8.4 8.1 8 7.9 7.9 8.2 8.3 8.2 8.3 EC, TDS, S, (µs/cm) 383.10 (mg/l) 245.184 341.80 218.752 334.20 213.888 333.60 213.504 336.80 215.552 383.30 245.312 383.80 245.632 385.10 246.464 383.50 245.440 (mg/l) Ag, (mg/l) NO3, (mg/l) Hg, (mg/l) 3 0.03 0.019 0.4 2 0.03 0.017 0.4 3 0.03 0.019 0.4 3 0.03 0.019 0.4 2 0.02 0.016 0.3 2 0.02 0.015 0.3 2 0.02 0.015 0.3 2 0.03 0.018 0.4 3 0.03 0.023 0.4 11 GPS Data Well Locations 12 Well locations 13 GPS 𝑥 = 𝑁 + 𝐻 cos ∅ cos 𝜆 𝑦 = 𝑁 + 𝐻 cos ∅ sin 𝜆 𝑧 = [ 𝑁 1 − 𝑒 2 + 𝐻] sin ∅ 14 Coordinates of the Wells Well No. N E X, m Y, m 1 25̊̊ 53.411' 45̊̊ 21.375' 464314 2863571 2 25̊̊ 48.773' 45̊̊ 24.483' 459099 2855056 3 25̊̊ 48.769' 45̊̊ 24.463' 459132 2855018 4 25̊̊ 48.456' 45̊̊ 24.109' 459722 2854439 5 25̊̊ 48.956' 45̊̊ 25.790' 456916 2855370 6 25̊̊ 48.684' 45̊̊ 25.564' 457292 2854867 7 25̊̊ 48.836' 45̊̊ 25.951' 456647 2855150 8 25̊̊ 49.269' 45̊̊ 25.951' 456649 2855999 9 25̊̊ 49.087' 45̊̊ 25.792' 456917 2855612 10 25̊̊ 49.183' 45̊̊ 26.067' 456455 2855791 11 25̊̊ 48.622' 45̊̊ 26.468' 455781 2854758 12 25̊̊ 48.563' 45̊̊ 26.572' 455607 2854649 13 25̊̊ 48.371' 45̊̊ 26.776' 455606 2854295 14 25̊̊ 48.298' 45̊̊ 26.916' 455031 2854162 15 25̊̊ 47.930' 45̊̊ 27.306' 454377 2853485 15 Locations of tested wells 16 Preliminarily Statistical Analysis Mean, Range , Extremes, S.Dev. 17 Summary Statistics for WQP Summary Statistics for “EC” Count = 15 Average = 363.013 Variance = 454.537 Standard deviation = 21.3199 Minimum = 333.6 Maximum = 385.7 Stnd. skewness = -0.232746 Stnd. kurtosis = -1.49015 Percentiles for “EC” 1.0% = 333.6 5.0% = 333.6 10.0% = 334.2 25.0% = 341.8 50.0% = 356.3 75.0% = 383.5 90.0% = 385.1 95.0% = 385.7 99.0% = 385.7 In this case, the standardized skewness value is within the range expected for data from a normal distribution. The standardized kurtosis value is within the range expected for data from a normal distribution. 18 Normal Probability Plot for EC proportion Box-and-Whisker Plot 330 340 350 360 370 380 390 EC 99.9 99 95 80 50 20 5 1 0.1 330 340 350 360 370 380 390 EC Histogram for EC Histogram for Zn 8 6 frequency frequency 5 4 2 4 3 2 1 0 330 340 350 360 EC 370 380 390 0 48 58 68 78 Zn 88 98 (X 0.001) 19 Summary Statistics of the Wells Preliminary statistics lead to exclude the following parameters from the multivariate statistics: F, S, Cr, Color and Ba as they have abnormal distribution which appear from the normal probability distribution and the Box-and- Whisker plot. 20 Comparison with Standards SAS, WHO, EPA 21 Comparing the WQP with the Standards No 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Variable pH Zinc (Zn) TDS Sulfate (S) Silver (Ag) Nitrite (NO3) Mercury (Hg) Lead (Pb) Iron (Fe) Cyanide (Cn) Copper (Cu) Chromium (Cr) Chloride (Cl) Cadmium (Cd) Aluminum (Al) Dissolved Oxygen (DO) Calcium (Ca) Magnesium (Mg) EC Color Av. Measured values 8.153 0.072 mg/l 232.3 mg/l 2.53 mg/l 0.027 mg/l 0.018 mg/l 0.373 µg/l 10.67 µg /l 0.130 mg/l 0.008 mg/l 0.200 mg/l 0.031 mg/l 0.433 mg/l 4.72 µg /l 0.05 mg/l 5.53 mg/l 0.33 mg/l 0.143 mg/l 363.0 µS /mm 103.80 TCU SAS standards EPA standards WHO standards 6.5-8.5 5000 µg/l 1500 mg/l 400 mg/l 1 µg/l 10 µg/l 1.0 mg/l 70 µg/l 1000 µg/l 0.05 600 mg/l 3.0 µg/l 100-200 µg/l - 6.5 – 8.5 5 mg/l 500 mg/l 250 mg/l 0.1 mg/l 1.0 mg/l 0.002 mg/l 0.015 mg/l 0.3 mg/l 0.2 mg/l 1-1.13 mg/l 0.05 250 mg/l 0.005 mg/l 0.05 -0.2 mg/l - 6.5- 8.5 3 mg/l 1000 mg/l 400 mg/l 0.1 mg/l 15 TCU 200 mg/l* 50 mg/l* 15 TCU 75 mg/l 50 mg/l 15 TCU 0.001 mg/l 0.01 mg/l 0.3 mg/l 0.1 mg/l 2.0 mg/l 0.05 250 mg/l 0.003 mg/l 0.2 mg/l - Notes > MPL > MPL > MPL 22 Multivariate Statistical Methods 23 Multivariate Statistical Methods PCA method can be used to: reduce number of variables and detect relationships between them. The large set of water quality parameters can be further studied using this method to determine the interrelationship between the parameters. 24 Why Multivariate Statistics? 1. To identify the hidden dimensions or constructs that may not be apparent from direct analysis. 2. To identify relationships between variables, it helps in data reduction. 3. It helps the researcher to cluster the product and population being analysed. 25 Correlation Matrix- Excel Table : Correlation matrix controlling all variables, using Excel pH Zn TDS Ag NO3 Hg Pb Fe Cn Cu Cl Cd Al DO Ca Mg Ec pH Zn 1.000 0.060 1.000 -0.033 0.221 -0.119 0.645 0.062 0.835 -0.095 0.657 0.119 0.840 0.230 0.664 0.093 0.773 0.154 0.794 0.281 0.677 0.266 0.808 0.335 0.706 0.294 -0.109 0.192 0.286 0.072 0.217 -0.033 0.221 TDS Ag 1.000 0.245 1.000 0.360 0.767 0.238 0.921 0.411 0.664 0.335 0.519 0.267 0.627 0.276 0.567 0.252 0.498 0.217 0.544 0.194 0.448 -0.414 -0.145 -0.110 0.145 0.083 0.098 1.000 0.245 NO3 Hg 1.000 0.770 1.000 0.899 0.705 0.762 0.547 0.799 0.656 0.747 0.597 0.696 0.492 0.767 0.574 0.683 0.456 -0.107 -0.222 0.225 0.135 0.188 0.090 0.360 0.238 Pb Fe Cn Cu 1.000 0.889 1.000 0.804 0.661 1.000 0.841 0.728 0.900 1.000 0.781 0.754 0.784 0.815 0.834 0.735 0.893 0.923 0.778 0.773 0.816 0.816 -0.168 -0.106 -0.117 -0.057 0.266 0.221 0.231 0.309 0.235 0.207 0.225 0.248 0.411 0.335 0.267 0.276 Cl 1.000 0.839 0.853 0.031 0.321 0.129 0.252 Cd 1.000 0.934 0.007 0.280 0.185 0.217 Al DO Ca Mg Ec 1.000 0.108 1.000 0.257 0.195 1.000 0.200 -0.088 0.650 1.000 0.194 -0.414 -0.110 0.083 1.000 Assumptions: 1) Correlation values between (0.25 – 0.50) indicate weak correlation. 2) Correlation values between (0.50 – 0.74) indicate good correlation. 3) Correlation values > 0.75 indicate strong correlation. 26 Correlation Matrix- SPSS Table : Correlation matrix controlling all variables using SPSS pH Zn TDS Ag NO3 Hg Pb Fe Cn Cu Cl Cd Al DO Ca Mg pH 1 Zn .060 1 TDS -.033 .221 1 Ag -.119 .645 .245 1 NO3 .062 .835 .360 .767 1 Hg -.095 .657 .238 .921 .770 1 Pb .119 .840 .411 .664 .899 .705 1 Fe .230 .664 .335 .519 .762 .547 .889 1 Cn .093 .773 .267 .627 .799 .656 .804 .661 1 Cu .154 .794 .276 .567 .747 .597 .841 .728 .900 1 Cl .281 .677 .252 .498 .696 .492 .781 .754 .784 .815 1 Cd .266 .808 .217 .544 .767 .574 .834 .735 .893 .923 .839 1 Al .335 .706 .194 .448 .683 .456 .778 .773 .816 .816 .853 .934 1 DO .294 -.109 -.414 -.145 -.107 -.222 -.168 -.106 -.117 -.057 .031 .007 .108 1 Ca .192 .286 -.110 .145 .225 .135 .266 .221 .231 .309 .321 .280 .257 .195 1 Mg .072 .217 .083 .098 .188 .090 .235 .207 .225 .248 .129 .185 .200 -.088 .650 1 -.033- .221 1.000 .245 .360 .238 .411 .335 .267 .276 .252 .217 .194 -.414 -.110 .083 Ec Ec 1 27 Results of PCA Data input: observations Standardized: yes Number of complete cases: 45 Number of components extracted: 4 Component Number Eigenvalue Percent of Variance Cumulative Percentage 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 8.79189 2.36680 1.49645 1.38797 0.72077 0.58768 0.41371 0.33459 0.30356 0.15602 0.14240 0.12077 0.06893 0.05797 0.03654 0.01391 0.00003 51.717 13.922 8.803 8.165 4.240 3.457 2.434 1.968 1.786 0.918 0.838 0.710 0.405 0.341 0.215 0.082 0.000 51.717 65.639 74.442 82.607 86.846 90.303 92.737 94.705 96.491 97.408 98.246 98.957 99.362 99.703 99.918 100.000 100.000 28 Component Weights of PCA Ag Al Ca Cd Cl Cn Cu DO EC Fe Hg Mg NO3 Pb pH TDS Zn Component Component Component Component (1) (2) (3) (4) 0.2458 -0.1155 0.3487 0.2098 0.2900 0.1774 -0.0765 -0.2171 0.1057 0.3283 -0.3222 0.4832 0.3101 0.1327 -0.0011 -0.1385 0.2887 0.1226 -0.0677 -0.1803 0.3056 0.0364 0.0684 -0.0140 0.3083 0.0756 -0.0141 -0.0370 -0.0426 0.4481 0.0318 -0.2175 0.1322 -0.4898 -0.3725 -0.1096 0.2854 0.0106 -0.0820 -0.1072 0.2521 -0.1233 0.3498 0.2028 0.0897 0.1348 -0.4391 0.5829 0.3066 -0.0544 0.1192 0.0529 0.3213 -0.0438 -0.0076 -0.0005 0.0551 0.2949 -0.3630 -0.3868 0.1321 -0.4904 -0.3717 -0.1089 0.2936 0.0354 0.1159 0.0786 29 The previous table shows the equations of the principal components. For example, the first principal component has the equation. 0.24584*Ag + 0.289998*Al + 0.105667*Ca + 0.310126*Cd + 0.288733*Cl + 0.305632*Cn + 0.308319*Cu - 0.042581*DO + 0.132234*EC + 0.285392*Fe + 0.252125*Hg + 0.0897126*Mg + 0.306629*NO3 + 0.321259*Pb + 0.0551099*pH + 0.132119*TDS + 0.293593*Zn 30 Plots of PCA Scree Plot 8 6 4 2 0 0 3 6 9 12 15 18 Component Scree Plot using STAT Graph Scree Plot using SPSS Scatterplot 5.1 Component 2 Eigenvalue 10 3.1 1.1 -0.9 -2.9 -9 -6 -3 0 Component 1 3 6 9 31 Plots of PCA Plot of Component Weights Component 2 0.5 DO Ca pH 0.3 Al Mg Cd Cu Fe Cn Hg NO3 Pb Cl 0.1 Zn Ag -0.1 -0.3 TDS EC -0.5 -0.05 0.05 0.15 0.25 0.35 Component 1 Component 3 Plot of Component Weights Ag 0.36 DO 0.16 -0.04 pH -0.24 -0.44 -0.05 Ca Hg Zn Cn NO3 Cu Cd Al Pb Fe Cl Plot of Component in rotated space using SPSS Mg TDS EC 0.05 0.15 0.25 0.35 0.1 -0.1 -0.3 -0.5 0.3 0.5 Component 2 Component 1 Plot of Component Weights using STAT Graph 32 Interpretation of Multivariate Statistics Results The values obtained from data reduction using PCA method reveal that the first component (factors) involves: Cd, Cn, Cu, NO3 and Pb in one group. The variables: Ca, DO, EC and TDS are the main variable in group two. Also, group three consists of seven variables that might have interrelationship among them, those are: Ca, Ag, EC, Mg, pH, and TDS. The fourth group has two main variables: Mg and pH. 33 Geostatistics Techniques Contour Maps 34 Geostatistics and Kriging Techniques Geostatistics is the statistics of spatially or temporally correlated data. The technique has been used to be a practical approach to the problems of ore reserve estimation and mine planning. It has been also used for other applications concerned with petroleum and gas resources estimation. Kriging is the most famous geostatisics technique that is being use now for several applications. In this project, Kriging is performed using SURFER software. 35 pH - Contour Map Contour map shows the distribution of “pH” at the study area 36 TDS - Contour Map Contour map shows the distribution of “TDS” at the study area 37 EC - Contour Map Contour map shows the distribution of “EC” at the study area 38 Contour Maps - Comparison pH EC TDS Zn 39 pH – Representation 3D representation of “pH” at the study area 40 TDS – Representation 3D representation of “TDS” at the study area 41 EC – Representation 3D representation of “EC” at the study area 42 3D- Comparison pH EC TDS Zn 43 Conclusions 44 Conclusions • Water quality parameters of groundwater wells at part of Majmaah city has been characterised using intensive descriptive statistics and multivariate statistical analysis. • 15 groundwater wells in a farming area near Majmaah. Where 3 samples from each well were gathered and sent to environmental engineering lab for chemical analysis. • GPS instruments were used to determine the X, Y and coordinates of the tested wells, as there were no coordinates available for those points. 45 Conclusions • Laboratory analysis has been conducted over the three samples for 22 water quality parameters, and results were sorted with the determined GPS data in one database for further statistical and Geostatistics analysis. • Results of the preliminarily statistical analysis reveal that only 17 variables are suitable for carrying out multivariate statistical analysis to reduce the number of measured variables. • A comparison study between WQP and SAS, WHO and EPA is introduce in tables, where most of the recorded parameters were below the standards except Lead (Pb) and Cadmium (Cd). 46 Conclusions • An attempted to interoperate the resultant groups were made, despite the physical interpretation is not deep. • X, Y and measured values of each WQP were used with Surfer software to generate contour maps and 3D representation of each variable within the study area. • Nevertheless, the study has been carried out over a small portion, the steps and introduced procedure can be easily applied elsewhere for similar purpose. 47 Recommendations 48 Recommendations • The project examined the calculation methodology over an area of 21 sq.km with 15 water wells, It might be better if more wells are included, for carrying out trustable geostatistics studies. • WQP such as: temp., Turbidity, K, BOD, Na, P, etc.. should be investigated. • Grouping WQP as: 1) field parameters and 2) laboratory parameters, then conduct PCA or FA to correlated the two groups and find the intercorrelations between them. Such study needs intensive data and reliable WQP analysis. 49 References [1] EPA, (2001), “Parameters of Water Quality – Interpretation and Standards” Report, 133P, ISBN 184096-015-3. [2] Mohammed Al-Saud et.al, (2011), “Challenges for an Integrated Groundwater Management in the Kingdom of Saudi Arabia” International Journal of Water Resources and Arid Environments 1(1): 65-70, 2011. [3] Walid Abdelrahamn, (2006) “Groundwater Resources Management in Saudi Arabia” Special presentation at water conservation workshop, Khoper, KSA, December 2006. [4] FAO, (2009), “Groundwater Management in Saudi Arabia”, A Report by FAO, 14P. .”مرقب جبل منيخ مبحافظة اجملمعة” التقرير التارخيي واألثري الصادر من وكالة اآلاثر واملتاحف: [ كتاب5] [6] Sameh S Ahmed, (2014), “Surveying 1” Lecture notes of surveying course at Civil and Environmental Engineering Department, MU, KSA, 2014. [7] Gregory T. French (1997), “Understanding the GPS, An Introduction to the Global Positioning System" First edition, April 1997. [8] GPS Coordinate Converter, Maps and Info, http://boulter.com/gps/ [9] Excel 2010, “Microsoft Office 2010" [10] STATGRAPHICS plus, (1996): Statistical Graphics Corp. [11] EPA, (2012), “Drinking Water Standards and Health Advisories”, 2012 Edition, EPA 822-s-12-001, Office of water, U.S Environmental Protection Agency. April 2012. [12] World Health Organization, (2004),”Guidelines for Drinking-water Quality”, Vol. 1, 3rd Edition, Geneva, ISBN: 9241546387. 15] SPSS 16.0 for windows, (Release 16.0.0, Sept 2007) http://www.winwarp.com 50 Thanks for your attention