Evaluation of Groundwater Quality Parameters Using Multivariate Statistics Methods Students

advertisement
Evaluation of Groundwater Quality Parameters
Using Multivariate Statistics Methods
A case Study of Majmaah, KSA
Students
Hussam Khaled Almubark
Abdullah A. Alzeer
Supervisor
Dr. SaMeH S. Ahmed
Majid Mufleh Almotairi
Civil and Environmental Engineering Department
College of Engineering – Majmaah University
A presentation prepared for:
1
Outlines
•
•
•
•
•
•
•
•
Introduction
Objectives and Problems
Water Quality Data
Methodology
Statistical Analysis
Multivariate Statistical Analysis
Geostatistics
Conclusions & Recommendations
2
Introduction
• Groundwater is the main source (95 %) of fresh
water in Saudi Arabia.
• It is important to ensure good and safe water quality
for drinking and other purposes.
• Monitoring water quality is important and any
contribution to assist this program is always
welcome.
• Characterisation of the water quality parameters
means defining levels, distribution, changes with
time, etc.
3
Objectives
The main objectives of this Project are:
 To evaluate and map the characterisation of
groundwater quality at certain area in Majmaah.
 To assist the monitoring program by reducing
the number of measured variables, using
advanced statistical methods (Multivariate
Statistics).
 To classify groundwater in the area into spatial
water quality types using modern techniques
(Geostatistics).
4
Problems
Problem 1
Wells without coordinates
Problem 2
No previous water quality analysis have
been conducted at the study area
Problem 3
Need for reducing the number of WQP to
save cost and time
Problem 4
Support the monitoring program by
dynamic maps revealing WQP in the place
Software’s and Analyses
MS Office
Word
Power Point
Excel:
-Descriptive Stat.
-Correlation matrix
-Graphs
-Coordinates
-Calculations
StatGraph
Variable
analysis
Scatter plot
Correlations
PCA
FA
SPSS
Correlations
PCA
Rotation methods
Surfer
Kriging
Contour maps
3D figures
6
Water Quality Data
Field, lab., and Analysis
7
Study Area
Info.
Values
Latitude
N 25o
Longitude
E 46o
Area
21 km2
Distance
12 km
No. of wells
15
Population
60,000
8
Sampling in the field
9
Laboratory Analyses
10
Results of lab. Analysis
Well
1
2
3
4
5
6
7
8
9
pH
8.4
8.1
8
7.9
7.9
8.2
8.3
8.2
8.3
EC,
TDS,
S,
(µs/cm)
383.10
(mg/l)
245.184
341.80
218.752
334.20
213.888
333.60
213.504
336.80
215.552
383.30
245.312
383.80
245.632
385.10
246.464
383.50
245.440
(mg/l)
Ag,
(mg/l)
NO3,
(mg/l)
Hg,
(mg/l)
3
0.03
0.019
0.4
2
0.03
0.017
0.4
3
0.03
0.019
0.4
3
0.03
0.019
0.4
2
0.02
0.016
0.3
2
0.02
0.015
0.3
2
0.02
0.015
0.3
2
0.03
0.018
0.4
3
0.03
0.023
0.4
11
GPS Data
Well Locations
12
Well locations
13
GPS
𝑥 = 𝑁 + 𝐻 cos ∅ cos 𝜆
𝑦 = 𝑁 + 𝐻 cos ∅ sin 𝜆
𝑧 = [ 𝑁 1 − 𝑒 2 + 𝐻] sin ∅
14
Coordinates of the Wells
Well No.
N
E
X, m
Y, m
1
25̊̊ 53.411'
45̊̊ 21.375'
464314
2863571
2
25̊̊ 48.773'
45̊̊ 24.483'
459099
2855056
3
25̊̊ 48.769'
45̊̊ 24.463'
459132
2855018
4
25̊̊ 48.456'
45̊̊ 24.109'
459722
2854439
5
25̊̊ 48.956'
45̊̊ 25.790'
456916
2855370
6
25̊̊ 48.684'
45̊̊ 25.564'
457292
2854867
7
25̊̊ 48.836'
45̊̊ 25.951'
456647
2855150
8
25̊̊ 49.269'
45̊̊ 25.951'
456649
2855999
9
25̊̊ 49.087'
45̊̊ 25.792'
456917
2855612
10
25̊̊ 49.183'
45̊̊ 26.067'
456455
2855791
11
25̊̊ 48.622'
45̊̊ 26.468'
455781
2854758
12
25̊̊ 48.563'
45̊̊ 26.572'
455607
2854649
13
25̊̊ 48.371'
45̊̊ 26.776'
455606
2854295
14
25̊̊ 48.298'
45̊̊ 26.916'
455031
2854162
15
25̊̊ 47.930'
45̊̊ 27.306'
454377
2853485
15
Locations of tested wells
16
Preliminarily Statistical
Analysis
Mean, Range , Extremes, S.Dev.
17
Summary Statistics for WQP
Summary Statistics for “EC”
Count = 15
Average = 363.013
Variance = 454.537
Standard deviation = 21.3199
Minimum = 333.6
Maximum = 385.7
Stnd. skewness = -0.232746
Stnd. kurtosis = -1.49015
Percentiles for “EC”
1.0% = 333.6
5.0% = 333.6
10.0% = 334.2
25.0% = 341.8
50.0% = 356.3
75.0% = 383.5
90.0% = 385.1
95.0% = 385.7
99.0% = 385.7
In this case, the standardized skewness value is within the range expected for data
from a normal distribution. The standardized kurtosis value is within the range
expected for data from a normal distribution.
18
Normal Probability Plot for EC
proportion
Box-and-Whisker Plot
330
340
350
360
370
380
390
EC
99.9
99
95
80
50
20
5
1
0.1
330
340
350
360
370
380
390
EC
Histogram for EC
Histogram for Zn
8
6
frequency
frequency
5
4
2
4
3
2
1
0
330
340
350
360
EC
370
380
390
0
48
58
68
78
Zn
88
98
(X 0.001)
19
Summary Statistics of the Wells
Preliminary statistics lead to exclude the following parameters from the multivariate statistics:
F, S, Cr, Color and Ba as they have abnormal distribution which appear from the normal
probability distribution and the Box-and- Whisker plot.
20
Comparison with
Standards
SAS, WHO, EPA
21
Comparing the WQP with the Standards
No
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Variable
pH
Zinc (Zn)
TDS
Sulfate (S)
Silver (Ag)
Nitrite (NO3)
Mercury (Hg)
Lead (Pb)
Iron (Fe)
Cyanide (Cn)
Copper (Cu)
Chromium (Cr)
Chloride (Cl)
Cadmium (Cd)
Aluminum (Al)
Dissolved
Oxygen (DO)
Calcium (Ca)
Magnesium (Mg)
EC
Color
Av. Measured
values
8.153
0.072 mg/l
232.3 mg/l
2.53 mg/l
0.027 mg/l
0.018 mg/l
0.373 µg/l
10.67 µg /l
0.130 mg/l
0.008 mg/l
0.200 mg/l
0.031 mg/l
0.433 mg/l
4.72 µg /l
0.05 mg/l
5.53 mg/l
0.33 mg/l
0.143 mg/l
363.0 µS /mm
103.80 TCU
SAS
standards
EPA
standards
WHO
standards
6.5-8.5
5000 µg/l
1500 mg/l
400 mg/l
1 µg/l
10 µg/l
1.0 mg/l
70 µg/l
1000 µg/l
0.05
600 mg/l
3.0 µg/l
100-200 µg/l
-
6.5 – 8.5
5 mg/l
500 mg/l
250 mg/l
0.1 mg/l
1.0 mg/l
0.002 mg/l
0.015 mg/l
0.3 mg/l
0.2 mg/l
1-1.13 mg/l
0.05
250 mg/l
0.005 mg/l
0.05 -0.2 mg/l
-
6.5- 8.5
3 mg/l
1000 mg/l
400 mg/l
0.1 mg/l
15 TCU
200 mg/l*
50 mg/l*
15 TCU
75 mg/l
50 mg/l
15 TCU
0.001 mg/l
0.01 mg/l
0.3 mg/l
0.1 mg/l
2.0 mg/l
0.05
250 mg/l
0.003 mg/l
0.2 mg/l
-
Notes
> MPL
> MPL
> MPL
22
Multivariate Statistical
Methods
23
Multivariate Statistical Methods
PCA method can be used to:
 reduce number of variables and
 detect relationships between them.
The large set of water quality parameters can be
further studied using this method to
determine the interrelationship between the
parameters.
24
Why Multivariate Statistics?
1. To identify the hidden dimensions or
constructs that may not be apparent
from direct analysis.
2. To identify relationships between
variables, it helps in data reduction.
3. It helps the researcher to cluster the
product
and
population
being
analysed.
25
Correlation Matrix- Excel
Table : Correlation matrix controlling all variables, using Excel
pH
Zn
TDS
Ag
NO3
Hg
Pb
Fe
Cn
Cu
Cl
Cd
Al
DO
Ca
Mg
Ec
pH
Zn
1.000
0.060 1.000
-0.033 0.221
-0.119 0.645
0.062 0.835
-0.095 0.657
0.119 0.840
0.230 0.664
0.093 0.773
0.154 0.794
0.281 0.677
0.266 0.808
0.335 0.706
0.294 -0.109
0.192 0.286
0.072 0.217
-0.033 0.221
TDS
Ag
1.000
0.245 1.000
0.360 0.767
0.238 0.921
0.411 0.664
0.335 0.519
0.267 0.627
0.276 0.567
0.252 0.498
0.217 0.544
0.194 0.448
-0.414 -0.145
-0.110 0.145
0.083 0.098
1.000 0.245
NO3
Hg
1.000
0.770 1.000
0.899 0.705
0.762 0.547
0.799 0.656
0.747 0.597
0.696 0.492
0.767 0.574
0.683 0.456
-0.107 -0.222
0.225 0.135
0.188 0.090
0.360 0.238
Pb
Fe
Cn
Cu
1.000
0.889 1.000
0.804 0.661 1.000
0.841 0.728 0.900 1.000
0.781 0.754 0.784 0.815
0.834 0.735 0.893 0.923
0.778 0.773 0.816 0.816
-0.168 -0.106 -0.117 -0.057
0.266 0.221 0.231 0.309
0.235 0.207 0.225 0.248
0.411 0.335 0.267 0.276
Cl
1.000
0.839
0.853
0.031
0.321
0.129
0.252
Cd
1.000
0.934
0.007
0.280
0.185
0.217
Al
DO
Ca
Mg
Ec
1.000
0.108 1.000
0.257 0.195 1.000
0.200 -0.088 0.650 1.000
0.194 -0.414 -0.110 0.083 1.000
Assumptions:
1) Correlation values between (0.25 – 0.50) indicate weak correlation.
2) Correlation values between (0.50 – 0.74) indicate good correlation.
3) Correlation values > 0.75 indicate strong correlation.
26
Correlation Matrix- SPSS
Table : Correlation matrix controlling all variables using SPSS
pH
Zn
TDS
Ag
NO3
Hg
Pb
Fe
Cn
Cu
Cl
Cd
Al
DO
Ca
Mg
pH
1
Zn
.060
1
TDS
-.033
.221
1
Ag
-.119
.645
.245
1
NO3
.062
.835
.360
.767
1
Hg
-.095
.657
.238
.921
.770
1
Pb
.119
.840
.411
.664
.899
.705
1
Fe
.230
.664
.335
.519
.762
.547
.889
1
Cn
.093
.773
.267
.627
.799
.656
.804
.661
1
Cu
.154
.794
.276
.567
.747
.597
.841
.728
.900
1
Cl
.281
.677
.252
.498
.696
.492
.781
.754
.784
.815
1
Cd
.266
.808
.217
.544
.767
.574
.834
.735
.893
.923
.839
1
Al
.335
.706
.194
.448
.683
.456
.778
.773
.816
.816
.853
.934
1
DO
.294
-.109
-.414
-.145
-.107
-.222
-.168
-.106
-.117
-.057
.031
.007
.108
1
Ca
.192
.286
-.110
.145
.225
.135
.266
.221
.231
.309
.321
.280
.257
.195
1
Mg
.072
.217
.083
.098
.188
.090
.235
.207
.225
.248
.129
.185
.200
-.088
.650
1
-.033-
.221
1.000
.245
.360
.238
.411
.335
.267
.276
.252
.217
.194
-.414
-.110
.083
Ec
Ec
1
27
Results of PCA
Data input: observations
Standardized: yes
Number of complete cases: 45
Number of components extracted: 4
Component
Number
Eigenvalue
Percent of
Variance
Cumulative
Percentage
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
8.79189
2.36680
1.49645
1.38797
0.72077
0.58768
0.41371
0.33459
0.30356
0.15602
0.14240
0.12077
0.06893
0.05797
0.03654
0.01391
0.00003
51.717
13.922
8.803
8.165
4.240
3.457
2.434
1.968
1.786
0.918
0.838
0.710
0.405
0.341
0.215
0.082
0.000
51.717
65.639
74.442
82.607
86.846
90.303
92.737
94.705
96.491
97.408
98.246
98.957
99.362
99.703
99.918
100.000
100.000
28
Component Weights of PCA
Ag
Al
Ca
Cd
Cl
Cn
Cu
DO
EC
Fe
Hg
Mg
NO3
Pb
pH
TDS
Zn
Component Component Component Component
(1)
(2)
(3)
(4)
0.2458
-0.1155
0.3487
0.2098
0.2900
0.1774
-0.0765
-0.2171
0.1057
0.3283
-0.3222
0.4832
0.3101
0.1327
-0.0011
-0.1385
0.2887
0.1226
-0.0677
-0.1803
0.3056
0.0364
0.0684
-0.0140
0.3083
0.0756
-0.0141
-0.0370
-0.0426
0.4481
0.0318
-0.2175
0.1322
-0.4898
-0.3725
-0.1096
0.2854
0.0106
-0.0820
-0.1072
0.2521
-0.1233
0.3498
0.2028
0.0897
0.1348
-0.4391
0.5829
0.3066
-0.0544
0.1192
0.0529
0.3213
-0.0438
-0.0076
-0.0005
0.0551
0.2949
-0.3630
-0.3868
0.1321
-0.4904
-0.3717
-0.1089
0.2936
0.0354
0.1159
0.0786
29
The previous table shows the equations of the principal
components. For example, the first principal component
has the equation.
0.24584*Ag + 0.289998*Al + 0.105667*Ca +
0.310126*Cd + 0.288733*Cl +
0.305632*Cn + 0.308319*Cu - 0.042581*DO +
0.132234*EC + 0.285392*Fe +
0.252125*Hg + 0.0897126*Mg + 0.306629*NO3 +
0.321259*Pb + 0.0551099*pH
+ 0.132119*TDS + 0.293593*Zn
30
Plots of PCA
Scree Plot
8
6
4
2
0
0
3
6
9
12
15
18
Component
Scree Plot using STAT Graph
Scree Plot using SPSS
Scatterplot
5.1
Component 2
Eigenvalue
10
3.1
1.1
-0.9
-2.9
-9
-6
-3
0
Component 1
3
6
9
31
Plots of PCA
Plot of Component Weights
Component 2
0.5
DO
Ca
pH
0.3
Al
Mg
Cd
Cu
Fe
Cn
Hg NO3 Pb
Cl
0.1
Zn
Ag
-0.1
-0.3
TDS EC
-0.5
-0.05
0.05
0.15
0.25
0.35
Component 1
Component 3
Plot of Component Weights
Ag
0.36
DO
0.16
-0.04
pH
-0.24
-0.44
-0.05
Ca
Hg
Zn
Cn
NO3 Cu
Cd Al
Pb
Fe
Cl
Plot of Component in rotated space
using SPSS
Mg
TDS
EC
0.05
0.15
0.25
0.35
0.1
-0.1
-0.3
-0.5
0.3
0.5
Component 2
Component 1
Plot of Component Weights using STAT Graph
32
Interpretation of Multivariate
Statistics Results
The values obtained from data reduction using PCA
method reveal that the first component (factors)
involves: Cd, Cn, Cu, NO3 and Pb in one group. The
variables: Ca, DO, EC and TDS are the main variable
in group two. Also, group three consists of seven
variables that might have interrelationship among
them, those are: Ca, Ag, EC, Mg, pH, and TDS. The
fourth group has two main variables: Mg and pH.
33
Geostatistics Techniques
Contour Maps
34
Geostatistics and Kriging
Techniques
Geostatistics is the statistics of spatially or temporally
correlated data. The technique has been used to be a
practical approach to the problems of ore reserve
estimation and mine planning. It has been also used
for other applications concerned with petroleum and
gas resources estimation.
Kriging is the most famous geostatisics technique that
is being use now for several applications. In this
project, Kriging is performed using SURFER
software.
35
pH - Contour Map
Contour map shows the distribution of “pH” at the study area
36
TDS - Contour Map
Contour map shows the distribution of “TDS” at the study area
37
EC - Contour Map
Contour map shows the distribution of “EC” at the study area
38
Contour Maps - Comparison
pH
EC
TDS
Zn
39
pH – Representation
3D representation of “pH” at the study area
40
TDS – Representation
3D representation of “TDS” at the study area
41
EC – Representation
3D representation of “EC” at the study area
42
3D- Comparison
pH
EC
TDS
Zn
43
Conclusions
44
Conclusions
• Water quality parameters of groundwater wells at
part of Majmaah city has been characterised using
intensive descriptive statistics and multivariate
statistical analysis.
• 15 groundwater wells in a farming area near
Majmaah. Where 3 samples from each well were
gathered and sent to environmental engineering lab
for chemical analysis.
• GPS instruments were used to determine the X, Y
and coordinates of the tested wells, as there were no
coordinates available for those points.
45
Conclusions
• Laboratory analysis has been conducted over the
three samples for 22 water quality parameters, and
results were sorted with the determined GPS data in
one database for further statistical and Geostatistics
analysis.
• Results of the preliminarily statistical analysis reveal
that only 17 variables are suitable for carrying out
multivariate statistical analysis to reduce the number
of measured variables.
• A comparison study between WQP and SAS, WHO
and EPA is introduce in tables, where most of the
recorded parameters were below the standards
except Lead (Pb) and Cadmium (Cd).
46
Conclusions
• An attempted to interoperate the resultant groups
were made, despite the physical interpretation is not
deep.
• X, Y and measured values of each WQP were used
with Surfer software to generate contour maps and
3D representation of each variable within the study
area.
• Nevertheless, the study has been carried out over a
small portion, the steps and introduced procedure
can be easily applied elsewhere for similar purpose.
47
Recommendations
48
Recommendations
• The project examined the calculation methodology
over an area of 21 sq.km with 15 water wells, It might
be better if more wells are included, for carrying out
trustable geostatistics studies.
• WQP such as: temp., Turbidity, K, BOD, Na, P, etc..
should be investigated.
• Grouping WQP as: 1) field parameters and 2)
laboratory parameters, then conduct PCA or FA to
correlated the two groups and find the intercorrelations between them. Such study needs
intensive data and reliable WQP analysis.
49
References
[1] EPA, (2001), “Parameters of Water Quality – Interpretation and Standards” Report, 133P, ISBN 184096-015-3.
[2] Mohammed Al-Saud et.al, (2011), “Challenges for an Integrated Groundwater Management in the
Kingdom of Saudi Arabia” International Journal of Water Resources and Arid Environments 1(1):
65-70, 2011.
[3] Walid Abdelrahamn, (2006) “Groundwater Resources Management in Saudi Arabia” Special
presentation at water conservation workshop, Khoper, KSA, December 2006.
[4] FAO, (2009), “Groundwater Management in Saudi Arabia”, A Report by FAO, 14P.
.‫”مرقب جبل منيخ مبحافظة اجملمعة” التقرير التارخيي واألثري الصادر من وكالة اآلاثر واملتاحف‬: ‫[ كتاب‬5]
[6] Sameh S Ahmed, (2014), “Surveying 1” Lecture notes of surveying course at Civil and
Environmental Engineering Department, MU, KSA, 2014.
[7] Gregory T. French (1997), “Understanding the GPS, An Introduction to the Global Positioning
System" First edition, April 1997.
[8] GPS Coordinate Converter, Maps and Info, http://boulter.com/gps/
[9] Excel 2010, “Microsoft Office 2010"
[10] STATGRAPHICS plus, (1996): Statistical Graphics Corp.
[11] EPA, (2012), “Drinking Water Standards and Health Advisories”, 2012 Edition, EPA 822-s-12-001,
Office of water, U.S Environmental Protection Agency. April 2012.
[12] World Health Organization, (2004),”Guidelines for Drinking-water Quality”, Vol. 1, 3rd Edition,
Geneva, ISBN: 9241546387.
15] SPSS 16.0 for windows, (Release 16.0.0, Sept 2007) http://www.winwarp.com
50
Thanks for your attention
Download