Analytical Chemistry CHE 226

advertisement
9/23/2009
Systematic Error
Illustration of Bias
Sources of Systematic
Errors



Instrument Errors
Method Errors
Personal
– Prejudice
– Preconceived notion of “true” value
– Number bias
Prefer 0/5
Small over large
 Even over odd


Effects of Systematic
Errors

Constant Errors
– Become more serious as size of
measurement get smaller

Proportional Errors
– Interfering contaminants

I f the contaminant becomes larger, the
signal becomes larger.
1
9/23/2009
Detection of Systematic
and Personal Errors


Calibration
Care and Self-discipline
– Instrument readings
– Notebook entries
– Calculations
– Physical disabilities--color blindness
Bias

Difficult to detect
– Analyze standard samples
– Do an independent analysis
– Determine a blank
– Vary the sample size
Applying Statistics to
Data Evaluation





Gross error or segment of population?
Define the confidence interval.
Find the number of replicates necessary to
ensure that the mean falls within a
predetermined interval.
What is the probability that an experimental
mean and a “true” value or a two
experimental means are different.
Calibrate
2
9/23/2009
Gross Errors
The Q -test: rejecting outliers
Gross Errors
The Q -test: rejecting outliers
Qexp
xq
xn
xq
x1
d
w
quest. result - nearest neighbor
range
The Q -test: An Example
A calcite sample yields the following data
for the determination of calcium as CaO:
55.95, 56.00, 56.04, 56.08, and 56.23.
Should we reject 56.23?
Qexp
xq
xn
xq
x1
56.23 56.08
56.23 55.95
0.54
3
9/23/2009
The Q -test: The Q -Table
(5-1)
Qcrit
Number of
Observations
90%
95%
99%
3
0.941
0.970
0.994
4
0.765
0.829
0.926
5
0.642
0.710
0.821
6
0.560
0.625
0.740
7
0.507
0.568
0.680
The Q -test: An Example
A calcite sample yields the following data
for the determination of calcium as CaO:
55.95, 56.00, 56.04, 56.08, and 56.23.
Should we reject 56.23?
Qexp
xq
xn
xq
x1
56.23 56.08
56.23 55.95
0.54
What’s the criterion?
Qexp > Qcrit , reject.
 If Qexp < Qcrit , accept.
 Qexp = 0.54; Qcrit = 0.64, so
 If
accept.
Here
4
9/23/2009
Can we reject data?




Blind application of statistical tests is no
better than doing nothing.
Use good judgement based on experience.
If you know that something went wrong
with a sample and the sample produces an
outlier, then rejection may be warranted.
Be cautious about rejecting data for any
reason.
Recommendations




Keep good records and examine the
data carefully.
If possible, estimate the precision of
the method.
Repeat the analysis if time and sample
are available. Compare with first data.
If not feasible, apply the Q -test.
Recommendations



If Q -test indicated retention, consider
reporting the median.
The median allows inclusion of all of the
data without undue influence from the
outlier.
The median of a set of 3 measurements
from a normal distribution gives a better
estimate than the mean of the remaining 2
values after an outlier is rejected.
5
9/23/2009
Confidence Limits and
Intervals


Confidence Limits are limits around an
experimentally determined mean
within which the true mean
lies with
a give degree of probability.
The confidence interval is the interval
around the mean defined by the
confidence limits.
Confidence limits if s is a
good estimate of
CL for
x z
(single measurement)
CL for
x
z
N
(mean x of N measurements)
50% Confidence Limits
6
9/23/2009
80% Confidence Limits
90% Confidence Limits
95% Confidence Limits
7
9/23/2009
99% Confidence Limits
Confidence limits if s is
not a good estimate of
t
x
Student' s t
(analogous to z )
ts
N
(mean x of N measurements)
CL for
x
Values of Student's t
Probability Level
Degrees of
Freedom
90%
95%
99%
99.8%
1
6.31
12.7
63.7
318
2
2.92
4.30
9.92
22.3
3
2.35
3.18
5.84
10.2
4
2.13
2.78
4.60
7.17
5
2.02
2.57
4.03
5.89
(z)
1.64
1.96
2.58
3.09
8
9/23/2009
Finding the Confidence
Interval: An Example
Determination of the alcohol content in blood
gives the following data: % C2H5OH: 0.084,
0.089, and 0.079.
(a) If the precision of the method is unknown,
find the 95% confidence limits of the mean.
(b) Perform the same calculation if the the
standard deviation s
= 0.0050% C2H5OH.
(How could we determine s
?)
unknown (use t )
(a)
x
0.084 % C 2 H 5OH
s
0.0050 % C 2 H 5OH
95% CL
ts
(4.30)(0.0050)
0.084
N
3
0.084 0.012 % C 2 H 5OH
x
Values of Student's t
Probability Level
Degrees of
Freedom
90%
95%
99%
99.8%
1
6.31
12.7
63.7
318
2
2.92
4.30
9.92
22.3
3
2.35
3.18
5.84
10.2
4
2.13
2.78
4.60
7.17
5
2.02
2.57
4.03
5.89
(z)
1.64
1.96
2.58
3.09
9
9/23/2009
= 0.0050 %
(b) s
(use z )
x
0.084 % C 2 H 5OH
0.0050 % C 2 H 5OH
95% CL
(1.96)(0.0050)
N
3
0.084 0.006 % C 2 H 5OH
x
z
0.084
Values of Student's t
Probability Level
Degrees of
Freedom
90%
95%
99%
99.8%
1
6.31
12.7
63.7
318
2
2.92
4.30
9.92
22.3
3
2.35
3.18
5.84
10.2
4
2.13
2.78
4.60
7.17
5
2.02
2.57
4.03
5.89
(z)
1.64
1.96
2.58
3.09
Comparing a mean to the
true value: The Null
Hypothesis



The null hypothesis assumes that two
measurments are the same.
Any numerical difference is assumed
to be due to random error.
If the observed difference is greater
than or equal to the difference that
would occur 5% of the time, the null
hypothesis is rejected, and the
difference is judged significant.
10
9/23/2009
The Critical Value
We rearrange the equation
for the confidence interval.
x
x
ts
N
ts
N
Compare the difference to
the critical value


The difference x
is
compared to
ts / N
the critical value
the desired probability level.
at
x
If
is greater than the
critical value, the null hypothesis is
rejected.
An Example: The
Determinaton of Sulfur in
Kerosenes
A known sample containing
0.123% sulfur was analyzed and
the results for four samples were:
0.112, 0.118, 0.115, and 0.119
%S. Is there bias in the method?
Let’s do a spreadsheet.
11
9/23/2009
The Spreadsheet (5%)
True Val.
0.123
Difference
-0.007
Data
t(95%, 3 df)
0.112
3.18
0.118
0.115 ts/sqrt(N)
0.119
0.0050
0.116
0.0032
Mean
Std. Dev.
If we wish to be wrong no more than 5%
of the time, we must reject the null
hypothesis, and there is systematic error.
Here
What about 1%?
True Val.
0.123
Difference
-0.007
Data
t(99%, 3 df)
0.112
5.84
0.118
0.115 ts/sqrt(N)
0.119
0.0092
0.116
0.0032
Mean
Std. Dev.
If we wish to be wrong no more than 1% of
the time, we must accept the null hypothesis,
and there is no systematic error.
Comparing Two
Experimental Means
x1 x1
tspooled
d.f. N1
N1 N 2
N1 N 2
N2 2
12
9/23/2009
Least-Squares for
Analyzing Linear
Calibrations: y = mx +b



Least-squares assumes that there is relatively
little error in the x measurement.
The mathematics of the derivation of the
equations minimizes the sum of the squares
of the deviations (the residuals ) of the points
from the best line in the y direction only.
From calculus, take the partial derivatives of
the equation for the sum of squares with
respect to m and b , set it equal to zero, and
solve for the variables.
13
9/23/2009
The Intermediate
Equations (See pp. 161-2)
S xx
xi
x
2
S yy
yi
y
2
S xy
xi
x yi
x
xi
2
N
yi
yi2
2
N
y
and
N
xi
xi2
xi
xi yi
yi
N
yi
y
N
The Results
S xy
1. Slope : m
2. Intercept : b
S xx
y mx
3. The standard deviation about regression,
or the standard error of the estimate :
sr
m 2 S xx
S yy
N 2
where N 2 d.f.
sr2
S xx
4. The standard deviation of the slope : sm
The Standard Deviation
about Regression



Analogous to the standard deviation
Measure of the scatter of points
Precision similar to individual data
sr
S yy
m 2 S xx
yi
N 2
yi
mxi b
N 2
yline
N 2
2
2
14
9/23/2009
More Results
5. The standard deviation of the intercept :
sb
xi2
sr
N
2
i
x
xi
2
6. The standard deviation of results from the calibration curve :
sb
sr
m
1
M
1
N
yc y
m 2 S xx
2
where
M
yi
yc
i 1
M
M
no. replicates of the unknown
15
9/23/2009
Assignment 2


7-2, 7-4, 7-6, 7-11, 7-16, 7-19
SS p. 164
16
Download