Fundamentals of Analytical Chemistry

Fundamentals of
Analytical Chemistry
4, 5, 7, 11, 15
Chapter 7
Statistical Data Treatment and
Only starred (*) parts!
We will only discuss the following sections
from Chapter 7
D – Detection of Gross Errors
A – Confidence Intervals
B – Statistical Aids to Hypothesis Testing
Detection of Gross Errors
Only 100% guaranteed method for
rejecting a point because of gross error is
if you KNOW that you made a mistake!
Qexp =
Largest or smallest point in a data set
xq = outlier
xn = nearest value to outlier
w = range
Qexp is then a value for comparison
Statistical test based on Gaussian distribution
of data
xq − xn
Assumption that gross error leads to an
Q – test
Largest or smallest value
Largest – smallest
Q – test
Q – test is a null hypothesis test
Assume point is valid unless shown otherwise
Q – test
Trends in Qcrit
To do the Q – test we must compare Qexp
to Qcrit
Found in table of Qcrit values (Table 77-5)
Value depends on number of measurements
AND the confidence level
If Qexp > Qcrit then reject the outlier
Treat data as if the point never existed
Do not use for mean, standard deviation, or any
other calculation for the data set!
As number of observations increases, Qcrit
As confidence level increases, Qcrit increases
More measurements – more reliable the data for
Interval must contain more points to insure that
we do not reject a ‘good’
good’ point (one for which
there was not a gross error
Use extreme caution when rejecting data with
the Q – test
Q – test Limitations
Statistical test
Goal for sample statistics is to determine
population values
More difficult to determine µ
90% confidence level for rejection still means
a 10% chance that a good point was rejected
Mathematical limitation
Confidence Intervals
3 data points with 2 points the same value
Q – test will always predict rejection
We’ve seen how σ can be approximated by s
Impossible would be a better term
We can define a range of values which will
probably’ include µ
Probably in a statistical sense
Based on Gaussian distribution of data
Called the ‘confidence interval’
interval’ (ci)
CI when σ is known
For a single measurement
CI for µ = x ± zσ
z comes from area under Gaussian curve
% confidence level is % area defined by ±z
Very unusual to use a single measurement
For a series of measurements
µ = x±
We assume no bias (systematic error) in
Assume that s is a good approximation
of σ
Symbolized by s –> σ
Not usually true
CI when σ is not known
Must have a greater interval
Equation for t in book
We will use tt-table (Table 77-3) to determine
the appropriate value for t
Confidence interval
µ = x±
CI when σ is not known
Function of % confidence level (like z) and
number of degrees of freedom (unlike z)
Possible from pooled data
Not always!
Note similarity to equation for when σ is
At a given probability level, t is always equal
to or larger than z
t = z ONLY when the degrees of freedom = ∞
Comparison of Means
Two possibilities
Only possibility of (significant) error is for
the sample
True value is known
Comparison of measured values
The question is if any difference in two
numbers is attributable to random error.
Comparison to a True Value
Are the numbers ‘significantly’
significantly’ different?
When both values for comparison are
measured, then you must account for
random error in both means!
x1 − x 2
t exp =
s pooled
XX depends on how you chose your t value
F – test
N1 N 2
N1 + N 2
Again, if texp > ttable there IS a significant
difference at the XX% confidence level.
Comparison of Standard Deviations
If texp > ttable then there IS a demonstrated
difference at the XX% probability level
Comparison of Two Means
t exp =
Fexp =
For this test, s2 > s1
Fcrit from Table 77-4
Note must use degrees of freedom for both
numerator and denominator