S T O E

advertisement
STATISTICAL TREATMENT OF EXPERIMENTAL: DATA
S-1. INTRODUCTION ...................................................................................................................................................... 1
S-1.1. TYPES OF EXPERIMENTAL ERRORS............................................................................................................................ 1
S-1.2. RULES FOR TREATMENT OF DATA ............................................................................................................................. 3
S-2. PROPERTIES OF THE SAMPLING UNIVERSE............................................................................................... 4
S-2.1. CENTRAL VALUES ...................................................................................................................................................... 4
S-2.2. DISPERSION ................................................................................................................................................................. 4
S-3. REPEATED MEASUREMENTS OF SINGLE QUANTITY ............................................................................. 5
S-3.1. ESTIMATE OF CENTRAL VALUES ............................................................................................................................... 5
S-3.2. ESTIMATE OF DISPERSION VALUES ........................................................................................................................... 5
S-4. MEASUREMENTS OF LINEAR RELATIONSHIPS......................................................................................... 7
S-4.1. LEAST-SQUARES FIT OF Y = MX + B .......................................................................................................................... 7
S-4.2. PROPER A PPLICATION OF LEAST-SQUARES FITS ...................................................................................................... 8
S-5. QUALITY OF RESULTS.......................................................................................................................................... 2
S-5.1. REJECTION OF DATA .................................................................................................................................................. 2
S-5.2. CONFIDENCE INTERVALS ........................................................................................................................................... 5
S-5.3. SIGNIFICANT FIGURES AND ROUNDING ERRORS ...................................................................................................... 6
S-6. RESULTS DERIVED FROM MEASURED QUANTITIES............................................................................... 7
S-6.1. ERROR PROPAGATION ................................................................................................................................................ 7
S-6.2. ESTIMATES OF PRECISION .......................................................................................................................................... 7
S-6.3. ESTIMATES OF ACCURACY ......................................................................................................................................... 8
S-7. TABLES FOR STATISTICAL TREATMENT OF DATA ............................................................................... 10
S-7.1. VALUES OF T FOR 95% CONFIDENCE INTERVALS ................................................................................................... 10
S-7.2. VALUES OF Q FOR DATA REJECTION ...................................................................................................................... 11
S-7.3. VALUES OF TC FOR DATA REJECTION -- CHAUVENET'S CRITERION ....................................................................... 12
S-7.4. PRECISION AND ACCURACY OF VOLUMETRIC G LASSWARE ................................................................................... 13
S-7.5. MEASURED PRECISION OF LABORATORY BALANCES ............................................................................................. 13
S-7.6. TABLE OF ATOMIC WEIGHTS WITH UNCERTAINTIES .............................................................................................. 14
S-7.7. TABLE OF CONSTANTS AND CONVERSION FACTORS WITH UNCERTAINTIES ......................................................... 16
S-7.8. SUMMARY OF COMPUTATIONAL FORMULAS .......................................................................................................... 17
S-1.Introduction
It is common experience that repeated laboratory measurements of any quantity yield numerical
results that vary from one time to another. Similarly, values of physical properties that are derived
computationally from directly measured quantities usually vary from one determination to another.
Finally, values of properties obtained by one observer commonly vary from those of other observers or
from accepted (literature) values where the latter exist. All of these different types of variations of
physical values are known collectively as "experimental errors."
Experimental errors can never be totally eliminated, but their effect can be minimized by proper
application of statistics. While it is desirable for any scientist or engineer to understand the mathematics
of probability and statistics, it is possible to use statistical analysis correctly even without detailed
understanding. Thus, statistics can be applied just like any other tool.
S-1.1.Types of Experimental Errors
There are basically three types of experimental errors: 1) blunders; 2) systematic, or determinate
errors; and 3) random, or indeterminate errors. The first two types of errors can in principle be nearly
S–
1
eliminated. The third type can never be eliminated, but its influence can be quantitatively evaluated to
yield the greatest possible information about the value that is sought.
Systematic errors cause inaccurate results, even when the precision of the measurement is excellent.
Random errors reduce the precision of the results, but the accuracy may still be perfect within the
confidence limits of precision. Undetected blunders may contribute to both inaccuracy and imprecision.
S-1.1.1.Blunders:
Under the heading of blunders we can include such physical errors as spilling or splashing a portion
of the sample being measured, contaminating the sample, using the wrong sample (because of labelling
error, carelessness in reading labels, or losing the label), misreading an instrument or other apparatus, etc.
Also included in this type of error would be arithmetic or algebraic errors involved in calculations of
derived values (including entering the wrong value in a calculator or computer by mistakenly pressing a
wrong key). Whenever any of these types of blunders are noticed, they must be corrected if possible, or
the particular sample must be eliminated from consideration if the error cannot be rectified.
Sometimes a blunder goes unnoticed but its effect becomes evident during statistical analysis of the
data. At that time, the particular value (s) so affected can be eliminated from further consideration.
S-1.1.2.Systematic Errors:
Systematic, or determinate, errors are most commonly thought of as involving inaccurate calibrations
of equipment. When such errors exist, the most precise measurements imaginable will result in incorrect
results which are not detectable or correctable by statistical methods. For example, if a 100 ml volumetric
flask actually has a volume of 100.1 ml, then the use of this flask will always give erroneous results unless
either the flask is accurately calibrated to learn its true volume, or some other error compensates for the
erroneous volume. All volumetric glassware, balances, electrical components, etc. are calibrated by their
manufacturers, but only to within a certain "tolerance" range of a truly accurate value. For very exact
work, it is important that the experimenter calibrate his or her own apparatus.
Assumptions regarding the purity of chemical reagents introduce another type of systematic error
which can be minimized only by assaying the purity of the compound or by rigorously purifying it, or
both. Sometimes the manufacturer supplies an assay of the material; in such cases the actual value of the
concentration should be used.
The use of obsolete values of physical constants, conversion factors, atomic masses, etc. introduces
additional systematic errors into derived values, just as does the use of improperly calibrated standards.
Tables of constants and conversion factors are given in Section S-7.7. Numerical values of many of these
factors are periodically reviewed and revised using more and more refined measurement techniques as
they become available. Sometimes definitions (e.g., atomic mass scale, conversion factor from liters to
cubic centimeters, etc.) are changed by international agreement, requiring revision of values of some
other physical constants or conversion factors. Only by the use of the most current values can the
contribution of this type of systematic error be minimized.
Finally, rounding off of numerical values during computation introduces an additional systematic
error which in its effect is equivalent to reducing the precision of calibration of an instrument. Though
the influence of round-off error can in some situations be estimated statistically, it actually is a systematic
error which can be minimized, or effectively eliminated, but usually its effect cannot be minimized or
even discovered after the fact. This problem is discussed more full in Sec. S-1.2.3
S-1.1.3.Random Errors:
The third category of experimental error, i.e., random or indeterminate error, includes a great number
of types of phenomena or conditions normally external to the experimental system, each of which can
have (usually) small influences on the results of the experiment. Generally it is considered impossible to
determine even what all the influences are and certainly impossible to determine their individual or
collective effects.
Just a few of the types of possible influences here might include minor fluctuations in atmospheric
pressure or temperature, color and intensity of incident light, voltage of electric supply, electromagnetic
fields, gravitational field (as influenced by the phase of the moon, for instance), sunspots, physical and
emotional conditions of the observer, etc. Fortunately, if there are enough different contributors to
random experimental error (as there almost certainly are), statistical theory tells us a great deal about the
overall effect of such errors, and this effect is easily quantified.
S–
2
S-1.2.Rules for Treatment of Data
Unless you are specifically instructed to do otherwise, you are to follow the data treatment rules
given below for every experiment you perform in the physical chemistry laboratory. The application of
some of these rules is self-evident and requires no knowledge of statistics. For others, the required
knowledge of statistical computations and analysis can be obtained from the following pages. Even if
you have had some prior involvement with statistical treatment of data, you should at least peruse the
following pages. It is your responsibility to use the formulas and terms as they are described here.
Failure to follow any of these rules faithfully will be penalized (in terms of your grade) just as if you
had used an incorrect thermodynamic function or had made a serious error in arithmetic.
S-1.2.1.Data Entry.
Always enter data directly into your laboratory notebook (never on a separate piece of paper first),
preferably with ball point pen so a good carbon copy will result. If you enter data first on a separate
piece of paper, to be transferred into your notebook later, your valuable data may be confiscated and
destroyed by your instructor.
If a value includes a decimal fraction, be sure to make the decimal point very distinct on both the
original and the carbon copy. If the value is less than unity, either place a zero before the decimal point
or use scientific notation with a non-zero integer preceding the decimal point. (Follow these rules for
intermediate and final values in your lab report as well as for original data values.)
If you make a "blunder" type of error and you realize it at the time, cross out the erroneous value
with a single line (don't obliterate). It is a good idea to append a notation indicating why the value was
crossed out.
If, while working in the laboratory you suspect you might have made some kind of blunder in
obtaining or recording a particular data value but you aren't certain, proceed as follows. Indicate the
questionable value, but don't cross it out. Indicate, by a written notation, your reason for questioning the
value. When you perform the appropriate calculations with the data, if the result from the questionable
value does not appear "out of line" with other values, retain it; otherwise eliminate it from further
consideration.
S-1.2.2.Statistical Rejection of Data.
When you have completed all, or an appropriate portion, of the calculations with your data,
determine by the following means whether any individual values are outlying points. If no more than
ten individual data values are involved, reject any that fail the Q-test. If more than ten values are
involved, eliminate those data which are indicated to be outlying points by Chauvenet's criterion.
Detailed instructions for the use of both of these rejection decision methods are provided in Sec. 5.1.
S-1.2.3.Rounding Off.
During computations, always retain at least two more significant figures than those which you would
have retained according to the rules you learned (reviewed in Sec. 5.3) in your introductory chemistry
course. When you have completed you computations you are to round off the final results to two
significant figures in the confidence interval limits.
S-1.2.4.Use of Averages.
Values which represent replicate measurements of the same property of the same sample may be
averaged, and the average value is then used in further computations. The standard error of the average
is used in computing confidence limits in the final results.
S-1.2.5.Linear Functions.
When either raw data or intermediate computed values are to be fitted to a straight line, the values of
the slope and intercept are to be obtained by a least-squares fit. In the process of fitting you will also
obtain the standard error the y-values (actually the estimated standard deviation of residuals) as well as
the slope and intercept. Apply the appropriate criterion to the residuals to decide whether to reject any
data points as out-lyers. If you do reject any, recompute the least-squares fit with the remaining data.
Use the standard errors of the slope and/or intercept in computing confidence limits of any final results
derived therefrom.
S–
3
S-1.2.6.Confidence Limits of Results.
Final results of measurements are to include confidence limits for, and physical units of, the values
obtained. This procedure is illustrated in more detail in Sec. 6.3. If your final answer contains more than
two "significant figures" in the confidence limits, it will be graded as a computational error.
Along with the final results, report a "literature value" (a table of results is an appropriate way to
report these) if one can be found. Give a complete reference to the source of your literature value, even if
it was obtained from a textbook or handbook. If the literature value falls outside the confidence limit
range of the value you obtained, a systematic error is implied. In your discussion, consider possible
systematic errors that could result in the observed discrepancy. This is discussed further in Sec. 6.3.
S-1.2.7.Expected Errors.
For one data point (of each kind, if more than one kind) you are to estimate the precision involved in
each measurement or observation. Estimates of errors involved in measuring volumes and masses are
tabulated in Sec. 7.4 and 7.5. Errors in other types of measurements must be estimated.
Counting numbers (e.g., the charge on an ion, the exponents in an equilibrium constant expression,
etc.) are considered to contain absolutely no error.
When all expected errors have been assessed, the expected errors in derived quantities are to be
computed. The expected errors (precision) is to be compared with the standard error determined from
experimental results. If the standard error is larger than the expected error by a factor of two or more,
include an analysis of why your experimental results show poorer precision than anticipated. An
example is given in Sec. 6.2.
S-1.2.8.Validity of Least-Squares Fit.
If your computation involves a linear fit of data, prepare a graph of the residuals in the least-squares
fit. The appearance of the plot of residuals is to be discussed briefly in terms of the applicability of linear
least-squares fit of the experimental data. This type of analysis is described in Sec. 4.2.
S-2.Properties of the Sampling Universe
If an infinite number of measurements of quantity were made, they would be "distributed normally,"
i.e., the most common value would be the true value (assuming no systematic error). Values increasingly
far from the "central" (true, or most common) value would be less frequently observed. If a graph were
plotted with measured value as the abscissa and the frequency of observation of that value as the
ordinate, the result would be the bell-shaped Gaussian, or normal, distribution curve. This distribution
of infinitely many (all possible) measurable values is known as the "sampling universe" or frequently
simply as the "universe". It is from this universe that our samples of a few measurements is drawn. We
assume that our sample is "representative" of the universe and use it to estimate properties of the
sampling universe itself.
S-2.1.Central Values
We shall be especially concerned with two particular features of the normal distribution. The value
having the maximum frequency (corresponding to the peak of the curve) is the "mode" or "most probably
value," and for the normal distribution it is identical to the "mean" or "average" value of the distribution.
The mean of a sampling universe will be denoted by µ, and the average of a measured sample will be
denoted by a bar over the symbol for the quantity being measured. Thus, µx is the mean of the sampling
universe for the quantity x, and x is the average of an actual group of measurements of x.
Ideally, x = µx, but this is not normally encountered, especially if the sample of measurements is
small. However, statistical treatment of the data allows us to estimate the probability, or likelihood, that
the value of µx deviates from the value of by a given amount.
S-2.2.Dispersion
The second quantity of the distribution that is significant to us is a measure of the dispersion, or
breadth of the distribution curve. As an example, we might have two sets of measurements of some
property each with an average of 100. If one set had all values between 99 and 101, and the other had all
values between 90 and 110, we would say the first was a narrower distribution. We would probably have
S–
4
more confidence in the value of x as a representation of µ in the first case, even though x is the same in
both cases.
Conceivably we could use the total range of values of a distribution as a measure of its breadth, but a
normal distribution (of an infinite sampling universe) extends infinitely in both directions from the
central value, even though finite samples have finite ranges. It turns out that a quantity σ, the "standard
deviation" of the sampling universe, is easily estimated and has much utility. We can compute a
standard deviation of a finite sample (denoted s) and from this we can estimate the value of σ. Much
useful information can be obtained from a knowledge of the estimated values of µ and σ.
S-3.Repeated Measurements of Single Quantity
S-3.1.Estimate of Central Values
Suppose we measure the height of a column of mercury five times and obtain values of 5.31 cm, 5.28
cm, 5.34 cm, 5.30 cm, and 5.27 cm, and we wish to obtain an estimate of the "true" height of the mercury
column together with a measure of the amount of confidence we can place in the value. It seems natural
to use x as a measure of µ. We can calculate its value the same way we have since grade school, i.e., add
up all the values and divide by the number of values. However, we will soon find such word-based
definitions too cumbersome to be useful and we desire a more compact, efficient notation.
The "sigma" notation is universally used in statistics, and it is mandatory that you master its use. In
this notation each value is assigned an index number which is carried as a subscript. The index numbers
used in any given set of measurements are the consecutive counting numbers beginning either with 0 or
with 1 (we shall always begin with 1 in this work). These might be applied to our set of data for the
height of the mercury column in the following way:
Index number
i
Measured value
xi
1
x1 = 5.31 cm
2
x2 = 5.28 cm
3
x3 = 5.34 cm
4
x4 = 5.30 cm
5
x5 = 5.27 cm
Translated into sigma notation, our formula for computing the mean of the sample becomes:
x=
1
n
n
"x
i
=
i !1
1
( x + x2 + K + xn )
n 1
n
The summation symbol,
!
, means "add up the quantities that follow, with i running sequentially from
i= 1
1 to n." The value of n is simply the total number of sample points. Frequently, the limits of a summation
are simply understood to include all the sample points and the index is not actually indicated.
x=
1
n
n
1
1
" x = n" x = n "x
i
i
i !1
S-3.2.Estimate of Dispersion Values
We say that x is the best estimate we can get of µ without making additional measurements, but we
can say something about how good the estimate is if we know the value of the standard deviation, σ. The
actual use of the standard deviation for this purpose is treated in Sec. S-5.2.
S–
5
The definition of the standard deviation of the sampling universe is stated mathematically as:
lim & 1
!=
(
n " # (' n
n
)
(x i $ µ) +
+*
i =1
12
2
%
It seems natural to define the standard deviation of a finite sample in an analogous way as:
#1
s=%
%$ n
n
&
( xi ! x ) (
('
i =1
12
2
"
Although we can compute s directly, we would really like to have a value for σ. Since it cannot be
computed directly, we need a best estimate, !ˆ of the value, σ, that is desired. (In statistics, a carat is
placed above a symbol to indicate that the value to which it refers is an estimate rather than a true value.
For example, !ˆ is an estimate of the standard deviation, and µˆ = x is an estimate of the mean). The
relationship we seek is:
$ 1
!ˆ = &
&% n " 1
n
'
(x i " x) )
)(
i =1
12
2
#
12
* n = s,
/
+ n " 1.
Note that on many handheld calculators, the symbol s often appears as a part of the statistical functions,
however, the value that is calculated is !ˆ .
Now let us apply these definitions and formulas to our measurements of the height of a column of
mercury.
x=
1
n
!x
i
1
( 5.31 + 5.28 + 5.34 + 5.30 + 5.27) cm
5
26.5 cm
=
= 5.30cm
5
=
#1
s=%
$n
" (x
i
2&
! x) (
'
12
"1
%
s = # (5.31 ! 5.30)2 + (5.28 ! 5.30)2 + (5. 34 ! 5.30) 2 + (5.30 ! 5.30) 2 + (5.27 ! 5.30)2 &
$5
'
[
]
!1
$
s = " 0.01 2 + 0.02 2 + 0.042 + 0.00 2 + 0.03 2 %
#5
&
[
!ˆ = s
]
12
12
= 0.0245 cm
n
5
5
=s
= 0.0245 cm
= 0.0274 cm
n"1
4
4
We feel "intuitively" (i.e., on a basis of experience) that by taking the average of the measurements,
the result probably will be closer to the mean than would any single measurement chosen at random.
Indeed, mathematical statistics tells us that the standard deviation of the means, known to statisticians as
the standard error of the mean, of all samples of size n is:
!x =
!
n
Though we lack absolute knowledge of the value of σ, we have our best estimate, !ˆ . Therefore, to a good
approximation, we have:
S–
6
!ˆ x "
=
ˆ
!
0.0274 cm
= 0.0123 cm
5
0.0245cm
=
= 0.0122cm
n#1
4
n
s
=
Note that even though these two sequences of computation are algebraically equivalent, the results differ
by 1 in the third significant figure. The reason for this is that intermediate results were rounded off,
introducing round-off error. This type of error in statistical computations is discussed further in Sec. S5.3.
Computation of such statistical parameters as x , !ˆ , and !ˆ x can be tedious and are subject to errors
simply because of the length of the procedure. Careful use of the statistical functions available on most
modern scientific calculators significantly reduces this problem.
S-4.Measurements of Linear Relationships
Suppose a functional relationship with x and y exists such that y = f(x, a, b, c, ...) where values of y are
measured for fixed, precisely known values of x, and the best estimate of constants a, b, c, ... are to be
determined for the particular relationship. Then if the measured values of y are normally distributed
about the values of the function calculated from the mathematical relation, it can be shown that the best
estimates of the constants are those for which the sum of the squares of the differences between measured
and calculated values of y is a minimum. Note that this criterion is valid only if the random errors are
assumed to exist solely in the values of y, and not in those of x. This is known as the "least-squares
criterion of fit."
It is possible to obtain standard deviations of the various estimated values of parameters also.
Results of application of the method to the equation of a straight line are as follows.
S-4.1.Least-Squares Fit of y = mx + b
This is probably the best known of all least-squares applications. The computational formulas for
obtaining the best estimates of the values of the slope and intercept are as follows:
n !xy " ! x ! y
mˆ =
D
! x2 ! y " !x ! xy
ˆx ,
bˆ =
=y" m
D
2
2
where D = n !x " ( ! x) = the denominator
The standard error of estimate of the y values relative to the least-squares fitted line is given by:
$
ˆ x # bˆ
&" y # m
!ˆ yˆ = &
n#2
&%
(
)
2
'
)
)
)(
12
and the standard errors of estimate of the slope and intercept are given by:
" n%
!ˆ mˆ = !ˆ yˆ $ '
# D&
12
2&
#
ˆ yˆ % " x (
!ˆ bˆ = !
$ D '
12
As with the mean and standard deviation, many scientific calculators and most computer data
analysis programs can perform linear (and often non-linear) least-squares fitting.
S–
7
S-4.2.Proper Application of Least-Squares Fits
The first assumption in the derivation and use of a least-squares fit is simply that the equation of the
line is exactly the same mathematical form as the relation between the variables in the actual physical
system or process, and that the purpose of using the fitting technique is to obtain the best estimate of the
numerical values of the parameters that are involved.
The second assumption is that the errors in the measured values of the dependent variable (y) are
normally distributed, with zero mean, and that these errors are randomly distributed. That is, if a
sufficiently large number of data points were available, and if they were divided into groups (clusters) of
adjacent points, the means and standard deviations of errors theoretically should be the same for all
groups.
The third assumption is that errors in the measured values of the independent variable (x) are nonexistent, or in practical term, they are negligible relative to the errors in the measured values of the
dependent variable.
If any of the above assumptions is invalid for a particular set of data, then a least-squares fit does not
provide the best estimate of the values of the desired parameters. It may give better estimates than any
other conveniently available method but it certainly cannot be considered to have the same reliability as if
the above assumptions were valid. There is a sensitive test, to be desired below, which should be applied
whenever fitting data to a line under conditions such that the validity of the assumptions is not known
for certain.
When some or all of the measurements involve replicate determinations of y at each of several values
of x, the following considerations should be observed carefully. Unless the standard error of estimate of
the y values is exactly the same for each value of x, do not apply the above formulas to the xi's and the
corresponding average values of yi. To do so would result in obtaining estimates of the desired
parameters that may be considerably poorer than the desired best estimates. (It would violate the second
assumption stated above.) The proper procedure is to apply the appropriate formulas to all the measured
values individually. Though this may seem to be an excessive amount of computation, it actually
involves about the same number of data entry operations when using a programmed computer or
calculator as it does if the data are first averaged and then subjected to a least-squares fit.
In addition to consideration of whether the equations are properly applied to the data, it is
appropriate to consider the equation of whether the least-squares technique should be used at all.
Once the fitted values have been obtained for the parameters of the model equation, it is a simple
matter to compute the fitted value of y corresponding to each experimental value of x. From this, the
deviation of each experimental value of y from its corresponding fitted value, known as the "residual" of
y, is obtained by subtraction of the latter from the former. If a computer data analysis program is used
for the least-squares computations, it is generally a simple matter to calculate the residual values in
addition to the fitted values of y.
Then a graph is prepared using fitted value of y as the abscissa and the corresponding residual value
(with a algebraic sign) as the ordinate. From a simple examination of such a scatter diagram of residuals,
qualitative information about the appropriateness of fit is obtained. If the field of residuals seems to
represent a straight line at an angle to the horizontal axis, as in Fig. 1a, it almost certainly indicates an
error either in the least-squares computations or in entering the data into the computations. This is a
more sensitive test than a simple plot of the computed line through the field of actual data points, as in
Fig. 1b. Also, if the mean value of the y-residuals (with algebraic sign) is significantly different from zero,
it certainly indicates some error in fitting the data. This case is illustrated in Fig. 2.
S–
8
data
data residuals
0
variable
variable
Figure 1b
data
data residuals
Figure 1a
0
variable
variable
Figure 2a
Figure 2b
If the field of points is wider at one side of the diagram and narrower at the other, or if it is wider or
narrower in the center than at the sides, the second assumption behind the least-squares method is not
valid. This occurs frequently when a nonlinear functional relation between x and y is "linearized" by
algebraic manipulation. A common example of this is the case of an exponential function. For example,
the vapor pressure of a liquid is an exponential function of the reciprocal of temperature. However, we
do not have a convenient least-squares equation for an exponential relation, so we linearize it by taking
the logarithm of both sides of the equation. The result is a function of the type
log p = m
1
+b
T
If we let log pi = yi and 1/Ti = xi, we have a general linear equation and it is a simple matter to apply a
least-squares analysis to evaluate m and b. However, it is common experience that errors in measured
values of pi form a single normal distribution, not the errors in log pi. Stated another way, the leastsquares fitting method in this case assumes a single distribution of the relative errors in p, where the
usual physical situation results in a single distribution of the absolute errors in p. For this reason, the
scatter diagram of y-residuals obtained from such an experimental analysis almost invariably is wider for
small values of log p than for large values. Thus, the line so obtained is unlikely to be the best possible fit
of the data, but it may not be greatly in error, so the linearizing technique is commonly used.
This type of fitting problem is illustrated by Fig. 3, which involves a least-squares fit of calibration
data for a thermistor, whose resistance is an exponential function of the reciprocal of Kelvin temperature.
In Fig. 3b, ln R is the ordinate and 1/T is the abscissa. Errors in resistance are approximately normally
distributed.
S–
9
data
data residuals
0
variable
variable
Figure 3a
Figure 3b
If the field of points in the graph of residuals seems to represent a curve rather than a horizontal
straight line of zero y-deviation, it indicates that the mathematical model does not adequately represent
the experimental data. This is almost always the case when the logarithm of vapor pressure is fitted to a
linear function of reciprocal of absolute temperature for careful experimental work. The reason is found
by referring to the basis for the log P vs 1/T relation, i.e. the Clausius-Clapeyron equation. The
derivation of the latter equation involves several assumptions that are not adequately correct for a
temperature range of more than a few degrees. Thus if high quality experimental data are plotted in the
appropriate form for the Clausius-Clapeyron equation, they lie on a curve rather than a straight line. The
curvature may not be great, however, and probably would not be noticed, but the scatter diagram of yresiduals again provides a very sensitive test for the appropriateness of the model.
data
data residuals
As a specific example, if a least-squares analysis is applied to vapor pressure data for water from a
handbook, as in Fig. 4, this curvature is very noticeable. Furthermore, the normal boiling point will not
be calculated (from the fitting equation) to be 100.0oC, nor will the calibrated heat of vaporization agree
with the accepted value. In addition, the actual calculated values of the boiling point and of the heat of
vaporization will depend on how many points, and which ones, are used in the computation.
0
variable
Figure 4a
variable
Figure 4b
Finally,it should be reiterated that the test described here is very sensitive, and should not negate the
reasonableness of using approximate models for physical systems. While the graph of residuals serves to
identify many situations in which the residuals do not belong to a single distribution (at least
approximately), it does not in any way tell us whether the distribution is reasonably normal. For
example, the residuals involved in Fig. 5 are normally distributed, but this fact cannot be discerned by
simple visual inspection. There are rather sophisticated tests available to answer this question, but they
will not be considered here because the question is not usually of serious concern when dealing with
analysis of experimental data.
S–1
data
data residuals
0
variable
Figure 5a
variable
Figure 5b
Whether errors associated with the measurement of x values are non-existent or negligible, as
required for a least-squares fit to be the best possible, is usually judged by consideration of the physical
system. Thus, the effectiveness of control of the independent variable and the precision of the method
used for measuring its value (time, temperature, composition, etc.) will determine the relative precision
involved. If this is not already known for a given system or apparatus, it is subject to independent
experimental evaluation. Then the magnitude of !ˆ x can be compared with the magnitude of !ˆ y
obtained in the least-squares fit of the data. If !ˆ x /x is not less than about one percent of !ˆ y /y , the
question of whether the least-squares estimate of parameters is truly the best estimate becomes pertinent.
Though it is possible to obtain an even better fit using the experimentally determined value of !ˆ x , such
refinements are somewhat complicated and will not be treated here.
S-5.Quality of Results
By combining estimates of dispersion of the errors in experimental data with estimates of mean
values or of least-squares fitted values, it is possible to deduce additional information about the quality of
results. In fact, when dealing with experimental data there probably is no other reasonable justification
for performing the calculations to obtain estimates of the standard deviation.
In most of the cases presented here, the recommended procedures can be justified by mathematical
deduction that is rigorous and is indisputably correct provided the basic assumption of normal
distribution of errors is correct. this basic assumption is justified by the Central Limit theorem which is
mathematically provable, but the absolute proof of the assumption is unavailable because there is not
way of deciding how nearly the limiting conditions inherent in the theorem are met by the actual
experimental situation. In a few of the cases discussed, no treatment is available which could be proven
rigorously. Such cases will be identified, and a justification will be presented for recommending them.
The user is generally free to make a choice of how to deal with such situations without fear of being
proved wrong (or right).
S-5.1.Rejection of Data
One question that frequently arises in the analysis of sets of measurements regards criteria for
rejection of data that seem "out of line" with the remaining data. This must always involve a subjective
judgement--there is no absolute, rigorously provable, basis for rejection. In fact, an experimental purist
might say that a data point must never be rejected unless it is known to have been the result of something
faulty. However, if the third measurement in our mercury column example had been 15.34 cm rather
than 5.34, few if any persons would object to its rejection. The question then becomes, for practical
purposes at which point do we reject? Many people use one or another of many rejection criteria that
have been proposed, and we shall comment on a few of them.
S–
2
S-5.1.1.Probability Distributions for Small Sample Size.
All statistical, "objective" criteria for rejection are based on a single concept. A decision is make, a priori,
to reject any sample points that have less than the rest of the data points. Once the probability level for
rejection has been chosen (a subjective matter), then statistical theory can be applied objectively to
determine which data points meet or fail the established criterion for retention, and all other data points
are rejected. After such a technique has been applied to the original set of data, the statistical properties
of the remaining values are computed, and data rejection is never even considered for application to the
remaining values.
From the central limit theorem it can be deduced that the measurements of a single physical quantity
can be described in terms of a statistical distribution which becomes asymptotically (in the limit of
infinitely many measurements) indistinguishable from a normal (Gaussian) distribution with x = µ and
s = σ. From this, it is easily demonstrated that the statistic Z = (xi -µ)/σ can be described in terms of a
standard normal distribution, i.e., a normal distribution with mean = 0 and standard deviation = 1. As
the total area under the standard normal distribution curve is unity, the area under the curve between -Z
and +Z is equal to the probability of obtaining an individual value xi. The probability of a data point
lying between Z = -1 and Z = +1 is 0.68268, and the probability of it being between Z = -2 and Z = +2 is
0.95450. However, you should note carefully one feature of this discussion that is frequently overlooked
in the establishment of rejection criteria and in the reporting of confidence intervals for measured data.
The statistic Z is stated in terms of σ and of µ, for which we do not know the actual values. It is true
that we have best estimates, !ˆ for σ, and x for µ. However, these are but estimates and we know only
that they are the best we can obtain in the absence of actual knowledge of σ and µ. We don't even know
how good the estimates are. This fact shakes our confidence somewhat in the use of Z, as we have only
estimates for its value.
A statistician by the name of W.S. Gosset, who published under the pen name of "Student", came to
our rescue with the t-distribution. It can be shown that the statistic
t = ( xi ! x ) "ˆ
is not distributed according to a standard normal distribution, but rather according to Student's tdistribution, for which extensive tables exist. Unlike the normal distribution, the t-distribution is a
function of a sample size.
S-5.1.2."2σ" Rejection Criterion.
In this approach, it is proposed that any measurement that has less than a 5% chance of belonging to the
true sampling universe (i.e., it is a mistake, with high probability) is rejected. If x is a reasonable estimate
of µ, then any individual measurement lying outside the range x ± 2 !ˆ approximately meets these criteria
and should be rejected. However, this basis has two serious problems, one for small samples and one for
large samples.
For small samples, the range of ±2 !ˆ encompasses considerably less than the specified 95%
probability range. This shortcoming can be overcome by stating the criterion in terms of the range
x ± t0.95 !ˆ , where t0.95 is a function of sample size.
The problem with this criterion for large samples is as follows. Suppose we have a sample of 100
measurements, and that it is a true sample of the universe (i.e., no values represent systematic errors, and
it has a distribution of the same form as the distribution of the universe). Under these conditions we
might expect 5%, or 5, of the measurements to lie outside x ± t0.95 !ˆ and hence they would be rejected
mistakenly. It may be just as serious an error to reject a good point as to accept a bad one, so some other
basis for decision is needed.
S-5.1.3.Chauvenet's Rejection Criterion.
Chauvenet's criterion of rejection, which seems more reasonable for large samples than does the 2σ or
95% criterion, is that a measurement should be rejected if the probability of obtaining it in a single
measurement taken from the sampling universe is less than 1/(2n). Thus, for our sample of size 5, we
should reject any value whose probability of occurring is 1/10 or less. This turns our to be any point
outside the range of x ± 2.13 !ˆ . For a sample of 100, the probability desired is 0.005 or less, which
corresponds to the range x ± 2.82 !ˆ .
S–
3
Values to be used in application of Chauvenet's criterion are given in Sec. S-7.3. Note that the values
in the first column are those to be used with an average value or with a fitted function that involves only
one estimated parameter. Values in the second column are to be used with linear least-squares fit (two
parameters). Values in the third column are to be used with (least-squares) fitted functions that involve
three fitted parameters, such as a, b, and c in y = a + bx + cx2.
Tables usually published for this purpose are incorrect in terms of the basic statement of Chauvenet's
criterion. They are based on the standard normal distribution rather than on the Student's t-distribution.
The table in Sec. S-7.3 is correct in this respect.
In application of Chauvenet's criterion, values of tc from Table 7.3 are used in conjunction with the
estimate of the universe standard deviation (for single parameter cases) or with the standard error of
estimate of y-values (for two- and three-parameter cases). Any values whose residuals lie outside the
range of x ± tc !ˆ or ± tc !ˆ yˆ are rejected.
S-5.1.4.Q-Test Rejection Criterion.
Even Chauvenet's criterion as it is usually applied has a difficulty that is overcome in part by the Qtest method. This method is widely promulgated in textbooks of analytical chemistry, but it, too has a
minor flaw (which is corrected here).
In the application of Chauvenet's criterion, values of the estimates of the mean and the standard
deviation are computed for the entire sample. Then the critical t-value calculated from these results
statistically is based on the assumption that all the data points belong to the same normal distribution.
Those whose probabilities of belonging to that distribution are too low are rejected as not belonging to
the distribution for which the computations assumed they did belong. The internal inconsistency should
be evident here, even though it has been ignored in textbook approaches to the subject. For large enough
samples, the errors introduced by including points in computations which perhaps don't belong there,
are small enough not to be important. Also, the logic flaw results in a conservative criterion: points are
less likely to be rejected falsely than even the criterion implies.
For small sample sizes, though, this defect becomes serious. For example, for samples of fewer than
seven values, application of Chauvenet's criterion as described above will never result in the rejection of
any outlying data points, even if infinitely far removed from the mean value. This problem could be
overcome in the following way. First, tentatively eliminate the outlying data point(s), compute the
statistics of the remaining sample, and then apply Chauvenet's criterion to determine whether the outliers
should indeed be rejected. If a sample is large, there may be more than one possible outlier, and the
computations should be applied to all combinations of possible outliers. The computational work could
quickly become prohibitive in such a technique, so it is not used for large samples.
For small samples, the Q-test has been widely used. Its basis is as follows. First, assume that no more
than one data point is likely to be an outlier for any given sample. Then reject an outlier if its probability
(using the t-distribution) of being a member of the same distribution as the remaining values is less than
10%. To simplify computations, the usual Q-test tables use the range of values of the data other than the
outliers) as a means of estimating the universe standard deviation, rather than using the sample standard
deviation for this purpose. This approach has been well justified by extensive studies of the relationship
between the range and the universe standard deviation.
The Q statistic is computed in the following way. First, arrange all the data points (or residuals in the
case of a 2- or 3-parameter fitted function) in order of increasing values. Assign serially ordered indices
(from 1 to n, for n values) to these ordered values. We then compute
Q1 =
x2 ! x1
x n ! x1
and Qn =
xn ! xn !1
xn ! x1
If either of these values exceeds the critical value of Q for n points (as given in Table S-7.2) then point
number 1, point number n, or possibly both, may be rejected as being an outlier.
The 90% retention level of the Q-test suffers from the same problems as does the 2σ criterion. Thus, it
is expected that for every sample of ten values, one value is likely to be rejected by the Q-test even though
that value may be a legitimate member of the sample universe. The test is, according to Chauvenet's
criterion, overly conservative for samples of fewer than five points (rejection especially unlikely).
However, our intuitive confidence in statistics, even in the t-distribution, is not good for very small
samples, so this probably is not a serious criticism.
S–
4
S-5.1.5.Summary of Rejection Criteria.
For your work, use the Q-test in conjunction with Sec. S-7.2 to decide whether to reject values when there
are not more than ten values in all. In all other cases, use Chauvenet's method in conjunction with Sec. S7.3. If a blank appears in the table of Sec. S-7.2 for your situation, never reject any data. Regardless of
which criterion is used for rejection of data, the remaining values should be used to recalculate the
desired parameters, but a rejection criterion should never be applied a second time to any given set of
measurements.
S-5.2.Confidence Intervals
The publication of confidence limits with experimental values has become a common procedure and
is familiar to most chemists. In the following paragraphs the statistical basis for evaluations of confidence
intervals will be presented. In the process it will be seen that many (probably most) published confidence
limits are either incorrect or misleading. It requires almost no additional effort to evaluate the limits
correctly, so the procedure will be described.
When experimental data are reported, frequently the mean value (or fitted value) is given together
with the value of the "standard deviation". Unfortunately, the meaning of "standard deviation" is not
always made clear, so there is no way to evaluate it. Sometimes, though probably seldom, it refers to s,
the standard deviation of the sample. More commonly it refers either to !ˆ , the best estimate of the
universe standard deviation or to !ˆ x , the standard error of the mean. Sometimes the results are reported
as x ± σ or as x ± 2σ and are identified as 68% or 95% confidence limits respectively. Now let us
examine the basis for such an identification and establish the mathematically correct way of computing
confidence limits.
It can be shown rigorously that the average values of groups of measurements of a single quantity
can be described in terms of a statistical distribution which becomes asymptotically indistinguishable
from a normal distribution with x = µ and !ˆ x = σ. From this it is demonstrable that the statistic
Z m = ( x ! µ) /" x can be described in terms of a standard normal distribution. We then define Zω as the
value of Z such that the area under the normal curve between -Zω and +Zω has the value ω. From this it
can be proven that µ = x ± !ˆ x Z" with probability ω, or with percent confidence of 100ω. If Zω = 1, the
confidence is 68%, if Zω = 2, the confidence is 95%, and if Zω = 3, the confidence is about 99%.
As with the rejection of data, we note that x and !ˆ x are only estimates of the true mean and
standard deviation of means (thus !ˆ x is known as the standard error of estimate). This problem as
before, is handled by using t values rather than Z values in the computation, e.g.,
µ = x ± !ˆ x t"
This is the only correct form for description of confidence intervals for mean values, as it accounts for the
uncertainty in the value of !ˆ x . The value of tω depends on the number of data points obtained--the more
points, the more confidence we have in the estimate !ˆ . Values of tω are given in the table in Sec. S-7.1 for
ω = 0.95, i.e., for determining 95% confidence limits. The value of ω = 0.95 seems to be evolving as a
"standard" basis reporting data.
Thus, to determine the value of tω to specify 95% confidence limits for a sample of 15 measurements,
we enter the table under the column for 1 parameter and read opposite the value 15 in the data point
column, the value for tω = 2.145. To illustrate the application of confidence limits we turn to our earlier
example of the height of a column of mercury. In that case, we found x = 5.30 cm and !ˆ x = 0.0123 cm.
As there were five data points in the example, from the table we find that t0.95 = 2.776, so that the
confidence range on each side of the mean value is 2.776 x 0.0123 cm = 0.0341 cm. Thus, we stat the 95%
confidence limit for the measurement as 5.300 ± 0.034 cm. (The use of significant figures in reporting
confidence intervals will be considered later.) Note that if we had used the Z-distribution rather than the
t-distribution the interval would have been given as 5.300 ± 0.024 cm. The difference in the two values is
marked, and the latter value is in error.
As tables of the t-distribution appear in most statistics books as well as here, it seems pointless not to
use them in reporting experimental results. Note also that both !ˆ x and tω decrease as the number of data
points increases, giving narrower confidence intervals, as is intuitively expected. This is why results are
more reliable for larger samples (other things being equal). However, since the work required (in
S–
5
obtaining more sample points) increases approximately as the square of the amount improvement in !ˆ x ,
a condition of diminishing returns is involved.
Confidence intervals can also be constructed for the parameters arising from a least-squares fit. The
procedure for developing them is rather complicated, so only the results will be given here, and those for
the general linear relationship. In obtaining tω for use in this case, remember that two parameters are
obtained in the fit. Then it can be shown that
ˆ yˆ
"
ˆ ± t! "
ˆ mˆ = m
ˆ ± t!
m=m
ˆ ˆ = bˆ ± t!
b = bˆ ± t! "
b
n "ˆ x
"ˆ yˆ x 2
ˆx
n"
Note that in both these expressions the confidence interval becomes smaller not only with increasing
number of data points, as would be expected, but also with increasing range of x values (as indicated by
!ˆ x ), as also seems reasonable.
S-5.3.Significant Figures and Rounding Errors
It is critically important that in all statistical computations, no values are to be rounded off any more
than dictated by the limitations of computing equipment until the computations are completed. In line
with this, computations done with FORTRAN, Pascal, C, or other programming language should be
performed with double precision arithmetic (16 significant figures rather than the usual eight). Statistical
computations frequently involve small differences between very large numbers, and round off errors can
affect the results seriously, whether numbers are rounded off by limitations of computing devices or by
scientists who apply rules of significant figures which should not be applied in statistical computations.
Conceptually, it may be said that a statistical analysis is performed to determine the properties of a
hypothetical set of numbers of which the numerical values of data constitute a subset, presumably a
representative subset. At this point, the fact that the numbers correspond to physical measurements of
limited precision is of no consequence whatsoever. Thus, for purposes of the statistical analysis all data
are considered to be known to an infinite number of significant figures, i.e., an endless string of zeros is
assumed to follow the last non-zero digit recorded. (Thus, if we compute the mean of the numbers 5 and
6, it is 5.5, not 5, or 6.) Once the computations have been completed, the results may be rounded off in
any way desired.
An illustration of the effect of round off errors in statistical computations was seen in Sec. S-3.2. That
example showed that it is sometimes necessary to avoid round off errors in statistical computations and
since the actual result of rounding off in any particular case can be determined only by doing the
computation both ways, it should always be done with a minimum of rounding off.
At this point it is appropriate to discuss the question of significant figures in reporting final results. It
is sometimes stated that the confidence interval values should be reported only to one significant figure
and that the mean or fitted values should be reported in a way that is consistent with the interval values.
In the case of the mercury column example, 95% confidence limits as rounded off to 1, 2, and 3 significant
figures are
x = 5.30 ± 0.03 cm
x = 5.300 ± 0.034 cm
x = 5.3000 ± 0.0341 cm
As the 3-decimal places form is the most nearly correct of these three, it is seen that rounding off to 1
significant figure understates the breadth of confidence interval by 12%, giving the impression that the
results are somewhat better than is actually the case. In fact rounding off to one significant figure can
introduce errors ranging up to 50%, so it would seem appropriate to use at last two significant figures.
Rounding off to two significant figures introduces errors ranging only up to 5%, and it is likely that for
most purposes this is adequate. In any case, the last significant figure retained in the estimated value
S–
6
itself must be in the same decimal position as the last significant figure retained in the confidence limits.
This rule applies whether the last digit is zero or non-zero.
For rounding off of values, observe the following rules. If the leftmost digit of those to be eliminated
by rounding off is less than 5, the last retained digit is left unchanged. If the leftmost of the digits to be
eliminated is 5 or greater, the last retained digit is increased by one. To illustrate, we shall round off each
of four different values to three significant figures.
1.5550000 becomes 1.56
1.5650000 becomes 1.57
1.5650001 becomes 1.57
1.5549999 becomes 1.55
S-6.Results Derived from Measured Quantities
S-6.1.Error Propagation
It is common in scientific work to compute a value of a function, e.g. f(x, y), from independent values,
e.g. x and y, each of which has a certain degree of uncertainty attached. The uncertainties, which we shall
denote εx, εy, etc., may be standard errors of estimate, or they may be stated or estimated uncertainties or
tolerance limits. It should be obvious that there will also be some degree of uncertainty in the computed
value of f, and it is desirable to estimate this uncertainty that is a result of "propagating" errors through
the computation of f, with the maximum possible degree of precision. If εf is estimated too small, then a
higher degree of precision is implied than is justified. If εf is estimated too large, then the precision of f is
understated, and its value may not be accorded the confidence it deserves.
Without offering any proof, or even a plausibility argument, we state that the desired computation of
the uncertainty in f(x1, x2, …, xn) is according to the following equation:
*
!f = ,
,+
)
i
2
# "f & 2 % "x ( ! x i /
$ i'
/.
12
In elementary science courses students commonly are instructed to use the relationships that the
"error" in a sum or difference is the sum of the "errors" and that the relative "error" in a produce or
quotient is the sum of the relative "errors". The usual rules for use of significant figures in computations
are derived from this. That is, when added or subtracting numbers, discard figures to the right of the last
significant figure retained in the least precise value involved in the calculation. This assumes the error to
be ±1 in the last retained figure, or ±5 in the first omitted figure, or something similar. Similarly, in
multiplying or dividing, retain the number of significant figures equal to that of the least precise figure
used in the computations. Though these rules are but crude approximations to the propagated error
rules stated at the beginning of this paragraph, they are satisfactory for computations in beginning
science classes.
However, from the more correct relationship for computing propagating errors, we find that the error
in a sum or difference is the square root of the sum of squares of the errors of the individual quantities,
and the relative error of a product or quotient is the square root of the sum of squares of the relative
errors of the individual quantities. If all the errors (or relative errors as the case may be) are the same in a
given case, the simpler estimates are in error by about 40%--they always overestimate the magnitude of
the true "error" and hence they should never be applied to results of high quality experimental work. The
purpose of statistical analysis of experimental data is to obtain the maximum information from the data.
Use of error estimates that are too large negates much of the purpose of the calculations. To perform
meaningless computations is little better than to perform no computations.
S-6.2.Estimates of Precision
Whenever you determine a property of a system by a method that involves several different
measurements, either under the same or different conditions, you will obtain a numerical evaluation of
the precision of your work in the form of the 95% confidence interval of the result. In addition, however,
you are to estimate the precision to be expected for the method. If the expected precision is very much
less than the experimentally observed precision, try to find the reason for the discrepancy.
S–
7
In the calculation of expected precision, apply the appropriate propagated error formulas to
uncertainties in reading the various types of quantities involved. In the case of volumes or masses, you
may use the uncertainties indicated in Sec. S-7.4 and 0. For other types of measurements, you will have to
estimate the uncertainties. For example, in timing the efflux period in a viscometer, your uncertainty
would include your own reflex time in operating the switch as well as the uncertainty in reading the
timer. In using electric meters of any type, the uncertainty would include an estimate of the fraction of a
scale division in which you have confidence of your ability always to read the same value, or the range of
observed needle fluctuations, or the larger of the two. Similarly with thermometer readings in which the
mercury height might be seen to fluctuate if viewed with a magnifier.
As an example of the calculation of expected precision, let us assume the titration of potassium
hydrogen phthalate (KHP) with a dilute solution of NaOH to standardize the latter. Assume the
following values are obtained:
Volume of solution = approx. 40.00 mL
Mass of KHP = approx. 0.816 g
M.W. of KHP = 204.2 g/mole
Though the value of the molecular weight has an uncertainty because of uncertainties of knowledge of
atomic weights, this will contribute only to accuracy of the results, and not to their precision.
From Sec. S-7.4, we find the estimated precision of reading a 50 mL buret to be 0.025 mL, but there are
two readings involved in a volume measurement, so the value of the volume is stated to be
V = (40.000 ± 0.035) mL
From Sec. 0, the measured precision of a weighing on the analytical balance is 0.00057 g, but there are two
readings involved in a mass measurement, so the mass of KHP is stated to be
m = (0.81600 ± 0.00081) g
The molar concentration is then found to be
C=
1000m 1000 ! (0.81600 ± 0.00081)
=
MV
204.2 ! (40.000 ± 0.035)
= (0.09990 ± 0.00031) mole/L
If, as a result of several titrations, you obtain a 95% confidence interval for the concentration of the
base that is much larger than ± 0.00013 molar, you should examine your titration technique. Perhaps you
are unable to determine the endpoint reproducibly enough, but at least the computed precision tells you
what you should be able to accomplish with good lab technique.
S-6.3.Estimates of Accuracy
Estimation of accuracy of a result is done in a manner similar to that of precision, except that now we
include uncertainties in molecular weights, constants, and conversion factors, we well as tolerance limits
(accuracy) rather than reproducibility (precision) of quantitative apparatus.
Using the titration example again, suppose the actual measured quantities were
V = 39.97 mL
m = 0.8154 g
From Sec. S-7.4, the stated tolerance of a 50 mL buret is 0.05 mL, so the value would be stated as
V = (39.970 ± 0.050) mL.
We do not have a table of tolerances for the masses obtained from the balance; such a table would be
cumbersome. However, assume the accuracy of weighing to be about the same as the precision in this
case. Thus,
m = (0.8154 0±0.00081) g
S–
8
Referring to Sec. S-7.6 we obtain the molecular weight of KHP (KC8H5 O4) to be
M = 1 x (39.098 ± 0.003) + 8 x (12.011 ± 0.001) + 5 x (1.0079 ± 0.0001) + 4 x (15.9994 ± 0.0003) = (39.098 ±
0.008) + (5.0395 ± 0.0005) + (63.9976 ± 0.0012) = (204.2231 ± 0.0086) g/mole
Further, the KHP bottle carries an assay value of 99.99%, so
M =(204.2435 ± 0.0086) g/mole
Finally,
C=
1000m
1000 ! (0.81540 ± 0.00081)
=
MV
(204.2435 ± 0.0086) ! (39.970 ± 0.050)
= (0.09988 ± 0.00016) mole/L
In the process of calculating the final concentration, 95% confidence limits would be obtained, as
titrations are always done in duplicate or triplicate, at least. The reported value would indicate an
uncertainty that is the greater of the estimated uncertainty (as above) or the 95% confidence limits. Thus,
if your precision is better than some of the uncertainties in calibration, then the systematic error is likely
larger than the random error, and the final value can be no better than the certainty with which we know
calibration values, molecular weights, constants, conversion factors, etc. Alternatively, if your precision
is poorer than the computed uncertainties, then there is no reason to assume that the true value lies
outside your 95% confidence limit range with probability greater than 0.05.
Finally, if your result is a property (of a system) for which a "literature value" can be located for
comparison, you must do so. Not to do so is very unscientific--it borders on dishonesty. If the literature
value does not lie within your final range of uncertainty, you must examine possible causes of, and
means of correction of, the discrepancy.
S–
9
S-7.Tables for Statistical Treatment of Data
S-7.1.Values of t for 95% Confidence Intervals
Data Points
—
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Number of
Parameters
1
12.706
4.303
3.182
2.776
2.571
2.447
2.365
2.306
2.262
2.228
2.201
2.179
2.160
2.145
2.131
2.120
2.110
2.101
2.093
2.086
2.080
2.074
2.069
2.064
2.060
2.056
2.052
2.048
2.045
2.042
2.040
2.037
2.035
2.032
2.030
2.028
2.026
2.024
2.023
2.021
2.020
2.018
2.017
2.015
2.014
2.013
2.012
2.011
2.010
2
—
12.706
4.303
3.182
2.776
2.571
2.447
2.365
2.306
2.262
2.228
2.201
2.179
2.160
2.145
2.131
2.120
2.110
2.101
2.093
2.086
2.080
2.074
2.069
2.064
2.060
2.056
2.052
2.048
2.045
2.042
2.040
2.037
2.035
2.032
2.030
2.028
2.026
2.024
2.023
2.021
2.020
2.018
2.017
2.015
2.014
2.013
2.012
2.011
3
—
—
12.706
4.303
3.182
2.776
2.571
2.447
2.365
2.306
2.262
2.228
2.201
2.179
2.160
2.145
2.131
2.120
2.110
2.101
2.093
2.086
2.080
2.074
2.069
2.064
2.060
2.056
2.052
2.048
2.045
2.042
2.040
2.037
2.035
2.032
2.030
2.028
2.026
2.024
2.023
2.021
2.020
2.018
2.017
2.015
2.014
2.013
2.012
S– 10
S-7.2.Values of Q for Data Rejection
1.
Arrange values to be tested in order of increasing value.
2.
Assign ordinal indices to the values, i.e., x1, x2, ..., xn.
3.
Compute
Q1 = (x2 - x1)/(xn - x1)
Qn = (xn - xn-1)/(xn - x1)
4.
If either Q1 or Qn exceeds the value of Q in the above table, reject x1 or xn,
respectively.
Data Points
Number of
Parameters
1
2
3
3
0.94
—
----
4
0.76
0.94
----
5
0.64
0.76
0.94
6
0.56
0.64
0.76
7
0.51
0.56
0.64
8
0.47
0.51
0.56
9
0.44
0.47
0.51
10
0.41
0.44
0.47
S– 11
S-7.3.Values of tc for Data Rejection -- Chauvenet's Criterion
Data
Points
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Number of
Parameters
1
2.284
2.305
2.324
2.343
2.360
2.376
2.392
2.406
2.420
2.433
2.446
2.458
2.470
2.481
2.492
2.503
2.513
2.522
2.532
2.546
2.550
2.558
2.567
2.575
2.583
2.590
2.598
2.605
2.612
2.619
2.626
2.632
2.639
2.645
2.651
2.657
2.663
2.669
2.674
2.680
2
2.320
2.335
2.350
2.365
2.380
2.394
2.407
2.420
2.433
2.445
2.457
2.468
2.479
2.490
2.500
2.510
2.519
2.529
2.538
2.546
2.555
2.563
2.571
2.579
2.587
2.594
2.601
2.609
2.615
2.622
2.629
2.635
2.642
2.648
2.654
2.660
2.665
2.671
2.677
2.682
3
2.367
2.374
2.382
2.392
2.403
2.414
2.425
2.436
2.447
2.458
2.469
2.479
2.489
2.499
2.508
2.518
2.527
2.535
2.544
2.552
2.561
2.569
2.576
2.584
2.591
2.598
2.606
2.612
2.619
2.626
2.632
2.638
2.645
2.651
2.657
2.662
2.668
2.674
2.679
2.685
S– 12
S-7.4.Precision and Accuracy of Volumetric Glassware
Item
(Total Capacity)
Burets
100 ml
50 ml
25 ml
10 ml
Measuring Pipets
10 ml
5 ml
2 ml
Transfer Pipets
100 ml
50 ml
25 ml
10 ml
5 ml
2 ml
Volumetric Flasks, TC
2000 ml
1000 ml
500 ml
250 ml
100 ml
50 ml
25 ml
10 ml
5 ml
Volumetric Flasks, TD
2000 ml
1000 ml
500 ml
250 ml
100 ml
50 ml
25 ml
10 ml
5 ml
a
Accuracy
Class A
(NBS
Other
0.10
0.05
0.03
0.02
0.20
0.10
0.06
0.04
ml
ml
ml
ml
Tolerance)a
Precision,
(Estimated)
ml
ml
ml
ml
0.05 ml
0.025 ml
0.015 ml
0.01 ml
0.03 ml
0.02 ml
0.01 ml
0.06 ml
0.04 ml
0.02 ml
0.06 ml
0.04 ml
0.02 ml
0.08 ml
0.05 ml
0.025 ml
0.02 ml
0.01 ml
0.006 ml
0.16 ml
0.1 ml
0.05 ml
0.04 ml
0.02 ml
0.012 ml
0.16 ml
0.1 ml
0.05 ml
0.04 ml
0.02 ml
0.012 ml
0.5 ml
0.3 ml
0.2 ml
0.1 ml
0.08 ml
0.05 ml
0.03 ml
0.02 ml
0.02 ml
1.0 ml
0.6 ml
0.4 ml
0.2 ml
0.16 ml
0.1 ml
0.06 ml
0.04 ml
0.04 ml
1.0 ml
0.6 ml
0.4 ml
0.2 ml
0.16 ml
0.1 ml
0.06 ml
0.04 ml
0.04 ml
1.0 ml
0.6 ml
0.4 ml
0.2 ml
0.16 ml
0.1 ml
0.06 ml
0.04 ml
0.04 ml
2.0 ml
1.2 ml
0.8 ml
0.4 ml
0.32 ml
0.2 ml
0.12 ml
0.08 ml
0.08 ml
2.0 ml
1.2 ml
0.8 ml
0.4 ml
0.32 ml
0.2 ml
0.12 ml
0.08 ml
0.08 ml
SOURCE: NBS Circular 602, 1957.
S-7.5.Measured Precision of Laboratory Balances
Triple-beam Platform Balance
Chainomatic Balance
± 0.28 g
± 0.0021 g
Single-pan Analytic Balance ± 0.00057 g
S– 13
S-7.6.Table of Atomic Weights with Uncertainties
Name
Symbol
Aluminum
Antimony
Argon
Arsenic
Barium
Beryllium
Bismuth
Boron
Bromine
Cadmium
Calcium
Carbon
Cerium
Cesium
Chlorine
Chromium
Cobalt
Copper
Dysprosium
Erbium
Europium
Fluorine
Gadolinium
Gallium
Germanium
Gold
Hafnium
Helium
Holmium
Hydrogen
Indium
Iodine
Iridium
Iron
Krypton
Lanthanum
Lead
Lithium
Lutetium
Magnesium
Manganese
Mercury
Molybdenum
Neodymium
Neon
Neptunium
Nickel
Niobium
Nitrogen
Osmium
Oxygen
Al
Sb
Ar
As
Ba
Be
Bi
B
Br
Cd
Ca
C
Ce
Cs
Cl
Cr
Co
Cu
Dy
Er
Eu
F
Gd
Ga
Ge
Au
Hf
He
Ho
H
In
I
Ir
Fe
Kr
La
Pb
Li
Lu
Mg
Mn
Hg
Mo
Nd
Ne
Np
Ni
Nb
N
Os
O
Atomic
#
13
51
18
33
56
4
83
5
35
48
20
6
58
55
17
24
27
29
66
68
63
9
64
31
32
79
72
2
67
1
49
53
77
26
36
57
82
3
71
12
25
80
42
60
10
93
28
41
7
76
8
Atomic Weight Value
Uncertainty
26.98154
121.75
39.948
74.9216
137.34
9.01218
208.9808
10.81
79.904
112.40
40.08
12.011
140.12
132.9054
35.453
51.996
58.9332
63.546
162.50
167.26
151.96
18.99840
157.25
69.72
72.59
196.9665
178.49
4.00260
164.9304
1.0079
114.82
126.9045
192.22
55.847
83.80
138.9055
207.2
6.941
174.97
24.305
54.9380
200.59
95.94
144.24
20.170
237.0482
58.71
92.9064
14.0067
190.2
15.9994
0.00001
0.03
0.001
0.0001
0.03
0.00001
0.0001
0.01
0.001
0.01
0.01
0.001
0.01
0.0001
0.001
0.001
0.0001
0.001
0.01
0.01
0.01
0.00001
0.03
0.01
0.03
0.0001
0.03
0.00001
0.0001
0.0001
0.01
0.001
0.03
0.003
0.01
0.0003
0.1
0.001
0.01
0.001
0.0001
0.03
0.03
0.01
0.003
0.0001
0.03
0.0001
0.0001
0.1
0.0003
S– 14
Palladium
Phosphorus
Platinum
Potassium
Praseodymium
Protactinium
Radium
Rhenium
Rhodium
Rubidium
Ruthenium
Samarium
Scandium
Selenium
Silicon
Silver
Sodium
Strontium
Sulfur
Tantalum
Technetium
Tellurium
Terbium
Thallium
Thorium
Thulium
Tin
Titanium
Tungsten
Uranium
Vanadium
Xenon
Ytterbium
Yttrium
Zinc
Zirconium
SOURCE:
Pd
P
Pt
K
Pr
Pa
Ra
Re
Rh
Rb
Ru
Sm
Sc
Se
Si
Ag
Na
Sr
S
Ta
Tc
Te
Tb
Tl
Th
Tm
Sn
Ti
W
U
V
Xe
Tb
Y
Zn
Zr
46
15
78
19
59
91
88
75
45
37
44
62
21
34
14
47
11
38
16
73
43
52
65
81
90
69
50
22
74
92
23
54
70
39
30
40
106.4
30.97376
195.09
39.098
140.9077
231.0359
226.0254
186.2
102.9055
85.4678
101.07
150.4
44.9559
78.96
28.086
107.868
22.9898
87.62
32.06
180.9479
98.9062
127.60
158.9254
204.37
232.0381
168.9342
118.69
47.90
183.85
238.029
50.9414
131.30
173.04
88.9059
65.38
91.22
0.1
0.00001
0.03
0.003
0.0003
0.0001
0.0001
0.1
0.0001
0.0003
0.03
0.1
0.0001
0.03
0.003
0.001
0.0001
0.01
0.01
0.0003
0.0001
0.03
0.0001
0.03
0.0001
0.0001
0.03
0.03
0.03
0.001
0.0003
0.01
0.03
0.0001
0.01
0.01
R.C. Weast, Ed., "Handbook of Chemistry and Physics," 56th Ed., CRC Press, Cleveland,
Ohio, 1975, inside back cover.
NOTE: This table includes all the known elements for which a reasonable atomic weight value is
assignable. All other elements are man-made, or their natural abundance is such that a
reasonable assessment is unavailable.
S– 15
S-7.7.Table of Constants and Conversion Factors with Uncertainties
Fundamental and Derived Constants:
(6.022 045 ± 0.000 031) x 1023 mol-1
Avogadro Number
NA
Gas Constant
(8.314 41 ± 0.000 26) J K-1 mol-1
R
R
(1.987 192 ± 0.000 062) cal K -1 mol-1
R
(8.205 68 ± 0.000 26) x 10-2 L atm K-1 mol-1
Boltzmann Constant
kB
(1.380 662 ± 0.000 044) x 10-23 J K-1
Faraday Constant
F
(9.648 456 ± 0.000 027) x 104 C mol-1
Electronic Charge
e
(1.602 189 ± 0.000 005) x 10-19 C
Planck Constant
h
(6.626 176 ± 0.000 036) x 1034 J s
h/2π
(1.054 5887 ± 0.000 0057) x 10-34 J sec
Speed of Light in Vacuum
c
(2.997 924 58 ± 0.000 000 12) x 108 m s-1
Conversion Factors:
TK
= t oC + (273.1500 ± 0.0002) deg
1 atm
= (7.60 ± 0.00) x 102 torr
= (7.60 ± 0.00) x 102 mm Hg
1 cal = (4.184 ± 0) J
1 J = (1.0 ± 0) x 107 erg
1 erg = (1.0 ± 0) dyne cm
1 L = (1.0 ± 0) x 103 cm3
a
SOURCE:
J.A. Dean, Ed., "Lange’s Handbook of Chemistry", 13th Ed., McGraw-Hill, New York,
NY, 1985, pp. 2-3ff.
S– 16
S-7.8.Summary of Computational Formulas
S-7.8.1.Mean.
x=
1
n
n
"x
i
=
i !1
1
( x + x2 + K + xn )
n 1
S-7.8.2.Sample Standard Deviation
#1
s=%
%$ n
n
&
( xi ! x ) (
('
i =1
12
2
"
S-7.8.3.Estimated Standard Deviation of the Universe
lim & 1
!=
(
n " # (' n
n
)
(x i $ µ) +
+*
i =1
12
2
%
S-7.8.4.Standard Error of Estimate of the Mean
ˆ
!
!ˆ x "
=
n
s
n#1
S-7.8.5.Least Squares Fitting
If y = mx + b,
n !xy " ! x ! y
mˆ =
D
! x2 ! y " !x ! xy
ˆx ,
bˆ =
=y" m
D
2
where D = n !x " ( ! x)
2
Standard error of estimate of the y values
$
ˆ x # bˆ
&" y # m
!ˆ yˆ = &
n#2
&%
(
)
2
'
)
)
)(
12
Standard errors of estimate of the slope:
" n%
!ˆ mˆ = !ˆ yˆ $ '
# D&
12
and intercept:
2&
#
ˆ yˆ % " x (
!ˆ bˆ = !
$ D '
12
S-7.8.6.Confidence Limits
µ = x ± !ˆ x t"
S– 17
S-7.8.7.Propagation of Errors
For f(x1, x2, …, xn) :
*
!f = ,
,+
)
i
2
# "f & 2 % "x ( ! x i /
$ i'
/.
12
S– 18
Download