chemistry statistics - Seattle Central College

advertisement
CHEMISTRY STATISTICS
The quality of measurements in the chemical laboratory is directly related to the reliability of experimental results. Typically, quality
of a measurement is related to the number of significant figures read from the measuring device. These measurements are then
manipulated in such a way as to maximize the reliability of the experimental result. There are other means, other than significant
figures, to determine the reliability of a result, which will be discussed below.
ACCURACY AND PRECISION
All experimental measurements are subject to error. Error is defined as the difference between a measured value and the “true” value
of a property. Expressing the reliability of a measurement in terms of its accuracy is not often possible, since there are relatively few
instances where the true value of a property is known. Counted numbers of objects or events are true values; so are the rational or
irrational numbers which appear in mathematical formulas. For example the numbers 1 and п in the formula for the area of a circle
(Area = пr2) are known to any desired accuracy.
Since an experimental measurement is subject to error, a property determined by that measurement can never have a true value in the
same sense that t a counted number does. However, in some cases a property does have a value which is accepted as true by the
scientific community. An accepted value may be defined. For example, the atomic weight of the 12C isotope is assigned a value of 12
followed by a decimal point and then an infinite number of zeros. An accepted value may also be the most probable value derived
from repeated and careful measurements. For instance, the accepted value for the density of liquid ethyl alcohol is 0.7852g/mL at a
temperature of 25ºC.
True or accepted values for most experimentally measured properties are unknown. When dealing with such a property, and
experimenter is unable to evaluate the accuracy of this measurement and must express its reliability in terms of precision. The
precision of a measurement is obtained by repeating that measurement several times. If the several measured values show a
reasonable agreement with one another, then the measurement is said to be precise. Consider four values for the density of a liquid
measured by one experimenter: 0.7854, 0.7850, 0.7847, and 0.7830 g/mL; and four values determined by a second experimenter:
0.7856, 0.7850, 0.7844, and 0.7830 g/mL. Clearly, the first set of measurements is more precise than the second.
Repeating a measurement several times is often impractical or inefficient and it is sometimes necessary to estimate the precision.
Frequently, this estimate is based on the limiting precision of some instrument or other apparatus used in the measurement. When the
limiting precision of an instrument is not specified, it must be estimated by the experimenter.
Measurements may be precise without necessarily being accurate. This situation arises when there is a constant source of error which
affects each measurement of a particular property in the same way. Consider the measurements of liquid density mentioned before. If
the volume of the density bottle used in each measurement is 2% high, then each density value (mass/volume) would be too low by a
factor of 2%. However, in the absence of such a consistent error, an experimenter generally assumes that the more precise the
measurement of some property, the greater the chance of its being reliable.
STATISTICAL EVALUATION OF DATA AND RESULTS
Many of the experiments in this laboratory require two or more measurements of the same property. The average value derived from
these replicate measurements is taken as the best value of the property. Statistical methods are then employed to evaluate the
reliability of this best value as well as the reliability of each individual value. The following presentation outlines certain concepts and
definitions used in the statistical treatment of experimental data and results. If you wanted to quantitatively describe the composition
of a forest (What percentage evergreen trees? What percentage deciduous trees? You probably would not count every tree in the
forest – it is both impractical and inefficient. Likewise, chemists often need to characterize the composition of samples (How much
lead contaminant is in this lake?) but it is impractical to count every atom and molecule. Therefore, it is standard practice to use an
analytical sample (some small volume of lake water) that is representative of the bulk material (the whole lake).
How can an investigator be confident that their sample is really representative of the bulk material? In this experiment we will explore
some sampling issues that can impact the reliability of an analysis, and some statistical analysis techniques that can be used to
determine the reliability of a sample.
It is important to point out that sampling and statistics are not simply chemistry problems; they are relevant throughout science, and
beyond. Election polling is a great example of this – how can one poll indicate one candidate is ahead by 8% and another poll
indicates the two candidates are even? The outcome is significantly influenced by the demographics of the people polled, and the
number of people polled. It is generally true that the larger a sample is, the more reliable it will be. But, for the sake of efficiency,
polling organizations need to use statistical analysis to determine when they can stop collecting data.
The first step in a quantitative chemical investigation is to obtain an analytical sample of the bulk material. The sample must have the
same chemical and physical properties of the bulk material in order to be a good representative. The best way to collect material
would be to obtain several samples at random from the total population, using the idea that as the sample size approaches the
population size the errors decrease to zero. In most investigations, it is impractical to select very large samples. A typical random
sample is usually far smaller than desired, raising concerns about how accurately the sample really represents the bulk material. This
doubt can be answered by statistical analysis of the data. An answer that is ‘good enough’ can usually be obtained with a sample size
much smaller than the population size.
There are different kinds of errors than can affect a sample or its measurement. A gross error could occur due to loss of sample or
sabotage. A systematic error has an assignable cause (for example, the “zero point” on your measuring instrument is not calibrated),
and it biases the measurements in the same way for each measurement. A random error appears as “scatter” within sampling, and is
reflected in the imprecision of the data. Random errors are analyzed by the technique of statistics.
Arithmetic mean: The mean or average for a set of measured values of some property is the sum of the individual values, xi, divided
by the total number of values, N
N
xi
x1  x2  x3  x4  ...  xN 
i 1
x

N
N
Deviation:
The deviation, di, of an individual value is the absolute value of the difference between that value and the mean:
di = Xi - X
Average Deviation: The average deviation, d, is the sum of the deviations for the individual values (without regard to sign) divided
by the total number of values:
d=
d1 + d2 + d3 + ... + dn
N
Relative average deviation: Relative average deviation is expressed as the ratio of the average deviation of the individual values to
the arithmetic mean:
d
x
X 100 = relative average deviation (%, or pph)
d
x
x 1000 = relative average deviation (ppt)
Percentage relative error: Percentage relative error is an accuracy index in which is a ratio of the absolute error in a measured value
(or mean) to the true or accepted value of a property is multiplied by 100:
measured value - true value
% relative error =
true value
Absolute error is of little statistical importance, relative error does have significance in those situations where the true or accepted
value of a property is known. The statistical concepts described are illustrated here using the values of the liquid density previously
mentioned.
Density (g/mL)
0.7854
0.7850
0.7847
0.7849
3.1400
mean = x =
3.1400
4
Deviation (di)
0.0004
0.0000
0.0003
0.0001
0.0008
d =
= 0.7850
0.0008
Relative average deviation =
d
x
Relative average deviation(%, or pph) =
d
x
x 100 = 0.3 %
=
d
x
x 1000 = 30 ppt
Relative average deviation( ppt)
= 0.0002
4
= 0.0003
From the above calculations, one would report that the density is equal to 0.7850 g/mL. One also should indicate how reliable this
answer is known. In other words, an index of precision is needed to indicate the degree of uncertainty in the calculated result. In this
laboratory, the recommended indices of precision are the average deviation and the relative average deviation. Therefore, the
experimental density can be reported in two ways:
1) Using the average deviation, the value reported is 0.7850 0.0002g/mL. This
value indicates that the density is between 0.7850 and 0.7852g/mL.
2) Using the relative average deviation as an indication of precision, one would
report the density as 0.7850 g/mL with a relative average deviation of 0.3 units
for each 1000 units reported.
Standard deviation:
Introduction: If you wanted to quantitatively describe the composition of a forest (What percentage evergreen trees? What
percentage deciduous trees? You probably would not count every tree in the forest – it is both impractical and inefficient. Likewise,
chemists often need to characterize the composition of samples (How much lead contaminant is in this lake?) but it is impractical to
count every atom and molecule. Therefore, it is standard practice to use an analytical sample (some small volume of lake water) that
is representative of the bulk material (the whole lake).
How can an investigator be confident that their sample is really representative of the bulk material? In this experiment we will explore
some sampling issues that can impact the reliability of an analysis, and some statistical analysis techniques that can be used to
determine the reliability of a sample.
It is important to point out that sampling and statistics are not simply chemistry problems; they are relevant throughout science, and
beyond. Election polling is a great example of this – how can one poll indicate one candidate is ahead by 8% and another poll
indicates the two candidates are even? The outcome is significantly influenced by the demographics of the people polled, and the
number of people polled. It is generally true that the larger a sample is, the more reliable it will be. But, for the sake of efficiency,
polling organizations need to use statistical analysis to determine when they can stop collecting data.
The first step in a quantitative chemical investigation is to obtain an analytical sample of the bulk material. The sample must have the
same chemical and physical properties of the bulk material in order to be a good representative. The best way to collect material
would be to obtain several samples at random from the total population, using the idea that as the sample size approaches the
population size the errors decrease to zero. In most investigations, it is impractical to select very large samples. A typical random
sample is usually far smaller than desired, raising concerns about how accurately the sample really represents the bulk material. This
doubt can be answered by statistical analysis of the data. An answer that is ‘good enough’ can usually be obtained with a sample size
much smaller than the population size.
There are different kinds of errors than can affect a sample or its measurement. A gross error could occur due to loss of sample or
sabotage. A systematic error has an assignable cause (for example, the “zero point” on your measuring instrument is not calibrated),
and it biases the measurements in the same way for each measurement. A random error appears as “scatter” within sampling, and is
reflected in the imprecision of the data. Random errors are analyzed by the technique of statistics.
The accuracy of a measurement refers to how close it is to the true value. We will be calculating mean (average) values, expressing
the accuracy in terms of the relative error, and specifying the precision of the values by calculating the standard deviation (spread)
about the mean. (If you don’t remember the difference between accuracy and precision, now would be a good time to review these
terms).
Standard deviation for N measurements:
  x  x
N
s
i 1
2
i
N 1
In this equation, xi represents one sample value, and x-bar represents the mean (average). The s value should have the same number of
significant figures as your data values. Standard deviation is used to indicate how “spread out” the measurements are. For example,
if someone does an experiment to determine the percent sugar in apple juice, and the trial measurements are 13%, 14%, 15%, the data
set will have a much smaller standard deviation than a data set of 10%, 13%, 19%. What is interesting to note about both of these data
sets is that they have the same mean value! (Confirm this for yourself.) Can you be confident that one or both of these data sets is a
good predictor of the % sugar in apple juice?
Confidence intervals in data sets with a large number of samples:
For large numbers of measurements, the standard
deviation represents the 68% confidence interval. This means that for a large sample, we can expect that 68% of any new
measurements would be in the
xs
range. The 95% confidence interval is obtained within 2 standard deviations of the mean:
x  2s . A data set with a small standard deviation indicates that the data points have high precision (reproducibility); a data set with
a large standard deviation indicates that there is low precision (a lot of scatter) in the data.
Confidence intervals in data sets with a small number of samples: In a typical laboratory setting it is not practical to make large
numbers of measurements; 5-10 samples is normal. In general, the confidence interval (CI) for a single measurement is given by
CI  x  t  s , where the value t depends on the number of measurements N and the % confidence desired. What this CI means is
that any single measurement would fall within  t  s of x with the % probability used to look up t. The smaller the value of N, the
larger the t. The values of t for the 95% confidence interval are
N
t (CI = 95%)
5
2.776
6
2.571
7
2.447
8
2.365
9
2.306
10
2.262
11
2.228
12
2.201
Example: Calculate the 95% CI for the values 9.990, 9.982, 9.977, 9.990, 9.978
x  9.9834
s  0.006308 t  2.776
x  t  s (95%CI )  9.9834  2.776  0.006308  9.983  0.018
Note that both the average value and the interval boundaries have the same number of decimal places (level of precision) as the
individual measurement values.
Dixon’s Q-test:
In some instances one set of measurements apparently lies an abnormal distance from other values. Such
measurements, called outliers, may be related to human errors and may be removed or corrected because they interfere with the
precision and accuracy of the results. Because unfounded rejection of data is a source of scientific misconduct, data points should
only be rejected with the utmost suspicion and if the situation warrants it. Before abnormal observations can be singled out, it is
necessary to characterize normal observations by statistical validation. One of the most-used methods to legitimately eliminate
outliers in chemistry is called Dixon’s Q-test. This test allows us to examine if one observation from a small set of observations can
be “legitimately” rejected. The test is applied as follows:
1.
The N values comprising the set of observations are arranged in ascending order.
2.
The Q-value is calculated. This is a ratio defined as the difference of the suspect value from its nearest one divided by the
range of values.
Q
3.
x  xN 1
suspect _ value  nearest _ value
 N
largest _ value  smallest _ value
xN  x1
The obtained Q value is compared to a critical Q-value found in tables. For the 95% confidence level, the table of critical
values of Q are listed at the top of the next page:
95% confidence level critical Q values:
N
5
6
7
8
9
10
11
Q
0.710
0.625
0.568
0.526
0.493
0.466
0.444
Download