GENERAL CHEMISTRY EXPERIMENTAL MEASUREMENT AND RELIABILITY The quality of measurements in the chemical laboratory is directly related to the reliability of experimental results. Typically, quality of a measurement is related to the number of significant figures read from the measuring device. These measurements are then manipulated in such a way as to maximize the reliability of the experimental result. There are other means, other than significant figures, to determine the reliability of a result, which will be discussed below. ACCURACY AND PRECISION All experimental measurements are subject to error. Error is defined as the difference between a measured value and the “true” value of a property. Expressing the reliability of a measurement in terms of its accuracy is not often possible, since there are relatively few instances where the true value of a property is known. Counted numbers of objects or events are true values; so are the rational or irrational numbers which appear in mathematical formulas. For example the numbers 1 and п in the formula for the area of a circle (Area = пr2) are known to any desired accuracy. Since an experimental measurement is subject to error, a property determined by that measurement can never have a true value in the same sense that t a counted number does. However, in some cases a property does have a value which is accepted as true by the scientific community. An accepted value may be defined. For example, the atomic weight of the 12C isotope is assigned a value of 12 followed by a decimal point and then an infinite number of zeros. An accepted value may also be the most probable value derived from repeated and careful measurements. For instance, the accepted value for the density of liquid ethyl alcohol is 0.7852g/mL at a temperature of 25ºC. True or accepted values for most experimentally measured properties are unknown. When dealing with such a property, and experimenter is unable to evaluate the accuracy of this measurement and must express its reliability in terms of precision. The precision of a measurement is obtained by repeating that measurement several times. If the several measured values show a reasonable agreement with one another, then the measurement is said to be precise. Consider four values for the density of a liquid measured by one experimenter: 0.7854, 0.7850, 0.7847, and 0.7830 g/mL; and four values determined by a second experimenter: 0.7856, 0.7850, 0.7844, and 0.7830 g/mL. Clearly, the first set of measurements is more precise than the second. Repeating a measurement several times is often impractical or inefficient and it is sometimes necessary to estimate the precision. Frequently, this estimate is based on the limiting precision of some instrument or other apparatus used in the measurement. When the limiting precision of an instrument is not specified, it must be estimated by the experimenter. Measurements may be precise without necessarily being accurate. This situation arises when there is a constant source of error which affects each measurement of a particular property in the same way. Consider the measurements of liquid density mentioned before. If the volume of the density bottle used in each measurement is 2% high, then each density value (mass/volume) would be too low by a factor of 2%. However, in the absence of such a consistent error, an experimenter generally assumes that the more precise the measurement of some property, the greater the chance of its being reliable. STATISTICAL EVALUATION OF DATA AND RESULTS Many of the experiments in this laboratory require two or more measurements of the same property. The average value derived from these replicate measurements is taken as the best value of the property. Statistical methods are then employed to evaluate the reliability of this best value as well as the reliability of each individual value. The following presentation outlines certain concepts and definitions used in the statistical treatment of experimental data and results. Arithmetic mean: The mean or average for a set of measured values of some property is the sum of the individual values, xi, divided by the total number of values, N X1 +X2 + X3 ... + Xn X= N Deviation: The deviation, di, of an individual value is the absolute value of the difference between that value and the mean: di = Xi - X Average Deviation: The average deviation, d, is the sum of the deviations for the individual values (without regard to sign) divided by the total number of values: d1 + d2 + d3 + ... + dn d= N Relative average deviation: Relative average deviation is expressed as the ratio of the average deviation of the individual values to the arithmetic mean: d X 100 = relative average deviation (%, or pph) x d x 1000 = relative average deviation (ppt) x Percentage relative error: Percentage relative error is an accuracy index in which is a ratio of the absolute error in a measured value (or mean) to the true or accepted value of a property is multiplied by 100: measured value - true value % relative error = true value Absolute error is of little statistical importance, relative error does have significance in those situations where the true or accepted value of a property is known. The statistical concepts described are illustrated here using the values of the liquid density previously mentioned. mean = x = Density (g/mL) 0.7854 0.7850 0.7847 0.7849 3.1400 3.1400 = 0.7850 4 Deviation (di) 0.0004 0.0000 0.0003 0.0001 0.0008 0.0008 = 0.0002 d = 4 Relative average deviation = d x Relative average deviation(%, or pph) = d x = 0.0003 x 100 = 0.3 % d x 1000 = 30 ppt x From the above calculations, one would report that the density is equal to 0.7850 g/mL. One also should indicate how reliable this answer is known. In other words, an index of precision is needed to indicate the degree of uncertainty in the calculated result. In this laboratory, the recommended indices of precision are the average deviation and the relative average deviation. Therefore, the experimental density can be reported in two ways: 1) Using the average deviation, the value reported is 0.7850 0.0002g/mL. This value indicates that the density is between 0.7850 and 0.7852g/mL. 2) Using the relative average deviation as an indication of precision, one would report the density as 0.7850 g/mL with a relative average deviation of 0.3 units for each 1000 units reported. Relative average deviation( ppt) = Standard deviation: Statistics gives another common method of computing the quality of experimental data or results called standard deviation. The standard deviation is calculated by taking the sum of the deviations divided by one less than the number of deviations and taking the square rood of the quotient. The standard deviation, S, means the likely hood of another measurement or result would have a 68% chance of falling in the range of S of the mean value for a thousand or more trils. Using a range of two standard deviation units, 2S, then the next measurement or result should have a 99% chance of being with two deviations from the mean, again for thousands of trials. 1/2 d12 + d22 + ... + dn2 The size of S is related to the precision of the results or data, thus a larger the value of S is n-1 less precise than a smaller value of S. In this equation, di represents diviation (see above). The s value should have the same number of significant figures as your data values. Standard deviation is used to indicate how “spread out” the measurements are. For example, if someone does an experiment to determine the percent sugar in apple juice, and the trial measurements are 13%, 14%, 15%, the data set will have a much smaller standard deviation than a data set of 10%, Sσ== 13%, 19%. What is interesting to note about both of these data sets is that they have the same mean value! (Confirm this for yourself.) Can you be confident that one or both of these data sets is a good predictor of the % sugar in apple juice? Confidence intervals in data sets with a large number of samples: For large numbers of measurements, the standard deviation represents the 68% confidence interval. This means that for a large sample, we can expect that 68% of any new measurements would be in the x s range. The 95% confidence interval is obtained within 2 standard deviations of the mean: x 2s . A data set with a small standard deviation indicates that the data points have high precision (reproducibility); a data set with a large standard deviation indicates that there is low precision (a lot of scatter) in the data. Confidence intervals in data sets with a small number of samples: In a typical laboratory setting it is not practical to make large numbers of measurements; 5-10 samples is normal. In general, the confidence interval (CI) for a single measurement is given by CI x t s , where the value t depends on the number of measurements N and the % confidence desired. What this CI means is that any single measurement would fall within t s of x with the % probability used to look up t. The smaller the value of N, the larger the t. The values of t for the 95% confidence interval are N t (CI = 95%) 5 2.776 6 2.571 7 2.447 8 2.365 9 2.306 10 2.262 11 2.228 12 2.201 Example: Calculate the 95% CI for the values 9.990, 9.982, 9.977, 9.990, 9.978 s 0.006308 t 2.776 x 9.9834 x t s (95%CI ) 9.9834 2.776 0.006308 9.983 0.018 Note that both the average value and the interval boundaries have the same number of decimal places (level of precision) as the individual measurement values. Dixon’s Q-test: In some instances one set of measurements apparently lies an abnormal distance from other values. Such measurements, called outliers, may be related to human errors and may be removed or corrected because they interfere with the precision and accuracy of the results. Because unfounded rejection of data is a source of scientific misconduct, data points should only be rejected with the utmost suspicion and if the situation warrants it. Before abnormal observations can be singled out, it is necessary to characterize normal observations by statistical validation. One of the most-used methods to legitimately eliminate outliers in chemistry is called Dixon’s Q-test. This test allows us to examine if one observation from a small set of observations can be “legitimately” rejected. The test is applied as follows: 1. The N values comprising the set of observations are arranged in ascending order. 2. The Q-value is calculated. This is a ratio defined as the difference of the suspect value from its nearest one divided by the range of values. Q x xN 1 suspect _ value nearest _ value N largest _ value smallest _ value xN x1 3. The obtained Q value is compared to a critical Q-value found in tables. For the 95% confidence level, the table of critical values of Q are listed at the top of the next page: 95% confidence level critical Q values: N 5 6 7 8 9 10 11 Q 0.710 0.625 0.568 0.526 0.493 0.466 0.444