Lecture 4 ERT 207 ANALYTICAL CHEMISTRY 13 JAN 2011 TOPICS TO BE COVERED: UTILIZATION OF STATISTICS IN DATA ANALYSIS: * SIGNIFICANT TESTING *THE STUDENT T -TEST *THE Q TEST: REJECTION OF A RESULT CALIBRATION CURVE * SLOPE,INTERCEPT AND COEFFICENT OF DETERMINATION 3.1 SIGNIFICANT TESTING • Why we do testing to our experimental data? • 1. to compare data among friends with the intention of gaining some confidence that the data observed could be accepted or rejected. • 2. to decide whether there is a difference between the results obtained using two different methods. All these can be confirmed by doing some significant tests. 3.2 THE STUDENT T -TEST • T-test is used to decide whether there is a significant difference in the mean for two sets of data • True or accepted value and confidence limit know • Comparing mean for two sets of data The calculated t value is compared with the t value from the table (Table 3.1 from Gary D.Christian) for the sets of data and at a required level of confidence. Select a confidence level (95% is good) for the number of samples analyzed (= degrees of freedom +1). Confidence limit = x ± ts/√N. It depends on the precision, s, and the confidence level you select. (a) True or accepted value and confidence limit known. The calculated t value (ItI) is then compared to the t value from the table (Table 3.1). If t calculated > ttable, then there is a significant difference in the results of the two methods used at a certain level of confidence. If tcalculated < ttable, then there is no significant difference in the results of the two methods used at a certain level of confidence. Example 1: The true value (μ) for Cl- in a standard sample is 34.63 %. After three analysis, it is found that the mean is 34.71 % and s equals to 0.04 %. From the result, is there any determinate error that occurs in the method used? Solution: For the three analysis or N-1 =2(Table 3.1) t calculated > ttable for 90 % confidence levels, and t calculated < ttable for 95% confidence level and more. Therefore there is no significant difference for confidence level greater than 95 % and we can conclude that there is no determinate error for confidence level greater than 95 %. • Example 2: You are developing a procedure for determining traces of copper in biological materials using a wet digestion followed by measurement by atomic absorption spectrophotometry. In order to test the validity of the method, you obtain an NIST orchard leaves standard reference material and analyze this material. Five replicas are sampled and analyzed, and the mean of the results is found to be 10.8 ppm with a standard deviation of ±0.7 ppm. The listed value is 11.7 ppm. Does your method give a statistically correct value at the 95% confidence level? • Solution: • t = 2.9 • There are five measurements, so there are 4 degrees of freedom (N — 1). • From Table 3.1, we see that the tabulated value of t at the 95% confidence level is 2.776. • This is less than the calculated value, so there is a determinate error in the new procedure. • That is, there is a 95% probability that the difference between the reference value and the measured value is not due to chance. (b) Comparing mean for two sets of data. • See Gary D.Christian (page 95) The pooled standard deviation, sp is given by: x => values in each set N => num. of measurements x =>means of each of k sets of analyses N – k => degree of freedom from (N1-1) + (N2-2) + … + (Nk-1) • Example 1: Two bottles of beer were analyzed for the alcohol content. Four samples from the first bottle with a mean value of 12.61 % alcohol and six samples from the second bottle with a mean value of 12.39 % alcohol. The pooled standard deviation is 0.07 %. Is there a significant difference between the content of the two bottles? • t calculated > ttable for 80, 90, 95 and 99 % levels of confidence (Table 3.1). • This shows that the analysis gives a significant difference in the alcohol content. • Example: A new gravimetric method is developed for iron (III) in which the iron is precipitated in crystalline form with an organoboron “cage” compound. The accuracy of the method is checked by analyzing the iron in an ore sample and comparing with the results using the standard precipitation with ammonia and weighing of Fe203. The results, reported as % Fe for each analysis, were as follows: • Is there a significant difference between the two methods? • The tabulated t for nine degrees of freedom (N1 + N2 — 2) at the 95% confidence level is 2.262, so there is no statistical difference in the results by the two methods. (c) For special case, that is M = N. 3.3 THE Q TEST: REJECTION OF A RESULT • Not all data obtained from an experiment can be used. • There is a possibility that some data have a great difference from the true value and therefore should be removed. • The rule for data rejection is known as Q test. • The Q test is used for rejection of values that are further away from the true value. • The Q value can be obtained as follows: • (i) Arrange the values in an order. • (ii) Calculate the differences between the highest and lowest values. • (iii) Calculate the difference between the uncertain value and the nearest value to it. Divide this value with the value calculated in (ii) to obtain the Q calculated. • (iv) Compare this Qcalculated with the Q from the Q Table (Table 3.3). The doubtful value is removed with 90 % confidence level if the Qcalculated is greater or equals to the Q value obtained from the Q table. • Example: An analysis of iron ore gives the following results: 33.78 %, 33.84 %, 33.60 % and 33.15 %. Which of the values should be removed? • Solution: • Doubtful values are 33.15 % and 33.84 % • Arrange the values: • 33.84, 33.78, 33.60, 33.15 • The difference between the highest and the lowest value = 33.84 — 33.15 • = 0.69 % • Testing for 33.15%: • The difference between the doubtful value and the nearest one to it: • = 33.60—33.15 • = 0.45 % • From the Q Table, • Q table = 0.76 (for 4 observations). • The highest value (33.84 %) can also he tested using the same method as above. • Using the Q test for the following results, determine whether the value 8.75 should be rejected as an outlier at 90% confidence level. • 8.20, 8.35, 8.64, 8.25, 8.75, 8.45. • Answer: • a) Arranging according to descending order: 8.75, 8.64, 8.45, 8.35, 8.25, 8.20 • b) Calculate difference between the highest and lowest values: 8.75-8.20 = 0.55% • c) Calculate difference between uncertain value and nearest value to it : 8.75- 8.64 = 0.11% • d) Divide this value with the value calculated in (b) to obtain Qcalculated • Therefore, 0.11% = 0.2 • 0.55% • e) Compare Q calculated with the Qtable at 90% confidence level. N =6 • Qcalculated = 0.2 • Qtable = 0.56 • Qcalculated < Qtable • 0.2 < 0.56 • Therefore, the value 8.75 should be accepted at 90% confidence level (8.75 cannot be removed). • Average difference D is calculated and individual deviations of each from D are used to compute a standard deviation, sd • t value is calculated from : Example You are developing a new analytical method for the determination of blood urea nitrogen (BUN). You want to determine whether you method differs significantly from a standard one for analyzing a range of sample concentrations expected to be found in the routine laboratory. It has been ascertained that the two methods have comparable precisions. Following are two sets of results for a number of individual samples. Sample Your Method (mg/dL) ,x Standard Method (mg/dL) ,y A 10.2 10.5 B 12.7 11.9 C 8.6 8.7 D 7.5 16.9 E 11.2 10.9 F 11.5 11.1 Solution : Sample Your Method, mg/dL Standard Method, mg/dL Di Di-D (Di-D)2 A 10.2 10.5 -0.3 -0.6 0.36 B 12.7 11.9 0.8 0.5 0.25 C 8.6 8.7 -0.1 -0.4 0.16 D 17.5 16.9 0.6 0.3 0.09 E 11.2 10.9 0.3 0.0 0.00 F 11.5 11.1 0.4 0.1 0.01 ∑ 1.7 D = 0.28 ∑ 0.87 • Sd = 0.42 • t = 1.63 • The tabulated t value at 95% confidence level for 5 degrees of freedom is 2.571. • Therefore, t calc < t table • So, there is no significant difference between the 2 methods at 95% confidence level • 95% confidence level is considered significant • 99% level is highly significant • Smaller the calculated t value, more confident that there is no significant difference between the two method • Too low confidence level (e.g. 80%), likely to conclude erroneously that there is a significant difference between two methods (type I error) • On the other hand, too high confidence level will require too large a difference to detect (type II error) • If a calculated t value is near tabular value at the 95% confidence level, more tests should be run to ascertain definitely whether the two methods are significantly different Thank you for your attention