CHM 235 Quantitative Analysis Spring 2007 Dr. S.A. Skrabal SOLUTIONS TO PROBLEM SET 3 Statistics 30 January 2007 1. A researcher performs an analysis for total PCB concentration in a contaminated soil. Seven analyses of the same soil sample yielded the following concentrations (in g PCBs per g soil): 14.5, 17.8, 15.9, 14.6, 15.0, 16.6, and 16.2. (A) Calculate the mean, standard deviation, relative standard deviation (rsd), average deviation, and relative average deviation (rad) of these results, using the correct number of significant figures for each. (B) Calculate the 50% and 95% confidence intervals for these data. (A) Using calculator: d x 15.8 g or 16 g s 1.1 g or 1 g rsd 1.1 (100) 6.9 % or 7% 15.8 15.8 14.5 15.8 17.8 15.8 15.9 15.8 14.6 15.8 15.0 15.8 16.6 15.8 16.2 7 d 0.9 4 or 0.9 g rad 0.9 4 (100) 5.9 % or 6% 15.8 (B) ttable(df=6,50%CL) = 0.718 ttable(df=6,95%CL) = 2.447 x ts x ts n n 15.8 (0.718)(1.1 ) 15.8 (2.447)(1.1 ) 7 7 15.8 0.30 g 50% confidence interval 15.8 1.0 g 95% confidence interval 2. A beginning graduate student has analyzed a biological standard reference material (SRM), lobster hepatopancreas, for the concentration of chromium. Six analyses gave the following results (in mg/kg dry weight): 0.544, 0.509, 0.522, 0.593, 0.537, 0.545. (A) Can any of these values be rejected using the Q-test at the 90% confidence level (CL)? (B) What are the mean, standard deviation, and rsd of the non-rejected values? (A) Put in order (low to high or high to low): 0.509, 0.522, 0.537, 0.544, 0.545, 0.593 Test 0.593 as an outlier: Qcalc gap 0.593 0.545 0.048 0.571 range 0.593 0.509 0.084 Qtable(n=6, 90%CL) = 0.56 Qcalc > Qtable, so value is rejected. (B) Mean, sd, and rsd (using calculator) of remaining values are 0.531 (or 0.53) mg/kg, 0.015 (or 0.02) mg/kg, and 2.8 (or 3%), respectively. 3. Another student is analyzing the lobster hepatopancreas SRM for silver, which is a potentially toxic heavy metal. From seven analyses, the student found the following concentrations (in mg/kg dry weight): 3.32, 3.54, 3.67, 3.62, 3.30, 3.79, 3.40. The certified concentration of silver in the SRM is 3.89 mg/kg dry weight. Is the student obtaining the “correct” value (at the 95% CL) for silver in the SRM? Mean and sd of student’s results (from calculator) are: 3.52 0.18 mg/kg t calc known value x n s 3.89 3.5 2 0.18 7 5.438 ttable(df=6,95% CL) = 2.447. Since tcalc > ttable, student’s results are significantly different from the certified value at the 95% CL. 4. The quality control department of a vitamin C tablet manufacturing facility analyzed 121 tablets and found the average mass of vitamin C in each tablet to be 200.9 mg with a standard deviation of 0.9 mg. Calculate the 90% and 99.9% confidence intervals for these data. Keep two decimal places for the confidence intervals. ttable(df=120,90%CL) = 1.658 ttable(df=120,99.9%CL) = 3.373 x ts x ts n n 200.9 (1.658)(0.9) 200.9 (3.373)(0.9) 121 121 200.9 0.14 mg 90% confidence interval 200.9 0.28 mg 99.9% confidence interval 5. A large sample of groundwater from the western United States was divided into two equal portions, and sent to two different laboratories for the analysis of dissolved arsenic. The two laboratories followed the exact same processing and analytical procedures. The first laboratory, Gabfest Analytical Labs of Lake Lahala, MI, obtained arsenic concentrations (for 5 analyses) of 42.5, 39.5, 39.9, 40.0, and 43.9 ppb. Porkmeister Environmental Co. of Melrose Place, IA obtained arsenic concentrations (for 7 analyses) of 49.1, 45.5, 46.7, 47.3, 48.8, 48.2, and 47.9 ppb. First determine whether or not the standard deviations of the two methods are statistically the same at the 95% confidence level, then apply the correct t-test to determine whether or not the two labs obtained results that are significantly different at the 95% confidence level. What is the significance at the 99% CL? First apply F-test to determine if standard deviations are significantly different. Gabfest data: x1 41.1 ppb Porkmeister data: s1 1.9 ppb x 2 47.6 ppb s 2 1.2 ppb Fcalc = s12/s22 = (1.9)2/(1.2)2 = 2.51 (Remember s1 is always the larger standard deviation, so F is always > 1.) Ftable(df = 4,6; CL = 95%) = 4.53 Since Fcalc < Ftable, standard deviations are not significantly different at the 95% CL. Therefore, use the “regular” formulas for the t-test for comparison of means. s1 (n1 1) s 2 (n2 1) n1 n2 2 2 s pooled t calc x1 x2 n1 n2 s pooled n1 n2 2 41.1 47.6 df for t-test = n1 + n2 – 2 = 10 1.519 (1.9 ) 2 (5 1) (1.2 ) 2 (7 1) 1.519 (ignoring sf) 57 2 (5)(7) 7.308 (keeping 4 sf for test) 57 ttable(df=10,95% CL) = 2.228 Since tcalc > ttable, results are significantly different at the 95% CL. ttable(df=10, 99% CL) = 3.169 Since tcalc > ttable, results are significantly different at 99% CL. 6. A blood sample was analyzed for calcium using two different methods: a colorimetric method and an atomic absorption spectrophotometric (AAS) method. Six analyses of the sample using the colorimetric method yielded the following results: 10.8, 10.4, 10.5, 10.9, 9.9, and 9.9 mg/dL. Eight analyses of the blood using the AAS method gave the following results: 8.9, 10.3, 9.4, 11.4, 11.7, 9.4, 11.9, 10.9 mg/dL. (A) Apply the Q-test to each set of data to determine whether or not any data points should be rejected as outliers at the 90% confidence level. (B) Using all acceptable data points, use the F test at the 95% CL to determine whether or not there is a significant difference in the precision of the two methods. That is, are the standard deviations significantly different between the two methods? (C) Using the correct t-test, determine whether or not the means obtained by the two methods are significantly different at the 95% CL. (A) Colorimetric data: 9.9, 9.9, 10.4, 10.5, 10.8, 10.9 No obvious outliers. Can test 10.9 to demonstrate this. Q = gap/range = (10.9 – 10.8) / (10.9 – 9.9) = 0.1 Qtable(90%CL, n = 6) = 0.56. Since Qcalc < Qtable, do not reject data point at 90% CL. AAS data: 8.9, 9.4, 9.4, 10.3, 10.9, 11.4, 11.7, 11.9 Possible outlier is 8.9. Q = (9.4 – 8.9) / (11.9 – 8.9) = 0.17 Qtable,(90%CL, n=8) = 0.47 Since Qcalc < Qtable, do not reject data point at 90% CL. Keep all data from both data sets when calculating mean and standard deviations: Using calculator: Colorimetric data: x1 10.4 0 mg / dL s1 0.4 2 mg / dL AAS data: x 2 10.4 mg / dL s 2 1.1 mg / dL (B) Fcalc = (1.1)2/(0.42)2 = 6.86 (Note the AAS data has the larger s.d., so it becomes s1 for the F-test.) Degrees of freedom are 7 for AAS data and 5 for colorimetric data. Ftable(df = 7,5; 95% CL) = 4.88. Since Fcalc > Ftable, standard deviations are significantly different at the 95% CL. Precision is not the same for the two methods. (C) Must use equations 4-8a and 4-9a in textbook to perform t-test for comparison of means: t calc x1 x2 s12 / n1 s 22 / n2 10.4 0 10.4 (0.4 2 ) 2 / 6 (1.1 ) 2 / 8 0 (Note: You could stop here, since it is clear that t calc will be less than ttable. Therefore, no significant difference between the means at the 95% CL. But, continuing on so you see how it’s done…) 2 s12 / n1 s 22 / n2 DF = 2 2 s 22 / n2 s12 / n1 n 1 n 1 2 1 2 (0.4 2 ) 2 / 6 (1.1 ) 2 / 8 2 2 (1.1 ) 2 / 8 (0.4 2 ) 2 / 6 6 1 8 1 2 2 0.03263 2 10.2 10 0.002665 ttable(df=10,95% CL) = 2.228 Since tcalc < ttable, there is no significant difference between the means for the two sets of data at the 95% CL.