Chem. 31 – 9/21 Lecture Guest Lecture Dr. Roy Dixon Announcements I • Due on Wednesday – Pipet/Buret Calibration Lab Report – Format: Pipet Report Form, Buret Calibration Plot and data for these measurements (use Lab Manual pages or photocopy your lab notebook pages if neat/organized) • Last Week’s Additional Problem – returned in labs – remember to put your LAB SECTION NUMBER on all assignments turned in (grading is by lab section) Announcements II • Today’s Lecture – Error and Uncertainty • Finish up Gaussian Distribution Problems • t and Z based Confidence Intervals • Statistical Tests – Lecture will be posted under my faculty web page (but I will also send them to Dr. Miller-Schulze for his posting method) Example Problems Text Problems 4-2 (a), (d), (e) 4-4 (a), (b) Done already Chapter 4 – A Little More on Distributions (1) Spec #1 * [BP = 234.1, 52707] 100 1343.9877 90 80 70 2s ~ 0.2 amu 60 % Intensity • Measurements can be a naturally varying quantity (e.g. student heights, student test scores, Hg levels in fish in a lake) • Additionally, a single quantity measured multiple times typically will give different values each time (example: real distribution of measurements of mass of an ion) 50 40 1344.9770 30 20 10 0 1343.12360 1343.92636 1344.72913 1345 m/z x axis is mass Note: to be considered “accurate mass”, an ion’s mass error must be less than 5 ppm (0.007 amu in above spectrum). This is only possible by averaging measurements so that the average mass meets the requirement. Chapter 4 – A Little More on Distributions • Reasons for making multiple measurements: – So one has information on the variability of the measurement (e.g. can calculate the standard deviation and uncertainty) – Average values show less deviation than single measurements – Mass spectrometer example: standard deviation in single measurement ~0.1 amu, but standard deviation in 4 s averages ~0.005 amu Chapter 4 – Calculation of Confidence Interval 1. 2. x n Z depends on area or desired probability At Area = 0.45 (90% both sides), Normal Distribution Frequency Confidence Interval = x + uncertainty Calculation of uncertainty depends on whether σ is “well known” 3. If s is not well known (covered later) 4. When s is well known (not in text) Value + uncertainty = Zs 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 -3 -2 -1 0 Z value Z = 1.65 At Area = 0.475 (95% both sides), Z = 1.96 => larger confidence interval 1 2 3 Chapter 4 – Calculation of Uncertainty Example: The concentration of NO3- in a sample is measured many times. If the mean value and standard deviation (assume as population standard deviation) are 14.81 and 0.62 ppm, respectively, what would be the expected 95% confidence interval in a 4 measurement average value? (Z for 95% CI = 1.96) What is the probability that a new measurement would exceed the upper 95% confidence limit? Chapter 4 – Calculation of Confidence Interval with s Not Known Value + uncertainty = tS x n t = Student’s t value t depends on: - the number of samples (more samples => smaller t) - the probability of including the true value (larger probability => larger t) Chapter 4 – Calculation of Uncertainties Example • Measurement of lead in drinking water sample: – values = 12.3, 9.8, 11.4, and 13.0 ppb • What is the 95% confidence interval? Chapter 4 – Ways to Reduce Uncertainty 1. Decrease standard deviation in measurements (usually requires more skill in analysis or better equipment) 2. Analyze each sample more time (this increases n and decreases t) Overview of Statistical Tests • t-Tests: Determine if a systematic error exists in a method or between methods or if a difference exists in sample sets • F-Test: Determine if there is a significant difference in standard deviations in two methods (which method is more precise) • Grubbs Test: Determine if a data point can be excluded on a statistical basis Statistical Tests Possible Outcomes • Outcome #1 – There is a statistically significant result (e.g. a systematic error) – this is at some probability (e.g. 95%) – can occasionally be wrong (5% of time possible if test barely valid at 95% confidence) • Outcome #2 – No significant result can be detected – this doesn’t mean there is no systematic error or difference in averages – it does mean that the systematic error, if it exists, is not detectable (e.g. not observable due to larger random errors) – It is not possible to prove a null hypothesis beyond any doubt Statistical Tests t Tests • Case 1 – used to determine if there is a significant bias by measuring a test standard and determining if there is a significant difference between the known and measured concentration • Case 2 – used to determine if there is a significant differences between two methods (or samples) by measuring one sample multiple time by each method (or each sample multiple times) • Case 3 – used to determine if there is a significant difference between two methods (or sample sets) by measuring multiple sample once by each method (or each sample in each set once) Case 1 t test Example • A new method for determining sulfur content in kerosene was tested on a sample known to contain 0.123% S. • The measured %S were: 0.112%, 0.118%, 0.115%, and 0.119% Do the data show a significant bias at a 95% confidence level? Clearly lower, but is it significant? Case 2 t test Example • A winemaker found a barrel of wine that was labeled as a merlot, but was suspected of being part of a chardonnay wine batch and was obviously mis-labeled. To see if it was part of the chardonnay batch, the mislabeled barrel wine and the chardonnay batch were analzyed for alcohol content. The results were as follows: – Mislabeled wine: n = 6, mean = 12.61%, S = 0.52% – Chardonnay wine: n = 4, mean = 12.53%, S = 0.48% • Determine if there is a statistically significant difference in the ethanol content. Case 3 t Test Example • Case 3 t Test used when multiple samples are analyzed by two different methods (only once each method) • Useful for establishing if there is a constant systematic error • Example: Cl- in Ohio rainwater measured by Dixon and PNL (14 samples) Case 3 t Test Example – Data Set and Calculations Calculations Conc. of Cl- in Rainwater (Units = uM) Step 1 – Calculate Difference Sample # Dixon Cl- PNL Cl- 1 9.9 17.0 7.1 2 2.3 11.0 8.7 3 23.8 28.0 4.2 4 8.0 13.0 5.0 5 1.7 7.9 6.2 6 2.3 11.0 8.7 7 1.9 9.9 8.0 8 4.2 11.0 6.8 9 3.2 13.0 9.8 10 3.9 10.0 6.1 11 2.7 9.7 7.0 12 3.8 8.2 4.4 13 2.4 10.0 7.6 14 2.2 11.0 8.8 Step 2 - Calculate mean and standard deviation in differences ave d = (7.1 + 8.7 + ...)/14 ave d = 7.49 Sd = 2.44 Step 3 – Calculate t value: tCalc d Sd tCalc = 11.5 n Case 3 t Test Example – Rest of Calculations • Step 4 – look up tTable – (t(95%, 13 degrees of freedom) = 2.17) • Step 5 – Compare tCalc with tTable, draw conclusion – tCalc >> tTable so difference is significant t- Tests • Note: These (case 2 and 3) can be applied to two different senarios: – samples (e.g. sample A and sample B, do they have the same % Ca?) – methods (analysis method A vs. analysis method B) F - Test • Similar methodology as t tests but to compare standard deviations between two methods to determine if there is a statistical difference in precision between the two methods (or variability between two sample sets) FCalc S1 > S2 S12 2 S2 As with t tests, if FCalc > FTable, difference is statistically significant Grubbs Test Example • Purpose: To determine if an “outlier” data point can be removed from a data set • Data points can be removed if observations suggest systematic errors •Example: •Cl lab – 4 trials with values of 30.98%, 30.87%, 31.05%, and 31.00%. •Student would like less variability (to get full points for precision) •Data point farthest from others is most suspicious (so 30.87%) •Demonstrate calculations Dealing with Poor Quality Data • If Grubbs test fails, what can be done to improve precision? – design study to reduce standard deviations (e.g. use more precise tools) – make more measurements (this may make an outlier more extreme and should decrease confidence interval)