Analytical Chemistry Analytical Chemistry • Analytical Chemistry? – “Science of Chemical Measurements” WHAT IS ANALYTICAL CHEMISTRY - The qualitative and quantitative characterization of matter - The scope is very wide and it is critical to our understanding of almost all scientific disciplines Characterization - The identification of chemical compounds or elements present in a sample (qualitative) - The determination of the amount of compound or element present in a sample (quantitative) Four primary Areas of Analytical Chemistry • Detection: – Does the sample contain substance X? • Identification: – What is the identity of the substance X in the sample? • Separation: – How can the species of interest be separated from the sample matrix for better quantitation and identification? • Quantitation: – How much of substance X is in the sample? Four primary Areas of Analytical Chemistry • Detection: – Does the sample contain substance X? • Identification: – What is the identity of the substance X in the sample? • Separation: – How can the species of interest be separated from the sample matrix for better quantitation and identification? • Quantitation: – How much of substance X is in the sample? What are the roles of Analytical Chemists? Analytical Chemist: 1. Applies known measurement techniques to well defined compositional or characterization questions. 2. Develops new measurement methods on existing principles to solve new analysis problems. What is Analytical Science? • Analytical Chemistry provides the methods and tools needed for insight into our material world…for answering four basic questions about a material sample? • What? • Where? • How much? • What arrangement, structure or form? Different methods provide a range of precision, sensitivity, selectivity, and speed capabilities. Sample size is very important when choosing a particular analytical technique. The Analytical Chemistry Language Analyte - A substance to be measured in a given sample Matrix - Everything else in the sample Interferences - Other compounds in the sample matrix that interfere with the measurement of the analyte The Analytical Chemistry Language Homogeneous Sample - Same chemical composition throughout (steel, sugar water, juice with no pulp, alcoholic beverages) Heterogeneous Sample - Composition varies from region to region within the sample (pudding with raisins, granola bars with peanuts) - Differences in composition may be visible or invisible to the human eye (most real samples are invisible) - Variation of composition may be random or segregated The Analytical Chemistry Language Analyze/Analysis - Experimental evaluation of the sample under study Determine/Determination - Measurement of the analyte in the sample Multiple Samples - Identical samples prepared from another source Replicate Samples - Splits of sample from the same source The Analytical Chemistry Language General Steps in Chemical Analysis 1. Formulating the question or defining the problem - To be answered through chemical measurements 2. Designing the analytical method/ protocol (selecting techniques) - Find appropriate analytical procedures 3. Sampling and sample storage - Select representative material to be analyzed 4. Sample preparation - Convert representative material into a suitable form for analysis The analytical chemistry approach for sample analysis General Steps in Chemical Analysis 5. Analysis (performing the measurement) - Measure the concentration of analyte in several identical portions 6. Assessing the data 7. Method validation 8. Documentation Basic steps to develop an analytical procedure. An analysis involves several steps and operations which depend on: •the particular problem • your expertise • the apparatus or equipment available. The analyst should be involved in every step. Integrity of Analytical method Once an analytical method if conducted Statistical Operations are used to determine the integrity of the test method and results. Integrity of Analytical method Once an analytical method if conducted Statistical Operations are used to determine the integrity of the test method and results. All measurement provide information about its magnitude and its uncertainty. Statistical Operations - Statistics are needed in designing the correct experiment An Analyst must - select the required size of sample - select the number of samples - select the number of replicates - obtain the required accuracy and precision Analyst must also express uncertainty in measured values to - understand any associated limitations - know significant figures Statistical Operations - Statistics are needed in designing the correct experiment An Analyst must - select the required size of sample - select the number of samples - select the number of replicates - obtain the required accuracy and precision Analyst must also express uncertainty in measured values to - understand any associated limitations - know significant figures Statistical Operations Rules For Reporting Results Significant Figures = digits known with certainty + first uncertain digit - The last sig. fig. reflects the precision of the measurement - Report all sig. figs such that only the last figure is uncertain - Round off appropriately (round down, round up, round even) Statistical Operations Rules For Reporting Results - Report least sig. figs for multiplication and division of measurements (greatest number of absolute uncertainty) - Report least decimal places for addition and subtraction of measurements (greatest number of absolute uncertainty) - The characteristic of logarithm has no uncertainty - Does not affect the number of sig. figs. - Discrete objects (absolute numbers) have no uncertainty - Considered to have infinite number of sig. figs. Accuracy and Precision - Accuracy is how close a measurement is to the true (accepted) value - True value is evaluated by analyzing known standard samples - Precision is how close replicate measurements on the same sample are to each other - Precision is required for accuracy but does not guarantee accuracy - Results should be accurate and precise (reproducible, reliable, truly representative of sample) Errors - Two principal types of errors - Determinate (systematic) and indeterminate (random) Determinate (Systematic) Errors - Caused by faults in procedure or instrument - Fault can be determined and corrected - Results in good precision but poor accuracy Examples; - constant (incorrect calibration of pH meter or mass balance) - variable (change in volume due to temperature changes) - additive or multiplicative Errors Examples of Determinate (Systematic) Errors - Improperly calibrated volumetric flasks and pipettes - Analyst error (misreading or inexperience) - Incorrect technique - Malfunctioning instrument (voltage fluctuations, alignment, etc.) - Contaminated or impure or decomposed reagents - Interferences Errors To Identify Determinate (Systematic) Errors - Use of standard methods with known accuracy and precision to analyze samples - Run several analysis of a reference analyte whose concentration is known and accepted - Run Standard Operating Procedures (SOPs) Errors Indeterminate (Random) Errors - Sources cannot be identified, avoided, or corrected - Not constant (biased) Examples - Limitations of reading mass balances - Electrical noise in instruments Errors - Random errors are always associated with measurements - No conclusion can be drawn with complete certainty - Scientists use statistics to accept conclusions that have high probability of being correct and to reject conclusions that have low probability of being correct Efforts to Eliminate Errors - Random errors follow random distribution and analyzed using laws of probability (and statistics) - Statistics deals with only random errors - Systematic errors should be detected and eliminated Review Please see calculation examples in text Review Please see calculation examples in text Statistical Operations… Sample MEAN - Arithmetic mean of a finite number of observations - Also known as the average - Is the sum of the measured values divided by the number of measurements N _ x x i 1 N i 1 x1 x 2 x 3 ..... x N N ∑xi = sum of all individual measurements xi xi = a measured value N = number of observations Population MEAN (µ) - The limit as N approaches infinity of the sample mean lim μ N N xi i 1 N Population Mean versus Sample mean? Sample Mean is the mean of sample values collected. Population Mean is the mean of all the values in the population. If the sample is random and sample size is large then the sample mean would be a good estimate of the population mean. Quantifying Random Error Median - The middle number in a series of measurements arranged in increasing order - The average of the two middle numbers if the number of measurements is even Mode - The value that occurs the most frequently Range - The difference between the highest and the lowest values Error Error (E) the difference between T and either x i or x E x i T or E x T Absolute error Absolute value of E E abs x i T or E x T Total error = sum of all systematic and random errors Relative error = absolute error divided by the true value E rel E abs T %E rel E abs x 100% T Standard Deviation Absolute deviation (d i ) x i x Relative deviation (D) = absolute deviation divided by mean D di _ x Percent Relative deviation [D(%)] D(%) di _ x x 100% D x 100% Standard Deviation Sample Standard Deviation (s) - A measure of the width of the distribution - Small standard deviation gives narrow distribution curve For a finite number of observations, N N s d i 1 x 2 i N 1 i 1 2 N i x N 1 xi = a measured value N = number of observations N-1 = degrees of freedom Standard Deviation Standard Deviation of the mean (sm) - Standard deviation associated with the mean consisting of N measurements s sm N Population Standard Deviation (σ) - For an infinite number of measurements 2 N σ lim N x i μ i 1 N Standard Deviation Percent Relative Standard Deviation (%RSD) %RSD s _ x 100 x Variance - Is the square of the standard deviation - Variance = σ2 or s2 - Is a measure of precision - Variance is additive but standard deviation is not additive - Total variance is the sum of independent variances Quantifying Random Error (using Gaussian or Bell curve distribution) - The Gaussian distribution and statistics are used to determine how close the average value of measurements is to the true value - The Gaussian distribution assumes infinite number of measurements As N increases x μ approaches zero x μ for N > 20 Random error x μ - The standard deviation coincides with the point of inflection of the curve (2 inflection points since curve is symmetrical) Quantifying Random Error (Standard Bell Curve Method) Population mean (µ) = true value (T or xt) x=µ f(x) a Points of inflection -3σ -2σ -σ μ σ 2σ 3σ x Quantifying Random Error Probability - Range of measurements for ideal Gaussian distribution - The percentage of measurements lying within the given range (one, two, or three standard deviation on either side of the mean) Range Gaussian Distribution (%) µ ± 1σ µ ± 2σ µ ± 3σ 68.3 95.5 99.7 Quantifying Random Error - The average measurement is reported as: mean ± standard deviation - Mean and standard deviation should have the same number of decimal places In the absence of determinate error and if N > 20 - 68.3% of measurements of xi will fall within x = µ ± σ - (68.3% of the area under the curve lies in the range of x) - 95.5% of measurements of xi will fall within x = µ ± 2σ - 99.7% of measurements of xi will fall within x = µ ± 3σ Quantifying Random Error x=µ±σ f(x) a 68.3% known as the confidence level (CL) -3σ -2σ -σ μ σ 2σ 3σ x QUANTIFYING RANDOM ERROR x = µ ± 2σ f(x) a 95.5% known as the confidence level (CL) -3σ -2σ -σ μ σ 2σ 3σ x QUANTIFYING RANDOM ERROR x = µ ± 3σ f(x) a 99.7% known as the confidence level (CL) -3σ -2σ -σ μ σ 2σ 3σ x CONFIDENCE LIMITS - Refers to the extremes of the confidence interval (the range) - Range of values within which there is a specified probability of finding the true mean (µ) at a given CL - CL is an indicator of how close the sample mean lies to the population mean µ = x ± zσ CONFIDENCE LIMITS µ = x ± zσ If z = 1 we are 68.3% confident that x lies within ±σ of the true value If z = 2 we are 95.5% confident that x lies within ±2σ of the true value If z = 3 we are 99.7% confident that x lies within ±3σ of the true value CONTROL CHARTS ACCESS DATA INTEGRITY https://www.qimacros.com/free-excel-tips/control-chart-limits/ CONTROL CHARTS ACCESS DATA INTEGRITY https://www.flickr.com/photos/93642218@N07/8637804092 Tests to further evaluate statistical data 1. F - test 2. T - test 3. Q - test CONFIDENCE LIMITS - For N measurements CL for µ is μ x zs m - s is not a good estimate of σ since insufficient replicates are made - The student’s t-test is used to express CL - The t-test is also used to compare results from different experiments t x μ s CONFIDENCE LIMITS _ ts μ x N - That is, the range of confidence interval is – ts/√n below the mean and + ts/√n above the mean - For better precision reduce confidence interval by increasing number of measurements Example: A soda ash sample is analyzed in the analytical chemistry laboratory by titration with standard hydrochloric acid. The analysis is performed in triplicate with the following results: 43.51, 43.58, and 43.43% Na2CO3. Within what range are you 95% confident that the true value lies? Example: A soda ash sample is analyzed in the analytical chemistry laboratory by titration with standard hydrochloric acid. The analysis is performed in triplicate with the following results: 43.51, 43.58, and 43.43% Na2CO3. Within what range are you 95% confident that the true value lies? _ ts μ x N T- table Example: A soda ash sample is analyzed in the analytical chemistry laboratory by titration with standard hydrochloric acid. The analysis is performed in triplicate with the following results: 43.51, 43.58, and 43.43% Na2CO3. Within what range are you 95% confident that the true value lies? _ μ x ts N S = 0.0618 DOF = 2, So using two tailed test t = 4.303 at 95% CL Using the Null Hypothesis to evaluate experimental results A hypothesis is proposed for the statistical relationship between the two data sets. Random Error occurs when the null hypothesis is Accepted. Systematic Error occurs when the null hypothesis is Rejected. Null Hypothesis uses the three statistical tests; 1. F - test 2. T - test 3. Q - test t – Test To test for comparison of Means – Typically three different ways or scenarios. - Calculate the standard deviation or the pooled standard deviation (spooled) depending on the tests - Calculate t - Compare the calculated t to the value of t from the table - The two results are significantly different if the calculated t is greater than the tabulated t at particular confidence level (that is tcal > ttab at the CL chosen) t – Test Scenario 1. t test when an Accepted (True) value is known. - Testing of your sample mean against the population or accepted mean (True value). N t x μ s - A known valid method is used to determine µ for a known sample. - The mean and standard deviation of your test is then determined. - t value is calculated for a given CL - Systematic error exists in the new method if tcal > ttab for the given CL – Null Hypothesis REJECTED t – Test - Systematic error exists in your method if tcal > ttab for the given CL – Null Hypothesis REJECTED - Systematic Errors are involved in your results. - GROSS ERROR EXIST IN DATA SET. tcal < ttab for the given CL – Null Hypothesis ACCEPTED - Difference in results between sample mean and population mean in purely RANDOM and can be ignored. - GROSS ERROR NOT MANIFESTED IN DATA SET t – Test Scenario 2. t test for two sets of data with - N1 and N2 measurements averages of x1 and x 2 - standard deviations of s1 and s2 s pooled s12 N1 1 s22 N 2 1 N1 N 2 2 x1 x2 t spooled N1N 2 N1 N 2 Degrees of freedom = N1 + N2 - 2 t – Test Scenario 3. Paired t test - Testing of your sample mean against an accepted mean (from another method) by analyzing several different samples of slightly varying composition (within physiological range). - Where di is the individual difference between the two methods for each sample, with regards to sign and is the mean of all the individual differences. F-TEST - Used to compare two methods (method 1 and method 2) - Determines if the two methods are statistically different in terms of precision - The two variances (σ12 and σ22) are compared F-function = the ratio of the variances of the two sets of numbers σ12 F 2 σ2 F-TEST - Ratio should be greater than 1 (i. e. σ12 > σ22) - F values are found in tables (make use of two degrees of freedom) Fcal > Ftab implies there is a significant difference between the two methods Fcal = calculated F value Ftab = tabulated F value REJECTION OF RESULTS Outlier - A replicate result that is out of the line - A result that is far from other results - Is either the highest value or the lowest value in a set of data - There should be a justification for discarding the outlier - The outlier is rejected if it is > ±4σ from the mean - The outlier is not included in calculating the mean and standard deviation - A new σ should be calculated that includes outlier if it is < ±4σ REJECTION OF RESULTS Q – Test - Used for small data sets - Arrange data in increasing order - Calculate range = highest value – lowest value - Calculate gap = |suspected value – nearest value| - Calculate Q ratio = gap/range - Reject outlier if Qcal > Qtab - Q tables are available Performing The Experiment Two types of Analytical Methods 1. Classical 2. Non-classical or Instrumental PERFORMING THE EXPERIMENT Detector - Records the signal (change in the system that is related to the magnitude of the physical parameter being measured) - Can measure physical, chemical or electrical changes Transducer (Sensor) - Detector that converts nonelectrical signals to electrical signals and vice versa PERFORMING THE EXPERIMENT Signals and Noise - A detector makes measurements and detector response is converted to an electrical signal - The electrical signal is related to the chemical or physical property being measured, which is related to the amount of analyte - There should be no signal when no analyte is present - Signals should be smooth but are practically not smooth due to noise PERFORMING THE EXPERIMENT Signals and Noise Noise can originate from - Power fluctuations - Radio stations - Electrical motors - Building vibrations - Other instruments nearby PERFORMING THE EXPERIMENT Signals and Noise - Signal-to-noise ratio (S/N) is a useful tool for comparing methods or instruments - Noise is random and can be treated statistically - Signal can be defined as the average value of measurements - Noise can be defined as the standard deviation S x mean N s standard deviation PERFORMING THE EXPERIMENT Types of Noise 1. White Noise - Two types Thermal Noise - Due to random motions of charge carriers (electrons) which result in voltage fluctuations Shot Noise - When charge carriers cross a junction in an electrical circuit PERFORMING THE EXPERIMENT Types of Noise 2. Drift (Flicker) Noise (origin is not well understood) 3. Noise due to surroundings (vibrations) Improving s/n - Signal is enhanced or noise is reduced for better results - Hardware and software approaches are available to improve s/n - Another approach is the use of Fourier Transform (FT) or Fast Fourier Transform (FFT) which discriminates signals from noise (FT-IR, FT-NMR, FT-MS) CALIBRATION CURVES Calibration - The process of establishing the relationship between the measured signals and known concentrations of analyte - Calibration standards: known concentrations of analyte - Calibration standards at different concentrations are prepared and measured - Magnitude of signals are plotted against concentration - Equation relating signal and concentration is obtained and can be used to determine the concentration of unknown analyte after measuring its signal CALIBRATION CURVES - Many calibration curves have a linear range with the relation equation in the form y = mx + b - The method of least squares or the spreadsheet may be used - m is the slope and b is the vertical (signal) intercept - The slope is usually the sensitivity of the analytical method - R = correlation coefficient (R2 is between 0 and 1) - Perfect fit of data (direct relation) if R2 is closer to 1 BEST STRAIGHT LINE (METHOD OF LEAST SQUARES) The equation of a straight line y = mx + b m is the slope (y/x) b is the y-intercept (where the line crosses the y-axis) BEST STRAIGHT LINE (METHOD OF LEAST SQUARES) The method of least squares - finds the best straight line - adjusts the line to minimize the vertical deviations Only vertical deviations are adjusted because - experimental uncertainties in y values > in x values - calculations for minimizing vertical deviations are easier BEST STRAIGHT LINE (METHOD OF LEAST SQUARES) xi yi xiyi xi2 ∑xi = ∑yi = ∑(xiyi) = ∑xi2 = BEST STRAIGHT LINE (METHOD OF LEAST SQUARES) m b N x i y i x i y i D x y 2 i i x i y i x i D D N x i2 x i 2 - N is the number of data points Knowing m and b, the equation of the best straight line can be determined and the best straight line can be constructed ASSESSING THE DATA A good analytical method should be - both accurate and precise - reliable and robust - It is not a good practice to extrapolate above the highest standard or below the lowest standard - These regions may not be in the linear range - Dilute higher concentrations and concentrate lower concentrations of analyte to bring them into the working range ASSESSING THE DATA Limit of Detection (LOD) - The lowest concentration of an analyte that can be detected - Increasing concentration of analyte decreases signal due to noise - Signal can no longer be distinguished from noise at a point - LOD does not necessarily mean concentration can be measured and quantified ASSESSING THE DATA Limit of Detection (LOD) - Can be considered to be the concentration of analyte that gives a signal that is equal to 2 or 3 times the standard deviation of the blank - Concentration at which S/N = 2 at 95% CL or S/N = 3 at 99% CL - 3σ is more common and used by regulatory methods (e.g. EPA) ASSESSING THE DATA Limit of Quantification (LOQ) - The lowest concentration of an analyte in a sample that can be determined quantitatively with a given accuracy and precision - Precision is poor at or near LOD - LOQ is higher than LOD and has better precision - LOQ is the concentration equivalent to S/N = 10/1 - LOQ is also defined as 10 x σblank