Chemometrics • "Chemometrics has been defined as the application of mathematical and statistical methods to chemical measurements. " B. Kowalski, Anal. Chem. 1980, 52, 112R-122R. • "Chemometrics is the chemical discipline that uses mathematical and statistical methods for the obtention in the optimal way of relevant information on material systems." I. Frank and B. Kowalski, Anal. Chem.,1982, 54, 232R-243R. Chemometrics • "Chemometrics developments and the accompanying realization of these developments as computer software provide the means to convert raw data into information, information into knowledge and finally knowledge into intelligence." M. Delaney, Anal. Chem. 1984, 261R-277R. • ...research in chemometrics will contribute to the design of new types of instruments, generate optimal experiments that yield maximum information, and catalog and solve calibration and signal resolution problems. All this while quantitatively specifying the limitations of each instrument as well as the quality of the data it generates." L. S. Ramos et al., Anal. Chem. 1986, 58, 294R-315R. Chemometrics • "Chemometrics, the application of statistical and mathematical methods to chemistry..." S. Brown, Anal. Chem., 1986, 60, 252R-273R. • "Chemometrics is the discipline concerned with the application of statistics and mathematical methods, as well as those methods based on mathematical logic, to chemistry." S. Brown, Anal. Chem. 1990, 62, 84R-101R. Chemometrics • "Chemometrics is the use of mathematical and statistical methods for handling, interpreting, and predicting chemical data." • Malinowski, E.R.. (1991) Factor Analysis in Chemistry, Second Edition, page 1. • "Chemometrics is the discipline concerned with the application of statistical and mathematical methods, as well as those methods based on mathematical logic, to chemistry." S. Brown et al., Anal. Chem. 1992, 64,22R-49R. • Chemometrics • "Chemometrics can generally be described as the application of mathematical and statistical methods to 1) improve chemical measurement processes, and 2) extract more useful information from chemical and physical measurement data." J. Workman, P. Mobley, B. Kowalski, R. Bro, Appl. Spectrosc. Revs. 1996, 31, 73-124. • "Chemometrics is an approach to analytical and measurement science based on the idea of indirect observation. Measurements related to the chemical composition of a substance are taken, and the value of a property of interest is inferred from them through some mathematical relation." B.Lavine, Anal. Chem. 1998, 70, 209R-228R. Chemometrics • "Chemometrics is a chemical discipline that uses mathematics, statistics and formal logic (a) to design or select optimal experimental procedures; (b) to provide maximum relevant chemical information by analyzing chemical data; and (c) to obtain knowledge about chemical systems." Massart, D.L., et al.. (1997) Data Handling in Science and Technology 20A: Handbook of Chemometrics and Qualimetrics Part A, page 1. "The entire process whereby data (e.g., numbers in a table) are transformed into information used for decision making." Beebe, K. R., Pell, R. J., and M. B. Seasholtz. (1998) Chemometrics: A Practical Guide, page 1. Chemometrics • “Chemometrics (this is an international definition) is the chemical discipline that uses mathematical and statistical methods, (a) to design or select optimal measurement procedures and experiments; and (b) to provide maximum chemical information by analyzing chemical data.” Bruce Kowalski, in a formal CPAC presentation, December 1997 CHEMOMETRICS IS NOT A UNITARY SUBJECT LIKE ORGANIC CHEMISTRY ORGANIC CHEMISTRY IS BASICALLY A KNOWLEDGE BASED SUBJECT – certain basic skills and then increase the knowledge. CHEMOMETRICS IS MORE A SKILLED BASED SUBJECT – not necessary to have a huge knowledge of named methods, a very few basic principles but one must have hands-on experience to expand one’s problem solving ability. DIFFERENT GROUPS HAVE DIFFERENT BACKGROUNDS AND EXPECTATIONS AS TO HOW CHEMOMETRICS SHOULD BE INTRODUCED Statisticians want to start with distributions, hypothesis tests etc. and build up from there. They are dissatisfied if the maths is not explained. Chemical engineers like to start with linear algebra such as matrices, and expect a mathematical approach but are not always so interested in distributions etc. Computer scientists are often most interested in algorithms. Analytical chemists often know a little statistics but are not necessarily very confident in maths and algorithms so like to approach this via statistical analytical chemistry. Difficult group because the ability to run instruments is not necessarily an ability in maths and computing. Organic chemists do not like maths and want automated packages they can use. They often require elaborate courses that avoid matrices. The course an organic chemist would regard is good is one a statistician would regard as bad. Errors in quantitative analysis • No quantitative results are of any value unless they are accompanied by some estimate of the errors inherent in them • 24.69 24.73 24.77 25.39 (outlier) X error% Types of errors • Based on laboratory measurements: – – – – Instrumental Methodology Theoretical Data treatment • Based on their effect on the evaluation of the result: – Systematic-mostly instrumental – Random – Personal – Gross • Random errors cause replicate results to differ from one another so that the individual results fall on both sides of the average values even when all other errors are allowed for. – The deviation would be slight otherwise it could have been investigated – The total effects of the causes would yield a significant deviation • Systematic errors cause all the results to be in error in the same sense – – – – Instrumental errors are the most important Insufficient chemical purity Imperfect standard calibration and standardization Bias of the measurement is the total systematic error (some sources cause +ve and others cause –ve results) • Personal errors The results depend to some extent on the physical peculiarities of the observer (under otherwise equal conditions). These can be both systematic and random. • Gross errors Errors that are so serious that there is no real alternative t abandoning the experiment and making a completely fresh start (external influences that cause completely inaccurate results such as reading 20.0 and writing 30.0. Absolute and relative errors • Absolute error • Relative error 100 x • Reduced relative error [%] 100 ( x x ) 100 [%] R R max min • Accuracy (according to ISO =International Standards Organization): the closeness of agreement between a test result and the accepted reference value of the analyte • Precision= reproducibility and repeatability • Precision describes random error, bias describe systematic error and the accuracy incorporates both types of errors. • Repeatability Within-run-precision • Reproducibility Between-run-precision Random and systematic errors in titrimetric analysis It involves about 10 separate steps: 1. Making up a standard solution of one of the reactants. This involves (a) weighing a weighing bottle or similar vessel containing some solid material, (b) transferring the solid material to a standard flask and weighing the bottle again to obtain by subtraction the weight of solid transferred (weighing by difference), and (c) filling the flask up to the mark with water (assuming that an aqueous titration is to be used). 2. Transferring an aliquot of the standard material to a titration flask with the aid of a pipette. This involves (a) filling the pipette to the appropriate marls, and (b) draining it in a specified manner into the titration flask. 3. Titrating the liquid in the flask with a solution of the other reactant, added from a burette. This involves (a) filling the burette and allowing the liquid in it to drain until the meniscus is at a constant level, (b) adding a few drops of indicator solution to the titration flask, (c) reading the initial burette volume, (d) adding liquid to the titration flask from the burette a little at a time until the end-point is adjudged to have been reached, and (e) measuring the final level of liquid in the burette. • In principle, we should examine each step to evaluate the random and systematic errors that might occur. • Amongst the contributions to the errors are the tolerances of the weights used in the gravimetric steps, and of the volumetric glassware • Standard specifications for these tolerances are issued by such bodies as the British Standards Institute (BSI) and the American Society for Testing and Materials (ASTM). • Tolerance for a grade A 250-ml standard flask is ±0.12 ml: grade B glassware generally has tolerances twice as large as grade A glassware Handling systematic errors • Much of the remainder of topics will deal with the evaluation of random errors, which can be studied by a wide range of statistical methods. • In most cases we shall assume for convenience that systematic errors are absent • Many determinations have been made of the levels of (for example) chromium in serum – Different workers, all studying pooled serum samples from healthy subjects, have obtained chromium concentrations varying from < 1 to ca. 200 ng/ ml. In general the lower results have been obtained more recently, and it has gradually become apparent that the earlier, higher values were due at least in part to contamination of the samples by chromium from stainless-steel syringes, tube caps, and so on. • Methodological systematic errors of this kind are extremely common - incomplete washing of a precipitate in gravimetric analysis, and the indicator error in volumetric analysis • • • • • Another class of systematic error that occurs widely arises when false assumptions are made about the accuracy of an analytical instrument. Experienced analysts know only too well that the monochromators in spectrometers gradually go out of adjustment, so that errors of several nanometres in wavelength settings are not uncommon, yet many photometric analyses are undertaken without appropriate checks being made. Very simple devices such as volumetric glassware, stop-watches, pHmeters and thermometers can all show substantial systematic errors, but many laboratory workers regularly use these instruments as though they are always completely without bias. Instruments controlled by microprocessors or microcomputers has reduced to a minimum the number of operations and the level of skill required of their operators. Yet such instruments are still subject to systematic errors. Systematic errors arise not only from procedures or apparatus; they can also arise from human bias. – Some chemists suffer from astigmatism or colorblindness (the latter is more common amongst men than women) which might introduce errors into their readings of instruments and other observations. – A number of authors have reported various types of number bias, for example a tendency to favour even over odd numbers, or 0 and 5 over other digits, in the reporting of results. Approaches to avoid systematic errors • The analyst should be vigilant concerning the instruments’ functions, calibrations, analytical procedures and others. • Handling the design of the experiment at every stage carefully. – weighing by difference can remove some systematic gravimetric errors: – If the concentration of a sample of a single material is to be determined by absorption spectrometry, two procedures are possible. In the first, the sample is studied in a 1-cm pathlength spectrometer cell at a single wavelength, say 400 nm, and the concentration of the test component is determined from the A = ebc – Several systematic errors can arise here. The wavelength might be (say) 405 nm rather than 400 nm, thus rendering the reference value of e inappropriate; this reference value might in any case be wrong; the absorbance scale of the spectrometer might exhibit a systematic error; and the pathlength of the cell might not be exactly 1 cm. Alternatively, the analyst might take a series of solutions of the test substance of known concentration, and measure the absorbance of each at 400 nm. Planning and design of experiments • • • • Statistical tests are not used only to assess the results of completed experiments but also they may be considered crucial in the planning and design of experiments. In practice, the overall error is often dominated by the error in just one stage of the experiment, other errors having negligible effects when all the errors are combined correctly. Again it is obviously desirable to try to identify, before the experiment begins, where this single dominant error is likely to arise, and then to try to minimize it. Although random errors can never be eliminated, they can certainly be minimized by particular attention to experimental techniques: improving the precision of a spectrometric experiment by using a constant temperature sample cell would be a simple instance of such a precaution. Some times many experimental parameters should be taken into consideration, such as sensitivity, selectivity, sampling rate, cost, etc.). So the experiment should be designed in a way to optimize all parameters. Calculators and computers in statistical calculations • The rapid growth of chemometrics is due to the ease with which large quantities of data can be handed, and complex calculations done, with calculators and computers. • Personal computers (PCs) are now found in all chemical laboratories. Most modern instruments are controlled by PCs, which also handle and report the analytical data obtained.