7th Iranian Workshop on Chemometrics 3-5 February 2008 Initial estimates for MCR-ALS method: EFA and SIMPLISMA Bahram Hemmateenejad Chemistry Department, Shiraz University, Shiraz, Iran E-mail: hemmatb@sums.ac.ir 1 Chemical modeling Fitting data to model (Hard model) Fitting model to data (soft model) 2 Multicomponent Curve Resolution • Goal: Given an I x J matrix, D, of N species, determine N and the pure spectra of each specie. • Model: DIxJ = CIxN SNxJ • Common assumptions: – Non-negative spectra and concentrations – Unimodal concentrations – Kinetic profiles 1 0.5 = SNxJ 0 30 60 20 40 10 samples 20 0 0 DIxJ sensors CIxN 3 Basic Principles of MCR methods PCA: D=TP Beer-Lambert: D=CS In MCR we want to reach from PCA to BeerLambert • D= TP = TRR-1P, R: rotation matrix • D = (TR)(R-1P) • C=TR, S=R-1P • The critical step is calculation of R 4 Multivariate Curve ResolutionAlternative Least Squares (MCR-ALS) • • • • • Developed by R. Tauler and A. de Juan Fully soft modeling method Chemical and physical constraints Data augmentation Combined hard model • Tauler R, Kowalski B, Fleming S, ANALYTICAL CHEMISTRY 65 (15): 2040-2047, 1993. • de Juan A, Tauler R, CRITICAL REVIEWS IN ANALYTICAL CHEMISTRY 36 (3-4): 163-176 2006 5 MCR-ALS Theory • Widely Applied to spectroscopic methods – UV/Vis. Absorbance spectra – UV-Vis. Luminescence spectra – Vibration Spectra – NMR spectra – Circular Dichroism –… • Electrochemical data are also analyzed 6 MCR-ALS Theory • In the case of spectroscopic data • Beer-Lambert Law for a mixture • D(mn) absorbance data of k absorbing species D = CS • C(mk) concentration profile • S(kn) pure spectra 7 MCR-ALS Theory • Initial estimate of C or S • Evolving Factor Analysis (EFA) C • Simple-to-use Interactive Self-Modeling Mixture Analysis (SIMPLISMA) S 8 MCR-ALS Theory • 1. 2. 3. 4. 5. Consider we have initial estimate of C (Cint) Determination of the chemical rank Least square solution for S: S=Cint+ D Least square solution for C: C=DS+ Reproducing of Dc: Dc=CS Calculating lack of fit error (LOF) Go to step 2 9 Constraints in MCR-ALS • Non-negativity (non-zero concentrations and absorbencies) • Unimodality (unimodal concentration profiles). Its rarely applied to pure spectra • Closure (the law of mass conservation or mass balance equation for a closed system) • Selectivity in concentration profiles (if some selective zooms are available) • Selectivity in pure spectra (if the pure spectra of a chemical species, i.e. reactant or product, are known) 10 Constraints in MCR-ALS • Peak shape constraint • Hard model constraint (combined hard model MCR-ALS) 11 • Rotational Ambiguity • Rank Deficiency 12 Evolving Factor Analysis (EFA) • Gives a raw estimate of concentration profiles • Repeated Factor analysis on evolving submatrices • • • Gampp H, Maeder M, Meyer CI, Zuberbuhler AD, CHIMIA 39 (10): 315-317 1985 Maeder M, Zuberbuhler AD, ANALYTICA CHIMICA ACTA 181: 287-291, 1986 Gampp H, Maeder M, Meyer CJ, Zuberbuhler AD, TALANTA 33 (12): 943951, 1986 13 Basic EFA Example Calculate Forward Singular Values 1 ___ 1st Singular Value 0.9 ----- 2nd Singular Value 0.8 ...… 3rd Singular Value 1 SVD i R Si 0.7 0.6 0.5 0.4 0.3 0.2 0.1 I 0 0 5 10 15 I samples 20 25 14 Basic EFA Example Calculate Backward Singular Values 1 1 ___ 1st Singular Value ----- 2nd Singular Value ...… 3rd Singular Value 0.9 0.8 0.7 i R 0.6 0.5 0.4 SVD 0.3 Si I 0.2 0.1 0 0 5 10 15 I samples 20 25 15 Basic EFA • Use ‘forward’ and ‘backward’ singular values to estimate initial concentration profiles • Area under both nth forward and (K-n+1)th backward singular values is estimate for initial concentration of nth component. 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 5 10 15 I samples 20 25 16 Basic EFA 1 First estimated spectra Area under 1st forward and 3rd backward singular value plot. (Blue) Compare to true component (Black) 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 5 10 15 I samples 20 25 17 Basic EFA 1 0.9 First estimated spectra Area under 2nd forward and 2nd backward singular value plot. (Red) 0.8 0.7 0.6 0.5 0.4 Compare to true component (Black) 0.3 0.2 0.1 0 0 5 10 15 I samples 20 25 18 Basic EFA 1 First estimated spectra Area under 3rd forward and 1st backward singular value plot. (Green) 0.9 0.8 0.7 0.6 0.5 Compare to true component (Black) 0.4 0.3 0.2 0.1 0 0 5 10 15 I samples 20 25 19 Example data • Spectrophotometric monitoring of the kinetic of a consecutive first order reaction of the form of A k1 B k2 C 20 21 • Pseudo first-order reaction with respect to A • A+R • [R]1 • [R]2 • [R]3 k1 k1=0.20 k1=0.30 k1=0.45 B k2 C k2=0.02 k2=0.08 k2=0.32 22 23 24 25 26 27 28 K1=0.2 K1=0.3 K2=0.02 K2=0.08 K1=0.45 K2=0.32 29 30 31 32 K1=0.20 K1=0.30 K2=0.02 K2=0.08 K1=0.45 K2=0.32 33 Noisy data 34 EFA Analysis • The m.file is downloadable from the MCRALS home page: http://www.ub.edu/mcr/welcome.html 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Simple-to-use Interactive SelfModeling Mixture Analysis (SIMPLISMA) W. Windigm J. Guilment, Anal. Chem. 1991, 63, 1425-1432. F.C. Sanchez, D.L. Massart, Anal. Chim. Acta 1994, 298, 331-339. 51 SIMPLISMA is based on the selection of what are called pure variables or pure objects. Object (i.e. time or sample) Variable (i.e. wavelength) Data matrix A pure variable is a wavelength at which only one of the compounds in the system is absorbing. A pure object is an analysis time at which only one compound is eluting. 52 Chromatographic profile Pure object Absorbance spectra Pure variable 53 1 2 54 35 20 55 t=0 Mean vector Standard deviation vector µ0 . 0 . . . . . t=m µm m 56 t=0 . . . . 0 . µ0 . t=m µm Mean vector Standard deviation vector 57 chromatogram Pure spectra 58 Pure spectra Mean Standard deviation 59 chromatogram Mean Standard deviation 60 λ1 vi || μi |||| xi || cos i xi || vi |||| xi || sin i || xi || || vi || || i || 2 2 µi 2 λ2 || xi || || vi || || i || n( i i ) 2 2 2 i || xi || . sin i / n p tan i i || xi || . cos i / n 2 0 i 61 SIMPLISMA steps 1) The ratio between the standard deviation, σi, and the mean, μi, of each spectrum is determined n (x j 1 i ij i ) 2 n n μi xij j 1 i pi wi i 62 To avoid attributing a high purity value to spectra with low mean absorbances, i.e., to noise spectra, an offset is included in the denominator i pi wi ' i i (offset / 100). max( i ) ' i 0<offset<3 63 2) Normalisation of the data matrix: Each spectrum xi is normalised by dividing each element of a row xij by the length of the row ||xi||: zij xij || xi || xi n x j 1 2 ij n( ) 2 i 2 i When an offset is added, the same offset is also included in the normalisation of the spectra. zij xij n( i2 i'2 ) 64 3) Determination of the weight of each spectrum, wi. The weight is defined as the determinant of the dispersion matrix of Yi, which contains the normalised spectra that have already been selected and each individual normalised spectrum zi of the complete data matrix. Yi = [Zi H] w i det( Y .Yi ) T i Initially, when no spectrum has been selected, each Yi contains only one column, zi (H=1), and the weight of each spectrum is equal to the square of the length of the normalized spectrum w i det( ZiT .Zi ) || zi ||2 65 When the first spectrum has been selected, p1, each matrix Yi consists of two columns: p1 and each individual spectrum zi, and the weight is equal to Yi = [Zi p1] w i det(Y .Yi ) (|| p1 || . || z i || . sin i ) T i 2 When two spectra have been selected, pl and p2, each Yi consists of those two selected spectra and each individual zi, and so on. Yi = [Zi [p1 p2]] 66 σ0 p0 μ0 pi w i p 0 Yi [ zi H] w i det( Y .Yi ) T i i=1 H=I i=2 H=p1 i=3 H=[p1 p2] i=4 H=[p1 p2 p3] … 67 68 69 70 71 Offset=0 72 Offset=1 73 * * 74 * * 75 * * 76 Example data HPLC-DAD data of a binary mixture 77 chromatogram 78 Pure spectra 79 80 81 82 83 84 85 86 87 88 89 90 91