Document

advertisement
7th Iranian Workshop on
Chemometrics
3-5 February 2008
Initial estimates for MCR-ALS
method: EFA and SIMPLISMA
Bahram Hemmateenejad
Chemistry Department, Shiraz University, Shiraz, Iran
E-mail: hemmatb@sums.ac.ir
1
Chemical modeling
Fitting data to model (Hard model)
Fitting model to data (soft model)
2
Multicomponent Curve
Resolution
• Goal: Given an I x J
matrix, D, of N species,
determine N and the pure
spectra of each specie.
• Model: DIxJ = CIxN SNxJ
• Common assumptions:
– Non-negative spectra and
concentrations
– Unimodal concentrations
– Kinetic profiles
1
0.5
=
SNxJ
0
30
60
20
40
10
samples
20
0 0
DIxJ
sensors
CIxN
3
Basic Principles of MCR methods
PCA:
D=TP
Beer-Lambert: D=CS
In MCR we want to reach from PCA to BeerLambert
• D= TP = TRR-1P, R: rotation matrix
• D = (TR)(R-1P)
• C=TR, S=R-1P
• The critical step is calculation of R
4
Multivariate Curve ResolutionAlternative Least Squares
(MCR-ALS)
•
•
•
•
•
Developed by R. Tauler and A. de Juan
Fully soft modeling method
Chemical and physical constraints
Data augmentation
Combined hard model
• Tauler R, Kowalski B, Fleming S, ANALYTICAL CHEMISTRY 65
(15): 2040-2047, 1993.
• de Juan A, Tauler R, CRITICAL REVIEWS IN ANALYTICAL
CHEMISTRY 36 (3-4): 163-176 2006
5
MCR-ALS Theory
• Widely Applied to spectroscopic methods
– UV/Vis. Absorbance spectra
– UV-Vis. Luminescence spectra
– Vibration Spectra
– NMR spectra
– Circular Dichroism
–…
• Electrochemical data are also analyzed
6
MCR-ALS Theory
• In the case of spectroscopic data
• Beer-Lambert Law for a mixture
• D(mn) absorbance data of k absorbing species
D = CS
• C(mk) concentration profile
• S(kn) pure spectra
7
MCR-ALS Theory
• Initial estimate of C or S
• Evolving Factor Analysis (EFA)
C
• Simple-to-use Interactive Self-Modeling
Mixture Analysis (SIMPLISMA)
S
8
MCR-ALS Theory
•
1.
2.
3.
4.
5.
Consider we have initial estimate of C (Cint)
Determination of the chemical rank
Least square solution for S: S=Cint+ D
Least square solution for C: C=DS+
Reproducing of Dc: Dc=CS
Calculating lack of fit error (LOF)
Go to step 2
9
Constraints in MCR-ALS
• Non-negativity (non-zero concentrations and
absorbencies)
• Unimodality (unimodal concentration profiles).
Its rarely applied to pure spectra
• Closure (the law of mass conservation or mass
balance equation for a closed system)
• Selectivity in concentration profiles (if some
selective zooms are available)
• Selectivity in pure spectra (if the pure spectra of
a chemical species, i.e. reactant or product, are
known)
10
Constraints in MCR-ALS
• Peak shape constraint
• Hard model constraint (combined hard
model MCR-ALS)
11
• Rotational Ambiguity
• Rank Deficiency
12
Evolving Factor Analysis
(EFA)
• Gives a raw estimate of concentration
profiles
• Repeated Factor analysis on evolving
submatrices
•
•
•
Gampp H, Maeder M, Meyer CI, Zuberbuhler AD, CHIMIA 39 (10): 315-317
1985
Maeder M, Zuberbuhler AD, ANALYTICA CHIMICA ACTA 181: 287-291,
1986
Gampp H, Maeder M, Meyer CJ, Zuberbuhler AD, TALANTA 33 (12): 943951, 1986
13
Basic EFA Example
Calculate Forward Singular Values
1
___ 1st Singular Value
0.9 ----- 2nd Singular Value
0.8 ...… 3rd Singular Value
1
SVD
i
R
Si
0.7
0.6
0.5
0.4
0.3
0.2
0.1
I
0
0
5
10
15
I samples
20
25
14
Basic EFA Example
Calculate Backward Singular Values
1
1
___ 1st Singular Value
----- 2nd Singular Value
...… 3rd Singular Value
0.9
0.8
0.7
i
R
0.6
0.5
0.4
SVD
0.3
Si
I
0.2
0.1
0
0
5
10
15
I samples
20
25
15
Basic EFA
• Use ‘forward’ and
‘backward’ singular
values to estimate initial
concentration profiles
• Area under both nth
forward and (K-n+1)th
backward singular values
is estimate for initial
concentration of nth
component.
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
5
10
15
I samples
20
25
16
Basic EFA
1
First estimated spectra
Area under 1st forward
and 3rd backward singular
value plot. (Blue)
Compare to true component
(Black)
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
5
10
15
I samples
20
25
17
Basic EFA
1
0.9
First estimated spectra
Area under 2nd forward
and 2nd backward singular
value plot. (Red)
0.8
0.7
0.6
0.5
0.4
Compare to true component
(Black)
0.3
0.2
0.1
0
0
5
10
15
I samples
20
25
18
Basic EFA
1
First estimated spectra
Area under 3rd forward
and 1st backward singular
value plot. (Green)
0.9
0.8
0.7
0.6
0.5
Compare to true component
(Black)
0.4
0.3
0.2
0.1
0
0
5
10
15
I samples
20
25
19
Example data
• Spectrophotometric monitoring of the
kinetic of a consecutive first order reaction
of the form of
A
k1
B
k2
C
20
21
• Pseudo first-order reaction with respect to A
• A+R
• [R]1
• [R]2
• [R]3
k1
k1=0.20
k1=0.30
k1=0.45
B
k2
C
k2=0.02
k2=0.08
k2=0.32
22
23
24
25
26
27
28
K1=0.2
K1=0.3
K2=0.02
K2=0.08
K1=0.45
K2=0.32
29
30
31
32
K1=0.20
K1=0.30
K2=0.02
K2=0.08
K1=0.45
K2=0.32
33
Noisy data
34
EFA Analysis
• The m.file is downloadable from the MCRALS home page:
http://www.ub.edu/mcr/welcome.html
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Simple-to-use Interactive SelfModeling Mixture Analysis
(SIMPLISMA)
W. Windigm J. Guilment, Anal. Chem. 1991, 63, 1425-1432.
F.C. Sanchez, D.L. Massart, Anal. Chim. Acta 1994, 298, 331-339.
51
 SIMPLISMA is based on the selection of what are called
pure variables or pure objects.
Object (i.e. time
or sample)
Variable (i.e. wavelength)
Data matrix
A
pure variable is a wavelength at which only one of the
compounds in the system is absorbing.
A
pure object is an analysis time at which only one
compound is eluting.
52
Chromatographic profile
Pure object
Absorbance spectra
Pure variable
53
1
2
54
35
20
55

t=0
Mean
vector
Standard deviation
vector
µ0
.
0
.
.
.
.
.
t=m
µm
m
56

t=0
.
.
.
.
0
.
µ0
.
t=m
µm
Mean
vector
Standard deviation
vector
57
chromatogram
Pure spectra
58
Pure spectra
Mean
Standard deviation
59
chromatogram
Mean
Standard deviation
60
λ1
vi
|| μi |||| xi || cos  i
xi
|| vi |||| xi || sin  i
|| xi || || vi ||  || i ||
2
2
µi
2
λ2
|| xi || || vi ||  || i ||  n( i  i )
2
2
2
 i || xi || . sin  i / n
p  
 tan  i
i || xi || . cos  i / n
2
0
i
61
SIMPLISMA steps
1) The ratio between the standard deviation, σi, and the mean,
μi, of each spectrum is determined
n
 (x
j 1
i 
ij
 i )
2
n
n
μi   xij
j 1
i
pi  wi
i
62
To avoid attributing a high purity value to spectra with low
mean absorbances, i.e., to noise spectra, an offset is included
in the denominator
i
pi  wi '
i
  i  (offset / 100). max( i )
'
i
0<offset<3
63
2) Normalisation of the data matrix: Each spectrum xi is
normalised by dividing each element of a row xij by the length
of the row ||xi||:
zij 
xij
|| xi ||
xi
n
x
j 1
2
ij
 n(   )
2
i
2
i
When an offset is added, the same offset is also included in the
normalisation of the spectra.
zij 
xij
n( i2  i'2 )
64
3) Determination of the weight of each spectrum, wi. The
weight is defined as the determinant of the dispersion matrix
of Yi, which contains the normalised spectra that have
already been selected and each individual normalised
spectrum zi of the complete data matrix.
Yi = [Zi H]
w i  det( Y .Yi )
T
i
Initially, when no spectrum has been selected, each Yi contains
only one column, zi (H=1), and the weight of each spectrum is
equal to the square of the length of the normalized spectrum
w i  det( ZiT .Zi ) || zi ||2
65
When the first spectrum has been selected, p1, each matrix Yi
consists of two columns: p1 and each individual spectrum zi,
and the weight is equal to
Yi = [Zi p1]
w i  det(Y .Yi )  (|| p1 || . || z i || . sin  i )
T
i
2
When two spectra have been selected, pl and p2, each Yi
consists of those two selected spectra and each individual zi,
and so on.
Yi = [Zi [p1 p2]]
66
σ0
p0 
μ0
pi  w i p 0
Yi  [ zi H]
w i  det( Y .Yi )
T
i
i=1
H=I
i=2
H=p1
i=3
H=[p1 p2]
i=4
H=[p1 p2 p3]
…
67
68
69
70
71
Offset=0
72
Offset=1
73
*
*
74
*
*
75
*
*
76
Example data
HPLC-DAD data of a binary
mixture
77
chromatogram
78
Pure spectra
79
80
81
82
83
84
85
86
87
88
89
90
91
Download