Les Méthodes PLS Herman et Svante Wold

advertisement
PLS Path Modeling
Michel Tenenhaus
(tenenhaus@hec.fr)
1
2
4
5
6
PLS Methods
initiated by Herman Wold, Svante Wold, Harald
Martens and Jan-Bernd Lohmöller
1.
2.
3.
4.
5.
6.
7.
8.
NIPALS (Nonlinear Iterative Partial Least Squares)
PLS Regression (Partial Least Squares Regression)
PLS Discriminant Analysis
SIMCA (Soft Independent Modeling by Class Analogy)
PLS Approach to Structural Equation Modeling
N-way PLS
PLS Logistic Regression
PLS Generalized Linear Model
7
PLS Methods
PLS Path Modeling:
PLS Approach to Structural Equation
Modeling
8
ECSI Path model for a
“ Mobile phone provider”
Image
R2=.243
R2=.432
.212 (.002)
.493
(.000)
Loyalty
.153
(.006)
Customer
Expectation
.037
(.406)
.466
(.000)
.066
(.314)
.545
(.000)
Perceived
value
.540
(.000)
R2=.335
.200 (.000)
.05
(.399)
Customer
satisfaction
R2=.672
.540
(.000)
.544
(.000)
Perceived
quality
R2=.297
Complaint
R2=.292
9
Structural Equation Modeling
The PLS approach of Herman WOLD
• Study of a system of linear relationships between
latent variables.
• Each latent variable is described by a set of
manifest variables, or summarizes them.
• Variables can be numerical, ordinal or nominal (no
need for normality assumptions).
• The number of observations can be small compare
to the number of variables.
10
Economic inequality and political instability
Data from Russett (1964), in GIFI
Economic inequality
Political instability
Agricultural inequality
INST : Instability of executive
(45-61)
ECKS : Nb of violent internal war
incidents (46-61)
DEAT : Nb of people killed as a
result of civic group
violence (50-62)
D-STAB : Stable democracy
D-UNST : Unstable democracy
DICT : Dictatorship
GINI : Inequality of land
distributions
FARM : % farmers that own half
of the land (> 50)
RENT : % farmers that rent all
their land
Industrial development
GNPR : Gross national product
per capita ($ 1955)
LABO : % of labor force
employed in agriculture
11
Economic inequality and political instability
(Data from Russett, 1964)
47 countries
Argentine
Australie
Autriche
Gini
86.3
92.9
74.0
Farm
98.2
99.6
97.4
Rent
32.9
*
10.7
Gnpr
374
1215
532
Labo
25
14
32
Inst
13.6
11.3
12.8
Ecks
57
0
4
Deat
217
0
0
Demo
2
1
2
58.3
86.1
26.0
1046
26
16.3
46
1
2
43.7
79.8
0.0
297
67
0.0
9
0
3

France

Yougoslavie
1 = Stable democracy
2 = Unstable democracy
3 = Dictatorship
12
Economic inequality and political instability
Agricultural inequality (X1)
GINI
FARM
+
+
+
INST
1
+
RENT
3
GNPR
+
LABO
-
Industrial
development (X2)
2
-
+
+
+
+
+
ECKS
DEAT
D-STB
D-INS
DICT
Political
instability (X3)
13
Modeling
• Reflective model (the block is supposed to be uni-dimensionnel)
Each manifest variable Xjh is written as : Xjh = jhh + jh
• Formative model (the block can be multi-dimensionnel)
The latent variable h is a function of the manifest variables of its block Xh :
h    jXhj  h
j
• There exists a linear structural relationship between the latent
variables:
Political instability (3)
= 1Agri. inequality (1) + 2Ind. development (2) + residual
14
Estimation of latent variables
using the PLS approach
(1)
External (outer) estimation Yh of h
Yh = Xhwh
(2)
Internal (inner) estimation Zh of h
Zh 
(3)
[sign( cor( j , h ))]Yj
jh
 j related with ξ h
Calculation of wh
whj = cor(Zh , Xhj)
15
Estimation of latent variables
using the PLS approach
(1) External estimation Yh of h
Yh = Xhwh
Y1 = w11Gini + w12Farm + w13Rent
Y2 = X2w2
Y3 = X3w3
16
1
Estimation of latent variables
+
using the PLS approach
3
(2)
2
Internal estimation Zh of h (Centroid scheme)
-
Zh 
[sign( cor( j , h ))]Yj
jh
 j related with ξ h
Z1 = sign(cor(1, 3)Y3 = (+1)Y3
Z2 = sign(cor(2, 3)Y3 = (-1)Y3
Z3 = sign(cor(3, 1)Y1 + sign(cor(3, 2)Y2
= (+1)Y1 + (-1)Y2
17
Estimation of latent variables
using the PLS approach
(3) Calculation of the weights wh
whj = cor(Xhj , Zh)
w11 = cor(Gini , Z1)
w12 = cor(Farm , Z1)
w13 = cor(Rent , Z1)
And the same way for the other whj.
18
Weight initialization in PLS-graph
Option “1” : All weights are equal to 1.
Option “–1” : All weights equal to 1,
except the last one put to –1.
w11,initial = 1
w12,initial = 1
w13,initial = -1
This choice allows some sign control:
If the variable with the largest weight is put on last position,
this weight will have a good chance to be negative.
19
Economic inequality and political instability
Estimation of latent variables with PLS Approach
(1) External estimation
Y1 = X1w1
Y2 = X2w2
Y3 = X3w3
(2) Internal estimation
Z1 = Y3
Z2 = -Y3
Z3 = Y1 - Y2
(3) Calculation of wh
w1j = cor(X1j , Z1)
w2j = cor(X2j , Z2)
w3j = cor(X3j , Z3)
Algorithm
• Begin with arbitrary
weights w1, w2, w3.
• Get new weights wh by
using (1) to (3).
• Iterate until convergence.
20
Use of PLS-Graph (Wynne Chin)
21
Résults
Outer Model
===============================
Variable
Weight
Loading (Corrélation)
------------------------------Ineg_agri outward
gini
0.4567
0.9745
farm
0.5125
0.9857
rent
0.1018
0.5156
------------------------------Dev_ind
outward
gnpr
0.5113
0.9501
labo
-0.5384
-0.9551
------------------------------Inst_pol
outward
inst
0.1187
0.3676
ecks
0.2855
0.8241
death
0.2977
0.7910
demostab
-0.3271
-0.8635
demoinst
0.0370
0.1037
dictatur
0.2758
0.7227
=================================
Loading = coeff. de régression de Xhj sur Yh ,
= cor(Xhj, Yh) si les X sont centrées-réduites
22
Results
Eta .. Latent variables
========================================
ineg_agr dev_indu inst_pol
---------------------------------------arg
.964
.238
.755
aus
1.204
1.371
-1.617
aut
.397
.253
-.480
bel
-.812
1.530
-.846
bol
1.115
-1.584
1.505
bré
.778
-.654
.302
.
.
.
tai
-.009
-.898
-.068
ru
.134
2.059
-1.046
eu
.193
2.016
-.942
uru
.699
.179
-1.298
ven
1.149
.252
1.135
rfa
-.212
1.104
-.494
you
-2.189
-.654
.125
========================================
23
PLS results
Latent variable estimation
Argentine
Australie
Autriche
Y1
0.96
1.20
0.39
Y2
0.24
1.37
0.25
Y3
0.75
-1.62
-0.48
-0.88
0.80
0.56
-2.19
-0.65
0.13

France

Yougoslavie
Multiple regression of Y3 on Y1 and Y2
R2 = 0.618
Political instability
=
0.217 Agricultural inequality – 0.692  Industrial development
(2.24)
(-7.22)
Student t coming from multiple regression results
24
Economic inequality and political instability
Agricultural inequality (X1)
GINI
FARM
INST
.974
.986
RENT
.516
GNPR
.950
LABO
-.955
Industrial
development (X2)
1
2
.368 ECKS
.824
.217
.791 DEAT
-.864
3
D-STB
.104
-.692
D-UNS
R2 = 0.618 .723
DICT
Political
instability (X3)
25
Map of countries : Y1 = agricultural inequality , Y2 = industrial development
Y2
2.0
1.5
1.0
0.5
0.0
-0.5
-1.0
-1.5
-2.0
„ƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒ†
‚
‚
‚
ˆ
royaume-uni(1) ** états-unis(1)
ˆ
‚
‚
‚
‚
‚
‚
‚
* canada(1)
‚
‚
‚
* suisse(1)
‚
‚
ˆ
* belgique(1)
‚
ˆ
‚
* suède(1)
‚
australie(1) *
‚
‚
‚
* nouv._zélande(1)
‚
‚
* pays-bas(1) ‚
‚
‚
* rfa(2)
‚
ˆ
* luxembourg(1)
ˆ
‚
france(2)
‚
‚
‚
* danemark(1)
*
* norvège(1)‚
‚
‚
‚
‚
‚
‚
‚
ˆ
‚
ˆ
‚
‚
‚
‚
* finlande(2)
‚
* autriche(2)
‚
‚
‚
italie(2) *
* argentine(2)‚
‚
* irlande(1)
‚
uruguay(1) *venezuela(3) ‚
ˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆ
‚
‚
‚
‚
‚
* cuba(3)
‚
‚
* pologne(3)
‚
chili(2) *
‚
‚
* japon(2)
‚
* panama(3) * colombie(2)
‚
ˆ
‚ grèce(2) *
* * costa-rica(2)ˆ
‚
* yougoslavie(3)
nicaragua(3)* Espagne(3)*brésil(2) ‚
‚
‚
salvador(3)* * * équateur(3) ‚
‚
* philippines(3)
rép_dominic.(3)
‚
‚
taiwan(3) *
guatémala(3) *
‚
ˆ
‚
pérou(3) *
* irak(3) ˆ
‚
sud_vietnam(3) *
** honduras(3)
‚
‚
‚
égypte(3)
‚
‚
‚
‚
‚
* libye(3)
‚
ˆ
* inde(1)
‚
ˆ
‚
‚
bolivie(3) *
‚
‚
‚
‚
‚
‚
‚
‚
‚
‚
ˆ
‚
ˆ
‚
‚
‚
ŠƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒŒ
-2.5
-2.0
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
Y1
26
Results
Inner Model
=======================
Block
Mult.RSq
----------------------Inégalit
0.0000
Développ
0.0000
Instabil
0.6180
========================
Path coefficients
========================================
Ineg_agr Dev_indu Inst_pol
---------------------------------------Ineg_agr
0.000
0.000
0.000
Dev_indu
0.000
0.000
0.000
Inst_pol
0.217
-0.692
0.000
========================================
Correlations of latent variables
========================================
Ineg_agr Dev_indu Inst_pol
---------------------------------------Ineg_agr
1.000
Dev_indu
-0.309
1.000
Inst_pol
0.431
-0.759
1.000
========================================
27
Results
Outer Model
========================================================
Variable
Weight
Loading
Communality Redundancy
-------------------------------------------------------Ineg_agri outward
gini
0.4567
0.9745
0.9496
0.0000
farm
0.5125
0.9857
0.9716
0.0000
rent
0.1018
0.5156
0.2659
0.0000
-------------------------------------------------------Dev_indu outward
gnpr
0.5113
0.9501
0.9027
0.0000
labo
-0.5384
-0.9551
0.9123
0.0000
-------------------------------------------------------Inst_pol outward
inst
0.1187
0.3676
0.1352
0.0835
ecks
0.2855
0.8241
0.6792
0.4197
death
0.2977
0.7910
0.6257
0.3867
demostab
-0.3271
-0.8635
0.7457
0.4608
demoinst
0.0370
0.1037
0.0107
0.0066
dictatur
0.2758
0.7227
0.5223
0.3228
========================================================
Average
= 0.28
Communality = Cor(Xhj, Yh)2 = Loading2
For endogenous LV : Redundancy = Cor2(Xhj, Yh)*R2(Yh, LVs explaining Yh)
28
Résultats
Inner Model
===========================================================
Block
Mult.RSq AvCommun AvRedund Goodness of Fit
----------------------------------------------------------Ineg_agri
0.0000
0.7290
0.0000
Dev_indu
0.0000
0.9075
0.0000
Inst_pol
0.6180
0.4531
0.2800
----------------------------------------------------------Average
0.6180
0.6110
0.2800
.614
===========================================================
Value of the
internal model
Value of the external
model
GoF  .618  .611
1 ph
(Average communality)h   Cor 2 (Xhj , Yh )
ph j1
= Average Variance of Xh explained by Yh
= AVEh
A latent variable must explain at least 50% of its block variance.
Average Communality = (3*AvCommun1 + 2*AvCommun2 + 6*AvCommun3)/11
29
A global index of model fit
PLS Goodness of Fit
Inner Model
===================================================
Block
Mult.RSq AvCommun AvRedund
GoF
--------------------------------------------------Ineg_agri
0.0000
0.7290
0.0000
Dev_indu
0.0000
0.9075
0.0000
Inst_pol
0.6180
0.4531
0.2800
--------------------------------------------------Average
0.618
0.6110
0.2800
.614
===================================================
GoF 
ph
1
1
2
R (Yh , Other LVs explaining Yh *
Cor 2 (X hj , Yh )


Nb of endogenous LVs h
Nb of MVs h j1
Inner
model
Outer
model
30
Discriminant validity
A LV explains more its own MVs than the other LVs
AVE and square correlations
========================================
Ineg_agr
Dev_indu
Inst_pol
---------------------------------------Ineg_agr
0.729
Dev_indu
0.095
0.907
Inst_pol
0.186
0.576
0.453
????
========================================
AVE(Yj) must be larger than the cor2(Yj,Yh) for all h
31
Using PLS-Graph
(t=1.705)
(t=-7.685)
t coming from bootstrap re-sampling
32
Bootstrap validation in PLS-Graph
Sign control: Individual sign changes / Construct level changes*
Outer Model Loadings:
====================================================================
Entire
Mean of
Standard
T-Statistic
sample
subsamples error
estimate
Inégalité agricole:
gini
0.9745
0.9584
0.0336
28.9616
farm
0.9857
0.9689
0.0329
29.9339
rent
0.5156
0.4204
0.2462
2.0946
Développement industriel:
gnpr
0.9501
labo
-0.9551
0.9489
-0.9536
0.0121
0.0107
78.3692
-89.1493
Instabilité politique:
inst
0.3676
0.3347
0.1756
2.0932
ecks
0.8241
0.8138
0.0699
11.7920
demostab
-0.8635
-0.8520
0.0667
-12.9419
demoinst
0.1037
0.0955
0.1611
0.6438
dictatur
0.7227
0.7195
0.0841
8.5915
death
0.7910
0.7977
0.0528
14.9773
====================================================================
(*) used here
33
Bootstrap validation in PLS-Graph
Sign control
Individual sign changes
Each bootstrapped sign weight is automatically put equal
to the full sample sign weight.
Construct level changes (Default)
For each LV (Construct) the weights are globally inversed if
the new loadings (after inversion) are closer to the full
sample loadings than the bootstrapped loadings (before
inversion).
34
PLS-Graph : Bootsrap Validation
Path Coefficients Table (Entire Sample Estimate):
====================================================================
Inég. Agric. Dev. Indust. Instab. Pol.
Inég. Agric.
0.0000
0.0000
0.0000
Dev. Indust.
0.0000
0.0000
0.0000
Inst. Pol.
0.2170
-0.6920
0.0000
====================================================================
Path Coefficients Table (Mean of Subsamples):
====================================================================
Inég. Agric. Dev. Indust. Instab. Pol.
Inég. Agric.
0.0000
0.0000
0.0000
Dev. Indust.
0.0000
0.0000
0.0000
Inst. Pol.
0.2328
-0.6743
0.0000
====================================================================
Path Coefficients Table (Standard Error):
====================================================================
Inég. Agric. Dev. Indust. Instab. Pol.
Inég. Agric.
0.0000
0.0000
0.0000
Dev. Indust.
0.0000
0.0000
0.0000
Instabil
0.1272
0.0900
0.0000
====================================================================
Path Coefficients Table (T-Statistic)
====================================================================
Inég. Agric. Dev. Indust. Instab. Pol.
Inég. Agric.
0.0000
0.0000
0.0000
Dev. Indust.
0.0000
0.0000
0.0000
Inst. Pol.
1.7054
-7.6855
0.0000
====================================================================
35
SPECIAL CASES OF
PLS PATH MODELLING
•
•
•
•
•
•
•
Principal component analysis
Multiple factor analysis
Canonical correlation analysis
Redundancy analysis
PLS Regression
Generalized canonical correlation analysis (Horst)
Generalized canonical correlation analysis (Carroll)
36
Options of the PLS algorithm
External estimation
Yj = Xjwj
Internal estimation
Mode A (for reflective) :
wjh = cor(Xjh , Zj)
Centroid scheme
eji = sign of cor(Yi,Yj)
Factorial scheme
eji = cor(Yi,Yj)
Path weighting scheme
eji = regression coeff. in the
Mode B (for formative) :
wj = (Xj´Xj)-1Xj´Zj
Z j   e jiYi
regression of Yj on the Yi’s
37
The general PLS algorithm
Yj =  Xjwj
Initial
step
wj
Outer
Estimation
(standardized)
1
Mode A: wj = X j ' Z j w  cor(X,Z)
n
1
1
Mode B: wj = ( X j ' X j ) 1 ( X j ' Z j )
n
n
Look at the loading, not at the w
Yj1
ej1
Yj2
ej2 Zj
Yjm
ejm
Inner
estimation
Choice of weights e:
Centroid, Factorial
or Path weighting scheme
38
Some modified multi-block methods for SEM
cjk = 1 if blocks are linked, 0 otherwise
SUMCOR (Horst, 1961)
Max  j ,k c jk Cor ( F j , Fk )
Mathes (1993), Hanafi (2004):
Max  j , k c jk Cor 2 ( F j , Fk )
PLS: B, Factorial
Mathes (1993), Hanafi (2004)
Max  j , k c jk | Cor ( F j , Fk ) |
PLS : B, Centroid
MAXBET
PLS: B, Horst (New)
Max [ j Var ( X j w j )   j  k c jk Cov ( X j w j , X k wk )]
All w j 1
(Van de Geer, 1984):
MAXDIFF
Max [ j  k c jk Cov ( X j w j , X k wk )]
PLS : A, Horst
NEW APPROACH
All w j 1
(Van de Geer, 1984):
MAXDIFF B
(Hanafi & Kiers, 2006)
Max
All wi 1
 c Cov ( X w , X w )
2
i j
ij
i
i
j
j
PLS : A, Factorial
NEW APPROACH
PLS approach : 2 blocks
1
X1
2
X2
Mode for weight calculation
Method
Deflation
(*)
Y1 = X1w1
Y2 = X2w2
A
A
PLS regression of X2 on X1
On X1 only
B
A
Redundancy analysis of X2 with respect to X1
On X1 only
A
A
Tucker Inter-Battery Factor Analysis
On X1 and X2
B
B
Canonical correlation Analysis
On X1 and X2
(*) Deflation: Working on residuals of the regression of X on the previous
LV’s in order to obtain orthogonal LV’s.
40
PLS regression (2 components)
dim 1
- Mode A for X
- Mode A for Y
- Deflate only X
Max Cov( Xa, Yb)
a  b 1
 Max Cor ( Xa, Yb)* Var ( Xa) * Var (Yb)
dim 2
a  b 1
41
PLS Regression in SIMCA-P : PLS Scores
australie
3
royaume-uni
2
états-unis
nouvelle zélande
belgique
pays-bas
1
suède luxembourg
t[2]
suisse
0
france
West Germany
norvège
canada
-1
argentine
uruguay venezuela
irak
cuba espagne
italie
chili
équateur
pèrou
colombie
autriche
rép. dominicaine
grèce
salvador guatémala
costa-rica
taiwan
brésil
panama
honduras
nicaragua
égypte
sud vietnam
philippines
inde
libye
finlande
danemark
bolivie
irlande
-2
japon
-3
pologne
yougoslavie
-4
-4
-3
-2
-1
0
1
2
3
4
t[1]
42
Correlation loadings
1.00
RENT
0.80
GINI
FARM
0.60
0.40
pc(corr)[Comp. 2]
GNPR
DEMOSTB
0.20
DEAT
0.00
INST
DEMOINST
-0.20
ECKS
DICTATURE
-0.40
LABO
-0.60
-0.80
-1.00
- 1.00
- 0.80
- 0.60
-0.40
-0.20
0.00
0.20
0.40
0.60
0.80
1.00
pc(corr)[Comp. 1]
43
Redundancy analysis of X on Y
(2 components)
dim 1
- Mode A for X
- Mode B for Y
- Deflate only X
Max
a Var (Yb ) 1

Cov( Xa, Yb)
Max
a Var (Yb ) 1
Max
Var (Yb ) 1
Cor ( Xa, Yb)* Var ( Xa)
 Cor
2
dim 2
( x j , Yb)
j
44
Inter-battery factor analysis (2 components)
dim 1
- Mode A for X
- Mode A for Y
- Deflate both
X and Y
Max Cov( Xa, Yb)
a  b 1
 Max Cor ( Xa, Yb)* Var ( Xa) * Var (Yb)
dim 2
a  b 1
45
Canonical correlation analysis (2 components)
dim 1
- Mode B for X
- Mode B for Y
- Deflate both
X and Y
Max
Var ( Xa ) Var (Yb ) 1
Cov( Xa, Yb)
dim 2
46
PLS approach : K blocks
1
X1
X1
.
.
.

K
XK
.
.
.
X
XK
Scheme for internal estimation calculation
Mode for weight
calculation
Centroid
Factorial
A
NEW !
NEW !
B
Generalized Canonical
Correlation Analysis
(Horst)
Generalized Canonical
Correlation Analysis
(Carroll)
Deflation: On the super-block only
Structural
- PCA of X
- Multiple Factor
Analysis of the X j’s
- ACOM
(Chessel & Hanafi)
NEW !
47
A new PLS algorithm
Arthur & Michel Tenenhaus
cij = 1 if blocks are linked, 0 otherwise
Horst scheme : Maximize cij Cov( X i ai , X j a j )
i, j
Centroid scheme : Maximize cij Cov( X i ai , X j a j )
i, j
Factorial scheme : Maximize cij Cov 2 ( X i ai , X j a j )
i, j
subject to the following constraints :
 i ai  1   i Var ( X i ai )  1
2
For  i = 1  Mode A : ai
2
1
For  i = 0  Mode B : Var ( X i ai )  1
48
CONCLUSION
• PLS IS TO COVARIANCE-BASED SEM AS
PRINCIPAL COMPONENT ANALYSIS IS TO
FACTOR ANALYSIS.
• WHEN INDIVIDUAL DATA ARE AVAILABLE,
SIGNIFICANCE TESTS CAN BE CARRIED
OUT WITH PLS BY CROSS VALIDATION
METHODS.
49
European Customer Satisfaction Index
PLS Path Modelling versus LISREL
Michel Tenenhaus
tenenhaus@hec.fr
50
The European Customer Satisfaction Index
(ECSI)
• ECSI is an economic indicator that
measures customer satisfaction.
• It is an adaptation of the Swedish Customer
Satisfaction Barometer and the American
Customer Satisfaction Index (ACSI)
proposed by Claes Fornell.
• Fornell’s methodology is presented.
51
Path model describing causes and consequences
of Customer Satisfaction
Image
Loyalty
Customer
Expectation
Perceived
value
Customer
satisfaction
.
Perceived
quality
Full model in red and blue, Reduced model in red
Complaints
52
Content of the presentation
• Use of Fornell’s methodology on the full ECSI
model
• Use of Fornell’s methodology on the reduced
model
• Use of SEM-ML on the reduced model
(SEM-ML did not work on the full model)
• Comparison between PLS and SEM-ML results on
the reduced model
53
Measurement Instrument for the Mobile Phone Industry :
Examples of latent and manifest variables
Customer expectation
a) Expectations for the overall quality of
“your mobil phone provider” at the
moment you became customer
of this provider.
b) Expectations for “your mobile phone
provider” to provide products and
services to meet your personal need.
Customer satisfaction
a) Overall satisfaction
b) Fulfilment of expectations
c) How well do you think
“ your mobile phone provider”
compares with your ideal mobil
phone provider ?
c) How often did you expect that things
could go wrong at “your mobile phone
provider” ?
54
Measurement Instrument for the Mobile Phone Industry :
Examples of latent and manifest variables
Customer loyalty
a) If you would need to choose a new mobile phone provider how
likely is it that you would choose “your provider” again ?
b) Let us now suppose that other mobile phone providers decide
to lower fees and prices, but “your mobile phone provider”
stays at the same level as today. At which level of difference (in %)
would you choose another phone provider ?
c) If a friend or colleague asks you for advice, how likely is it that
you would recommend “your mobile phone provider” ?
And so on for the other latent variables ...
55
I. Study of the complete model using
the Fornell’s approach
• Manifest variables V are transformed from a scale
“ 1-10 ” to a scale “ 0 -100 ” :
x
V 1
100
9
• Each latent variable is estimated as a weighted average of
its manifest variables.
• PLS Path modeling is used to estimate the weights with
Mode A and Centroid scheme options.
• Path coefficients are computed by multiple regression on
the estimated latent variables and t-statistics by crossvalidation (bootstrap).
56
Use of PLS-Graph (Wynne Chin)
57
Results : The weights
Outer Model
======================
Variable
Weight
---------------------Image
outward
IMAG1
0.0147
IMAG2
0.0127
IMAG3
0.0137
IMAG4
0.0177
IMAG5
0.0143
---------------------Expectat outward
CUEX1
0.0232
CUEX2
0.0224
CUEX3
0.0252
---------------------Per_Qual outward
PERQ1
0.0098
PERQ2
0.0085
PERQ3
0.0118
PERQ4
0.0094
PERQ5
0.0084
PERQ6
0.0095
PERQ7
0.0129
----------------------
Outer Model
======================
Variable
Weight
---------------------Per_Valu outward
PERV1
0.0239
PERV2
0.0247
---------------------Satisfac outward
CUSA1
0.0158
CUSA2
0.0231
CUSA3
0.0264
---------------------Complain outward
CUSCO
0.0397
---------------------Loyalty
outward
CUSL1
0.0185
CUSL2
0.0061
CUSL3
0.0225
======================
58
Fornell’s computation of the latent variables
Example : Customer Satisfaction Index
CSI 
0.0158  CUSA1  0.0231 CUSA2  0.0264  CUSA3
0.0158  0.0231  0.0264
Mean and standard deviation of the latent variables
IMAGE
CUSTOMER EXPECTATION
PERCEIVED QUALITY
PERCEIVED VALUE
CUSTOMER SATISFACTION
COMPLAINT
LOYALTY
N
250
250
250
250
250
250
250
Minimum
26.49
25.85
23.95
.00
23.68
.00
1.29
Maximum
100.00
100.00
100.00
100.00
100.00
100.00
100.00
Mean
72.6878
72.3198
74.5765
61.5887
71.2876
67.4704
69.1757
Std. Deviation
13.7660
14.1259
14.2573
20.5987
15.3417
25.2684
21.2668
59
Correlations between manifest variables and latent variables
Image1
Image2
Image3
Image4
Image5
C_exp1
C_exp2
C_exp3
P_qual1
P_qual2
P_qual3
P_qual4
P_qual5
P_qual6
P_qual7
P_val1
P_val2
C_sat1
C_sat2
C_sat3
Complaint
Loyalty1
Loyalty2
Loyalty3
Image
.717
.565
.657
.791
.698
.622
.
.621
.
.599
.551
.596
.541
.558
.524
.613
Customer
expectation
.689
.644
.724
.537
Perceived
quality
.571
Perceived
value
Customer
satisfaction
.539
.571
.544
.543
.500
.778
.651
.801
.760
.732
.766
.803
.661
.594
.638
.672
.684
.537
.547
.933
.911
.588
Complaint
Loyalty
.651
.587
.516
.539
.707
.631
.711
.872
.884
.540
.524
.547
1
.610
.854
.528
.537
.659
Correlations below 0.5 in absolute value are not shown.
.869
60
ECSI Path model for a“ Mobile phone provider”
Regression on standardized variables and t-statistics provided
by PLS-Graph bootstrap, construct level change option
Image
R 2=.242
R 2=.432
.211 (2.54)
.492
(7.67)
Loyalty
.153
(3.07)
Customer
Expectation
.037
(1.14)
.468
(5.18)
.066
(1.10)
.544
(10.71)
Perceived
value
.541
(6.93)
R2 =.335
.201 (3.59)
.049
(1.11)
Customer
satisfaction
R 2=.672
.540
(11.08)
.543
(8.62)
Perceived
quality
R 2=.296
Complaint
R 2=.292
61
II. Study of the reduced model
using Fornell’s approach
.
Customer
Expectation
.053
(1.20)
.070
(1.08)
.545
(8.92)
Perceived
value
.538
(6.59)
R 2=.335
.216 (12.35)
Customer
satisfaction
R 2=.660
.634 (11.50)
Loyalty
R 2 = .402
.638
(3.70)
Perceived
quality
R 2=.297
62
The new PLS weights
Weight
CE1
CE2
CE3
PQ1
PQ2
PQ3
PQ4
PQ5
PQ6
PQ7
.0237
.0206
.0262
.0098
.0085
.0118
.0094
.0084
.0095
.0129
Relative
Weight
.336
.292
.372
.139
.121
.168
.134
.119
.135
.183
Weight
PV1
PV2
CS1
CS2
CS3
CL1
CL2
CL3
-.0239
-.0247
.0157
.0240
.0256
-.0188
-.0050
-.0226
Relative
Weight
.492
.508
.241
.368
.392
.405
.108
.487
For each variable the relative weights sum up to 1.
63
III. Study of the reduced model using AMOS:
Model 1 (Standardized Results)
.21
.30
CE1
Chi-Square = 271
DF = 128
Chi-Square /DF = 2.12
e5
CE3
.42
PQ5
e10
.50
PQ6
.57
PQ7
.71 .69
.71 .76
.74
.86
PER_QUAL
.55
d1
.46 .78
.74
PV1
d2
PER_VAL
.89 .94 .04
.72
(.83)
PV2
.24
.48
e13
e9
.48
.50
PQ4
PQ3
.77 .57 .75
-.13 (.63)
e12
e8
e7
.56
PQ2
PQ1
CUS_EXP
e11
e6
.33
.60
.18
CE2
.55.46
e4
e3
e2
e1
CSI1
.57
e14
CSI2
e15
CS3
.87
.70
.75
CSI
d3
.64 .80
.80
.39
e16
CL1
.63
.01 .12
e17
CL2
.65
CUS_LOY
d4
.86
.75
e18
CL3
64
Reduced model 2 (Standardized results)
.21
.30
CE1
Chi-Square = 271
DF = 130
Chi-Square /DF = 2.08
RMSEA = 0.066
H0: RMSEA  0.05 :
p-value = 0.01
CE3
PQ1
e6
.33
.86
PQ5
e10
.50
PQ6
.57
PQ7
.71 .69
.71 .76
.73
PER_QUAL
d1
.45 .67
.74
PV1
d2
PER_VAL
.89 .94
.75
PV2
.25
.48
e13
e9
.48
.50
PQ4
PQ3
.77 .58 .75
.55
e12
e8
e7
.56
PQ2
.43
CUS_EXP
e11
e5
.60
.18
CE2
.55.45
e4
e3
e2
e1
CSI1
.57
e14
CSI2
e15
CS3
.87
.70
.75
CSI
d3
.64 .80
.80
.39
e16
CL1
.63
.01 .12
e17
CL2
.65
CUS_LOY
d4
.86
.75
e18
CL3
65
Reduced model 2 (Unstandardized results)
225.47 313.57 445.07 99.30
e2
e1
1
1
1
e4
e3
CE3
CE2
.91 1.00
1.00
97.94
PQ1
e11
PV1
CSI2
PQ4
PQ5
1
1
PQ6
PQ7
1.06.91
1.051.26
39.71
1
d1
177.30
1.00
1
PER_VAL
d2
.58
1.00
1.55
CL3
CSI
d3
1.56
CL1
CL2
12.01
1
1.63
CS3
152.38
1
e18
PQ3
e10
.99
CSI1
977.28
1
e17
PQ2
PER_QUAL
529.56
1
e16
1
e9
.13
134.06
1
e15
1
1
PV2
166.09
1
e14
e8
1.06
96.73
1
e13
e7
1.07
47.08
1
e12
e6
1
1.00.99 1.24
CUS_EXP
262.28
1
e5
1
CE1
294.34 178.38 167.50 135.99 163.63179.21
1.00
.20
120.31
CUS_LOY
1
d4
1.15
Chi-Square = 271.118
df = 130
Chi-Square/df = 2.086
rmsea = .066
p-value (rmsea =< .05) = .010
66
Specific estimation of the latent variables
• Each latent variable is estimated as a weighted average of its
own manifest variables, using the loadings hj .
• For example
Y4 
 41CUSA1   42CUSA2   43CUSA3
 41   42   43
is the Customer Satisfaction Index score.
• Each coefficient 4j is the regression coefficient of 4 in the
regression relating the manifest variable X4j to its latent
variable 4 (similar to PLS weight estimation when mode A is
used).
67
Loadings and LISREL weights
CE1
CE2
CE3
PQ1
PQ2
PQ3
PQ4
PQ5
PQ6
PQ7
Loading
1.000
0.913
1.004
1.000
0.988
1.241
1.061
0.911
1.045
1.265.
Weight
.343
.313
.344
.133
.132
.165
.141
.121
.139
.168
PV1
PV2
CS1
CS2
CS3
CL1
CL2
CL3
Loading
1.000
1.069
1.000
1.549
1.634
1.000
0.202
1.155
Weight
.483
.517
.239
.370
.391
.424
.086
.490
68
Comparison between the PLS and LISREL
weights
.6
.5
.4
LISREL WEIGHT
.3
.2
.1
0.0
.1
.2
.3
PLS RELATIVE WEIGHT
.4
.5
.6
69
Correlations between the PLS latent variables
and the specific LISREL latent variables
3
2
2
2
1
1
0
0
-1
-1
-2
-2
1
0
-2
-3
-4
20
40
60
80
-3
-4
100
120
20
2
1
1
0
0
-1
-1
-2
-2
-3
-4
40
C S I (LIS R E L)
40
60
80
100
-3
-4
120
P E R _QU A L (LIS RE L)
2
CUS_LO Y (PLS)
C SI (PLS)
C U S _E X P (LIS R E L)
20
PER _VAL (PLS)
PER _Q U AL (PLS)
-1
60
80
100
-2 0
0
20
40
60
80
100
120
P E R _V A L (LIS R E L)
All the correlations
are above .998
-3
-4
120
-2 0
0
20
40
60
80
1 00
1 20
CUS_LOY (LISREL)
70
First conclusions
• If COV-BASED SEM works, the PLS results
can be derived from the COV-BASED SEM
results.
• If COV-BASED SEM does not work, PLS is
still an alternative.
• If COV-BASED SEM is not adequate (small
number of observations and/or large number
of variables) PLS can be used for exploratory
purposes.
71
Usual estimation of latent variables in LISREL
Proc calis covariance modification data =ecsi outstat=a;
lineqs
CUEX1 =
1 f1 + e1,
CUEX2 = Lambda12 f1 + e2,
CUEX3 = Lambda13 f1 + e3,
.
.
.
CUSL1 =
1 f5 + e16,
CUSL2 = Lambda52 f5 + e17,
CUSL3 = Lambda53 f5 + e18,
f2 = beta21 f1 + d2,
f3 = beta31 f1 + beta32 f2 + d3,
f4 = beta41 f1 + beta42 f2 + beta43 f3 + d4,
f5 = beta54 f4 + d5;
std
e1-e18 = vare1-vare18,
d2-d5 = vard2-vard5,
f1 = varf1;
var
CUEX1
PERQ6
CUSL2
CUEX2
PERQ7
CUSL3;
CUEX3
PERV1
PERQ1 PERQ2
PERQ3
PERV2
CUSA1 CUSA2
PERQ4
CUSA3
PERQ5
CUSL1
run;
proc print data=a (where = (_type_="SCORE"));
run;
72
Variable weights for the usual estimation of the latent variables
in LISREL
CUEX1
f1
f2
f3
f4
f5
PERQ4
0.046274
0.081633
0.007765
0.028778
0.019737
CUSA2
0.027752
0.042369
0.024356
0.083861
0.057516
0.11102
0.03242
-0.00083
0.01321
0.00906
CUEX2
0.074334
0.021705
-0.000558
0.008842
0.006064
CUEX3
0.055776
0.016287
-0.000418
0.006634
0.004550
PERQ7
PERQ1
PERQ2
PERQ3
0.07362
0.12987
0.01235
0.04578
0.03140
0.024507
0.043233
0.004112
0.015241
0.010453
0.050785
0.089590
0.008522
0.031583
0.021661
PERV1
PERV2
CUSA1
PERQ5
PERQ6
0.049023
0.086483
0.008227
0.030488
0.020910
0.046702
0.082388
0.007837
0.029044
0.019920
0.051524
0.090894
0.008646
0.032043
0.021977
CUSA3
CUSL1
CUSL2
CUSL3
0.03606
0.05506
0.03165
0.10898
0.07474
0.00387
0.00591
0.00340
0.01170
0.10838
0.000423
0.000645
0.000371
0.001277
0.011830
0.01533
0.02340
0.01345
0.04632
0.42895
-0.00071
0.00466
0.10833
0.00992
0.00680
-0.00440
0.02873
0.66864
0.06122
0.04199
0.030850
0.047098
0.027074
0.093221
0.063936
73
Correlations between the PLS latent variables and
the usual estimated LISREL latent variables
3
2
2
2
1
1
0
0
-1
-1
-2
-2
1
0
-2
-3
-4
-3
Rs q = 0.6 2 2 7
-4 0
-3 0
-2 0
-1 0
0
10
20
-4
1
1
0
0
-1
-1
-2
-2
C U S_LOY
2
C SI
-3
-4
Rs q = 0.8 9 8 1
F4
-2 0
-4 0
-3 0
-2 0
-1 0
0
10
20
30
-1 0
0
10
20
-3
-4
Rs q = 0.9 0 1 0
-6 0
P E R _QU A L (LIS RE L)
2
-3 0
Rs q = 0.9 5 9 9
-5 0
C U S _E X P (LIS R E L)
-4 0
PER _VAL (PLS)
PER _Q U AL (PLS)
-1
-4 0
-2 0
0
20
40
P E R _V A L (LIS R E L)
-3
-4
Rs q = 0.8 5 9 2
-6 0
-4 0
-2 0
0
20
40
F5
74
Final conclusions
• COV-BASED SEM did not work on the full model.
• COV-BASED SEM gives better results for the inner model
(relating the latent variables between them) because the
latent variables are space-free.
• PLS gives better results for the outer model (relating the
manifest variables to their latent variables) because each
latent variable is constrained to be in its own manifest
variables space.
• If each COV-BASED SEM latent variable is estimated as
a weighted average of its own manifest variables, then
COV-BASED SEM and PLS give almost identical latent
variable estimates (at least on the examples we have
studied).
75
PLS Path Modeling and Multiple
Table Analysis
Application to the study of the cosmetic
habits of women in Ile-de-France
Christiane Guinot (CERIES/CHANEL)
& Michel Tenenhaus (HEC)
76
Objective of the analysis
We have applied the PLS approach to a study
of the cosmetic habits of women living in
Ile-de-France
The aim of the project was to obtain a global score
describing the propensity to use cosmetic products
in this sample
Then, we used behavioural and skin characteristic
variables, which are known to account for the
variation in use of cosmetic products, to check on
the relevance of this score
77
Data
The cosmetic
products were
divided into
four blocks
corresponding
to different
cosmetic
practices
Body soap, liquid soap, moisturising body
care cream, hand creams
make-up and eye make-up
Face removers, tonic lotions, day
care creams, night creams exfoliation
products
Make blushers, mascaras, eye shadows,
eye pencils, lipsticks, lip shiners
-up and nail polish
Sun sun protection products for face
and for body after-sun products
care for face and for body
78
Construction of a global score
Cosmetic
practices
Partial
Scores
Body care
1
Face care
2
Make-up
Sun care
Manifest
variables
Global
score
Face care

3
4
Body care
Latent
Variable (A)
Latent
Variables (B)
Make-up
Sun care
Inner model :
centroid scheme
79
Results
Correlations
Body
care
Face
care
Make
-up
Sun
care
soap body
liquid soap
body cream
hand cream
make-up rem.
tonic lotion
eye m.up rem.
day cream
night cream
exfoliation pdt
blusher
mascara
eye shadow
eye pencil
lipstick
nail polish
protec face
after sun face
protec body
after sun body
-.24
.47
.80
.56
Propensity to use
cosmetic products
1
.27
.44
.56
.56
.46
.55
.57
.57
.72
.58
.43
.53
.47
.64
.73
.71
.78
2
.42
.43
3
4
Global
score

.44
Regression
coefficients
-.12
.23
.40
.28
.32
.40
.40
.33
.39
.41
.39
.49
.39
.29
.36
.31
.39
.45
.44
.49
Soap body
liquid soap
body cream
hand cream
make-up remover
tonic lotion
eye m.up remover
day cream
night cream
exfoliation pdt
blusher
mascara
eye shadow
eye pencil
lipstick
nail polish
sun protec. face
after sun face
sun protec. body
after sun body
80
Result : Global score
Global score
=
-3.40
- .11 * soaps and toilet soaps for body care
+.20 * liquid soaps for body care
+.38 * moisturising body creams and milks
+.25 * hand creams and milks
+.21 * make-up removers
+.26 * tonic lotions
+.30 * eye make-up removers
+.39 * moisturising day creams
+.30 * moisturising night creams
+.30 * exfoliation products
+.26 * blushers
+.41 * mascara
+.26 * eye shadows
+.20 * eye pencils
+.33 * lipsticks and lip shiners
+.20 * nail polish
+.36 * sun protection products for the face
+.31 * moisturising after sun products for the face
+.38 * sun protection products for the body
+.34 * moisturising after sun products for the body
81
Results: partial scores
Score body-care
Score facial-care
Score make-up
Score sun-care
82
Result : global score
S_body-care S_facial-care
S_make-up
S_facial-care
0.24001
S_make-up
0.13462
0.35035
S_sun-care
0.16500
0.19075
0.14273
S_global
0.50263
0.71846
0.67347
S_sun-care
0.62071
83
Relevance of the global score
Factors influencing the use of cosmetic products
To identify behavioural and skin characteristic
variables which best account for the variation in the
use of cosmetic products, we can relate the global
score to the following variables:
 Professional activity & Socio-professional
category






Children
Sun exposure habits
Practice of sport
Importance of physical appearance
Type of facial skin & type of body skin
Age
84
Relevance of the global score
E(Score global) = -1.02
+.21
+.07
+.00
+.27
*
*
*
*
professional activity
housewife or student
retired
CSP A (craftsmen, trades people, business managers,
managerial staff,academics and professionals)
+.09 * CSP B (farmers and intermediary professions)
+.05 * CSP C (employees and working class people)
+.00 * CSP D (retired and non working people)
-.21 * without child
+.00 * with child
+.40 * habits of deliberate exposure to sunlight
+.09 * previous habits of deliberate exposure to sunlight
+.00 * no habits of deliberate exposure to sunlight
-.17 * no sport practised
+.00 * sport practised
+1.04 * physical appearance is of extreme importance
+.89 * physical appearance is of high importance
+.50 * physical appearance is of some importance
+.00 * physical appearance is of little importance
-.06 * oily facial skin
+.16 * combination facial skin
-.20 * normal facial skin
+.00 * dry facial skin
-.32 * oily body skin
-.57 * combination body skin
-.32 * normal body skin
+.00 * dry body skin
-.00 * age
85
A good profile
E(Global score)= -1.02
+.21
+.07
+.00
+.27
1.06
*
*
*
*
professional activity
housewife or student
retired
CSP A (craftsmen, trades people, business managers, managerial
staff, academics and professionals)
+.09 * CSP B (farmers and intermediary professions)
+.05 * CSP C (employees and working class people)
+.00 * CSP D (retired and non working people)
- .21 * without child
+.00 * with child
+.40 * habits of deliberate exposure to sunlight
+.09 * previous habits of deliberate exposure to sunlight
+.00 * no habits of deliberate exposure to sunlight
- .17 * no sport practised
+.00 * sport practised
+1.04 * physical appearance is of extreme importance
+.89 * physical appearance is of high importance
+.50 * physical appearance is of some importance
+.00 * physical appearance is of little importance
- .06 * oily facial skin
+.16 * combination facial skin
- .20 * normal facial skin
+.00 * dry facial skin
- .32 * oily body skin
- .57 * combination body skin
- .32 * normal body skin
+.00 * dry body skin
- .00 * age
86
A bad profile
E(Global score)= -1.02
+.21 * professional activity
+.07 * housewife or student
+.00 * retired
+.27 * CSP A (craftsmen, trades people, business managers,
managerial staff, academics and professionals)
+.09 * CSP B (farmers and intermediary professions)
+.05 * CSP C (employees and working class people)
+.00 * CSP D (retired and non working people)
- .21 * without child
+.00 * with child
+.40 * habits of deliberate exposure to sunlight
+.09 * previous habits of deliberate exposure to sunlight
+.00 * no habits of deliberate exposure to sunlight
- .17 * no sport practised
+.00 * sport practised
+1.04 * physical appearance is of extreme importance
+.89 * physical appearance is of high importance
+.50 * physical appearance is of some importance
+.00 * physical appearance is of little importance
- .06 * oily facial skin
+.16 * combination facial skin
- .20 * normal facial skin
+.00 * dry facial skin
- .32 * oily body skin
- .57 * combination body skin
- .32 * normal body skin
+.00 * dry body skin
- .00 * age
-2.00
87
Conclusion
Using PLS approach, we obtain a score
presenting the propensity to use cosmetic
products by balancing the different types
of cosmetic products better than using
principal component analysis.
88
Final conclusion
« All the proofs of a pudding are in the eating, not
in the cooking ».
William Camden (1623)
89
Some references on PLS Path Modeling
•
CHIN W.W. (2001) : PLS-Graph User’s Guide, C.T. Bauer College of Business,
University of Houston, Houston.
•
CHIN W.W. (1998) : “The partial least squares approach for structural equation
modeling”, in: G.A. Marcoulides (Ed.) Modern Methods for Business Research,
Lawrence Erlbaum Associates, pp. 295-336.
•
FORNELL C. (1992) : “A National Customer Satisfaction Barometer: The
Swedish Experience”, Journal of Marketing, Vol. 56, 6-21.
•
FORNELL C. & CHA J. (1994) : “Partial Least Squares”, in Advanced Methods
of Marketing Research, R.P. Bagozzi (Ed.), Basil Blackwell, Cambridge, MA., pp.
52-78.
•
GUINOT, C., LATREILLE, J. & TENENHAUS M.: “PLS Path Modeling and
Analysis of Multiple Tables”, Chemometrics and Intelligent Laboratory Systems,
Special issue on PLS methods, 58, 2001 (with C. Guinot and J. Latreille).
•
LOHMÖLLER J.-B. (1987) : LVPLS Program Manual, Version 1.8,
Zentralarchiv für Empirische Sozialforschung, Köln.
90
•
LOHMÖLLER J.-B. (1989) : Latent Variables Path Modeling with Partial Least Squares,
Physica-Verlag, Heildelberg.
• PAGÈS J. & TENENHAUS M. (2001) : "Multiple Factor Analysis and
PLS Path Modeling",
Chemometrics and Intelligent Laboratory Systems, 58, 261-273.
• TENENHAUS M. (1998) : La Régression PLS. Éditions Technip, Paris
• TENENHAUS M. (1999) : “L’approche PLS”, Revue de Statistique Appliquée,
vol. 47, n°2, pp. 5-40.
• TENENHAUS M., ESPOSITO VINZI V., CHATELIN Y.-M., LAURO, C. (2005):
"PLS Path Modeling", Computational Statistics and Data Analysis.
• WOLD H. (1985) : “Partial Least Squares”, in Encyclopedia of Statistical Sciences,
vol. 6, Kotz, S & Johnson, N.L. (Eds), John Wiley & Sons, New York, pp. 581-591.
91
Download