Uploaded by sarochajanmas

Factor Analysis Presentation

advertisement
Factor Analysis
Assistant Professor Dr. Sangdao Wongsai
Faculty of Science and Technology
Thammasat University, Rangsit campus
Handouts today




What is factor analysis?
Applications of factor analysis
Steps in factor analysis
An example
Designed by Aj Sangdao
2
What is factor analysis?

Factor analysis is a data reduction tool that

Describe the interrelationships among the variables.

represents the correlated variables with a smaller set of
“natural grouping” variables (i.e. derived factors) that
relatively independent of one another.

removes redundancy or duplication from a set of
correlated variables.

assist theoretical interpretation of complex datasets.
Designed by Aj Sangdao
3
Differences between MLR, PCA and FA

Multiple regression analysis is a method that explains the
total variability of a dependent (response) variable using
independent (predictor) variables.

Principal component analysis is a technique that selects
the m components that explain as much of the total
variance in a set of high-dimensional variables as
possible.

Factor analysis is a method that selects the m underlying
factors that have common variances shared among the
original (measured) variables.
Designed by Aj Sangdao
4
Applications of factor analysis

Identification of underlying factors


Sampling of variables


create new variables (i.e. factors) that describe the
correlations among observed variables.
select a small set of representative variables from a
larger set.
Uses in further analyses

reduce multicollinearity problem in regression analysis.

group variables into homogeneous sets in cluster
analysis.
Designed by Aj Sangdao
5
Steps in factor analysis
1.
2.
3.
4.
5.
6.
Collect data or obtain from a database
Explore data
Obtain correlation matrix
Select an appropriate number of factors
Select an estimation method
If necessary, drop variable(s) and repeat
steps 3 and 6
7. If necessary, rotate factors
8. Interpret (rotated) factors
Designed by Aj Sangdao
6
Steps in factor analysis
Collect data
Explore data
Obtain correlation matrix
Select an appropriate
number of factors
Select an appropriate
estimation method
Drop
variable(s)
Yes
Interpret factors
No
Rotate factors, if needed
Designed by Aj Sangdao
7
An example: Water quality

Twelve water quality variables were
measured monthly between 2001 and 2007
from three reservoirs in NSW, Australia.
Seventy one observations were recorded.

Research question:
Is there any underlying natural groupings
(i.e. factors) of the water quality dataset?
Designed by Aj Sangdao
8
Correlation matrix of 12 variables
Designed by Aj Sangdao
9
Factor analysis

The pattern of the high and low correlations in the
correlation matrix is such that the variables in a particular
subset have high correlations among themselves but low
correlations with all the other variables.

Then there may be a single underlying factor that give rise
to the variables in the subsets.

If the other variables can be similarly grouped into subsets
with a like pattern of correlations, then a few factors can
represent these groups of variables.
Designed by Aj Sangdao
10
Factor analysis

Three factors are retained.
variable factor1 factor2 factor3 uniqueness
Sec
SiO2
Con
Tem
DO
pH
NO3
NH4
TP
TN
Fe
OC
0.07
0.11
-0.20
-0.96
0.93
0.55
0.41
-0.11
-0.16
-0.02
0.02
-0.12
-0.21
0.93
0.16
-0.01
0.10
-0.42
0.49
-0.10
0.23
-0.22
0.68
-0.01
0.02
-0.14
-0.38
-0.02
-0.09
-0.06
0.41
0.38
0.36
0.98
-0.30
0.20
0.95
0.15
0.74
0.07
0.19
0.46
0.40
0.86
0.81
0.01
0.45
0.96
Designed by Aj Sangdao
11
Factor analysis

Three factors are retained.
variable factor1 factor2 factor3 uniqueness
Sec
SiO2
Con
Tem
DO
pH
NO3
NH4
TP
TN
Fe
OC
0.07 -0.21
0.02
0.11
0.93 -0.14
Factor
loadings
-0.20
0.16
-0.38
-0.96 -0.01 -0.02
0.93
0.10 -0.09
0.55 -0.42 -0.06
0.41
0.49
0.41
-0.11 -0.10
0.38
-0.16
0.23
0.36
-0.02 -0.22
0.98
0.02
0.68 -0.30
-0.12 -0.01
0.20
0.95
0.15
0.74
0.07
0.19
0.46
0.40
0.86
0.81
0.01
0.45
0.96
Designed by Aj Sangdao
If any variable
has a uniqueness
> 0.7, it will be
dropped from
factor analysis.
• How many factors?
• What is factor loading?
• What is uniqueness?
• How do we calculate
these values?
12
Factor analysis

Five variables with high uniqueness are dropped
from the factor analysis.

Then, factor analysis is performed on a correlation
of the remaining variables.
variable factor1 factor2 factor3 uniqueness
Tem
DO
pH
SiO2
Fe
NO3
TN
-0.89
0.92
0.62
0.03
0.03
0.19
-0.08
0.00
0.13
-0.38
0.87
0.77
0.34
-0.46
-0.18
0.04
-0.19
0.18
-0.07
0.82
0.65
0.09
0.16
0.41
0.24
0.41
0.16
0.36
Designed by Aj Sangdao
13
Factor analysis

Five variables with high uniqueness are dropped
from the factor analysis.

Then, factor analysis is performed on a correlation
of the remaining variables.
variable factor1 factor2 factor3 uniqueness
Tem
DO
pH
SiO2
Fe
NO3
TN
-0.89
0.92
0.62
0.03
0.03
0.19
-0.08
0.00
0.13
-0.38
0.87
0.77
0.34
-0.46
-0.18
0.04
-0.19
0.18
-0.07
0.82
0.65
0.09
0.16
0.41
0.24
0.41
0.16
0.36
Designed by Aj Sangdao
Each variable
now has its
uniqueness less
than 0.7.
14
Interpretation of derived factors

Factor 1 accounts for physical characteristics, comprising
positive loadings for water pH and dissolved oxygen, and
a negative loading for temperature.

Factor 2 reflects mineral elements, consisting of positive
loadings for silicate and filterable iron.

Factor 3 characterises nitrogen elements, comprising
positive loadings for total nitrogen and nitrates.
Designed by Aj Sangdao
15
FA: Shared variance and error
Total
variance
of Xj
l
/ unieueness )
= communality + uniqueness
F1
Assuming that we perform the factor
analysis model on a correlation matrix
of a data set of 5 variables; X1, X2,
X3, X4, and X5.
F2
✓
X1
X2
Two factors, F1 and F2, are created
by the maximum likelihood estimation.
X3
✓
X4
X5
✓
(l )
com
Var ( X j ) = h
h
2
j =1
(l )
k =1 2
j
= (l
2
j =1
k =2 2
j
Uni 94
+ u 2j =1
Ftctork
) + (l )
fpc
k =1 2
j =1
1
origina
f7C 2
k =2 2
j =1
✗
u 2j
Factor 1 represents the correlation of
the variables X1, X4, and X5 since
they have higher loading on this factor
than another factor.
Factor 2 comprises the correlation of
the variables X2 and X3.
Designed by Aj Sangdao
16
Factor analysis model


A data matrix X comprises n observations in rows
and p variables in columns, where n > p.
Each element xij represents the observation i for
variable j.
 x11
x
21

X=
 .

 xn1
x12
x22
.
xn 2
... x1 p 

... x2 p 
... . 

... xnp 
Designed by Aj Sangdao
17
Factor analysis model

Then, a correlation matrix of the original variables is
generated.

The correlation matrix can be also interpreted as the
covariance matrix between the standardized variables.

It is scale invariant, that is, it does not change if we
change the units of measurement.
Designed by Aj Sangdao
18
Factor analysis model

Correlation coefficient is a measure of both the
direction and the strength.

The magnitude of correlation coefficient measures
the strength. The closer it is to 1 or -1, the stronger
the relationship.

If the correlation coefficient > 0, then there is a
positive relationship between two variables. If the
correlation coefficient < 0, then the relationship is
negative.
Designed by Aj Sangdao
19
Factor analysis model

Factor analysis model expresses each variable as a
linear combination of underlying common factors with
an accompanying error term.
X = ΛF + ε
where
X
is the matrix of observed variables Xj, j = 1, 2, …, p
F
is the vector of factors Fk, k = 1, 2, …, m (m < p)
Λ
is the matrix of coefficients of the factors Fk on the
variables Xj – the so-called factor loadings.
ε
is the vector of random errors of the variables Xj.
Designed by Aj Sangdao
20
Factor analysis model

Factor analysis model expresses each variable as a
linear combination of underlying common factors with
an accompanying error term.
1 า
Factor
rariable
ex :
 l11 l12
 X1 
1 2
X 
 2  =  l2 l2
 ... ...
 ... 
1 2
 
X
 j 
l p l p
X = ΛF + ε
where
...
...
...
...
2
""
on
l1m   F 1   e1 
   
l2m   F 2   e2 
+

 ... 


... ...
 
m m
l p   F   e j 
Xj
is the jth observed variable, j = 1, 2, …, p
l kj
is the loading of the jth variable on the kth factor,
k = 1, 2, …, m (m < p)
Fk
is the kth factor, k = 1, 2, …, k
ej
is the random error of the jth variable.
กิ๋
Designed by Aj Sangdao
21
Factor analysis model

For p variables and m common factors, the model is
X 1 = l11 F 1 + l12 F 2 + ... + l12 F m + e1
X 2 = l21 F 1 + l22 F 2 + ... + l22 F m + e2
X = ΛF + ε
...
X p = l1p F 1 + l p2 F 2 + ... + l p2 F m + e p
where
Xj
is the jth observed variable, j = 1, 2, …, p
l kj
is the loading of the jth variable on the kth factor,
k = 1, 2, …, m (m < p)
Fk
is the kth factor, k = 1, 2, …, k
ej
is the random error of the jth variable.
Designed by Aj Sangdao
22
Factor analysis model

It is noted that the number of m factors is less than the
number of p original variables.

The m common factors are assumed to have zero
means and unit variances (variance = 1).

It is due to the fact that the factor analysis model is
usually carried out from the correlation matrix of the
original variables.
Designed by Aj Sangdao
23
Factor analysis model

Each variable is standardized to have a mean of 0.

The variance of each standardized variable is 1.

Thus, the total variance of all standardized variables is the sum of
their variances, that is equal to the number of p original variables.

The variance of each standardized variable is composed of a part
due to the m common factors and a part due to its own random
error.

The total variance of all the standardized variables is therefore
equal to the sum of the part of variance explained by the m
common factors and the part of variance from the random errors
of the p original variables.
Designed by Aj Sangdao
24
Factor analysis model



The factor loading, l k , is the loading of Xj
j
on the Fk.
For the kth factor, the sum of squared (ss)
factor loadings from all the p variables is
the variance of the kth factor. It specifies
the amount of variance in the data with
standardized variables can be explained
by the kth factor. It is called the common
variance of the kth factor (green box).
For the jth variable, the sum of squared
(ss) factor loadings from the m common
factors is the variance of the jth variables
accounted for by the m factors. It is called
the communality of the jth variable, h 2j
(blue box).
F 1 F 2 ... F m
X 1  l11

X 2  l21
... ...

X p l1p
l12 ... l1m 
m
2
l2 ... l2 
... ... ... 

l p2 ... l pm 
ss1
ss 2 ... ss k
 ( l1 )2
 1
 1 2
 ( l2 )
 ...

( l 1 ) 2
 p
Designed by Aj Sangdao
(l )
(l )
2 2
1
...
2 2
2
...
...
...
( l p2 )
2
( l ) 

l
( )
m 2
1
h12
m 2
2
h22
... 

2
... ( l pm )  hp2
25
Communality and uniqueness

The total variance of each Xj can be decomposed into
two parts corresponding to communality and uniqueness.

Total variance = communality + uniqueness.

Ivariancel
Communality ( h 2j ) is the proportion of the variance of each Xj
accounted for by the m common factors. It is the sum of squares
of the loadings of the jth variable contributed by the m factors.
m
h = ∑ (l kj ) 2 =(l1j ) 2 + (l 2j ) 2 + ...(l mj ) 2
2
j

k =1
2
Uniqueness ( u j ) is the unexplained variance of each Xj due to
the random errors, indicating how distinctive (specific) the
measure of each Xj is from the remaining variables.
u 2j = 1 − h 2j = 1 − (l1j ) 2 + (l 2j ) 2 + ...(l mj ) 2
Pase 4
on
5 +4.
¢ ata
> var
Designed by Aj Sangdao
=
1
26
Variance of the jth variable


p
The communality of the jth variable is the sum of ss loadings by row.
The sum of the communalities is equal to the cumulative part of the
total variance in the standardized variables that can be accounted
for by the m common factors.
p
m
∑ h = ∑∑ ( l )
j =1
2
j
k 2
j
j =1 k =1
m
for j = 1; ∑ ( l
k =1
m
for j = 2; ∑ ( l
k =1
)
k 2
1
; j = 1, 2,..., p; k = 1, 2,..., m
F1
= (l11 ) 2 + (l12 ) 2 + ... + (l1m ) 2
)
k 2
2
= (l ) + (l ) + ... + (l )
1 2
2
2 2
2
m 2
2
...
m
for j = p; ∑ ( l
k =1
)
k 2
p
1 2

l
X1 ( 1 )

1 2
X 2  ( l2 )
...  ...
1 2

l
X p ( p )
F2
( l12 )
...
2
(l )
2 2
2
...
( l p2 )
2
Fm
m 2
l
( 1 )  h12
2
m 2
... ( l2 )  h2
... ... 

2
2
... ( l pm )  hp
...
= (l1p ) 2 + (l p2 ) 2 + ... + (l pm ) 2
Designed by Aj Sangdao
27
Variance of the jth variable

The proportion of the total variance in the standardized variables
accounted for by the m common factors is equal to the sum of the
communalities divided by the total variance of the standardized
data (that is equal to the number of p variables.).
p
2
h
∑ j
j =1
p
; j = 1, 2,..., p; k = 1, 2,..., m
Designed by Aj Sangdao
28
Variance of the kth factor

The common variance of the kth factor is the sum of ss loadings
by column.
The sum of the common variances is equal to the cumulative part
of the total variance in the standardized variables that can be
explained by the m common factors.
m
p

∑∑ ( l )
k =1 j =1
k 2
j
; j = 1, 2,..., p; k = 1, 2,..., m
p
for k = 1; ∑ ( l
j =1
)
1 2
j
p
F1
= (l11 ) 2 + (l21 ) 2 + ... + (l1p ) 2
for k = 2; ∑ ( l 2j ) = (l12 ) 2 + (l22 ) 2 + ... + (l p2 ) 2
2
j =1
...
p
for k = m; ∑ ( l
j =1
)
m 2
j
1 2

l
X1 ( 1 )

1 2
X 2  ( l2 )
...  ...
1 2

l
X p ( p )
= (l ) + (l ) + ... + (l )
m 2
1
m 2
2
m 2
p
Designed by Aj Sangdao
ss1
F2
( l12 )
...
2
(l )
2 2
2
...
( l p2 )
2
Fm
m 2
l
( 1 )  h12
2
m 2
... ( l2 )  h2
... ... 

2
2
... ( l pm )  hp
...
ss 2 ... ss k
29
Variance of the kth factor

The proportion of the total variance in the standardized variables
explained by the m common factors is equal to the sum of the
common variances divided by the total variance of the
standardized data (that is equal to the number of p variables.).
m
p
∑∑ ( l )
k =1 j =1
p
k 2
j
; j = 1, 2,..., p; k = 1, 2,..., m
Designed by Aj Sangdao
30
The shared variance of the m factors

The sum of the common variances explained by the m factors
is equal to the sum of the communalities of the standardized
variables.

These sums are called the shared variance of the m factors.
p
m
p
∑ h = ∑∑ ( l )
j =1
2
j
k =1 j =1
k 2
j
; j = 1, 2,..., p; k = 1, 2,..., m
Designed by Aj Sangdao
31
The shared variance of the m factors

The proportion of the shared variance from the m factors is equal
to the common variance explained by the kth factor divided by
the shared variance of the m factors.
p
∑ (l )
j =1
p
∑h
j =1
p
k 2
j
2
j
=
∑ (l )
k 2
j
j =1
m
p
∑∑ ( l )
k =1 j =1
k 2
j
; j = 1, 2,..., p; k = 1, 2,..., m
Designed by Aj Sangdao
32
The total variance of standardized data

The proportion of the total variance in the standardized variables
accounted for by the m common factors is equal to the shared
variance from the m common factors divided by the total variance of
the standardized data (that is equal to the number of p variables.).
p
∑h
j =1
p
m
2
j
=
p
∑∑ ( l )
k =1 j =1
p
k 2
j
; j = 1, 2,..., p; k = 1, 2,..., m
Designed by Aj Sangdao
33
Uniqueness

If the communality of each variable is less than the total variance
of that variable, then the discrepancy between these two
quantities is the uniqueness.

If the communality of each variable is equal to the total variance
of that variable, then there is no part of the total variance that
cannot be accounted for by the factor model (uniqueness = 0).

The uniqueness is separated into two types; the specificity and
the measurement error.

The specificity is the variance that is specific to a particular variable.
It is a systematic variance that is unshared with other variables.

The error comes from errors of measurement and basically
anything unexplained by common or specific variance.
Designed by Aj Sangdao
34
Uniqueness

The uniqueness is the part of the variance that is due to
the unique factor ej, the random error of the jth variable.

Therefore, any variables with high uniqueness should be
dropped from the factor analysis.
communality
communality
>
<
uniqueness
uniqueness
X should be considered as a linear
combination of underlying factors. It
is more likely to be highly correlated
to other variables in the data set.
X has its own specific contribution. It
is less likely to be correlated to other
variables in the data set.
Designed by Aj Sangdao
35
Shared variance and error
Total
variance
of Xj
= communality + uniqueness
F1
Assuming that we perform the factor
analysis model on a correlation matrix
of a data set of 5 variables; X1, X2,
X3, X4, and X5.
F2
X1
X2
Two factors, F1 and F2, are created
by the maximum likelihood estimation.
X3
X4
X5
(l )
k =1 2
j
(l )
k =2 2
j
Var ( X j ) = h 2j =1 + u 2j =1
h 2j =1 = ( l kj==11 ) + ( l kj==12 )
2
2
u 2j
Factor 1 represents the correlation of
the variables X1, X4, and X5 since
they have higher loading on this factor
than another factor.
Factor 2 comprises the correlation of
the variables X2 and X3.
Designed by Aj Sangdao
36
How do we estimate factor loadings?

The factor loading, l kj , is the loading of Xj on the Fk.

The factor loading turns out to be the correlation between
the standardized variables and the underlying factors.

The matrix of factor loadings is called the pattern matrix.

Estimation method options for the factor loading are

Principal component model (analysis, extraction – it may be
called by different names)

Principal factor model (analysis, principal axis factor analysis)

Weighted least squares

Maximum likelihood method
Designed by Aj Sangdao
37
Estimation: PCA

Principal component analysis (PCA) estimation is computed
based on the eigenvalue and eigenvectors of the correlation
matrix of the original variables.

The first component is generated to achieve the maximum
(largest) variance of the measure variables.

The relationship between the factors and the components is
expressed as:
Fk =
Ck
λk
; k = 1, 2,..., m
Ck = F k λk
Designed by Aj Sangdao
38
Estimation: PCA

Recall, an equation of a linear combination of variables for the kth
component can be inverted to express the variable Xj as a function
of the components. Such that a factor model is written by the
following equation:
C k = a1k X 1 + a2k X 2 + ... + a kp X p
X j = a1j C1 + a 2j C 2 + ... + a jp C p
Ck = F k λk
X j = a1j F 1 λ 1 + a 2j F 2 λ 2 + ... + a jp F p λ p
Designed by Aj Sangdao
39
Estimation: PCA

In the factor analysis model, only m number of factors are selected.
Thus, the model is composed of two terms, that are the underlying
unobserved factors depending on the loadings and random errors.
X j = a1j F 1 λ 1 + a 2j F 2 λ 2 + ... + a jp F p λ p
l kj = a kj λ k
X j = l1j F 1 + l 2j F 2 + ... + l mj F m + e j
where
e j = a mj +1 F m +1 λ m +1 + ... + a jp F p λ p
Designed by Aj Sangdao
40
Estimation: PCA

The individual communalities tell how well the model is working
for the individual variables.

The total communality gives an overall assessment of
performance of the factor model.

The communality for a given variable can be interpreted as the
proportion of variation in that variable explained by the m factors.

In other words, if we perform multiple regression of the jth
variable against the m common factors, an R2 will be equal to
the total communality.

The communalities and the specific variances will depend on the
number of factors in the model.
Designed by Aj Sangdao
41
Estimation: PFA

The principal factor analysis (PFA) considers maximizing the
total communality a more attractive objective than maximizing
the total proportion of the explained variance (as is done in
PCA).

The estimation uses the communalities in place of the original
variance, and works out many iterations.

However, the initial communalities may be obtained from the
other estimation methods since it is not known prior to analysis
of PFA.
Designed by Aj Sangdao
42
Estimation: Maximum Likelihood

Maximum Likelihood Estimation requires that the data are
sampled from a multivariate normal distribution.

This is a drawback of this method. Some data may not have
such distribution.

It is noted that the normality assumption is important only if you
wish to generalize the results of your analysis beyond the
sample collected.

Computationally this process is complex. In general, there is no
closed-form solution to this maximization problem so iterative
methods are applied.
Designed by Aj Sangdao
43
How many factors?

The number m of common factors are chosen prior to the analysis,
and are smaller than the number of p original variables.

It is often not the case that the number of m common factors are
known in advanced. It is thus possible to allow the data themselves
to determine this number.

Unlike the factor analysis, the number of principal components
created is equal to the number of p original variables, although only
the m PC is recommended to retained for further applications.

Therefore, the m retained components will be selected after the
analysis and according to their proportion variance of the total
variability in the original data.
Designed by Aj Sangdao
44
How many factors?

Methods of choosing the initial number of factors:

Kaiser's criterion (1970)

The principal component analysis is chosen as the
estimation method for finding the initial number of factors.

Such that the Kaiser rule is based on PCA's eigenvalues and
eigenvectors of the correlation matrix of the measured
variables.

The Kaiser rule of how many factors have eigenvalues more
than 1 is used to decide how many factors to extract.
Designed by Aj Sangdao
45
How many factors?

Methods of choosing the initial number of factors:

Scree-plot

Principal component analysis is chosen as the estimation method for
finding the initial number of factors.

It is a plot of eigenvalues versus its number (Cattle 1966).

Looking for the elbow for the cliff and the scree (constant eigenvalues)
Designed by Aj Sangdao
46
How many factors?

Methods of choosing the initial number of factors:

Cumulative percentage of total variation


A criterion is set to be between 60-90%. The m factors will be
retained when the chosen percentage exceeds the criterion.
Chi-squared test statistic

To test the hypothesis that m factors are adequate to fit the
model.
Designed by Aj Sangdao
47
Criteria for selecting variables

Communality indicates the variance in each variable explained by
the extracted factors; ideally, above 0.5 for each variable.

Factor loading indicates how strongly each variable loads on each
factor. It should generally be above |.5| for each variable.

Reliability measure checks the internal consistency of the
variables included for each factor using Cronbach's alpha. It
should be above 0.7 for each variable.

A residual correlation matrix: The closer to the zero the better the
model fits to the data. Values on the main diagonal of a matrix are
common variances of that measured variable on the m factors.
Designed by Aj Sangdao
48
Interpretation about the total variability

Proportion variance of the total variance in the original set
of p standardized variables that can be accounted for by
the kth factor is equal to the ss loading (common
variance) for the kth factor divided by the total variance of
all p standardized variables.

Proportion variance of the total variance in the m common
factors from the factor analysis model is equal to the ss
loading (common variance) for the kth factor divided by
the total variance from the m factors in the factor model
(that is the sum of the communalities).
Designed by Aj Sangdao
49
Interpretation about the new factors

Give an appropriate name (label) to each of the factors as
it represents mutually correlated effect of many variables
that are not seen before performing the factor analysis.

Interpretation for each factor depends on the loading
(weight) of each variable on that factor.

Then, we determine what those variables have in common.
Whatever the variables have in common will indicate the
meaning of the factor.
Designed by Aj Sangdao
50
Interpretation about the new factors

Ideally, for any given variable, it has a high loading
(correlation) on only one factor.

Factor loading is a correlation between a variable and a
factor, indicating the strength (how strong a factor
influences a measured variable) and direction of a factor
on a observed variable.

In practice, it is often difficult to interpret. For example, a
particular variable may have an equal weight on all m
factors.

It is recommended to rotate the new factors to ease the
interpretation. It is called the rotated factors.
Designed by Aj Sangdao
51
Factor rotations

Theoretically, factor rotations can be done in an infinite number
of ways.

Factor rotation consists of finding new axes to represent the
factors. These new axes are selected so that they go through
clusters or subgroups of the points representing the data
variables in a plot of the 2-dimensional factor axes.

Two typical options of factor rotations are

orthogonal rotation (e.g., varimax)

oblique rotation (e.g., promax, oblimin, quartimin)
มีความคลาดเคลื่อนเล็กน้อย เพื่อให้สมจริงมากขึ้น
Designed by Aj Sangdao
52
Factor rotations
Orthogonal rotation
Oblique rotation
https://www.slideshare.net/ssuser1ab9f7/factor-analysis-7113647
Designed by Aj Sangdao
53
Factor rotations

Varimax rotation creates the uncorrelated rotated factors. It
restricts the new axes to being orthogonal (perpendicular) to
each other. The angle between axes representing the two
rotated factors is 90°.

Variables that go through the rotated factor 1 have high
loadings on the axis 1, and nearly zero loadings on the
rotated factor 2 on the axis 2.

Variables that go through the rotated factor 2 have high
loadings on axis 2 and nearly zero loadings on the rotated
factor 1 on the axis 1, and so on.
Designed by Aj Sangdao
54
Factor rotations

The varimax rotation is achieved by maximizing the common
variance within each factor by making the
large loadings larger and the small loadings smaller.

That simplifies the column of a factor loading matrix by
minimizing the number of variables that have high loadings
on each factor.

It helps in the interpretation of the factors by highlighting the
particular variables appearing in each factor.
Designed by Aj Sangdao
55
Factor rotations

The factor loadings for each factor have been changed after
the rotation.

It is noted that the communality of each variable is unchanged
but the common variance of each factor is changed to achieve
the maximum variance for a given factor.

In other words, the sum of ss loadings by row is not altered but
the sum of ss loadings by column is changed to satisfy the
conditions of the orthogonal rotation.

The pattern matrix is composed of the correlations between the
variables and the factors, that are factor loadings.
Designed by Aj Sangdao
56
Factor rotations

Oblique rotation allows the rotated factors are correlated, that are
nonorthogonal (not perpendicular) to each other.

It produces estimates of correlations among rotated factors.

The factor structure matrix comprises variances and correlations
between rotated factors.

The pattern matrix comprises correlations between the variables
and the factors.

Several oblique rotation procedures are commonly used.

Oblimin, Promax, Equamax, Quartimin, etc.
Designed by Aj Sangdao
57
Assumptions for Factor Analysis

Normality

The normality assumption is important only if you wish to
generalize the results of your analysis beyond the sample
collected.

Check the Skewness and Kurtosis statistics (Kline, 2011)


Skewness statistic < 3 and

Kurtosis statistic < 10
Depends on what estimation method is used.

If the principal factor analysis is used, it does not assume
multivariate normality.

If the maximum likelihood estimate is used, it assumes
multivariate normality.
Designed by Aj Sangdao
58
Assumptions for Factor Analysis

Linear relations

Before conducting a factor analysis, we should check the inter-correlation
between variables. If any variables do not correlate with any other
variables, then they should be excluded before performing the factor
analysis. Also, if the variables correlate too highly (extreme multicollinearity)
or perfectly correlated (singularity), then the factor analysis may be not
suitable for such data.

A simple method is to calculate the correlation matrix, r

-1 < r < +1

If an absolute of r is very close to zero, then there is less relationship among
variables.

If an absolute of r is very close to 1, then there is more relationship among
variables.

A positive sign indicates a directly proportional relationship.

A negative sign indicates a inversely proportional relationship.
Designed by Aj Sangdao
59
Assumptions for Factor Analysis

Factorability: a degree of collinearity among the variables

Bartlett test of sphericity

Null Hypothesis: A correlation matrix of the data is an identity matrix
(there are no relationships between the variables.).

The identity matrix of size n is the n * n square matrix with ones on
the main diagonal and zeros elsewhere.

Bartlett's test approximates a chi-squared distribution.

Very small values of significance (below 0.05) indicate the data is
appropriate for factor analysis.

Reference: Bartlett, M. S. (1954). A note on the multiplying factors for
various chi square approximations. Journal of Royal Statistical
Society, 16 (Series B), 296-298.
Designed by Aj Sangdao
60
Assumptions for Factor Analysis

Factorability: a degree of collinearity among the variables

Kaiser-Meyer-Olkin (KMO) is a measure of sampling adequacy (MSA).

Null hypothesis: a measure of how suited your data is for Factor Analysis.

The test measures sampling adequacy for each variable in the model and for
the complete model.

The statistic is a measure of the proportion of variance among variables that
might be common variance.

That might be indicative of underlying or lalent common factors.

The lower the proportion, the more suited your data is to Factor Analysis.
Designed by Aj Sangdao
61
Assumptions for Factor Analysis

Factorability: a degree of collinearity among the variables

Kaiser-Meyer-Olkin (KMO) is a measure of sampling adequacy (MSA).

KMO ranges between 0 and 1. A rule of thumb for interpreting the statistic:

KMO values between 0.8 and 1 indicate the sampling is greatly adequate.

KMO values between 0.5 and 0.7 indicate the sampling is moderately adequate.

KMO values less than 0.5 indicate the sampling is not adequate. (may consider to collect more
data or rethink about which variables to include.)

A value of 0 indicates that the sum of partial correlations is large relative to
the sum of correlations, indicating dispersion in the pattern of correlations.
Factor analysis is thus likely to be inappropriate.

A value of 1 indicates that the patterns of correlations are relatively compact
and so factor analysis should yield distinct and reliable factors.

Reference: Kaiser, H. 1974. An index of factor simplicity. Psychometrika. 39:
31-36.
Designed by Aj Sangdao
62
Download