Uploaded by writesayeed

Factor Analysis & PCA: Introduction, Assumptions, & Sample Size

advertisement
Introduction
 To reduce a large number of variables into fewer numbers of factors the statistical
technique used is Factor analysis (FA) and Principal Component Analysis (PCA).
 These techniques are used in the analysis of validation of constructs, creating
measurement index, scale construction and data reduction.
Factor analysis
They are used to identify one or more common domains for groups of correlated variables.
Factor analysis is a technique used to identify the pattern of correlations (or covariance)
between the observed measures.
For example, indicators of job satisfaction, political attitudes, self-esteem, socioeconomic
status, health or family values.
Assumptions
1) As far as the measurement levels of the variables are concerned, interval or ratio variables
form the input for a classical factor analysis.
2) With regard to the number of observations necessary for the performance of a factor
analysis, one may say that for every variable there are at least ten times as many observations
(respondents) necessary.
Sample Size
many "rules" (in order of popularity)




10 cases per item in the instrument
subjects to variables ratio of no less than 5
5 times the number of variables or 100
minimum of 200 cases, regardless of stv ratio
Kaiser-Meyer-Olkin (KMO)
Measure of Sample Adequacy KMO is a test conducted to examine the strength of the partial
correlation (how the factors explain each other) between the variables.
KMO values closer to 1.0 are consider ideal while values less than 0.5 are unacceptable.
Recently, most scholars argue that a KMO of at least 0.80 are good enough for factor analysis
to commence. Below is a tabular chart for your perusal.
Bartlett Test of Sphericity
The Bartlett's test of Sphericity is used to test the null hypothesis that the correlation matrix
is an identity matrix. An identity correlation matrix means your variables are unrelated and
not ideal for factor analysis.
Data Source
1. Link for SAQ data: http://commres.net/wiki/saq_dataset
2. Link for SAQ data: http://staff.bath.ac.uk/pssiw/stats2/page16/page16.html
If any statement has mean < 0.5 or > 4.5, and standard deviation is zero it means
respondents
have done the extreme tick (either on Strongly Agree or Strongly Disagree) and as there is no
variation in the statement, This statement is not useful.
Hypothesis
Null: Correlation Matrix is an identity matrix
Alt: Correlation Matrix is not an identity matrix (desirable)
As the p-value of Bartlett's Test of Sphericity is 0.000 which is less than 5% level of significance
we reject null hypothesis which means that correlation matrix is not an identity matrix. KaiserMeyer-Olkin Measure of Sampling Adequacy should be greater than .70 indicating sufficient
items for each factor. Here, the results of the KMO is 0.930 is greater than 0.7.
Download