PLS-SEM: Introduction and Overview Joe F. Hair, Jr. Founder & Senior Scholar The greatest interest in any factor solution centers on the correlations between the original variables and the factors. The matrix of such test-factor correlations is called the factor structure, and it is the primary interpretative device in principal components analysis. In the factor structure the element rjk gives the correlation of the jth test with the kth factor. Assuming that the content of the observation variables is well known, the correlations in the kth column of the structure help in interpreting, and perhaps naming, the kth factor. Also, the coefficients in the jth row give the best view of the factor composition of the jth test. The derivation of the factor structure S is as follows : N S 1 N (z i m 1 N z ifi 1 N z i (L 1 N ( z i z i ) VL z )( f i m f ) i 1 RVL 1 / 2 V z i ) 1 / 2 1 / 2 and since RV VL S VLL 1 / 2 VL 1/ 2 Another set of coefficients of interest in factor analysis is the weights that compound predicted observations z from factor scores f. These regression coefficients for the multiple regression of each element of the observation vector z on the factor f are called factor loadings and the matrix A that contains them as its rows is . . . . . Source: Cooley, William W., and Paul R. Lohnes, Multivariate Data Analysis, John Wiley & Sons, Inc., New York, 1971, page 106. SEM Model: Predicting the Birth Weight of Guinea Pigs X & Y = different outcomes B, C & D = common causes A & E = independent causes Sewall Wright, Correlation and Causation, Journal of Agricultural Research, Vol. XX, No. 7, 1921. Structural Equations Modeling What comes to mind? Structural Equations Modeling Wireless Phone Service Advertising Budget Brand Attitudes Purchase Likelihood Experience Information Search Risk Structural Equations Modeling (SEM) Two Steps: 1. Confirm measurement model (CFA) = CFA assesses reliability and validity of the model’s constructs. CB-SEM – must achieve fit to move to 2nd step. PLS-SEM – confirm measurement before examining structural model (2nd step). 2. Evaluate structural model (SEM) = SEM determines whether hypothesized relationships exist between the constructs. In developing models to test using CFA/SEM, researchers draw upon theory, prior experience, expert judgment, and research objectives to identify and develop hypotheses about relationships between multiple independent and dependent variables. CB-SEM (Covariance-based SEM) – statistical objective = to reproduce the theoretical covariance matrix, without focusing on explained variance. PLS-SEM (Partial Least Squares SEM) – statistical objective = to maximize the explained variance of the endogenous latent constructs (dependent variables). CB-SEM – statistical objective (goodness of fit) = minimize the differences between the observed covariance matrix and the estimated covariance matrix. Research objective: testing and confirmation where prior theory is strong. • Assumes normality of data distribution, homoscedasticity, large sample size, etc. • A “full information approach” which means small changes in model specification can result in substantial changes in model fit. PLS-SEM – statistical objective = maximize the explained (predicted) variance of the dependent variables. Research objective: theory development and prediction. • Normality of data distribution not assumed. • Good solutions with smaller sample sizes. • Measurement models: Can be used with fewer indicator variables (1 or 2) per construct. OK to have ordinal scaled questions. Can include a larger number of indicator variables (CB-SEM = solution unlikely with 50+ items). • Preferred alternative with formative constructs. PLS Path Model Steps 1 & 2 are combined, but still look at measurement theory first before moving to structural model assessment. Which SEM Approach Should Be Used? Rules of Thumb: PLS-SEM or CB-SEM Use CB-SEM when: •The goal is theory testing, theory confirmation, or the comparison of alternative theories. •Structural model has non-recursive relationships. •Research requires a global goodness of fit criterion. Rules of Thumb: PLS-SEM or CB-SEM? Use PLS-SEM when: •The goal is predicting key target constructs. •Formative constructs are included in the structural model. Note that formative measures can also be used with CB-SEM, but doing so requires construct specification modifications (e.g., the construct must include both formative and reflective indicators to meet identification requirements = MIMIC measurement model). •The structural model is complex (many constructs and many indicators). •The sample size is small and/or the data is not-normally distributed. •The plan is to use latent variable scores in subsequent analyses. Should You Use SEM In Your Research? Journal reviewers rate SEM papers more favorably on key manuscript attributes . . . Mean Score Attributes Topic Relevance Research Methods Data Analysis Conceptualization Writing Quality Contribution SEM 4.2 3.5 3.5 3.1 3.9 3.1 No SEM 3.8 2.7 2.8 2.5 3.0 2.8 p-value .182 .006 .025 .018 .006 .328 Note: scores based on 5-point scale, with 5 = more favorable Source: Babin, Hair & Boles, Publishing Research in Marketing Journals Using Structural Equation Modeling, Journal of Marketing Theory and Practice, Vol. 16, No. 4, 2008, pp. 281-288. The use of PLS-SEM is increasing in different fields Marketing (Hair et al. 2012a) Cumulative number of articles 200 180 160 140 120 MISQ (Ringle et al. 2012) 100 80 60 Strategic Mgmt. 40 (Ringle et al. 2012b) 20 Mgmt. Accounting 0 (Nitzl 2012) 1980 1990 2000 2010 Year Hair, J. F., M. Sarstedt, C. M. Ringle, and J. A. Mena (2012a). An Assessment of the Use of Partial Least Squares Structural Equation Modeling in Marketing Research, Journal of the Academy of Marketing Science, 40 (3), 414-433. Hair, J. F., M. Sarstedt, T. Pieper, and C. M. Ringle (2012b). The Use of Partial Least Squares Structural Equation Modeling in Strategic Management Research: A Review of Past Practices and Recommendations for Future Applications, Long Range Planning, 45(5/6), 320-340. Nitzl, C. (2012). The Use of Partial Least Squares Path Modeling in Management Accounting, White Paper. Ringle, C. M., M. Sarstedt, and D. Straub (2012). A Critical Look at the Application of PLS-SEM in MIS Quarterly, MIS Quarterly, 36(1), iii-xiv. All rights reserved ©. Cannot be reproduced or distributed without express written permission from 15 . . All rights reserved ©. Cannot be reproduced or distributed without express written permission from Prentice-Hall, McGraw-Hill, Sage, SmartPLS, and session presenters. . A PLS path model consists of two elements: First, there is a structural model (also referred to as the inner model in the context of PLS-SEM) that represents the constructs (circles or ovals). The structural model also displays the relationships (paths) between the constructs. Second, there are the measurement models (also referred to as the outer models in PLS-SEM) of the constructs that display the relationships between the constructs and the indicator variables (rectangles). There are two types of constructs in a SEM: the exogenous latent variables (i.e., those constructs that explain other constructs in the model) and the endogenous latent variables (i.e., those constructs that are being explained in the model). All rights reserved ©. Cannot be reproduced or distributed without express written permission from Prentice-Hall, McGraw-Hill, Sage, SmartPLS, and session presenters. Path models = diagrams used to visually display the hypotheses and variable relationships that are examined when SEM is applied. Constructs = variables that are not directly measured) are represented in path models as circles or ovals (Y1 to Y4). Indicators = also referred to as items or manifest variables, are the directly measured proxy variables that contain the raw data. They are represented in path models as rectangles (x1 to x10). Paths = relationships between constructs, and between constructs and their assigned indicators, shown as arrows. In PLS-SEM, the arrows are always single-headed, thus representing directional relationships. Single-headed arrows are considered a predictive relationship, and with strong theoretical support, can be interpreted as causal relationships. All rights reserved ©. Cannot be reproduced or distributed without express written permission from Prentice-Hall, McGraw-Hill, Sage, SmartPLS, and session presenters. Error terms = The error terms (e.g., e7 or e8; Exhibit 1.4) are connected to the (endogenous) constructs and (reflectively) measured variables by single-headed arrows. Error terms represent the unexplained variance when path models are estimated. In Exhibit 1.4, error terms e7 to e10 are on those indicators whose relationships go from the construct to the indicator (i.e., reflectively measured indicators). In contrast, the formatively measured indicators x1 to x6, where the relationship goes from the indicator to the construct, do not have error.terms. The structural model also contains error terms. In Exhibit 1.4, z3 and z4 are associated with the endogenous latent variables Y3 and Y4 (note that error terms on constructs and measured variables are labeled differently). In contrast, the exogenous latent variables that only explain other latent variables in the structural model do not have an error term. All rights reserved ©. Cannot be reproduced or distributed without express written permission from Prentice-Hall, McGraw-Hill, Sage, SmartPLS, and session presenters. Reflective (Scale) Versus Formative (Index) Operationalization of Constructs A central research question in social science research, particularly marketing, management & MIS, focuses on the operationalization of complex constructs: Are indicators causing or being caused by the latent variable/construct measured by them? Indicator 1 Indicator 2 Indicator 3 Construct Changes in the latent variable directly cause changes in the assigned indicators Indicator 1 ? Indicator 2 Indicator 3 Construct Changes in one or more of the indicators causes changes in the latent variable Example: Reflective vs. Formative World View Can’t walk a straight line Drunkenness Smells of alcohol Slurred speech Example: Reflective vs. Formative World View Consumption of beer Drunkenness Consumption of wine Consumption of hard liquor Basic Difference Between Reflective and Formative Measurement Approaches “Whereas reflective indicators are essentially interchangeable (and therefore the removal of an item does not change the essential nature of the underlying construct), with formative indicators ‘omitting an indicator is omitting a part of the construct’.” (DIAMANTOPOULOS/WINKLHOFER, 2001, p. 271) The formative measurement approach generally minimizes the overlap between complementary indicators Construct domain Construct domain The reflective measurement approach focuses on maximizing the overlap between interchangeable indicators Exercise: Satisfaction in Hotels as Formative and Reflective Operationalized Constructs The rooms‘ furnishings are good The hotel’s recreation offerings are good Taking everything into account, I am satisfied with this hotel The hotel‘s personnel are friendly I appreciate this hotel Satisfaction with Hotels The hotel is low-priced I am looking forward to staying overnight in this hotel The rooms are quiet I am comfortable with this hotel The rooms are clean The hotel’s service is good The hotel’s cuisine is good Formative Constructs – Two Types 1. Composite (formative) constructs – indicators completely determine the “latent” construct. They share similarities because they define a composite variable but may or may not have conceptual unity. In assessing validity, indicators are not interchangeable and should not be eliminated, because removing an indicator will likely change the nature of the latent construct. 2. Causal constructs – indicators have conceptual unity in that all variables should correspond to the definition of the concept. In assessing validity some of the indicators may be interchangeable, and also can be eliminated. Bollen, K.A. (2011), Evaluating Effect, Composite, and Causal Indicators in Structural Equations Models, MIS Quarterly, Vol. 35, No. 2, pp. 359-372. Sample Size The overall complexity of a structural model has little influence on the sample size requirements for PLS-SEM. The reason is the algorithm does not compute all relationships in the structural model at the same time. Instead, it uses OLS regressions to estimate the model’s partial regression relationships. The 10 times rule indicates the sample size should be equal to the larger of: (1) 10 times the largest number of formative indicators used to measure a single construct, or (2) 10 times the largest number of structural paths directed at a particular latent construct in the structural model. This rule of thumb is equivalent to saying that the minimum sample size should be 10 times the maximum number of arrowheads pointing at a latent variable anywhere in the path model. While the 10 times rule offers a rough guideline for minimum sample size requirements, PLS-SEM – like any statistical technique – requires researchers to consider the sample size against the background of the model and data characteristics. Specifically, the required sample size should be determined using power analyses based on the part of the model with the largest number of predictors. . Statistical power assumed = 80% Indicators for SEM Model Constructs Competence (COMP) comp_1 [company] is a top competitor in its market. comp_2 As far as I know, [company] is recognized world-wide. comp_3 I believe that [company] performs at a premium level. Likeability (LIKE) like_1 [company] is a company that I can better identify with than other companies. like_2 [company] is a company that I would regret more not having if it no longer existed than I would other companies. I regard [company] as a likeable company. like_3 Customer Loyalty (CUSL) cusl_1 I would recommend [company] to friends and relatives. cusl_2 If I had to choose again, I would chose [company] as my mobile phone services provider. I will remain a customer of [company] in the future. cusl_3 Satisfaction (CUSA) cusa If you consider your experiences with [company] how satisfied are you with [company]? Extended Reputation Model Constructs