Model Fit

Note of Caution

Researchers should be very cautious to report and use model fit in PLS-SEM (Hair et al. 2017). The proposed criteria are in their early stage of research, are not fully understood (e.g., the critical threshold values), and are often not useful for PLS-SEM. Even though, some researchers started requesting to report these new model fit indices for PLS-SEM. SmartPLS provides them but believes that there is much more research necessary to apply them appropriately. So far, these criteria usually should not be reported and used for the PLS-SEM results assessment.

Lohmöller (1989) already offers a set of fit measures. But he states that they have been introduced to provide a comparison to LISREL results rather than to represent an appropriate PLS-SEM index. More specifically, Lohmöller (1989) states that some fit measures imply restrictive assumptions on the residual covariances, which PLS-SEM does not imply when estimating the model. For example, certain fit measures assume a common factor model, which requires uncorrelated outer residuals. In contrast, the outer residuals of composite models are not required to be uncorrelated. Hence, they are inappropriate for PLS-SEM.

However, when mimicking CB-SEM models with the consistent PLS (PLSc-SEM) approach, one also mimics common factor models with the PLS-SEM approach. Hence, when using PLSc-SEM for a path model that only includes reflectively measured constructs (i.e., common factor models), one may be interested in the model fit. Thereby, it is more comprehensively possible to mimic CB-SEM via the PLSc-SEM approach or to compare the results from the two approaches.

Fit Measures in SmartPLS

SmartPLS offers the following fit measures:

  • SRMR
  • Exact fit criteria d_ULS and d_G
  • NFI
  • Chi²
  • RMS_theta

Note: These fit measures need to be clearly defined and better explained in PLS-SEM literature. Currently, most readers are unlikely to know what they are or what they mean or how they are calculated.

For the approximate fit indices such as SRMR and NFI, you may directly look at the outcomes of a PLS-SEM or PLSc-SEM model estimation (i.e., the results report) and these criteria's values with a certain threshold (e.g., SRMR < 0.08 and NFI > 0.90).

For the exact fit measures d_ULS and d_G you may consider the inference statistics for an assessment. Therefore, you need to run the bootstrap procedure and to use the “complete bootstrap” option in SmartPLS 3. When running the bootstrap procedure, you will notice that the procedure counts two times up to the specified number of bootstrapping samples:

  • In the first round, SmartPLS uses the standard bootstrapping procedure to get the inference statistics for the model parameters (e.g., path coefficients, weights, etc.).
  • In the second round, SmartPLS uses an adapted Bollen-Stine bootstrapping procedure as described in Dijkstra and Henseler (2015; also see Bollen and Stine, 1992; Yuan and Hayashi, 2003) to create confidence intervals for the d_ULS, d_G, and SRMR criteria (note that SmartPLS has two computation runs in the second round: one for the saturated model and one for the estimated model).

Standardized Root Mean Square Residual (SRMR)

While the root mean square residual (RMSR) is a measure of the mean absolute value of the covariance residuals, the standardized root mean square residual (SRMR) based on transforming both the sample covariance matrix and the predicted covariance matrix into correlation matrices. Note: Literature on PLS-SEM needs to better explain where and how the covariance matrix is derived in PLS-SEM (since it is different from CB-SEM, which is a full information method and PLS-SEM is not). Most important, should the researcher use the estimated model (most reasonable choice) or the saturated model to obtain the covariance matrix.

The SRMR is defined as the difference between the observed correlation and the model implied correlation matrix. Thus, it allows assessing the average magnitude of the discrepancies between observed and expected correlations as an absolute measure of (model) fit criterion.

A value less than 0.10 or of 0.08 (in a more conservative version; see Hu and Bentler, 1999) are considered a good fit. Henseler et al. (2014) introduce the SRMR as a goodness of fit measure for PLS-SEM that can be used to avoid model misspecification.

SmartPLS also provides bootstrap-based inference statistics of the SRMR criterion. For the interpretation of SRMR bootstrap confidence interval results see the exact model fit.

Exact Model Fit

Not much knowledge and information on exact fit measures, their usefulness, behavior, relevance, and proper application is available in PLS-SEM literature thus far. The exact model fit tests the statistical (bootstrap-based) inference of the discrepancy between the empirical covariance matrix and the covariance matrix implied by the composite factor model. Note: Literature on PLS-SEM needs to better explain where and how the covariance matrix is derived in PLS-SEM (since it is different from CB-SEM, which is a full information method and PLS-SEM is not). Most important, should the researcher use the estimated model (most reasonable choice) or the saturated model to obtain the covariance matrix.

As defined by Dijkstra and Henseler (2015), d_ULS (i.e., the squared Euclidean distance) and d_G (i.e., the geodesic distance) represent two different ways to compute this discrepancy. The bootstrap routine provides the confidence intervals of these discrepancy values. The d_G criterion builds on PLS-SEM eigenvalue computations. However, the question remains how these eigenvalues differ from CB-SEM.

Note: The value of the d_ULS and d_G in itself do not pertain any value. Only the bootstrap results of the exact model fit measures allow an interpretation of results. More specifically, since the d_ULS and d_G (and SRMR) confidence intervals are not obtained by running the “normal” bootstrapping procedure, but the adapted Bollen-Stine bootstrapping procedure, their results interpretation somewhat differs from the “normal” bootstrap outcomes.

For the exact fit criteria (i.e., d_ULS and d_G), you compare their original value against the confidence interval created from the sampling distribution. The confidence interval should include the original value. Hence, the upper bound of the confidence interval should be larger than the original value of the exact d_ULS and d_G fit criteria to indicate that the model has a “good fit”. Choose the confidence interval in a way that the upper bound is at the 95% or 99% point.

In other words, a model fits well if the difference between the correlation matrix implied by your model and the empirical correlation matrix is so small that it can be purely attributed to sampling error. Hence, the difference between the correlation matrix implied by your model and the empirical correlation matrix should be non-significant (p > 0.05). Otherwise, if the discrepancy is significant (p < 0.05), model fit has not been established.

Normed Fit Index (NFI) or Bentler and Bonett Index

One of the first fit measures proposed in the SEM literature is the normed fit index by Bentler and Bonett (1980). It computes the Chi² value of the proposed model and compares it against a meaningful benchmark. Since the Chi² value of the proposed model in itself does not provide sufficient information to judge model fit, the NFI uses the Chi² value from the null model, as a yardstick. Literature, however, does not explain how the PLS-SEM Chi² value differs from the CB-SEM one.

The NFI is then defined as 1 minus the Chi² value of the proposed model divided by the Chi² values of the null model. Consequently, the NFI results in values between 0 and 1. The closer the NFI to 1, the better the fit. NFI values above 0.9 usually represent acceptable fit. Lohmöller (1989) provides detailed information on the NFI computation of PLS path models. However, for the applied user, these explications are quite difficult to comprehend.

The NFI represents an incremental fit measure. As such, a major disadvantage is that it does not penalize for model complexity. The more parameters in the model, the larger (i.e., better) the NFI result. It is for this reason that this measure is not recommended, but alternatives such as the non-normed fit index (NNFI) or Tucker-Lewis index, which penalizes the Chi² values by the degrees of freedom (df). Lohmöller (1989) suggests computing the NNFI of PLS path models. However, the NNFI has not been implemented in SmartPLS, yet.

Chi² and Degrees of Freedom

Assuming a multinormal distribution, the Chi² value of a PLS path model with df degrees of freedom approximately is (N-1)*L, whereby N is the number of observations and L the maximum likelihood function as defined by Lohmöller (1989). The degrees of freedom (df) are defined as (K² + K) /2 – t, whereby is the number of manifest variables in the PLS path model and t the number of independent variables to estimate the model implied covariance matrix. However, future research must clearly define how to determine the degrees of freedom of composite models, common factor models, and mixed models when using PLS-SEM.

RMS_theta

The RMS_theta is the root mean squared residual covariance matrix of the outer model residuals (Lohmöller, 1989). This fit measure is only useful to assess purely reflective models, because outer model residuals for formative measurement models are not meaningful.

The RMS_theta assesses the degree to which the outer model residuals correlate. The measure should be close to zero to indicate good model fit, because it would imply that the correlations between the outer model residuals are very small (close to zero).

The RMS_theta builds on the outer model residuals, which are the differences between predicted indicator values and the observed indicator values. For predicting the indicator values it is necessary in PLS-SEM to have the latent variables scores. However, PLSc-SEM assumes common factors, which are subject to factor indeterminacy and, thus, determinate latent variable scores do not exist. Hence, even though RMS_theta computation should be used for assessing common factor models computed by PLSc-SEM, it exists only for composite models computed by PLS-SEM. This discussion needs to be further differentiated between PLS-SEM and PLSc-SEM. RMS_theta values below 0.12 indicate a well-fitting model, whereas higher values indicate a lack of fit (Henseler et al., 2014).

Estimated and Saturated Model

The distinction of estimated and saturated models in PLS-SEM is in its very early stages. Future research must provide detailed explanations and recommendations on the computation, usage and interpretation of these outcomes.

The saturated model assesses correlation between all constructs. The estimated model is a model which is based on a total effect scheme and takes the model structure into account. It is hence a more restricted version of the fit measure.

Researchers often struggle to choose between the estimated and saturated model when trying to report the fir of a PLS path model. At this stage, PLS-SEM literature is very vague on the use of fit criteria in general and, in specific, the choice between the estimated and saturated model. However, the estimated model seems to be a reasonable choice, if a researcher makes the questionable decision to report the fit results of the PLS path model.

Composite Model Fit Measures

If you like to obtain the composite model fit measures, use formative measurement models for all constructs in the PLS path model. After model estimation, refer to the estimated (or saturated?) model outcomes.

Common Factor Model Fit Measures

If you like to obtain the common factor model fit measures, use reflective measurement models for all constructs in the PLS path model. After model estimation, refer to the estimated (or saturated?) model SRMR outcomes. However, when assuming common factor models for all constructs in the PLS path model the question remains, why the researcher does not use CB-SEM in the first place to estimate and evaluate such a model.

Mixed Model Fit Measures

If you use both reflective and formative measurement models, SmartPLS 3.2.4 (and subsequent versions) provides the mixed model fit measures considering common factor models for reflective measurement models and composite models for formative measurement models. However, at this stage, PLS-SEM literature does not provide much support, why and how a researcher would theoretically distinguish between constructs represented by common factors and composites in the same model, the need for their mixed use, and the necessity to report the model fit.

Links

References

  • Bentler, P. M., & Bonett, D. G. (1980). Significance Tests and Goodness-of-Fit in the Analysis of Covariance Structures, Psychological Bulletin, 88: 588-600.

  • Dijkstra, T. K. and Henseler, J. (2015). Consistent and Asymptotically Normal PLS Estimators for Linear Structural Equations, Computational Statistics & Data Analysis, 81(1): 10-23.

  • Hair, J. F., Hollingsworth, C. L., Randolph, A. B., and Chong, A. Y. L (2017). An Updated and Expanded Assessment of PLS-SEM in Information Systems Research. Industrial Management & Data Systems, 117(3): 442-458.

  • Hair, J. F., Hult, G. T. M., Ringle, C. M., and Sarstedt, M. (2017). A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM), 2^nd^ Ed., Sage: Thousand Oaks.

  • Henseler, J., Dijkstra, T. K., Sarstedt, M., Ringle, C. M., Diamantopoulos, A., Straub, D. W., Ketchen, D. J., Hair, J. F., Hult, G. T. M., and Calantone, R. J. 2014. Common Beliefs and Reality about Partial Least Squares: Comments on Rönkkö & Evermann (2013), Organizational Research Methods, 17(2): 182-209.

  • Hu, L.-t., and Bentler, P. M. (1998). Fit Indices in Covariance Structure Modeling: Sensitivity to Underparameterized Model Misspecification, Psychological Methods, 3(4): 424-453.

  • Lohmöller, J.-B. (1989). Latent Variable Path Modeling with Partial Least Squares, Physica: Heidelberg.

Link to More Literature