Model Fit

Fit Measures in SmartPLS

SmartPLS offers the following fit measures:

  • SRMR
  • Exact fit criteria d_ULS and d_G
  • NFI
  • Chi²
  • RMS_theta

For the approximate fit indices such as SRMR and NFI, you may directly look at the outcomes of a PLS or PLSc model estimation (i.e., the results report) and these criteria's values with a certain threshold (e.g., SRMR < 0.08 and NFI > 0.90).

For the exact fit measures d_ULS and d_G you may consider the inference statistics for an assessment. Therefore, you need to run the bootstrap procedure and to use the “complete bootstrap” option in SmartPLS 3. When running the bootstrap procedure, you will notice that the procedure counts two times up to the specified number of bootstrapping samples:

  • In the first round, SmartPLS uses the standard bootstrapping procedure to get the inference statistics for the model parameters (e.g., path coefficients, weights, etc.).
  • In the second round, SmartPLS uses an adapted Bollen-Stine bootstrapping procedure as described in Dijkstra and Henseler (2015; also see Bollen and Stine, 1992; Yuan and Hayashi, 2003) to create confidence intervals for the d_ULS, d_G, and SRMR criteria (note that SmartPLS has two computation runs in the second round: one for the saturated model and one for the estimated model).

Standardized Root Mean Square Residual (SRMR)

While the root mean square residual (RMSR) is a measure of the mean absolute value of the covariance residuals, the standardized root mean square residual (SRMR) based on transforming both the sample covariance matrix and the predicted covariance matrix into correlation matrices. The SRMR is defined as the difference between the observed correlation and the model implied correlation matrix. Thus, it allows assessing the average magnitude of the discrepancies between observed and expected correlations as an absolute measure of (model) fit criterion.

A value less than 0.10 or of 0.08 (in a more conservative version; see Hu and Bentler, 1999) are considered a good fit. Henseler et al. (2014) introduce the SRMR as a goodness of fit measure for PLS-SEM that can be used to avoid model misspecification.

SmartPLS also provides bootstrap-based inference statistics of the SRMR criterion. For the interpretation of SRMR bootstrap confidence interval results see the exact model fit.

Exact Model Fit

The exact model fit tests the statistical (bootstrap-based) inference of the discrepancy between the empirical covariance matrix and the covariance matrix implied by the composite factor model. As defined by Dijkstra and Henseler (2015), d_LS (i.e., the squared Euclidean distance) and d_G (i.e., the geodesic distance) represent two different ways to compute this discrepancy. The bootstrap routine provide the confidence intervals of these discrepancy values.

Note: The value of the d_LS and d_G in itself do not pertain any value. Only the bootstrap results of the exact model fit measures allow an interpretation of results. More specifically, since the d_ULS and d_G (and SRMR) confidence intervals are not obtained by running the “normal” bootstrapping procedure, but the adapted Bollen-Stine bootstrapping procedure, their results interpretation somewhat differs from the “normal” bootstrap outcomes.

For the exact fit criteria (i.e., d_ULS and d_G), you compare their original value against the confidence interval created from the sampling distribution. The confidence interval should include the original value. Hence, the upper bound of the confidence interval should be larger than the original value of the exact d_ULS and d_G fit criteria to indicate that the model has a “good fit”. Choose the confidence interval in a way that the upper bound is at the 95% or 99% point.

In other words, a model fits well if the difference between the correlation matrix implied by your model and the empirical correlation matrix is so small that it can be purely attributed to sampling error. Hence, the difference between the correlation matrix implied by your model and the empirical correlation matrix should be non-significant (p > 0.05). Otherwise, if the discrepancy is significant (p < 0.05), model fit has not been established.

Note: SmartPLS 3.2.7 (and later versions) returns results of d_G1 and d_G2. In line with the publication by Dijkstra and Henseler (2015), d_G1 calculates the eigenvalues based on \(S^{-1}Σ\), whereby \(S\) represents the sample covariance matrix and \(Σ\) the model-implied covariance matrix. In contrast, d_G2 uses a corrected eigenvalue computation based on \(S^{-1/2}ΣS^{-1/2}\).

Normed Fit Index (NFI) or Bentler and Bonett Index

One of the first fit measures proposed in the SEM literature is the normed fit index by Bentler and Bonett (1980). It computes the Chi² value of the proposed model and compares it against a meaningful benchmark. Since the Chi² value of the proposed model in itself does not provide sufficient information to judge model fit, the NFI uses the Chi² value from the null model, as yardstick.

The NFI is then defined as 1 minus the Chi² value of the proposed model divided by the Chi² values of the null model. Consequently, the NFI results in values between 0 and 1. The closer the NFI to 1, the better the fit. NFI values above 0.9 usually represent acceptable fit. Lohmöller (1989) provides detailed information on the NFI computation of PLS path models.

The NFI represents an incremental fit measure. As such, a major disadvantage is that it does not penalize for model complexity. The more parameters in the model, the larger (i.e., better) the NFI result. It is for this reason that this measure is not recommended, but alternatives such as the non-normed fit index (NNFI) or Tucker-Lewis index, which penalizes the Chi² values by the degrees of freedom (df). Lohmöller (1989) suggests computing the NNFI of PLS path models. However, the NNFI has not been implemented in SmartPLS, yet.

Chi² and Degrees of Freedom

Assuming a multinormal distribution, the Chi² value of a PLS path model with df degrees of freedom approximately is (N-1)*L, whereby N is the number of observations and L the maximum likelihood function as defined by Lohmöller (1989). The degrees of freedom (df) are defined as (K² + K) /2 – t, whereby is the number of manifest variables in the PLS path model and t the number of independent variables to estimate the model implied covariance matrix. However, future research must clearly define how to determine the degrees of freedom of composite model, common factor models, and mixed models when using PLS

RMS_theta

The RMS_theta is the root mean squared residual covariance matrix of the outer model residuals (Lohmöller, 1989). This fit measure is only useful to assess purely reflective models, because outer model residuals for formative measurement model are not meaningful.

The RMS_theta assesses the degree to which the outer model residuals correlate. The measure should be close to zero to indicate good model fit, because it would imply that the correlations between the outer model residuals are very small (close to zero).

The RMS_theta builds on the outer model residuals, which are the differences between predicted indicator values and the observed indicator values. For predicting the indicator values it is necessary in PLS to have the latent variables scores. However, PLSc assumes common factors, which are subject to factor indeterminacy and, thus, determinate latent variable scores do not exist. Hence, even though RMS_theta computation should be used for assessing common factor models computed by PLSc, it exists only for composite models computed by PLS.

RMS_theta values below 0.12 indicate a well-fitting model, whereas higher values indicate a lack of fit (Henseler et al., 2014).

Estimated and Saturated Model

The distinction of estimated and saturated models in PLS-SEM is in its very early stages. Future research must provide detailed explanations and recommendations on the computation, usage and interpretation of these outcomes.

The saturated model is the model that we assumed in the previous version. It assesses correlation between all constructs. The estimated model is a model which is based on a total effect scheme and takes the model structure into account. It is hence a more restricted version of the fit measure. Composite Model Fit Measures

If you like to obtain the composite model fit measures, use formative measurement models for all constructs in the PLS path model. After model estimation, in the SmartPLS report, the saturated model SRMR outcome is similar to the one reported for the composite model SRMR provided by SmartPLS 3.2.3 and earlier versions.

Common Factor Model Fit Measures

If you like to obtain the common factor model fit measures, use reflective measurement models for all constructs in the PLS path model. After model estimation, in the SmartPLS report, the saturated model SRMR outcome is similar to the one reported for the common factor model SRMR provided by SmartPLS 3.2.3 and earlier versions.

Mixed Model Fit Measures

If you use both reflective and formative measurement models, SmartPLS 3.2.4 (and subsequent versions) provides the mixed model fit measures considering common factor models for reflective measurement models and composite models for formative measurement models.

Note of Caution

Even though SmartPLS includes some model fit assessment criteria, it is important to note that they may often not be useful for PLS-SEM and must be used with caution (Hair et al. 2017). In addition, these criteria are in their very early stage of research and not fully understood (e.g., the critical threshold values). However, some researchers started to request these new model fit indices for PLS-SEM. Thus, SmartPLS provides them even though we believe that there is much more research necessary to apply them appropriately.

Lohmöller (1989) already offers a set of fit measures. But he states that they have been introduced to provide a comparison to LISREL results rather than to represent an appropriate PLS-SEM index. More specifically, Lohmöller (1989) states that some fit measures imply restrictive assumptions on the residual covariances, which PLS-SEM does not imply when estimating the model. For example, certain fit measures assume a common factor model, which requires uncorrelated outer residuals. In contrast, the outer residuals of composite models are not required to be uncorrelated. Hence, they are inappropriate for PLS-SEM.

However, when mimicking CB-SEM models with the consistent PLS (PLSc) approach, one also mimics common factor models with the PLS-SEM approach. Hence, when using PLSc for a path model that only includes reflectively measured constructs (i.e., common factor models), one may be interested in the model fit. Thereby, it is more comprehensively possible to mimic CB-SEM via the PLSc approach or to compare the results from the two approaches.

In addition, consider the note of caution presented in Chapter 6 of the book on PLS-SEM (Hair et al., 2017): “While PLS-SEM was originally designed for prediction purposes, research has sought to extend its capabilities for theory testing by developing model fit measures. Model fit indices enable judging how well a hypothesized model structure fits the empirical data and, thus, help to identify model misspecifications. […] Initial simulation results suggest that the SRMR, RMStheta, and exact fit test are capable of identifying a range of model misspecifications (Dijkstra & Henseler, 2015a; Henseler et al., 2014). At this time, however, too little is known about these measures’ behavior across a range of data and model constellations, so more research is needed. Furthermore, these criteria are not readily implemented in standard PLS-SEM software. […] Apart from these developments, it is an open question whether fit measured as described above adds any value to PLS-SEM analyses in general. PLS-SEM focuses on prediction rather than on explanatory modeling and therefore requires a different type of validation. More precisely, validation PLS-SEM is concerned with generalization, which is the ability to predict sample data, or, preferably, out-of-sample data—see Shmueli (2010) for details. Against this background researchers increasingly call for the development of evaluation criteria that better support the prediction-oriented nature of PLS-SEM (e.g., Rigdon, 2012, 2014a) and for an emancipation of PLS-SEM from its CB-SEM sibling (Sarstedt, Ringle, Henseler, & Hair, 2014). In this context, fit (as put into effect by SRMS), RMStheta, and the exact fit test offer little value. In fact, their use can even be harmful as researchers may be tempted to sacrifice predictive power to achieve better “fit.” […].”

Links

References

  • Bentler, P. M., & Bonett, D. G. (1980). Significance Tests and Goodness-of-Fit in the Analysis of Covariance Structures, Psychological Bulletin, 88: 588-600.

  • Dijkstra, T. K. and Henseler, J. (2015). Consistent and Asymptotically Normal PLS Estimators for Linear Structural Equations, Computational Statistics & Data Analysis, 81(1): 10-23.

  • Hair, J. F., Hollingsworth, C. L., Randolph, A. B., and Chong, A. Y. L (2017). An Updated and Expanded Assessment of PLS-SEM in Information Systems Research. Industrial Management & Data Systems, 117(3): 442-458.

  • Hair, J. F., Hult, G. T. M., Ringle, C. M., and Sarstedt, M. (2017). A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM), 2^nd^ Ed., Sage: Thousand Oaks.

  • Henseler, J., Dijkstra, T. K., Sarstedt, M., Ringle, C. M., Diamantopoulos, A., Straub, D. W., Ketchen, D. J., Hair, J. F., Hult, G. T. M., and Calantone, R. J. 2014. Common Beliefs and Reality about Partial Least Squares: Comments on Rönkkö & Evermann (2013), Organizational Research Methods, 17(2): 182-209.

  • Hu, L.-t., and Bentler, P. M. (1998). Fit Indices in Covariance Structure Modeling: Sensitivity to Underparameterized Model Misspecification, Psychological Methods, 3(4): 424-453.

  • Lohmöller, J.-B. (1989). Latent Variable Path Modeling with Partial Least Squares, Physica: Heidelberg.

Link to More Literature