PLS and Bootstrapping Problems

Why do I get n/a as a result for some or all parameters in SmartPLS?

You receive an n/a for the results of a parameter (e.g., path coefficients, loadings, weights, etc.) if there is no possible solution. This problem usually occurs when executing the bootstrapping routine and one or more bootstrapping samples lead to an invalid solution (i.e., it was not possible to calculate a solution for the model or some of its parts). Consequently, the result table displays an n/a for the problematic parameter. Once an n/a result occurs across all bootstrap samples, it is not possible to calculate the mean and standard deviation from the bootstrapping distribution. Therefore, the result table shows an n/a for these outcomes.

The following screenshot illustrates the problem:

Consistent PLS Bootstrapping Results Table

Note: To replicate the problem shown above, run SmartPLS 3, open the ECSI model, select the data set with 98 observations, and start the consistent bootstrapping with the default settings.

1) Why are invalid bootstrapping solutions not filtered out?

Inadmissible solutions always indicate a problem with the data or the model. The lack of results makes the researchers aware of the problem. With this information, the user can address the problem in order to create an adequate solution.

In addition, filtering out inadmissible solutions would distort the estimation of the sample distribution generated by bootstrapping. The result would only consider the "good" bootstrap solutions. Consequently, the results of the bootstrapping routine may underestimate the actual variability of the parameter estimate (e.g. its standard error). This can lead to incorrect inferences from the model and the data (e.g., faults in the significance assessment).

2) What are potential problems?

When you use the consistent PLS (PLSc) bootstrapping:

The consistent PLS (PLSc) algorithm performs a correction of reflective constructs' correlations to make results consistent with a factor-model (Dijkstra and Henseler, 2015; Dijkstra and Schermelleh-Engel, 2014). Therefore, it uses a new reliability measure specific to the PLS context termed rho_A (Dijkstra and Henseler, 2015). It is well-known that this reliability statistic is consistent in PLS (i.e., it approaches the true reliability under common regularity conditions of a common factor model and when sample sizes increase to infinity). However, it may produce inadmissible solutions on smaller sample sizes and when common factor assumptions do not hold (e.g., the construct follows a composite model; e.g., Rigdon 2016; Rigdon, Sarstedt, and Ringle, 2017; Sarstedt et al. 2016). In these cases, the reliability estimate could be outside the permissible range of 0 to 1 (Takane and Hwang, 2018). If it is negative the correction does not work at all, because it requires taking the square root of rho_A which is not defined for negative values. But also extreme positive values could lead to estimation problems after correcting the correlations when then resulting corrected correlations are outside the interval -1 to 1.

The problem has also been compared to Haywood cases in CB-SEM where the method estimates negative variances (which are of course impossible).

PLSc is very strict about the common factor model assumptions. Deviations from these assumptions likely lead to estimation problems and thus inadmissible solutions. Hence, the researcher may revisit the common factor model assumption of reflective constructs. If appropriate, the constructs could be treated and estimated as composites. However, when assuming common factor models, a simple recommendation to avoid these problems is to use a larger sample size.

When you use the standard PLS bootstrapping:

Inadmissible solutions occur much less frequently with standard PLS algorithms than with PLSc. The occurrence of this problem is almost always associated with either (a) an almost perfect collinearity in the model or (b) a variable with zero variance.

Both problems may not appear on the original data, but may manifest during bootstrapping. The latter is a random process that draws observations from the original sample without replacement to create a bootstrapping sub-sample. The results for a parameter estimate on all subsamples represent the sample distribution. However, due to the random nature of bootstrapping, some sub-samples may show extreme characteristics.

For example, strong multi-collinearity problems on the overall sample may become perfect collinearity if only those observations are drawn that are perfectly collinear. In these cases you either need to cure the collinearity problem or try to increase the sample size.

Similarly, inadmissible solutions may occur if the model contains variables that have a variance close to zero (i.e., the same value for almost every respondent). In particular, if a variable contains very identical responses, it is possible to draw only observations that have the same value and thus cause a zero variance on that variable in this sub-sample. The standardization of the data in PLS-SEM also includes the division of the values by the variance of the variables. A division by zero leads to an inadmissible solution. This event is likely to occur if you have a very homogeneous group in which some variables have little variance, or if you include the grouping variable as an indicator in the model and run a multi-group analysis. After grouping, this variable only has the same value and therefore no variance. In addition, dummy variables (i.e. zero-one variables), for which one of the categories is very rare, can cause such bootstrapping problems. Moreover, the problem is inflated with small sample sizes, where the likelihood of drawing such a sample is higher than with large samples. In both cases, you should check your model for variables with low variance and, if possible, exclude them or increase the sample size.

When you use the PLS (or PLSc) fit-indices:

Some PLS-SEM fit indices have limitations in their general applicability for the evaluation of models. For example, they are not defined for models that use repeated indicators (e.g., when estimating higher-order models in PLS-SEM). This type of model constellation includes perfect correlations of 1 in the indicator correlation matrix (because the same indicator is use twice and correlates perfectly with itself). In this case, the calculation of some fit-indices is not possible.

Links

References

  • Dijkstra, T. K., and Henseler, J. (2015). Consistent Partial Least Squares Path Modeling, MIS Quarterly, 39(2): 297-316.

  • Dijkstra, T. K., and Schermelleh-Engel, K. (2014). Consistent Partial Least Squares for Nonlinear Structural Equation Models, Psychometrika, 79(4): 585-604.

  • Takane, Y., & Hwang, H. (2018). Comparisons Among Several Consistent Estimators of Structural Equation Models. Behaviormetrika, 45(1), 157-188.

  • Sarstedt, M., Hair, J. F., Ringle, C. M., Thiele, K. O., & Gudergan, S. P. (2016). Estimation Issues with PLS and CBSEM: Where the Bias Lies!. Journal of Business Research, 69(10), 3998-4010.

  • Rigdon, E. E., Sarstedt, M., & Ringle, C. M. (2017). On Comparing Results from CB-SEM and PLS-SEM. Five Perspectives and Five Recommendations. Marketing ZFP, 39(3), 4-16.

  • Rigdon, E. E. (2016). Choosing PLS Path Modeling as Analytical Method in European Management Research: A Realist Perspective. European Management Journal, 34(6), 598-605.

Link to More Literature