Bootstrapping

Abstract

Bootstrapping is a nonparametric procedure that allows testing the statistical significance of various PLS-SEM results such path coefficients, Cronbach’s alpha, HTMT, and R² values.

Brief Description

PLS-SEM does not assume that the data is normally distributed, which implies that parametric significance tests (e.g., as used in regression analyses) cannot be applied to test whether coefficients such as outer weights, outer loadings and path coefficients are significant. Instead, PLS-SEM relies on a nonparametric bootstrap procedure (Efron and Tibshirani, 1986; Davison and Hinkley, 1997) to test the significance of estimated path coefficients in PLS-SEM.

In bootstrapping, subsamples are created with randomly drawn observations from the original set of data (with replacement). The subsample is then used to estimate the PLS path model. This process is repeated until a large number of random subsamples has been created, typically about 5,000.

The parameter estimates (e.g., outer weights, outer loadings and path coefficients) estimated from the subsamples are used to derive standard errors for the estimates. With this information, t-values are calculated to assess each estimate's significance.

Hair et al. (2017) explain bootstrapping in more detail.

Bootstrapping Settings in SmartPLS

Subsamples

In bootstrapping, subsamples are created with observations randomly drawn from the original set of data (with replacement). To ensure stability of results, the number of subsamples should be large.

For an initial assessment, one may wish to choose a smaller number of bootstrap subsamples (e.g., 500) to be randomly drawn and estimated with the PLS-SEM algorithm, since that requires less time. For the final results preparation, however, one should use a large number of bootstrap subsamples (e.g., 5,000).

Note: Larger numbers of bootstrap subsamples increase the computation time.

Do Parallel Processing

If chosen the bootstrapping algorithm will be performed on multiple processors (if your computer offers more than one core). As each subsample can be calculated individually, subsamples can be computed in parallel mode. Using parallel computing will reduce computation time.

Sign Changes

Sets the method for dealing with sign changes during the bootstrapping iterations. The following options are available:

(1) No Sign Changes (default)

Sign changes in the resamples will be ignored and sample estimates are taken as they are. This is the most conservative estimation method, but results in larger standard errors and, consequently, lower t-ratios.

(2) Construct Level Changes

The signs of a group of coefficients (e.g., all outer loadings of a specific latent variable) in a bootstrapping subsample are compared with the signs of the original PLS path model estimation. If the majority of signs need to be reversed in a bootstrap run to match the signs of the model estimation using the original sample, all signs are reversed in that bootstrap run. Otherwise, no signs are changed.

(3) Individual Changes

This option reverses signs if an estimate for a bootstrap sample results in a different sign compared to that resulting from the original sample. Thus, the signs in the measurement and structural models of each bootstrap sample are made consistent with the signs in the original sample.

Test Type

Specifies if a one-sided or two-sided significance test is conducted.

Significance Level

Specifies the significance level of the test statistic.

Confidence Interval Method

Sets the bootstrapping method used for estimating nonparametric confidence intervals. The following bootstrapping procedures are available (see bootstrapping @ wikipedia.org):

  1. Percentile Bootstrap
  2. Studentized Bootstrap
  3. Bias-Corrected and Accelerated (BCa) Bootstrap (default)
  4. Davision Hinkley's Double Bootstrap
  5. Shi's Double Bootstrap

As a default we advise to use "Bias-Corrected and Accelerated (BCa) Bootstrap" as it is the most stable method that does not need excessive computing time.

ATTENTION: The Double Bootstrap procedures need much more computing time than the standard Bootstrapp because each resample will be resampled again. Hence, a calculation that needs 100 seconds for the standard Bootstrap will need roughly 1,000 second for the Double Bootstrap as they compute the number of subsamples * the number of subsamples PLS path models.

Links

References

  • Hair, J. F., Hult, G. T. M., Ringle, C. M., and Sarstedt, M. (2017). A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM), 2^nd^ Ed., Sage: Thousand Oaks.

  • Davison, A. C., and Hinkley, D. V. (1997). Bootstrap Methods and Their Application, Cambridge University Press: Cambridge.

  • Efron, B., and Tibshirani, R. J. (1993). An Introduction to the Bootstrap, Chapman Hall: New York.

Link to More Literature