CB-SEM Bootstrapping

Abstract

Bootstrapping is a nonparametric procedure that can be used to test the statistical significance of various CB-SEM results (e.g., path coefficients).

This algorithm is in beta stage. Changes and additions are likely and feedback is welcome.

Bootstrapping Settings in SmartPLS

Subsamples

Bootstrapping randomly generates (with replacement) subsamples from the original data set. The number of observations per subsample is equal to that of the original dataset. To ensure sufficient approximation of the sampling distribution, the number of subsamples should be large. For first evaluation, one may use a smaller number of bootstrap subsamples (e.g., 500). However, for the final results one should use a large number of bootstrap subsamples (e.g., 5,000 or 10,000). Note that larger numbers of bootstrap subsamples increase calculation time.

Amount of results

The Most important (faster) option returns a selection of the most important bootstrapping results that include.

The Complete (slower) option computes all available bootstrapping results. Therefore, it takes more time to run and is slower.

Confidence interval method

These choices allow the specification of the bootstrapping method used to estimate nonparametric confidence intervals. The following bootstrapping procedures are available: percentile (default), studentized, and bias-corrected and accelerated (BCa).

Test type

This setting indicates whether significance is based on a one-sided or two-sided test. The choice has two effects: (1) It affects the width of the confidence interval. (2) It affects the calculation of p values.

For instance, the default specification of a 5% significance level corresponds to a 95% confidence interval between the 2.5% probability of error at the left tail (i.e., the lower bound) and right tail (i.e., the upper bound).

Significance level

This setting specifies the desired significance level for parameter tests and has two effects: (1) It determines the width of the confidence interval (e.g., the default specification of a 5% significance level corresponds to a 95% confidence interval); (2) It affects the highlighting of the p values in the results report (i.e., values below the specified significance level are displayed in green, while those above appear in red).

Random number generator

The algorithm randomly generates subsamples from the original data set, which requires a seed value for the random number generator. You have the option to choose between a random seed and a fixed seed.

The random seed produces different random numbers and therefore results every time the algorithm is executed (this was the default and only option in SmartPLS 3).

The fixed seed uses a pre-specified seed value that is the same for every execution of the algorithm. Thus, it produces the same results if the same number of subsamples are drawn. It thereby addresses concerns about the replicability of research findings.