Logistic Regression

Abstract

Logistic regression is used to predict the value of a binary dependent variable based on one or more independent variables measured on a metric or binary scale. In contrast to linear regression, it explicitly accommodates and requires a dichotomous dependent variable (0/1), whereas linear regression assumes a dependent variable measured on a continuous metric scale.

This algorithm is in beta stage. Changes and additions are likely and feedback is welcome.

Brief Description

As described by Hair, Black, Babin, and Anderson (2018), logistic regression is a specialized form of regression designed to predict and explain a binary dependent variable rather than a metric-dependent variable. Its structure resembles that of multiple regression, representing a single multivariate relationship with coefficients that indicate the relative influence of each predictor. However, unlike linear regression, the coefficients are interpreted differently: they reflect effects on the logit (log odds) rather than linear relationships. Logistic regression can accommodate both metric and non-metric (categorical) independent variables, with categorical variables typically represented using dummy-coded binary indicators.

Logistic regression in SmartPLS is based on the multiple regression framework used for linear regression but requires a binary dependent variable. The software provides estimates of the logistic regression coefficients, their significance, and various metrics for evaluating the model’s predictive accuracy. Estimation is performed using a maximum likelihood approach with Newton-Raphson iterations. Accordingly, the output also includes standard information on model fit.

When specified (i.e., in the graphical regression model), an intercept is included in the model estimation. To estimate a regression model without an intercept in SmartPLS, the intercept must be removed from the graphical specification of the regression model (i.e., select the intecept in the graphic and delete it).

Logistic Regression Settings in SmartPLS

Test type

Specifies if a one-sided or two-sided significance test is conducted.

Significance level

Specifies the significance level of the test statistic.

Maximum Iterations

The maximum number of iterations specifies how many times the maximum likelihood (ML) estimation algorithm will run. This prevents the algorithm from running indefinitely in cases of nonconvergence. In most instances, the algorithm converges within a few iterations, depending on the precision specified by the stopping criterion. However, ML algorithms can occasionally encounter convergence issues, and in such cases, increasing the maximum number of iterations may help achieve convergence.

Stop Criterion

The algorithm terminates when the change in the log-likelihood (LnL) between two consecutive iterations falls below the specified stopping criterion or when the maximum number of iterations is reached.

Why beta?

SmartPLS has released the Logistic Regression algorithm as beta version for the following reasons:

The current implementation should produce correct results and has undergone some basic testing, but extensive testing is not yet completed.
The current implementation is not yet finished and will include additional results and outputs in the future.
Considerable changes in the structure of the results reports are possible in the future.

References

Backhaus, K., Erichson, B., Gensler, S., Weiber, R., and Weiber, T. (2021). Multivariate Analysis: An Application-Oriented Introduction. Gabler:Wiesbaden.
Hair, J. F., Black, W. C., Babin, B. J., and Anderson, R. E. (2018). Multivariate Data Analysis (8 ed.). Cengage Learning: London.
More literature ...

Cite correctly

Please always cite the use of SmartPLS!

Ringle, Christian M., Wende, Sven, & Becker, Jan-Michael. (2024). SmartPLS 4. Bönningstedt: SmartPLS. Retrieved from https://www.smartpls.com