Logistic Regression

Abstract

Logistic regression is a model for predicting the value of a binary dependent variable based on one or more independent variables of metric or binary scale. As such it is different from linear regression in that it allows/requires the dependent variable to be binary (i.e., 0/1) variable, while in linear regression the dependent variable has to be of metric scale (i.e., continuous).

This algorithm is in beta stage. Changes and additions are likely and feedback is welcome.

Brief Description

As described by Hair, Black, Babin, and Anderson (2018), logistic regression analysis is a specialized form of regression that is designed to predict and explain a binary dependent variable rather than a metric- dependent variable. The form of the logistic regression model is similar to multiple regression, in that it represents a single multivariate relationship, with regression-like coefficients indicating the relative impact of each predictor variable. However, unlike linear regression the regression coefficients are interpreted differently as they do not represent linear relationships, but a regression on logit values (log odds). For the independent variables, logistic regression allows both metric and non- metric (categorical) variables in the form of dummy coded binary variables. Logistic regression in SmartPLS builds on the multiple regression model (i.e., the same that is used for linear regression) but requires a binary dependent variable to be executable. SmartPLS provides the results for the logistic regression coefficients, their significance as well as several metrics for assessing the predictive accuracy of the model. For estimation, it uses a maximum likelihood approach with Newton Raphson steps. Accordingly, the outputs also provides typical information regarding model fit.

Logistic Regression Settings in SmartPLS

Test type

Specifies if a one-sided or two-sided significance test is conducted.

Significance level

Specifies the significance level of the test statistic.

Maximum Iterations

The maximum number of iterations that the maximum likelihood (ML) estimation will perform. Ensures in case of nonconvergence that the algorithm is not running forever. However, in most cases the algorithm will converges within a couple of iterations (depending on the required precision of the stop criterion), but sometimes ML algorithms may have problems to converge. In such cases, allowing higher numbers of iterations might be useful.

Stop Criterion

The algorithm stops when the change in the log-likelihood (LnL) between two consecutive iterations is less than this stop criterion value (or when the maximum number of iterations is reached).

Why beta?

SmartPLS has released the Logistic Regression algorithm as beta version for the following reasons:

  • The current implementation should produce correct results and has undergone some basic testing, but extensive testing is not yet completed.
  • The current implementation is not yet finished and will include additional results and outputs in the future.
  • Considerable changes in the structure of the results reports are possible in the future.

References

  • Backhaus, K., Erichson, B., Gensler, S., Weiber, R., and Weiber, T. (2021). Multivariate Analysis: An Application-Oriented Introduction. Gabler:Wiesbaden.

  • Hair, J. F., Black, W. C., Babin, B. J., and Anderson, R. E. (2018). Multivariate Data Analysis (8 ed.). Cengage Learning: London.

  • More literature ...

Please always cite the use of SmartPLS!

Ringle, Christian M., Wende, Sven, & Becker, Jan-Michael. (2024). SmartPLS 4. Bönningstedt: SmartPLS. Retrieved from https://www.smartpls.com