PLS-SEM Glossary

Adopted from the primer book on PLS-SEM by Hair, Hult, Ringle, and Sarstedt and the book on advanced PLS-SEM issues by Hair, Sarstedt, Ringle, and Gudergan.

10 times rule: one way to determine the minimum sample size specific to the PLS path model that one needs for model estimation (i.e., 10 times the number of independent variables of the most complex ordinary least squares regression in the structural model or any formative measurement model). The 10 times is no reliable indication of sample size requirements in PLS-SEM and should at best seen as a rough estimate. While statistical power analyses provide a more reliable minimum sample size estimates, researchers should primarily draw on the inverse square root method, which stands out in terms of precision and ease of use.

Absolute contribution: is the information an indicator variable provides about the formatively measured item, ignoring all other indicators. The absolute contribution is provided by the loading of the indicator (i.e., its bivariate correlation with the formatively measured construct).

Absolute importance: see Absolute contribution.

AIC: see Akaike's information criterion.

AIC3: see Modified AIC with factor 3.

AIC4: see Modified AIC with factor 4.

Akaike weights: are the weight of evidence in favor of a certain model being the best model for the situation at hand given a set of alternative models.

Akaike's information criterion: an information criterion that allows assessing the relative goodness-of-fit of a segmentation solution produced by FIMIX-PLS. A smaller AIC value for a certain number of segments indicates a better fit. The AIC has a strong tendency to overestimate the correct number of segments.

Algorithmic options: offer different ways to run the PLS-SEM algorithm by, for example, selecting between alternative starting values, stop values, weighting schemes, and maximum number of iterations.

Alpha inflation: refers to the fact that the more tests you conduct at a certain significance level, the more likely you are to claim a significant result when this is not so (i.e., a Type I error).

Alternating extreme pole responses: is a suspicious survey response pattern where a respondent uses only the extreme poles of the scale (e.g., a 7-point scale) in an alternating order to answer the questions.

Artifacts: are human-made concepts, which are typically measured with formative indicators.

Attribute: an element of the construct definition. It defines the general type of property to which the focal construct refers, such as an attitude.

AVE: see Average variance extracted.

Average variance extracted: the degree to which the construct explains the variance of its indicators.

Average variance extracted: a measure of convergent validity. It is the degree to which a latent construct explains the variance of its indicators; see Communality (construct).

Bandwidth-fidelity tradeoff: a practical dilemma resulting from the tradeoff between using measures that will cover the majority of variation in a trait (domain-level measurement) or measures that will assess a few specific traits (facet-level measurement) more precisely.

Bayesian information criterion: an information criterion that allows assessing the relative goodness-of-fit of a segmentation solution produced by FIMIX-PLS. A smaller BIC value for a certain number of segments indicates a better fit. The criterion generally performs well for identifying the number of segments in FIMIX-PLS.

Bayesian information criterion is a criterion for model selection among an alternative set of models. The model with the lowest BIC is preferred.

Bias-corrected and accelerated (BCa) bootstrap confidence intervals: is a method for constructing confidence intervals, which adjusts for biases and skewness in the bootstrap distribution. The method yields very low type I errors but is limited in terms of statistical power.

Bias-corrected and Bonferroni-adjusted confidence intervals: a confidence interval type used fortesting the significance of multiple tetrads considered per measurement model.

BIC: see Bayesian information criterion (BIC).

Blindfolding: is a sample reuse technique that omits singular elements of the data matrix and uses the model estimates to predict the omitted part. It is used to compute the Q2 statistic.

Bonferroni correction: a method used to counteract the increase in the familywise error rate when performing multiple comparisons across several groups of data.

Bootstrap cases: make up the number of observations drawn in every bootstrap run. The number is set equal to the number of valid observations in the original data set.

Bootstrap confidence interval: provides an estimated range of values that is likely to include an unknown population parameter. It is determined by its lower and upper bounds, which depend on a predefined probability of error and the standard error of the estimation for a given set of sample data. When zero does not fall into the confidence interval, an estimated parameter can be assumed to be significantly different from zero for the prespecified probability of error (e.g., 5%).

Bootstrap MGA: compares the bootstrap estimates of a parameter across two groups. By counting the number of occurrences where the bootstrap estimate of the first group is larger than those of the second group, the approach derives a p value for a one-tailed test.

Bootstrap samples: are the number of samples drawn in the bootstrapping procedure. Generally, 10,000 or more samples are recommended.

Bootstrapping: is a resampling technique that draws a large number of subsamples from the original data (with replacement) and estimates models for each subsample. It is used to determine standard errors of coefficients to assess their statistical significance without relying on distributional assumptions.

Bottleneck table: a tabular representation of the ceiling lines (typically the CR-FDH ceiling line). It shows, row by row, how a certain outcome level of the dependent construct and the corresponding necessary minimum condition level of an independent construct have been achieved.

Bottom-up approach: a way to establish an HCM in which several latent variables (the LOCs) are combined into a single, more abstract construct (the HOC).

CAIC: see Consistent AIC.

Cascaded moderator analysis: is a type of moderator analysis in which the strength of a moderating effect is influenced by another variable (i.e., the moderating effect is again moderated).

Casewise deletion: an entire observation (i.e., a case or respondent) is removed from the data set because of missing data.

Categorical moderator variable: see Multigroup analysis.

Categorical scale: see Nominal scale.

Causal indicators: an indicator type used in formative measurement models. Constructs measured with causal indicators have an error term, which implies that the indicators do not fully form the construct.

Causal links: are predictive relationships in which the constructs on the left predict the constructs to the right.

CB-SEM: see Covariance-based structural equation modeling.

CCA: see Confirmatory composite analysis.

Ceiling envelopment free disposal hull (CE-FDH) line: given by the scatterplot of the values of the independent (x-axis) and dependent construct (y-axis). The outermost values of this relationship (lowest x-value and highest y-value) are associated with a step function that represents the CE-FDH line.

Ceiling regression free disposal hull (CR-FDH) line: given by the scatterplot of the values of the independent (x-axis) and dependent construct (y-axis). The outermost values of this relationship (lowest x-value and highest y-value) are associated with a linear regression function. The CR-FDH line is given by a simple linear regression line through the data points that constitute the ceiling regression free disposal hull (CE-FDH) line.

Centroid weighting scheme: uses in the first stage of the PLS-SEM algorithm a value of +1 or -1 for relationships between the constructs in the structural model depending on the sign of their correlations; see Weighting scheme.

Cluster analysis: a class of methods that partitions a set of objects with the goal of obtaining high similarity within the formed groups and high dissimilarity between the groups.

Clustering: is a class of methods that partition a set of objects with the goal of obtaining high similarity within the formed groups and high dissimilarity across groups.

Coding: is the assignment of numbers to scales in a manner that facilitates measurement.

Coefficient of determination: is a measure of the proportion of an endogenous construct's variance that is explained by its predictor constructs. It indicates a model's explanatory power with regard to a specific endogenous construct.

Collect model: an HCM type in which the HOC is a combination of several specific LOCs representing more concrete components that form the general concept. The relationship between HOC and LOCs is formative.

Collinearity: arises when two variables are highly correlated.

Common factor model: assumes that only the variance shared by the indicators used to measure a construct (i.e., the common variance) should be used to estimate the construct and its relationship with other constructs in a model. Exploratory factor analysis (EFA), confirmatory factor analysis (CFA), and CB-SEM (also referred to as common factor-based SEM) are the three main types of analyses based on common factor models.

Common factor-based SEM: ia type of SEM method, which considers the constructs as common factors that explain the covariation between its associated indicators.

Common variance: the variance that an indicator shares with other indicators in the measurement model of a construct.

Communality (construct): see Average variance extracted.

Communality (item): see Indicator reliability.

Competitive mediation: a situation in mediation analysis that occurs when the indirect effect and the direct effect are both significant and point in opposite directions.

Complementary mediation: a situation in mediation analysis that occurs when the indirect effect and the direct effect are both significant and point in the same direction.

Composite indicators: an indicator type used in formative measurement models. Constructs measured with composite indicators have no error term, which implies that the indicators fully form the construct.

Composite indicators: are a type of indicator used in formative measurement models. Composite indicators form the construct (or composite) fully by means of linear combinations.

Composite model approach: an approach to estimating construct proxies. Its objective is to account for the total variance in the observed indicators rather than to explain the correlations between the indicators.

Composite reliability (rC): isa measure of internal consistency reliability, which, unlike Cronbach's alpha, does not assume equal indicator loadings. It should be above 0.70 (in exploratory research, 0.60 to 0.70 is considered acceptable).

Composite score: see Construct scores.

Composite variable: is a linear combination of several variables.

Composite-based SEM: is a type of SEM method, which represents the constructs as composites, formed by linear combinations of sets of indicator variables.

Compositional invariance: exists when the composite scores are equal across groups.

Conceptual variables: broad ideas or thoughts about abstract concepts that researchers establish and propose to measure in their research.

Conditional indirect effect: see_Moderated mediation__ . _

Conditional process models: combine mediation and moderation analysis. See mediated moderation and moderated mediation.

Confidence interval: provides the lower and upper limit of values within which a true population parameter will fall with a certain probability (e.g., 95%). In PLS-SEM, the construction of the interval relies on bootstrapping standard errors.

Configural invariance: exists when constructs are equally parameterized and estimated across groups.

Confirmatory composite analysis: is a set of analyses used to verify the quality of a composite measurement of a theoretical concept of interest.

Confirmatory tetrad analysis for PLS-SEM (CTA-PLS): allows statistical testing of whether a measurement model is reflective or formative.

Confirmatory tetrad analysis: is a statistical procedure that allows for empirically testing the measurement model setup (i.e., whether the measures should be specified reflectively or formatively).

Confirmatory: describes applications that aim at empirically testing theoretically developed models.

Consistency at large: describes an improvement of precision of PLS-SEM results when both the number of indicators per measurement model and the number of observations increase, assuming that the data stem from a common factor model.

Consistent AIC: an information criterion that allows assessing the relative goodness-of-fit of a segmentation solution produced by FIMIX-PLS. A smaller CAIC value for a certain number of segments indicates a better fit. The criterion should be used in conjunction with AIC3 to determine the number of segments in FIMIX-PLS.

Consistent PLS-SEM: a variant of the standard PLS-SEM algorithm, which provides consistent model estimates that disattenuate the correlations between pairs of latent variables, thereby mimicking CB-SEM results.

Construct definition: the specific way in which a conceptual variable is measured in a particular study, and could differ from one study to another.

Construct scores: are columns of data (vectors) for each latent variable that represent a key result of the PLS-SEM algorithm. The length of every vector equals the number of observations in the data set used.

Constructs: measure theoretical concepts that are abstract, complex, and cannot be directly observed by means of (multiple) items. Constructs are represented in path models as circles or ovals and are also referred to as latent variables.

Content specification: is the specification of the scope of the latent variable; that is, the domain of content the indicators are intended to capture.

Content validity: is a subjective but systematic evaluation of how well the domain content of a construct is captured by its indicators.

Continuous moderator variable: is a variable that affects the direction and / or strength of the relation between an exogenous latent variable and an endogenous latent variable. Continuous moderator variables can also be used to generate categories, which serve as basis for a subsequent multigroup analysis.

Control variables: are the variables that researchers seek to keep constant when conducting research.

Convergence: is reached when the results of the PLS-SEM algorithm do not change much. In that case, the PLS-SEM algorithm stops when a prespecified stop criterion (i.e., a small number such as 0.00001) that indicates the minimal changes of PLS-SEM computations has been reached. Thus, convergence has been accomplished when the PLS-SEM algorithm stops because the prespecified stop criterion has been reached and not the maximum number of iterations.

Convergent validity: It is the degree to which a reflectively specified construct explains the variance of its indicators (see Average variable extracted). In formative measurement model evaluation, convergent validity refers to the degree to which the formatively measured construct correlates positively with an alternative (reflective or single-item) measure of the same construct (see Redundancy analysis).

Correlation weights: see Mode A.

Covariance-based structural equation modeling: is an approach for estimating structural equation models, which assumes that the concepts of interest can be represented by common factors. Covariance-based structural equation modeling (CB-SEM) can be used for theory testing but has clear limitations in terms of testing a model's predictive power.

Coverage error: occurs when the bootstrapping confidence interval of a parameter does not correspond to its empirical confidence interval.

Critical t value: is the cutoff or criterion on which the significance of a coefficient is determined. If the empirical t value is larger than the critical t value, the null hypothesis of no effect is rejected. Typical critical t values are 2.57, 1.96, and 1.65 for significance levels of 1%, 5%, and 10%, respectively (two-tailed tests).

Critical value: see Significance testing.

Cronbach's alpha: a measure of internal consistency reliability that assumes equal indicator loadings. Cronbach's alpha represents a conservative measure of internal consistency reliability.

Cross-loadings: an indicator's correlation with other constructs in the model.

Cross-validated communality: is used to obtain the _Q_² value based on the prediction of the omitted data points by means of the underlying measurement model (see Blindfolding).

Cross-validated redundancy: is used to obtain the _Q_² value based on the prediction of the omitted data points by means of the underlying structural model and measurement model (see Blindfolding).

CTA-PLS: see Confirmatory tetrad analysis.

Cubic effect: a nonlinear relationship represented by a polynomial of the degree 3; see also Nonlinear effect and Polynomial.

Data matrix: includes the empirical data that are needed to estimate the PLS path model. The data matrix must have one column for every indicator in the PLS path model. The rows represent the observations with their responses to every indicator on the PLS path model.

Degrees of freedom (df): is the number of values in the final calculation of the test statistic that are free to vary.

Diagonal lining: is a suspicious survey response pattern in which a respondent uses the available points on a scale (e.g., a 7-point scale) to place the answers to the different questions on a diagonal line.

Direct effect: is a relationship linking two constructs with a single arrow between the two.

Direct-only nonmediation: a situation in mediation analysis that occurs when the direct effect is significant but not the indirect effect.

Disattenuated correlation: the correlation between two constructs, if they were perfectly measured (i.e., if they were perfectly reliable).

Discriminant validity: is the extent to which a construct is empirically distinct from other constructs in the model.

Disjoint two-stage approach: uses the only the lower-order components of a higher-order construct in its first stage to compute the construct scores, which serve as indicators of the higher-order component in the second stage. The approach proves particularly useful for estimating formative (i.e., reflective-formative and formative-formative) higher-order constructs.

Effect indicators: see Reflective measurement.

Embedded two-stage approach : uses the entire higher-order construct in its first stage to compute the construct scores, which serve as indicators of the higher-order component in the second stage.

Embedded two-stage approach: uses the LOCs' construct scores estimated in the first stage via the repeated indicators approach as indicators of the HOC in the second stage. The approach proves particularly useful for estimating formative (i.e., reflective-formative and formative-formative) higher-order constructs.

Empirical t value: is the test statistic value obtained from the data set at hand (here: the bootstrapping results). See Significance testing.

EN: see Normed entropy statistic.

Endogeneity: occurs when a predictor construct is correlated with the error term of the dependent construct to which it is related.

Endogenous constructs: see Endogenous latent variables.

Endogenous latent variables: serve only as dependent variables or as both independent and dependent variables in a structural model.

Equality of composite mean values and variances: the final requirement for establishing full measurement invariance.

Equidistance: isgiven when the distance between data points of a scale is identical.

Equidistant scale: a scale in which the intervals are distributed in equal units. For example, a 5-point Likert scale with two negative categories (completely disagree and disagree), a neutral option, and two positive categories (agree and completely agree) can be considered an equidistant scale.

Error terms: capture the unexplained variance in constructs and indicators when path models are estimated.

Evaluation criteria: are used to evaluate the quality of the measurement models and the structural model results in PLS-SEM based on a set of nonparametric evaluation criteria and procedures such as bootstrapping and blindfolding.

Ex post analysis: aims at identifying one or more explanatory variable(s) that match the latent class segmentation results in the best possible way to facilitate a multigroup analysis.

Exact fit test: a model fit test, which applies bootstrapping to derive p values of the Euclidean distance (dL) or geodesic distance (dG) between the observed correlations and the model-implied correlations. Research has shown that these measures are largely unsuitable for detecting model misspecification in situations commonly encountered in applied research.

Exogenous constructs: see Exogenous latent variables.

Exogenous latent variables: are latent variables that serve only as independent variables in a structural model.

Expectation-maximization (EM) algorithm: an iterative algorithm for finding maximum likelihood estimates of parameters in a statistical model. The algorithm alternates between performing an expectation (E) step and a maximization (M) step. The E step creates a function for the expectation of the log-likelihood, which is evaluated using the current estimate of the parameters. The M step computes parameters by maximizing the expected log-likelihood found in the E step. The E and M steps are successively applied until the results stabilize.

Explained variance: see Coefficient of determination (R²).

Explaining and predicting (EP) theories: is a type of theory that involves understanding underlying causes and prediction as well as describing theoretical constructs and the relationships between them.

Explanatory power: provides information about the strength of the assumed causal relationships in a PLS path model. The primary measure for assessing a PLS path model's explanatory power is the coefficient of determination (R²).

Exploratory: describes applications that focus on exploring data patterns and identifying relationships.

Extended repeated indicators approach: is a method for estimating a formatively specified higher-order constructs whose higher-order component serves as an endogenous construct in the PLS path model.

Extended repeated indicators approach: links the antecedent construct of a higher-order component in a reflective-formative or formative-formative higher-order construct with the lower-order components. The effect of the antecedent construct on the higher-order component is equal to its direct effect plus the sum of all indirect effects via the lower-order components.

ƒ² effect size: is a measure used to assess the relative impact of a predictor construct on an endogenous construct in terms of its explanatory power.

Factor (score) indeterminacy: means that one can compute an infinite number of sets of factor scores matching the specific requirements of a certain common factor model. In contrast to their explicit estimation in PLS-SEM, the scores of common factors as assumed in CB-SEM are indeterminate.

Factor weighting scheme: uses the correlations between constructs in the structural model to determine their relationships in the first stage of the PLS-SEM algorithm; see Weighting scheme.

Factor-based SEM: see Covariance-based structural equation modeling.

Familywise error rate: the probability of making one or more Type I errors when performing multiple comparisons across several groups of data.

FIMIX-PLS: see Finite mixture partial least squares.

Finite mixture partial least squares: is a latent class approach that allows for identifying and treating unobserved heterogeneity in PLS path models. The approach applies mixture regressions to simultaneously estimate group-specific parameters and observations' probabilities of segment membership.

First-generation techniques: are statistical methods traditionally used by researchers, such as regression and analysis of variance.

Focal object: an element of the construct definition. It refers to the entity to which an attribute is applied.

Forced-choice scale: a scale without a neutral category.

Formative measurement model: is a type of measurement model setup in which the indicators form the construct, and arrows point from the indicators to the construct. The outer weights estimation of formative measurement models usually uses Mode B in PLS-SEM.

Formative measurement: see Formative measurement model.

Formative measures: see Formative measurement model.

Formative-formative higher-order construct: has formatively measured lower-order components and path relationships from the lower-order components to the higher-order component.

Formative-reflective higher-order construct: has formatively measured lower-order components and path relationships from the higher-order component to the lower-order components.

Fornell-Larcker criterion: a measure of discriminant validity that compares the square root of each construct's average variance extracted with its correlations with all other constructs in the model. The Fornell-Larcker criterion is largely unsuitable for detecting discriminant validity problems.

Full measurement invariance: confirmed when (1) configural invariance, (2) compositional invariance, and (3) equality of composite mean values and variances are demonstrated.

Full mediation: a situation in mediation analysis that occurs when the mediated effect is significant but not the direct effect. Hence, the mediator variable fully explains the relationship between an exogenous and an endogenous latent variable. Full mediation is also referred to as indirect-only mediation.

Gaussian copula approach: is a method for diagnosing andtreating endogeneity, which directly models the correlation of an antecedent construct with its endogenous construct's error term.

Genetic algorithm segmentation in PLS-SEM: is a distance- based segmentation method in PLS-SEM that builds on genetic algorithms, a search heuristic, which aims to find a good (not necessary the best) solution for the classification problem.

Geweke and Meese criterion: is a criterion for model selection among a set of alternative models. The model with the lowest GM is preferred.

GM: see Geweke and Meese criterion.

GoF: see Goodness-of-fit index.

Goodness-of-fit index: has been developed as an overall measure of model fit for PLS-SEM. However, as the goodness-of-fit index (GoF) cannot reliably distinguish valid from invalid models and since its applicability is limited to certain model setups, researchers should avoid its use.

Heterogeneity: occurs when the data underlie groups of data characterized by significant differences in terms of model parameters. Heterogeneity can be either observed or unobserved, depending on whether its source can be traced back to observable characteristics (e.g., demographic variables) or whether the sources of heterogeneity are not fully known.

Heterotrait-heteromethod correlations: are the correlations of the indicators across constructs measuring different constructs.

Heterotrait-monotrait ratio: is a measure of discriminant validity. The heterotrait-monotrait ratio (HTMT) is the mean of all correlations of indicators across constructs measuring different constructs (i.e., the heterotrait-heteromethod correlations) relative to the (geometric) mean of the average correlations of indicators measuring the same construct (i.e., the monotrait-heteromethod correlations).

Hierarchical common factor model: see Reflective-reflective HCM.

Hierarchical component model: see Higher-order construct.

Higher-order component: represents a more abstract dimension of a concept in a higher-order construct.

Higher-order construct: represent a higher-order structure (usually second-order) that contains several layers of constructs and involves a higher level of abstraction. Higher-order constructs involve a more abstract higher-order component related to two or more lower-order components a reflective or formative way.

Higher-order model: see Higher-order constructs.

Hill climbing: a heuristic optimization method that belongs to the family of local search procedures. The method starts with a random partition into a prespecified number of segments and iteratively improves this solution one element at a time until it arrives at a (locally) optimized solution.

HOC: see Higher-order construct.

Holdout sample: is a subset of a larger data set or a separate data set not used in model estimation.

HTMT : see Heterotrait-monotrait ratio.

Hypothesized relationships: are proposed explanations for constructs that define the path relationships in the structural model. The PLS-SEM results enable researchers to statistically test these hypotheses and thereby empirically substantiate the existence of the proposed path relationships.

Importance: is a term used in the context of the importance-performance map analysis (IPMA). It is equivalent to the unstandardized total effect of some latent variable on the target variable.

Importance-performance map: is a graphical representation of the importance-performance map analysis.

Importance-performance map analysis: extends the standard PLS-SEM results reporting of path coefficient estimates by adding a dimension to the analysis that considers the average values of the latent variable scores. More precisely, the importance-performance map analysis (IPMA) contrasts structural model total effects on a specific target construct with the average latent variable scores of this construct's predecessors.

Importance-performance matrix: see Importance-performance map analysis (IPMA).

Inconsistent mediation: see Competitive mediation.

Index of moderated mediation: quantifies the effect of a moderator on the indirect effect of an exogenous construct on an endogenous construct through a mediator.

Index: is a set of formative indicators used to measure a construct.

Indicator reliability: is the square of a standardized indicator's outer loading. It represents how much of the variation in an item is explained by the construct and is referred to as the variance extracted from the item; see Communality (item).

Indicators: are directly measured observations (raw data), also referred to as either items or manifest variables, which are represented in path models as rectangles. They are also available data (e.g., responses to survey questions or collected from company databases) used in measurement models to measure the latent variables.

Indirect effect: represents a relationship between two latent variables via a third (e.g., mediator) construct in the PLS path model. If p1 is the relationship between the exogenous latent variable and the mediator variable, and p2 is the relationship between the mediator variable and the endogenous latent variable, the indirect effect is the product of path p1 and path_p_2.

Indirect-only mediation: a situation in mediation analysis that occurs when the indirect effect is significant but not the direct effect. Hence, the mediator variable fully explains the relationship between an exogenous and an endogenous latent variable. Indirect-only mediation is also referred to as full mediation.

Individual mediating effect: A type of mediating effect in a multiple mediation model which only considers one mediator.

Information criteria: statistical measures of the relative quality of a certain segment solution that contrast the fit (i.e., the likelihood) of a solution and the number of parameters used to achieve that fit, which increase with the number of segments. The smaller the value of a certain information criterion, the better the segmentation solution.

Initial values: are the values for the relationships between the latent variables and the indicators in the first iteration of the PLS-SEM algorithm. Since the user typically has no information which indicators are more important and which indicators are less important per measurement model, an equal weight for every indicator in the PLS path model serves well for the initialization of the PLS-SEM algorithm. In accordance, all relationships in the measurement models have an initial value of +1.

Inner model: see Structural model.

In-sample predictive power: see Coefficient of determination.

Interaction effect: see Moderating effect.

Interaction term: is an auxiliary variable entered into the path model to account for the interaction of the moderator variable and the exogenous construct.

Internal consistency reliability: is a form of reliability used to judge the consistency of results across items on the same test. It determines whether the items measuring a construct are similar in their scores (i.e., if the correlations between the items are strong).

Interpretational confounding: is a situation in which the empirical meaning of a construct departs from the theoretically implied meaning.

Interval scale: can be used to provide a rating of objects and has a constant unit of measurement so the distance between the scale points is equal.

Inverse square root method: is a method for determining the minimum sample size requirement, which uses the value of the path coefficient with the minimum magnitude in the PLS path model as input.

IPMA: see Importance-performance map analysis.

Items: see Indicators.

Iterative reweighted regressions segmentation: is a particularly fast and high-performing distance-based segmentation method for PLS-SEM.

Jangle fallacy: describes the inference that two measures (e.g., scales) with different names measure different constructs.

Joint mediating effect: A type of mediating effect in a multiple mediation model which considers the total indirect effect of an exogenous on an endogenous construct via all mediators.

k-fold cross-validation: is a model validation technique for assessing how the results of a PLS-SEM analysis will generalize to an independent data set. The technique combines k-1 subsets into a single training sample that is used to predict the remaining fifth subset.

k-means clustering: a group of nonhierarchical clustering algorithms that work by partitioning observations into a predefined number of groups and then iteratively reassigning observations until some numeric goal related to cluster distinctiveness is met.

Kurtosis: is a measure of whether the distribution is too peaked (a very narrow distribution with most of the responses in the center).

Label switching: occurs when the label of a specific segment changes from one FIMIX-PLS run to the next.

Latent class techniques: area class of approaches that facilitates uncovering unobserved heterogeneity. Different approaches have been proposed, which draw on, for example, finite mixture, genetic algorithm, or hill-climbing approaches to PLS-SEM.

Latent variable scores: see Construct scores.

Latent variable: see Constructs.

Linear effect: represented by a straight line when plotted as a graph.

Linear regression model (LM) benchmark: is a benchmark used in PLS__predict, derived from regressing an endogenous construct's indicators on the indicators of all exogenous constructs. The LM benchmark thereby neglects the measurement model and structural configurations. PLS-SEM results are assumed to outperform the LM benchmark.

Linear relationship: see Linear effect.

Listwise deletion: see Casewise deletion.

LOC: see Lower-order components.

Log transformation: a type of data transformation to account for nonlinear relationships, which applies a base 10 logarithm to every observation.

Lower-order components: represent more concrete subdimension of a concept in a higher-order construct.

MAE: see Mean absolute error.

Main effect: refers to the direct effect between an exogenous and an endogenous construct in the path model without the presence of a moderating effect. After inclusion of the moderator variable, the main effect typically changes in magnitude. Therefore, it is commonly referred to as simple effect in the context of a moderator model.

Manifest variables: see Indicators.

Maximum number of iterations: is needed to ensure that the PLS-SEM algorithm stops. The goal is to reach convergence. But if convergence cannot be reached, the algorithm should stop after a certain number of iterations. This maximum number of iterations (e.g., 300) should be sufficiently high to allow the PLS-SEM algorithm to converge.

Mean absolute error: is a metric used in PLS__predict, defined as the average absolute differences between the predictions and the actual observations, with all the individual differences having equal weight.

Mean value replacement: inserts the sample mean for the missing data. Should only be used when indicators have less than 5% missing values.

Measurement equivalence: see Measurement invariance.

Measurement error: is the difference between the true value of a variable and the value obtained by a measurement.

Measurement invariance of composite models (MICOM) procedure: is a series of tests to assess invariance of measures (constructs) across multiple groups of data. The procedure comprises three steps that test different aspects of measurement invariance: (1) configural invariance (i.e., equal parameterization and way of estimation), (2) compositional invariance (i.e., equal indicator weights), and (3) equality of composite mean values and variances.

Measurement invariance: refers to whether or not, under different conditions of observing and studying phenomena (e.g., across different groups of respondents), measurement operations yield measures of the same attribute.

Measurement model misspecification: describes the use of a reflective measurement model when it should be formative or the use of a formative measurement model when it should be reflective. Measurement model misspecification usually yields invalid results and misleading conclusions.

Measurement model: is an element of a path model that contains the indicators and their relationships with the constructs and is also called the outer model in PLS-SEM.

Measurement scale: is a tool with a predetermined number of closed-ended responses that can be used to obtain an answer to a question.

Measurement theory: specifies how constructs should be measured with (a set of) indicators. It determines which indicators to use for construct measurement and the directional relationship between construct and indicators.

Measurement: is the process of assigning numbers to a variable based on a set of rules.

Mediated moderation: combines a moderator model with a mediation model in that the continuous moderating effect is mediated.

Mediating effect: occurs when a third construct intervenes between two other related constructs.

Mediation model: see Mediation.

Mediation: represents a situation in which a mediator variable to some extent absorbs the effect of a latent variable on an endogenous latent variable in the PLS path model.

Mediation: represents a situation in which one or more mediator construct(s) explain the processes through which an exogenous construct influences an endogenous construct.

Mediator construct: is a construct, which intervenes between two other directly related constructs.

Mediator model: see Mediation.

Metric scale: a type of measurement scale that has a constant unit of measurement so that the distance between the scale points is equal (interval scale), thereby allowing for the interpretation of the scale points' absolute differences. In case the scale additionally has an absolute zero point, ratios among the scale points can be interpreted (ratio scale).

Metric scale: represents data on a ratio scale and interval scale; see Ratio scale, Interval scale.

Metrological uncertainty: is the dispersion of the measurement values that can be attributed to the object or concept being measured.

MICOM: see Measurement invariance of composite models (MICOM) procedure.

MIMIC model: see Multiple indicators and multiple causes model.

Minimum Description Length 5 (MDL5): an information criterion that allows assessing the relative goodness-of-fit of a segmentation solution produced by FIMIX-PLS. A smaller MDL5 value for a certain number of segments indicates a better fit. The MDL5 has a strong tendency to underestimate the correct number of segments.

Minimum sample size requirements: is the number of observations needed to represent the underlying population and to meet the technical requirements of the multivariate analysis method used. See inverse square root method.

Missing value treatment: can employ different methods such as mean replacement, EM (expectation-maximization algorithm), and nearest neighbor to obtain values for missing data points in the set of data used for the analysis. As an alternative, researchers may consider deleting cases with missing values (i.e., casewise deletion).

Mode A: uses correlation weights to compute composite scores from sets of indicators. More specifically, the outer weights are the correlation (or single regression) between the construct and each of its indicators. See Reflective measurement.

Mode B: uses regression weights to compute composite scores from sets of indicators. To obtain the weights, the construct is regressed on its indicators. Hence, the outer weighs in Mode B are the coefficients of a multiple regression model. See Formative measurement.

Model comparisons: involve establishing and empirically comparing a set of theoretically justified competing models that represent alternative explanations of the phenomenon under research.

Model complexity: indicates how many latent variables, structural model relationships, and indicators exist in a PLS path model.

Model overfit: occurs when the model estimates fit the data set used for model estimation but do not generalize well to other data sets.

Model parsimony: see Parsimonious models.

Model selection criteria: see Information criteria.

Model-implied nonredundant vanishing tetrads: are tetrads considered for significance testing in CTA-PLS.

Moderated mediation: combines a mediation model with a moderator model in that the mediator relationship is moderated by a continuous moderator construct.

Moderating effect: see Moderation.

Moderation: occurs when the effect of a latent variable on an endogenous latent variable depends on the values of a third variable, referred to as a moderator variable.

Moderation: occurs when the effect of an exogenous latent variable on an endogenous latent variable depends on the values of a third variable, referred to as a moderator variable, which impacts the relationship.

Moderator effect: see Moderation.

Moderator variable: see Moderation.

Modified AIC with factor 3 (AIC3): an information criterion that allows assessing the relative goodness-of-fit of a segmentation solution produced by FIMIX-PLS. A smaller AIC3 value for a certain number of segments indicates a better fit. The criterion should be used in conjunction with CAIC to determine the number of segments in FIMIX-PLS.

Modified AIC with factor 4 (AIC4): an information criterion that allows assessing the relative goodness-of-fit of a segmentation solution produced by FIMIX-PLS. A smaller AIC4 value for a certain number of segments indicates a better fit. The criterion generally performs well for identifying the number of segments in FIMIX-PLS.

Monotrait-heteromethod correlations: are the correlations of indicators measuring the same construct.

Multicollinearity: see Collinearity.

Multigroup analysis (MGA): tests whether parameters (mostly path coefficients) differ significantly between two groups. Research has proposed a range of approaches to multigroup analysis, which rely on the bootstrapping or permutation procedure.

Multigroup analysis: is a type of moderator analysis where the moderator variable is categorical (usually with two categories) and is assumed to potentially affect all relationships in the structural model; it tests whether parameters (mostly path coefficients) differ significantly between two groups. Research has proposed a range of approaches to multigroup analysis, which rely on the bootstrapping or permutation procedure.

Multimethod MGA approach: involves applying different MGA.

Multiple battery model: an HCM type in which the LOCs measure the same construct at different points of time using data from the same set of respondents.

Multiple indicators and multiple causes (MIMIC) model: a type of structural equation model that incorporates both formative and reflective indicators to measure latent constructs.

Multiple mediation analysis: describes a mediation analysis in which multiple mediator variables are being included in the model.

Multiple moderator model: describes a moderation analysis in which multiple moderators are being included in the model.

Multiple testing problem: occurs when the Type I error of a series of tests increases exponentially.

Multivariate analysis : is a statistical method that simultaneously analyzes multiple variables.

NCA: see Necessary condition analysis.

Necessary condition analysis: contrasts dependent and independent variables to identify necessary condition thresholds that have to be met to obtain a certain outcome. This analysis supports researchers in identifying necessary conditions for their outcomes.

Necessity effect size d: indicates whether a construct is necessary for achieving a certain outcome. A necessary condition should have at least a small effect size d (i.e., ≥ 0.1), which is significant.

Necessity logic: implies that an outcome - or a certain level of an outcome - can only be achieved if the necessary condition (e.g., a certain indicator or construct) is in place or has reached a certain level. If the necessary condition is not in place, the outcome will not materialize (i.e., the condition is necessary, but not sufficient for an outcome).

No-effect nonmediation: a situation in mediation analysis that occurs when neither the direct nor the indirect effect is significant.

Nominal scale: is a measurement scale in which numbers are assigned that can be used to identify and classify objects (e.g., people, companies, products, etc.).

Nomological validity: is the degree to which a construct behaves as it should in a system of related constructs.

Nonlinear effect: is not represented by a straight line when plotted on a graph but by a curve.

Nonlinear relationship: see Nonlinear effect.

Nonparametric distance-based test: allows testingwhether the complete model is different across two (or more) groups. The test applies the permutation procedure to compare the average (squared Euclidean or geodesic) distance of the model-implied indicator correlation matrix across the groups.

Nonredundant tetrads: tetrads considered for significance testing in CTA-PLS.

Normed entropy statistic (EN): a statistical measure that uses the observations' probabilities of segment membership as an indication of how well segments are separated. Values above 0.50 permit a clear-cut classification of data into the segments.

Observed heterogeneity: occurs when the sources of heterogeneity are known and can be traced back to observable characteristics such as demographics (e.g., gender, age, income).

Omission distance D: determines which data points are deleted when applying the blindfolding (see Blindfolding) procedure. An omission distance D of 9, for example, implies that every ninth data point, and thus 1/9 = 11.11% of the data in the original data set, is deleted during each blindfolding round. The omission distance should be chosen so that the number of observations used from model estimation divided by D is not an integer. Furthermore, D should be between 5 and 10.

One-tailed test: see Significance testing.

Optimization criterion: a certain measure whose value defines the quality of a tested set of parameters.

Ordinal scale: a measurement scale in which the assigned numbers indicate the relative positions of objects in an ordered series.

Orthogonalizing approach: is an approach to model the interaction term when including a moderator variable in the model. It creates an interaction term with orthogonal indicators, which are uncorrelated with the indicators of the independent variable and the moderator variable in the moderator model.

Outer loadings: are the bivariate correlations between a construct and the indicators. They determine an item's absolute contribution to its assigned construct. Loadings are of primary interest in the evaluation of reflective measurement models but are also interpreted when formative measures are involved.

Outer models: see Measurement model.

Outer weights: are the results of a multiple regression of a construct on its set of indicators. Weights are the primary criterion to assess each indicator's relative importance in formative measurement models.

Outlier: is an extreme response to a particular question or extreme responses to all questions.

Out-of-sample predictive power: see Predictive power.

p value: is, in the context of structural model assessment, the probability of error for assuming that a path coefficient is significantly different from zero. In applications, researchers compare the p value of a coefficient with a significance level selected prior to the analysis to decide whether the path coefficient is statistically significant.

Pairwise deletion: uses all observations with complete responses in the calculation of the model parameters. As a result, different calculations in the analysis may be based on different sample sizes, which can bias the results. The use of pairwise deletion should generally be avoided.

Parameter settings: see Algorithmic options.

Parametric approach: is a type of multigroup analysis, representing a modified version of a two independent samples t test.

Parametric MGA (equal): see Parametric MGA.

Parametric MGA (unequal): see Parametric MGA.

Parametric MGA: a multigroup variant, representing a modified version of a two-independent-samples t test. The test comes in two forms - parametric MGA (equal) or parametric MGA (unequal) - depending on whether the population variances of the groups are considered to be equal.

Parsimonious models: are models with as few parameters as possible for a given quality of model estimation results.

Partial least squares algorithm: allows estimating path models with latent variables.

Partial least squares k-means: is a clustering method, which maximizes group-specific latent variable score differences, while at the same time accounting for heterogeneity in the structural and measurement model relations.

Partial least squares path modeling: see Partial least squares structural equation modeling.

Partial least squares regression: an approach designed to reduce the problem of multicollinearity in regression models. It uses a principal components analysis that extracts linear composites of the independent variables and their respective scores, taking into consideration the relationship between the independent and dependent variables, and maximizing the explanation of the dependent variable.

Partial least squares structural equation modeling: is a composite-based method to estimate structural equation models. The goal is to maximize the explained variance of the endogenous latent variables.

Partial measurement invariance: is confirmed when both (1) configural invariance and (2) compositional invariance are demonstrated.

Partial mediation: occurs when a mediator variable partially explains the relationship between an exogenous and an endogenous construct. Partial mediation can come in the form of complementary and competitive mediation, depending on the relationship between the direct and indirect effects.

Path coefficients: are estimated path relationships in the structural model (i.e., between the constructs in the model). They correspond to standardized betas in a regression analysis.

Path model: is a diagram that visually displays the hypotheses and variable relationships that are examined when structural equation modeling is applied.

Path weighting scheme: uses the results of partial regression models to determine the relationships between the constructs in the structural model in the first stage of the PLS-SEM algorithm; see Weighting scheme.

Percentile method: is an approach for constructing bootstrap confidence intervals. Using the ordered set of parameter estimates obtained from bootstrapping, the lower and upper bounds are directly computed by excluding a certain percentage of lowest and highest values (e.g., 2.5% in the case of the 95% bootstrap confidence interval). The percentile method should be preferred when constructing confidence intervals.

Performance: is a term used in the context of the importance-performance map analysis (IPMA). It is the mean value of the unstandardized (and rescaled) scores of a latent variable or an indicator.

Permutation MGA: randomly permutes observations between two groups and re-estimates the model to derive a test statistic for the group differences.

Permutation test: is a type of multigroup analysis. The test randomly permutes observations between the groups and re-estimates the model to derive a test statistic for the group differences.

Permutation: generates a reference distribution of some set of parameters from the actual data by randomly exchanging observations between the groups multiple times.

PLS algorithm: see Partial least squares algorithm.

PLS path modeling: see Partial least squares structural equation modeling.

PLS regression: is an analysis technique that explores the linear relationships between multiple independent variables and a single or multiple dependent variable(s). In developing the regression model, it constructs composites from both the multiple independent variables and the dependent variable(s) by means of principal component analysis.

PLS typological path modeling : is the first distance-based segmentation method developed for PLS-SEM.

PLSc: see Consistent PLS-SEM.

PLSc-SEM: see Consistent PLS-SEM.

PLSe2: is a variant of the original PLS-SEM algorithm. Similar to PLSc-SEM, it makes the model estimates consistent in a common factor model sense.

PLS-GAS: see Genetic algorithm segmentation in PLS-SEM.

PLS-IRRS: see Iterative reweighted regressions segmentation method.

PLS-MGA: is a bootstrap-based multigroup analysis technique that further improves Henseler's MGA.

PLS-POS: see Prediction-oriented segmentation approach in PLS-SEM.

PLSpredict procedure: is a holdout-sample-based procedure that generates case-level predictions on an item or a construct level to facilitate the assessment of a PLS path model's predictive power. The PLSpredict procedure relies on the concept of k-fold cross-validation.

PLS-R: see Partial least squares regression.

PLS-SEM algorithm: is the heart of the method. Based on the PLS path model andthe indicator data available, the algorithm estimates the scores of all latent variables in the model, which in turn serve for estimating all path model relationships.

PLS-SEM bias: refers to PLS-SEM's property that structural model relationships are slightly underestimated and relationships in the measurement models are slightly overestimated compared to CB-SEM when using the method on common factor model data. This difference can be attributed to the methods' different handling of the latent variables in the model estimation but is negligible in most settings typically encountered in empirical research.

PLS-SEM: see Partial least squares structural equation modeling.

PLS-SEM-KM: see Partial least squares k-means.

PLS-TPM: see PLS typological path modeling segmentation.

Polynomial degree: determines number of terms that are summed in a polynomial; see also Polynomial.

Polynomial: is a mathematical expression consisting of a sum of terms, whereby each term includes a variable raised to a power and multiplied by a coefficient.

Prediction error: is the difference between a variable's predicted and original value.

Prediction statistics: quantify the degree of prediction error.

Prediction: see Predictive power.

Prediction-oriented segmentation in PLS-SEM: is a distance-based segmentation method for PLS-SEM.

Predictive power: indicates a model's ability to predict new or future observations.

Presegmentation: an option for PLS-POS that, in the first round, assigns all observations at the same time to their closest segment. Then, in the subsequent iterations, PLS-POS reassigns only one observation per iteration.

Principal components regression: performs a principal components analysis on the independent variables, and the principal components are used as predictive/explanatory variables for the dependent variable. It focuses on reducing the dimensionality of the independent variables without taking into account the relationship between the independent and dependent variables.

Priority map analysis: see Importance-performance map analysis (IPMA).

Product indicator approach: is an approach to model the nonlinear (e.g., quadratic) term. It involves multiplying all indicators of the exogenous latent variable to establish a measurement model of the nonlinear (e.g., quadratic) term. The approach is only applicable for reflectively measured exogenous latent variables.

Product indicators: are indicators of an interaction term, generated by multiplication of each indicator of the exogenous construct with each indicator of the moderator variable. See Product indicator approach.

_Q_²predict: is a metric used in PLSpredict to assess the model's predictive power. The metric represents a naïve benchmark for the PLS-SEM results. Values greater zero indicate that the PLS-SEM estimation beats the naïve benchmark in terms of prediction.

_Q_² statistic: is a measure for evaluating structural models. The computation of _Q_² draws on the blindfolding technique, which uses a subset of the available data to estimate model parameters and then predicts the omitted data. _Q_² examines whether a model accurately predicts data points not used in the estimation of model parameters. As the measure blends in-sample and out-of-sample predictive power assessment, we advise against its use.

Quadratic effect: represented by a curved nonlinear relationship characterized by a polynomial of the degree 2; see also Nonlinear effect and Polynomial.

R² value: see Coefficient of determination (R²).

Ratio scale: is a measurement scale, which has a constant unit of measurement and an absolute zero point; a ratio can be calculated using the scale points.

Raw data: are the unstandardized observations in the data matrix that is used for the PLS path model estimation.

REBUS-PLS: see Response-based procedure for detecting unit segments in PLS path modeling.

Redundancy analysis: is a method used to assess a formative construct's convergent validity. It tests whether a formatively measured construct is highly correlated with a reflective or single-item measure of the same construct.

Reflective indicators: indicators of a reflective measurement model.

Reflective measure: see Reflective measurement.

Reflective measurement model: see Reflective measurement

Reflective measurement: is a type of measurement model setup in which measures represent the effects (or manifestations) of an underlying construct. Causality is from the construct to its measures (indicators). The outer loadings estimation of reflective measurement models usually uses Mode A in PLS-SEM.

Reflective-formative higher-order construct: has reflectively measured lower-order components and path relationships from the lower-order components to the higher-order component.

Reflective-reflective higher-order construct: has reflective measurement models of all LOCs in the HCM and reflective path relationships from the HOC to the LOCs; also referred to as Hierarchical common factor model.

Regression weights: see Mode B.

Relative contribution: is the unique importance of each indicator by partializing the variance of the formatively measured construct that is predicted by the other indicators. An item's relative contribution is provided by its weight.

Relevance of significant relationships: compares the relative importance of predictor constructs to explain endogenous latent constructs in the structural model. Significance is a prerequisite for the relevance, but not all constructs and their significant path coefficients are highly relevant to explain a selected target construct.

Reliability coefficient rA: is a measure of internal consistency reliability.

Reliability: is the consistency of a measure. A measure is reliable (in the sense of test-retest reliability) when it produces consistent outcomes under consistent conditions. The most commonly used measure of reliability is the internal consistency reliability.

Repeated indicators approach: is a type of measurement model setup in higher-order constructs that reuses the indicators of the lower-order components as indicators of the higher-order component to identify the higher-order construct.

Rescaling: is the act of changing the values of a variable's scale to fit a predefined range (e.g., 0 to 100).

Response-based procedure for detecting unit segments in PLS path modeling: is a distance-based segmentation method for PLS-SEM that builds on the PLS-TPM method.

Response-based segmentation techniques: see Latent class techniques.

RMSE: see Root mean squared error (RMSE)

RMStheta: see Root mean square residual covariance.

Root mean square residual covariance: is a model fit measure, which is based on the (root mean square) discrepancy between the observed covariance and the model-implied correlations. In CB-SEM, an SRMR value indicates good fit, but no threshold value has been introduced in a PLS-SEM context yet. Initial simulation results suggest a (conservative) threshold value for the root mean square residual covariance (RMStheta) of 0.12. That is, RMStheta values below 0.12 indicate a well-fitting model, whereas higher values indicate a lack of fit. However, model fit measures should generally be treated with extreme caution in PLS-SEM.

Root mean squared error: is a metric used in PLSpredict, defined as the square root of the average of the squared differences between the predictions and the actual observations.

Sampling weights: assign the observations different importance in the parameter estimation in order to obtain unbiased estimates of the population effects.

Scale: is a set of reflective indicators used to measure a construct.

Search depth: a parameter in PLS-POS that defines the maximum number of observations considered for reassignment to another segment.

Secondary data: are data that have already been gathered, often for a different research purpose and some time ago.

Second-generation techniques: overcome the limitations of first-generation techniques, for example, in terms of accounting for measurement error. SEM is the most prominent second-generation data analysis technique.

Second-order construct: is a type of higher-order construct with two levels of abstraction.

Self-interaction: occurs when the effect of an exogenous latent variable on an endogenous latent variable depends on the values of the exogenous latent variable.

SEM: see Structural equation modeling.

Serial mediating effect: A type of mediating effect in a multiple mediation model which considers a sequence of effects via two or more mediators simultaneously.

Significance testing: is the process of testing whether a certain result likely has occurred by chance (i.e., whether an effect can be assumed to truly exist in the population).

Simple effect: is a cause-effect relationship in a moderator model. The parameter estimate represents the size of the relationship between the exogenous and endogenous latent variable when the moderator variable is included in the model. For this reason, the main effect and the simple effect usually have different sizes.

Single mediation analysis: describes a mediation analysis in which only one mediator variable is being included in the model.

Single-item measurement: uses only a single item to measure a construct. Since the construct is equal to its measure, the indicator loading is 1.00, making conventional reliability and convergent validity assessments inappropriate.

Singular data matrix: occurs when a variable in a measurement model is a linear combination of another variable in the same measurement model or when a variable has identical values for all observations. In this case, the variable has no variance and the PLS-SEM approach cannot estimate the PLS path model.

Skewness: is the extent to which a variable's distribution is symmetrical around its mean value.

Slope plot: is a type of line chart used to detect changes in linear slopes between groups.

Sobel test: is a test that has been proposed to assess the significance of the indirect effect in a mediation model. Due to its parametric nature and reliance on unstandardized path coefficients, the test is not applicable in a PLS-SEM context.

Specific indirect effect: describes an indirect effect via one single mediator in a multiple mediation model.

Spread model: an HCM type in which the HOC is manifested in several more specific LOCs. The relationship between HOC and LOCs is reflective.

SRMR: see Standardized root mean square residual.

Stand-alone higher-order construct: a higher-order construct that is not embedded in a greater nomological net of constructs.

Standard error: is the standard deviation of the sampling distribution of a given statistic. Standard errors are important to show how much sampling fluctuation a statistic has.

Standardized data: have a mean value of 0 and a standard deviation of 1 (z-standardization). The PLS-SEM method usually uses standardized raw data. Most software tools automatically standardize the raw data when running the PLS-SEM algorithm.

Standardized root mean square residual: is a model fit measure, which is defined as the root mean square discrepancy between the observed correlations and the model-implied correlations. Research has shown that the SRMR is largely unsuitable for detecting model misspecification in situations commonly encountered in applied research.

Standardized values: indicate how many standard deviations an observation is above or below the mean.

Statistical power: the probability to detect a significant relationship when the relationship is in fact significant in the population.

Stop criterion: see Convergence.

Straight lining: describes a situation in which a respondent marks the same response for a high proportion of the questions.

Structural equation modeling: is a set of statistical. Methods used to estimate relationships between constructs and indicators, while accounting for measurement error.

Structural model: includes the construct and their relationships as derived from theory and logic.

Structural theory: specifies how the latent variables are related to each other. That is, it shows the constructs and the paths between them.

Studentized bootstrap method: computes confidence intervals similarly to a confidence interval based on the t distribution, except that the standard error is derived from the bootstrapping results.

Sufficiency logic: assumes that several factors contribute to the outcome and that one can compensate for another.

Sum scores: represent a naive way to determine the latent variable scores. Instead of estimating the relationships in the measurement models, sum scores use the same weight for each indicator per measurement model (equal weights) to determine the latent variable scores. As such, the sum scores approach does not account form measurement error.

Suppressor variable: describes the mediator variable in competitive mediation, which absorbs a significant share of or the entire direct effect, thereby substantially decreasing the magnitude of the total effect.

Tetrad: is the difference of the product of a pair of covariances and the product of another pair of covariances. In reflective measurement models, this difference is assumed to be zero or at least close to zero; that is, they are expected to vanish. Nonvanishing tetrads in a latent variable's measurement model cast doubt on its reflective specification, suggesting a formative specification.

Theoretical model: a set of equations with variables that formalize a theory.

Theoretical t value: see Critical t value.

Theory: is a set of systematically related hypotheses developed following the scientific method that can be used to explain and predict outcomes and can be tested empirically.

Three-way interaction: is an extension of two-way interaction where the moderator effect is again moderated by another moderator variable.

TOL: see Variance inflation factor.

Tolerance: see Variance inflation factor.

Top-down approach: a way to establish an HCM in which a more abstract construct (the HOC) is defined that consists of several components (the LOCs).

Total effect: is the sum of the direct effect and the indirect effect between an exogenous and an endogenous latent variable in the path model.

Total indirect effect: is the sum of all specific indirect effects in a multiple mediation model.

Training sample: is a subset of a larger data set used for model estimation.

Two-stage approach (higher-order constructs): is an approach to modeling and estimating an higher-order constructs in PLS-SEM, which is particularly useful when a reflective-formative or formative-formative higher-order construct serves as an endogenous construct in the PLS path model.

Two-stage approach (moderation analysis): is an approach to model the interaction term when including a moderator variable in the model. The approach can be used when the exogenous construct and/or the moderator variable are measured formatively.

Two-stage approach (nonlinear effects): an approach to model the nonlinear (e.g., quadratic) term. The approach can be used for any kind of exogenous construct no matter whether it is measured reflectively, formatively, or represents a single item construct.

Two-tailed test: see Significance testing.

Two-way interaction: is the standard approach to moderator analysis where the moderator variable interacts with one other exogenous latent variable.

Type I higher-order construct: see Reflective-reflective higher-order construct.

Type II higher-order construct: see Reflective-formative higher-order construct.

Type III higher-order construct: see Formative-reflective higher-order construct.

Type IV higher-order construct: see Formative-formative higher-order construct.

Unobserved heterogeneity: occurs when the sources of heterogeneous data structures are not (fully) known.

Unobserved heterogeneity: occurs when the sources of heterogeneous data structures are not (fully) known.

Validity: is the extent to which a construct's indicators jointly measure what they are supposed to measure.

Vanishing tetrads: see Tetrad.

Variance inflation factor: quantifies the severity of collinearity among the indicators in a formative measurement model. The VIF is directly related to the tolerance value (VIFi = 1/tolerancei).

Variance-based SEM: see Partial least squares structural equation modeling.

Variate: see Composite variable.

VIF: see Variance inflation factor.

Weighted PLS-SEM: is a modified version of the original PLS-SEM algorithm allows the researcher to incorporate sampling weights.

Weighting scheme: describes a particular method to determine the relationships in the structural model when running the PLS-SEM algorithm. Standard options are the centroid, factor, and path weighting schemes. The final results do not differ much, and one should use the path weighting scheme as a default option since it maximizes the R² values of the PLS path model estimation.

Welch-Satterthwaite t test: see Parametric MGA.

WPLS: see Weighted PLS-SEM.