Variation Sharing: A Novel Numeric Solution to the Path Bias Underestimation Problem of PLS-Based SEM

Variation Sharing: A Novel Numeric Solution to the Path Bias Underestimation Problem of PLS-Based SEM

Ned Kock (Department of International Business and Technology Studies, Texas A&M International University, Laredo, TX, USA) and Shaun Sexton (Megabus USA, Chicago, IL, USA)
Copyright: © 2017 |Pages: 23
DOI: 10.4018/IJSDS.2017100102
OnDemand PDF Download:
List Price: $37.50


The most fundamental problem currently associated with structural equation modeling employing the partial least squares method is that it does not properly account for measurement error, which often leads to path coefficient estimates that asymptotically converge to values of lower magnitude than the true values. This attenuation phenomenon affects applications in the field of business data analytics; and is in fact a characteristic of composite-based models in general, where latent variables are modeled as exact linear combinations of their indicators. The underestimation is often of around 10% per path in models that meet generally accepted measurement quality assessment criteria. The authors propose a numeric solution to this problem, which they call the factor-based partial least squares regression (FPLSR) algorithm, whereby variation lost in composites is restored in proportion to measurement error and amount of attenuation. Six variations of the solution are developed based on different reliability measures, and contrasted in Monte Carlo simulations. The authors' solution is nonparametric and seems to perform generally well with small samples and severely non-normal data.
Article Preview


Structural equation modeling (SEM) is extensively used in many areas of research, including various business disciplines, as well as the social and behavioral sciences (Kline, 2010; Kock, 2014; Schumacker & Lomax, 2004). The techniques underlying SEM are relevant for the incipient field of business data analytics (Abdelhafez, 2014; Cech et al., 2014; Lee et al., 2014; Liu & Shi, 2015; Wang & Zhou, 2014). SEM employs latent variables, which are measured indirectly through “observed” or “manifest” variables, in sets associated with latent variables that are normally called “indicators”. This measurement includes error. Latent variables typically refer to perception-based constructs (e.g., satisfaction with one’s job). Indicators normally store numeric answers to sets of questions in questionnaires, each set designed to refer to a latent variable, and expected to measure it with a certain degree of imprecision.

Many SEM methods have been proposed over the years. Two main classes of methods have gained wider acceptance: covariance-based and PLS-based SEM (Hair et al., 2011; Kline, 2010; Kock, 2014; Kock & Lynn, 2012). Covariance-based SEM, often viewed as the classic form of SEM, builds on strong parametric assumptions (e.g., multivariate normality) and relies on the minimization of differences between indicator covariance matrices.

PLS-based SEM is generally nonparametric in design, building largely on techniques that make no distributional assumptions. It has a few advantages over covariance-based SEM, such as virtually always converging to solutions; even in complex models, with small sample sizes, and severely non-normal data (Hair et al., 2011; Tenenhaus et al., 2005). Also, PLS-based SEM generates latent variable scores, which can be used in further analyses – e.g., analyses that attempt to uncover and model nonlinear relationships among latent variables (Brewer et al., 2012; Guo et al., 2011; Kock, 2010). Finally, leading software tools for conducting PLS-based SEM (e.g., WarpPLS) tend to be viewed as fairly easy to use by a wide range of researchers.

However, PLS-based SEM builds latent variables as exact linear combinations of their indicators, without explicitly accounting for measurement error. Strictly speaking, these are not really latent variables, but “composites” (McDonald, 1996). Because of this, some argue that PLS-based SEM should not be referred to as an “SEM” technique, while others ignore this as just a semantic issue (Hair et al., 2011). This is one of the reasons why PLS-based SEM is sometimes referred to as “PLS path modeling” (Tenenhaus et al., 2005).

Because PLS-based SEM does not explicitly account for measurement error, it often yields path coefficient estimates that asymptotically converge to values of lower magnitude than the true values as sample sizes grow to infinity. Since path coefficients are proportional to correlations, the amount of underestimation for each path can be approximated through the correlation attenuation factor (Nunnally & Bernstein, 1994), expressed in (1). In this equation, is the attenuated correlation between composites that refer to two correlated latent variables and ; is the correlation between the latent variables, and and are the true reliabilities associated with the latent variables. We use the symbols and throughout to refer to latent variables (or factors) and associated composites, respectively:

Complete Article List

Search this Journal:
Open Access Articles: Forthcoming
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing