{smcl} {* *! version 0.3.0 12Apr2019}{...} {vieweralsosee "plssem postestimation" "help plssem postestimation"}{...} {vieweralsosee "plssemplot" "help plssemplot"}{...} {vieweralsosee "plssemc" "help plssemc"}{...} {viewerjumpto "Syntax" "plssem##syntax"}{...} {viewerjumpto "Description" "plssem##description"}{...} {viewerjumpto "Options" "plssem##options"}{...} {viewerjumpto "Examples" "plssem##examples"}{...} {viewerjumpto "Authors" "plssem##authors"}{...} {viewerjumpto "Stored results" "plssem##results"}{...} {viewerjumpto "References" "plssem##references"}{...} {title:Title} {p 4 18 2} {hi:plssem} {hline 2} Partial least squares structural equation modelling (PLS-SEM) {marker syntax}{...} {title:Syntax} {pstd} Partial least squares structural equation modeling of data {p 8 12 2} {cmd:plssem} (LV1 > indblock1) (LV2 > indblock2) (...) {ifin} [{cmd:,} structural(LV2 LV1, ...) {it:{help plssem##plssemopts:options}}] {pstd} Partial least squares structural equation modeling of adjacency matrices {p 8 15 2} {cmd:plssemmat} {it:adjmeas_matname} {ifin} [{cmd:,} structural({it:adjstruc_matname}) {it:{help plssem##plssemopts:options}}] {phang} {it:adjmeas_matname} is a Q x P matrix providing the adjacency matrix for the measurement model, while {it:adjstruc_matname} is a P x P matrix providing the adjacency matrix for the structural model (Q denotes the number of indicators and P the number of latent variables in the model). {synoptset 19 tabbed}{...} {marker plssemopts}{...} {synopthdr} {synoptline} {synopt:{cmdab:w:scheme(centroid)}}use the centroid weighting scheme{p_end} {synopt:{cmdab:w:scheme(factorial)}}use the factorial weighting scheme{p_end} {synopt:{cmdab:w:scheme(path)}}use the path weighting scheme; the default{p_end} {synopt:{opth bin:ary(syntax##description_of_namelist:namelist)}}list of latent variables to fit using {helpb logit}{p_end} {synopt:{opth b:oot(numlist)}}number of bootstrap replications{p_end} {synopt:{opth s:eed(numlist)}}bootstrap seed number{p_end} {synopt:{opt t:ol(#)}}tolerance; default is {cmd:1e-7}{p_end} {synopt:{opt max:iter(#)}}maximum number of iterations; default is {cmd:100}{p_end} {synopt:{cmdab:miss:ing(mean)}}impute the indicator missing values using the mean of the available indicators{p_end} {synopt:{cmdab:miss:ing(knn)}}impute the indicator missing values using the k-th nearest neighbor method{p_end} {synopt:{opt k(#)}}number of nearest neighbors to use with {cmd:missing(knn)}; default is {cmd:5}{p_end} {synopt:{cmdab:init(eigen)}}initialize the latent variables using {helpb factor}{p_end} {synopt:{cmdab:init(indsum)}}initialize the latent variables using the sum of indicators; the default{p_end} {synopt:{opt dig:its(#)}}number of digits to display; default is {cmd:3}{p_end} {synopt:{cmd:no}{cmdab:head:er}}suppress display of output header{p_end} {synopt:{cmd:no}{cmdab:meas:table}}suppress display of measurement model estimates table{p_end} {synopt:{cmd:no}{cmdab:discrim:table}}suppress display of discriminant validity table{p_end} {synopt:{cmd:no}{cmdab:struct:table}}suppress display of structural model estimates table{p_end} {synopt:{opt loadp:val}}show the outer loadings' p-values{p_end} {synopt:{opt stat:s}}print a table of summary statistics for the indicators{p_end} {synopt:{opt gr:oup()}}perform multigroup analysis; see {help plssem##options:{it:Options}} for details{p_end} {synopt:{opt corr:elate()}}report the correlation among indicators, latent variables and cross loadings; see {help plssem##options:{it:Options}} for details{p_end} {synopt:{opt raw:sum}}estimate the latent scores as the raw sum of the indicators{p_end} {synopt:{cmd:no}{cmdab:sc:ale}}manifest variables are not standardized before running the algorithm{p_end} {synopt:{cmdab:conv:crit(relative)}}relative convergence criterion; the default{p_end} {synopt:{cmdab:conv:crit(square)}}square convergence criterion{p_end} {synoptline} {p 4 6 2} {cmd:by} is allowed with {cmd:plssem}; see {help prefix}. {p 4 6 2} See {helpb plssem_postestimation:plssem postestimation} and {helpb plssemplot:plssemplot} for features available after estimation.{p_end} {pstd}The syntax of {cmd:plssem} reflects the measurement and structural part of a PLS-SEM model, and accordingly requires the user to specify both of these parts simultaneously. Since a full PLS-SEM model would include a structural model, i.e., the relationship between latent variables (LV), one needs to have at least two latent variables specified in the measurement part. Each latent variable will be defined by a block of indicators (say, {cmd:indblock}). For example, if we have two latent variables in our PLS-SEM model, the {cmd:plssem} syntax requires to specify the measurement part by typing {cmd:(LV1 > indblock1) (LV2 > indblock2)} following the command name. Note that we can specify as many LVs as it is needed in the model. {pstd}Incidentally, when specifying reflective measures, one needs to use the greater-than sign between a latent variable and its associated indicators (e.g., {cmd:LV1 > indblock1}) and the less-than sign for formative measures (e.g., {cmd:LV1 < indblock1}). {pstd}To specify the structural part, one simply needs to type in the endogenous/dependent LV first and then the exogenous latent variable/s, e.g., {cmd:structural(LV2 LV1)}. One can specify more than one structural relationship following the same approach. Say that we have two further latent variables in the model, {cmd:LV3} and {cmd:LV4}; then, in the structural part of the syntax we would type in {cmd:structural(LV2 LV1, LV4 LV3)} indicating that {cmd:LV4} is another endogenous LV predicted by {cmd:LV3}. In addition, in line with most of the Stata commands, one can fit a full PLS-PM model by subsetting the data directly in the syntax using the {cmd:if} and {cmd:in} qualifers. {pstd}In {cmd:plssemmat} row and column names of the adjacency matrices provided are used in the output. Note that, no matter whether {cmd:plssem} or {cmd:plssemmat} is used, the raw data are still needed. The difference between the two commands is how the model is specified: through an equation-like style for {cmd:plssem} and with the adjacency matrices for {cmd:plssemmat}. {pstd}In {cmd:plssemmat} each column name of the measurement model adjacency matrix must specify either "Reflective:" or "Formative:" as the equation names of the columns (see the examples below). {marker description}{...} {title:Description} {pstd} {bf:plssem} fits partial least squares structural equation models (PLS-SEM), which is often considered as an alternative to the commonly known covariance-based structural equation modeling (COV-SEM). {bf:plssem} is developed in line with the algorithm provided by {help plssem##Wold1975:Wold (1975)} and {help plssem##Lohmoller1989:Lohmöller (1989)}. {bf:plssem} can be used for modeling the relationship among single-item observed variables too and not only for latent variable modeling. {pstd} The algorithm used to estimate a PLS-SEM model consists basically of three sequential stages of estimation (see {help plssem##Lohmoller1989:Lohmöller 1989}). In the first stage, latent variable scores are estimated for each case. Using these scores, in the second stage, measurement model parameters (weights/loadings) are estimated. In the same manner, in the third stage structural model parameters (path coefficients) are finally estimated. The first stage is what makes PLS-SEM a novel method in that the second and third stages are about conducting a series of regression analysis using the ordinary least squares method. {marker options}{...} {title:Options} {phang}{opt wscheme(weighting_scheme)} provides the choice of the weighting scheme. The default is {bf:path} for the path scheme. Alternative choices are {bf:factorial} or {bf:centroid} for the corresponding scheme. {phang}{opt binary(LV)} indicates the latent variables that are defined by a single binary variable. This allows essentially for estimating a model with a binary dependent variable using a logistic regression model. The {bf:LV} needs to be specified in the measurement part of the syntax at the same time (e.g., {bf:LV > binaryvar1)}. {phang}{opt boot(#)} sets the number of bootstrap replications. {phang}{opt seed(#)} sets the seed number for the bootstrap calculations. This option may be useful if reproducibility is the analyst's concern. {phang}{opt tol(#)} sets the tolerance value used for checking convergence attainment. The default tolerance value is 1e-7. {phang}{opt maxiter(#)} indicates the maximum number of iterations the algorithm runs. The default is 100 iterations. Note that usually the algorithm requires a very limited number of iterations to reach convergence, typically less than 10. {phang}{opt missing(imputation_method)} provides the choice for the method to use for imputing the indicator missing values. Possible choices are {bf:mean} (i.e. the mean of the available indicators) or {bf:knn} (i.e. the k-th nearest neighbor method). {phang}{opt k(#)} sets the number of nearest neighbors to use with {cmd:missing(knn)}. The default number of nearest neighbors is 5. {phang}{opt init(init_method)} lets the user choose between two options for initialization. These are {bf:indsum} (default) and {bf:eigen}. The {bf:eigen} option also allows the user estimate only the measurement part of the model. {phang}{opt digits(#)} sets the number of decimals to display the model estimates. The default is 3. {phang}{opt noheader} suppresses the output header. {phang}{opt nodiscrimtable} suppresses discriminant validity assessment section of the output. {phang}{opt nomeastable} suppresses measurement model section of the output. {phang}{opt nostructtable} suppresses structural model section of the output. {phang}{opt loadpval} shows the table of loadings' p-values. {phang}{opt stats} displays some summary statistics (mean, standard deviations, etc.) for the original indicators. {phang}{opt group(grouping_variable, [sub-options])} provides both the structural and the measurement part of the estimation results for each category of the grouping variable as well as the comparison between the categories based on normal-theory (default). As an alternative to normal-based theory estimations, the user can use two resampling techniques. More specifically, by adding the suboption {bf:method(permutation} or {bf:bootstrap)} one can get the results based on permutation or bootstrap resampling. The default number of replications for both permutation and bootstrap is 100. However, this can be changed by adding the suboption {bf:reps(#)}. Further, with the suboption {bf:groupseed(#)} one can also set a certain seed number to be able reproduce the bootstrap or permutation results. Finally, by using the suboption {bf:plot} we can get a graphical output showing the estimates differences between the groups based on alpha level of 0.05 (default). The significance level can also be changed by adding the suboption {bf:alpha(#)}. {phang}{opt correlate(mv lv cross [, cutoff(#)])} lets the user ask for correlations among the indicators or manifest variables ({bf:mv}), latent variables ({bf:lv}) as well as cross-loadings ({bf:cross}) between the indicators and latent variables. When doing so, the user can also set a certain cut-off value for the correlations to be displayed by using the suboption {bf:cutoff(#)}. For instance, {bf:cutoff(0.3)} will display the correlations above 0.3 in absolute terms. {phang}{opt rawsum} uses the sum of the raw indicators and the resulting aggregated scores (also called summated scales) are used directly for estimating the structural part. In this sense, {cmd:rawsum} is an alternative procedure to the PLS-algorithm for estimating the latent variable scores. {phang}{opt noscale} the manifest variables are not standardized before running the algorithm. {phang}{opt convcrit(convergence_criterion)} the convergence criterion to use. Alternative choices are {bf:relative} or {bf:square}. The default is {bf:relative}. {marker examples}{...} {title:Examples} {hline} {pstd}Setup{p_end} {phang2}{cmd:. sysuse workout2, clear}{p_end} {pstd}Model estimation{p_end} {phang2}{cmd:. plssem (Attractive > face sexy) (Appearance > body appear attract) (Muscle > muscle strength endur) (Weight > lweight calories cweight), structural(Appearance Attractive, Muscle Appearance, Weight Appearance)}{p_end} {pstd}Inner model graph{p_end} {phang2}{cmd:. plssemplot, innermodel}{p_end} {pstd}Outer weights evolution{p_end} {phang2}{cmd:. plssemplot, outerweights}{p_end} {pstd}Direct, indirect and total effects graph{p_end} {phang2}{cmd:. estat total, plot}{p_end} {pstd}Multicollinearity assessment{p_end} {phang2}{cmd:. estat vif}{p_end} {pstd}Multigroup analysis using bootstrap{p_end} {phang2}{cmd:. plssem (Attractive > face sexy) (Appearance > body appear attract) (Weight > lweight calories cweight), structural(Appearance Attractive, Weight Appearance) group(women, method(bootstrap) reps(50) plot)}{p_end} {hline} {pstd}Setup{p_end} {phang2}{cmd:. sysuse ecsimobi, clear}{p_end} {pstd}Model estimation{p_end} {phang2}{cmd:. plssem (Expectation > CUEX1-CUEX3) (Satisfaction > CUSA1-CUSA3) (Complaints > CUSCO) (Loyalty > CUSL1-CUSL3) (Image > IMAG1-IMAG5) (Quality > PERQ1-PERQ7) (Value > PERV1-PERV2), }{p_end} {p 12 12 2}{cmd: structural(Expectation Image, Quality Expectation, Value Expectation Quality, Satisfaction Value Quality Image Expectation, Complaints Satisfaction, Loyalty Complaints Satisfaction}{p_end} {p 12 12 2}{cmd: Image) wscheme(path) digits(4) correlate(mv lv cross, cutoff(.3))}{p_end} {pstd}Inner model graph{p_end} {phang2}{cmd:. plssemplot, innermodel}{p_end} {pstd}Outer weights evolution{p_end} {phang2}{cmd:. plssemplot, outerweights}{p_end} {pstd}Plot of loadings{p_end} {phang2}{cmd:. plssemplot, loadings}{p_end} {hline} {pstd}Setup{p_end} {phang2}{cmd:. matrix m = (2, 5, 3)}{p_end} {phang2}{cmd:. matrix sd = (.5, 1, 2)}{p_end} {phang2}{cmd:. matrix C = (1, .3, 1, .1, .5, 1)}{p_end} {phang2}{cmd:. drawnorm x1 x2 x3, n(300) means(m) sds(sd) corr(C) cstorage(lower) clear}{p_end} {p 12 12 2}{cmd: seed(101)}{p_end} {pstd}Model specification{p_end} {phang2}{cmd:. matrix M = (1, 0 \ 1, 0 \ 0, 1)}{p_end} {phang2}{cmd:. matrix rownames M = x1 x2 x3}{p_end} {phang2}{cmd:. matrix colnames M = y1 y2}{p_end} {phang2}{cmd:. matrix coleq M = Reflective Formative}{p_end} {phang2}{cmd:. matrix S = (0, 1 \ 0, 0)}{p_end} {phang2}{cmd:. matrix rownames S = y1 y2}{p_end} {phang2}{cmd:. matrix colnames S = y1 y2}{p_end} {pstd}Model estimation{p_end} {phang2}{cmd:. plssemmat M, structural(S) wscheme(path) digits(4)}{p_end} {hline} {marker authors}{...} {title:Authors} {pstd} Sergio Venturini{break} Department of Management{break} Università degli Studi di Torino, Italy{break} {browse "mailto:sergio.venturini@unito.it":sergio.venturini@unito.it}{break} {pstd} Mehmet Mehmetoglu{break} Department of Psychology{break} Norwegian University of Science and Technology{break} {browse "mailto:mehmetm@svt.ntnu.no":mehmetm@svt.ntnu.no}{break} {p_end} {marker results}{...} {title:Stored results} {pstd} {cmd:plssem} stores the following in {cmd:e()}: {synoptset 24 tabbed}{...} {p2col 5 24 28 2: Scalars}{p_end} {synopt:{cmd:e(N)}}number of observations{p_end} {synopt:{cmd:e(reps)}}number of bootstrap replications{p_end} {synopt:{cmd:e(iterations)}}number of iterations to reach convergence{p_end} {synopt:{cmd:e(tolerance)}}chosen tolerance value{p_end} {synopt:{cmd:e(maxiter)}}maximum number of iterations allowed{p_end} {synopt:{cmd:e(converged)}}equal to 1 if convergence is achieved; 0 otherwise{p_end} {synoptset 24 tabbed}{...} {p2col 5 24 28 2: Macros}{p_end} {synopt:{cmd:e(cmd)}}{cmd:plssem}{p_end} {synopt:{cmd:e(cmdline)}}command as typed{p_end} {synopt:{cmd:e(estat_cmd)}}program used to implement {cmd:estat}{p_end} {synopt:{cmd:e(predict)}}program used to implement {cmd:predict}{p_end} {synopt:{cmd:e(title)}}title in estimation output{p_end} {synopt:{cmd:e(mvs)}}list of manifest variables (indicators) used{p_end} {synopt:{cmd:e(lvs)}}list of latent variables used{p_end} {synopt:{cmd:e(binarylvs)}}sublist of binary latent variables only{p_end} {synopt:{cmd:e(datasignaturevars)}}variables used in calculation of checksum {p_end} {synopt:{cmd:e(datasignature)}}the checksum{p_end} {synopt:{cmd:e(reflective)}}list of latent variables measured in a reflective way{p_end} {synopt:{cmd:e(formative)}}list of latent variables measured in a formative way{p_end} {synopt:{cmd:e(struct_eqs)}}equations defining the structural model{p_end} {synopt:{cmd:e(properties)}}choices of initialization, weighting scheme, imputation method, whether the bootstrap has been used, whether the model has a structural part, whether the {cmd:rawsum} option has been used, and whether the manifest variables have been scaled or not{p_end} {synoptset 24 tabbed}{...} {p2col 5 24 28 2: Matrices}{p_end} {synopt:{cmd:e(loadings)}}outer loadings matrix{p_end} {synopt:{cmd:e(loadings_bs)}}bootstrap-based outer loadings matrix (available only if the {cmd:boot()} option is chosen){p_end} {synopt:{cmd:e(loadings_se)}}matrix of the outer loadings standard errors{p_end} {synopt:{cmd:e(cross_loadings)}}cross loadings matrix{p_end} {synopt:{cmd:e(cross_loadings_bs)}}bootstrap-based cross loadings matrix (available only if the {cmd:boot()} option is chosen){p_end} {synopt:{cmd:e(cross_loadings_se)}}matrix of the cross loadings standard errors{p_end} {synopt:{cmd:e(adj_meas)}}adjacency matrix for the measurement (outer) model{p_end} {synopt:{cmd:e(outerweights)}}matrix of outer weights{p_end} {synopt:{cmd:e(ow_history)}}matrix of outer weights evolution{p_end} {synopt:{cmd:e(relcoef)}}matrix of reliability coefficients{p_end} {synopt:{cmd:e(sqcorr)}}matrix of squared correlations among the latent variables{p_end} {synopt:{cmd:e(ave)}}vector of average variances extracted{p_end} {synopt:{cmd:e(struct_b)}}path coefficients matrix (short form){p_end} {synopt:{cmd:e(struct_se)}}matrix of path coefficients' standard errors (short form){p_end} {synopt:{cmd:e(struct_table)}}table combining estimation results for the structural (inner) model{p_end} {synopt:{cmd:e(pathcoef)}}path coefficients matrix (extended form){p_end} {synopt:{cmd:e(pathcoef_bs)}}bootstrap-based path coefficients matrix (extended form; available only if the {cmd:boot()} option is chosen){p_end} {synopt:{cmd:e(adj_struct)}}adjacency matrix for the structural (inner) model{p_end} {synopt:{cmd:e(total_effects)}}matrix of the structural (inner) model total effects{p_end} {synopt:{cmd:e(rsquared)}}vector of r-squared for reflective latent variables{p_end} {synopt:{cmd:e(redundancy)}}vector of redundancies{p_end} {synopt:{cmd:e(assessment)}}vector of model quality indices{p_end} {synopt:{cmd:e(reldiff)}}vector containing the history of weights' relative differences{p_end} {synopt:{cmd:e(imputed_data)}}matrix of imputed indicators; available only if the the {cmd:missing} option has been used{p_end} {synopt:{cmd:e(R)}}latent variable correlation matrix{p_end} {synoptset 24 tabbed}{...} {p2col 5 24 28 2: Functions}{p_end} {synopt:{cmd:e(sample)}}marks estimation sample{p_end} {p2colreset}{...} {marker references}{...} {title:References} {marker BaronKenny1986}{...} {phang} Baron, R. M., and Kenny, D. A. 1986. The Moderator-Mediator Variable Distinction in Social Psychological Research: Conceptual, Strategic, and Statistical Considerations. Journal of Personality and Social Psychology, 51, 1173-1182. {marker Hairetal2017}{...} {phang} Hair, J. F., Hult, G. T. M., Ringle, C. M., and Sarstedt, M. 2017. {it:A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM)}. Second edition. Sage. {marker Lohmoller1989}{...} {phang} Lohmöller, J. B. 1989. {it:Latent Variable Path Modeling with Partial Least Squares}. Heidelberg: Physica. {marker Sobel1982}{...} {phang} Sobel, M. N. 1982. Asymptotic Confidence Intervals for Indirect Effects in Structural Equations Models. In Leinhart, S. (ed.), {it:Sociological Methodology}, pp. 290-312. Jossey-Bass. {marker VanderWeele2015}{...} {phang} VanderWeele, T. J. 2015. {it:Explanation in Causal Inference}. Oxford University Press. {marker Wold1975}{...} {phang} Wold, H. O. A. 1975. Path Models with Latent Variables: The NIPALS Approach. In Blalock, H. M., Aganbegian, A., Borodkin, F. M., Boudon, R., and Cappecchi, V. (ed.), {it:Quantitative Sociology} (pp. 307-359). New York: Academic Press. {p_end}