help for ssm

Endogenous-Switch & Sample Selection Count, Binary & Ordinal Response Regressio > n

ssm depvar [indepvars] [weight] [if exp] [in range] , switch(varname = varlist) family(familyname) link(linkname) [quadrature(#) from(matrix) selection noconstant adapt nolog trace]

The outcome model is specified by depvar and [indepvars], family(familyname), link(linkname), etc.

The endogenous switching equation is specified by switch(varname = varlist), where varname is the name of the endogenous dummy variable and varlist are a set of explanatory variables. Endogenous switching models are the default specification. Sample selection models are obtained if the selection option is used.

familyname is one of

binomial | poisson

linkname is one of

log | logit | probit | ologit | oprobit

fweights and pweights are allowed; see help weights.

ssm shares the features of all estimation commands; see help estcom.


ssm is a wrapper for gllamm to estimate Endogenous-Switch & Sample Selection Count, Binary & Ordinal Response Regression by maximum likelihood using adaptive quadrature. ssm interprets a simple syntax, prepares the data for gllamm, calls gllamm and produces tailor-made output. The commands option causes ssm to print out all data manipulation commands and the gllamm command. gllamm itself should be used to extend the model and for prediction and simulation using gllapred or gllasim. The Endogenous-Switch (Sample Selection) model comprises two submodels: the outcome model and the Switch (Selection) model.

The outcome model is a generalized linear model that contains an endogenous dummy variable among its observed covariates, and a unobserved or latent random term.

The Switch model is a binary variable model that determines the outcome of the endogenous dummy included in the outcome model. The Switch model contains an unobserved random (latent) term that is correlated with the unobserved random term included in the outcome model.

The Selection model is obtained when the outcome variable is only observed if a particular condition is met (selection = 1) and the selection dummy does not enter the outcome model.


family(familyname) specifies the distribution of depvar; family(binomial) is the default.

link(linkname) specifies the link function; the default is the canonical link for the family() specified.

selection Sample selection models are estimated, substituting the default endogenous switching specification.

quadrature(#) specifies the number of quadrature points to be used.

noconstant specifies that the linear predictor has no intercept term, thus forcing it through the origin on the scale defined by the link function.

adapt Use adaptive quadrature instead of the default ordinary quadrature.

robust specifies that the Huber/White/sandwich estimator of variance is to be used. If you specify pweights,robust is implied.

commands displays the commands necessary to prepare the data and estimate the model in gllamm instead of estimating the model. These commands can be copied into a do-file and should work without further editing. Note that the data will be changed by the do-file!

nolog suppresses the iteration log.

trace requests that the estimated coefficient vector be printed at each iteration. In addition, all the output produced by gllamm with the trace option is also produced.

from(matrix) specifies a matrix of starting values.


The allowed link functions are

Link function ssm option ---------------------------------------- log link(log) logit link(logit) probit link(probit) ordinal logit link(ologit) ordinal probit link(oprobit)

The allowed distribution families are

Family ssm option ---------------------------------------- Bernoulli/binomial family(binomial) Poisson family(poisson)

If you specify family() but not link(), you obtain the canonical link for the family:

family() default link() -------------------------------------- family(binomial) link(logit) family(poisson) link(log)


* simulate data set seed 12345678 set obs 3500 local lambda = 0.4 gen double ve = invnormal(uniform()) gen double zeta = invnormal(uniform()) gen double tau = invnormal(uniform()) gen double x1=invnormal(uniform()) gen double x2=invnormal(uniform()) gen double x3=invnormal(uniform()) gen double x4=invnormal(uniform()) replace x3 = (x3>0) replace x4 = (x4>0) gen double selstar = 0.58 + 0.93*x1 + 0.45*x2 - 0.64*x3 + 0.6*x4 + /// (ve + zeta)/sqrt(2) gen sel = (selstar>0) gen double ystar = 0.17 + 0.30*x1 + 0.11*x2 + /// (‘lambda’*ve + tau)/sqrt(1+‘lambda’^2) gen y = (ystar>0) replace y =. if sel==0

* estimate model . ssm y x1 x2, s(sel = x1 x2 x3 x4) q(16) family(binom) link(probit) sel adapt




Alfonso Miranda (A.Miranda@econ.keele.ac.uk) & Sophia Rabe-Hesketh (sophiarh@berkeley.edu).

References (available from the authors)

Miranda and Rabe-Hesketh (2006). Maximum likelihood estimation of endogenous switching and sample selection models for binary, count, and ordinal variables. The Stata Journal 6 (3), 285-308.

Rabe-Hesketh, S., Skrondal, A. and Pickles, A. (2003). Maximum likelihood estimation of generalized linear models with covariate measurement error. The Stata Journal 3, 386-411.

Rabe-Hesketh, S., Skrondal, A. and Pickles, A. (2005). Maximum likelihood estimation of limited and discrete dependent variable models with nested random effects. Journal of Econometrics 128 (2), 301-323.

Rabe-Hesketh, S., Pickles, A. and Skrondal, S. (2001). Correcting for covariate measurement error in logistic regression using nonparametric maximum likelihood estimation. Statistical Modelling 3, 215-232.

Rabe-Hesketh, S., Skrondal, A. and Pickles, A. (2002). Reliable estimation of generalized linear mixed models using adaptive quadrature. The Stata Journal 2 (1), 1-21.

Rabe-Hesketh, S., Skrondal, A. and Pickles, A. (2004). GLLAMM Manual. U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 160.

Also see

Manual: [U] 23 Estimation and post-estimation commands, [U] 29 Overview of Stata estimation commands,

Online: help for cme; gllamm, gllapred, gllasim; estcom, postest;