cmd:help movestay} -------------------------------------------------------------------------------


movestay -- Maximum-likelihood estimation of endogenous switching regression model


movestay (depvar0 [=] varlist0) (depvar1 [=] varlist1) [if] [in] [ weight] , select(depvar_s [=] varlist_s) [options]

Syntax for predict

predict [type] newvar [if] [in] [, statistic]

statistic Description ------------------------------------------------------------------------- Main psel the probability of being in regime 1 xb0 fitted values for regime 0 xb1 fitted values for regime 1 yc0 fitted values for regime 1 yc1 fitted values for regime 1 mills0 Mills' ratio for regime 0 mills1 Mills' ratio for regime 1 -------------------------------------------------------------------------

options Description ------------------------------------------------------------------------- Model select() specify selection equation: dependent and independent variables collinear keep collinear variables

SE/Robust robust robust estimator of variance cluster(varname) adjust standard errors for intragroup correlation

Reporting level(#) set confidence level; default is level(95)

Max option maximize_options control the maximization process; ------------------------------------------------------------------------- fweights, iweights, and pweights are allowed;see weight.


movestay uses the maximum likelihood method to estimate the endogenous switching regression model. It is implemented using the d2 evaluator to calculate the overall log likelihood together with its first and second derivatives.

movestay estimates all of the parameters in the model:

(regression equation for regime 0: y0 is depvar0, x1 is varlist0) y0 = x0 * b0 + e_0

(regression equation for regime 1: y1 is depvar1, x1 is varlist1) y1 = x1 * b1 + e_1

(selection equation: Z is varlist_s) y0 observed if Zg + u <= 0 y1 observed if Zg + u > 0

where: e_0 ~ N(0, sigma0) e_1 ~ N(0, sigma1) u ~ N(0, 1) corr(e_0, u) = rho_0 corr(e_1, u) = rho_1

Here depvar0, depvar1 and varlist0, varlist1 are the dependent variables and regressors for the underlying regression models (y0, y1 = xb), and varlist_s specifies the variables Z thought to determine which regime is observed.


+-------+ ----+ Model +------------------------------------------------------------

select() specifies variables in the selection equation. varlist_s includes the set of instruments that help identify the model. This option is an integral part of the movestay estimation and is required. The selection equation is estimated based on all exogenous variables specified in the continuous equations plus instruments. If there are no instrumental variables in the model, depvar_s must be specified. In that case the model will be identified by non-linearities and the selection equation will contain all the independent variables that enter in the continuous equations.

collinear see estimation options.

+-----------+ ----+ SE/Robust +--------------------------------------------------------

robust specifies that the Huber/White/sandwich estimator of the variance is to be used in place of the conventional MLE variance estimator. robust combined with cluster further allows observations which are not independent within cluster (although they must be independent between clusters).

If you specify pweights, robust is implied. See [U] 23.14 Obtaining robust variance estimates.

cluster(varname) specifies that the observations are independent across groups (clusters) but not necessarily within groups. varname specifies to which group each observation belongs; e.g., cluster(personid) in data with repeated observations on individuals. cluster() affects the estimated standard errors and variance-covariance matrix of the estimators (VCE), but not the estimated coefficients. cluster() can be used with pweights to produce estimates for unstratified cluster-sampled data. Specifying cluster() implies robust.

+-----------+ ----+ Reporting +--------------------------------------------------------

level(#); see estimation options.

+-------------+ ----+ Max options +------------------------------------------------------

maximize_options control the maximization process; see maximize. With the possible exception of iterate(0) and trace, you should only have to specify them if the model is unstable. The maximization uses option difficult by default. This option need not be specified.

+------------------+ ----+ predict options +-------------------------------------------------

psel calculates the probability of being in regime 1.

xb0 calculates the linear prediction for equation 0.

xb1 calculates the linear prediction for equation 1.

yc0 returns the predicted value of the dependent variable(s) in the regime 0. For example, if earning function is modeled for two sectors (regimes), then this option predicts the wage rate in sector one for all individuals in the sample.

yc1 returns the predicted value of the dependent variable(s) in the regime 1.

mills0 and mills1 calculate corresponding Mills' ratios for two regimes


To obtain full ML estimates:

Using instruments:

. movestay y1 x1 x2 x3 x4, select(regime1=z1 z2)

. movestay (y1= x1 x2 x3 x4) (y1= x1 x2 x3 x5), select(regime1=z1 z2)

Model is identified through non-linearities:

. movestay (y1= x1 x2 x3 x4) (y1= x1 x2 x3 x5), select(regime1)

To define and use each equation separately:

. global wage_eqn y x1 x2 x3 x4 . global select_eqn regime z1 z2

. movestay ($wage_eqn), select($select_equn)

To use options:

. movestay y= x1 x2 x3 x4 if region=1 [w= hhweight], select(regime= z1 z2)

. movestay (y= x1 x2 x3 x4) if region=1, select(regime= z1 z2) tech("dfp")


. movestay y x1 x2 x3 x4, select(regime= z1 z2)

. predict yexpected, xb

. predict mymills1, mills1

Example from the Stata Journal:

. movestay lmo_wage age age2 edu13 edu4 edu5 reg2 reg3 reg4, select(private =m_s1 job_hold)


M. Lokshin (DECRG, The World Bank) and Z. Sajaia (Stanford University).

Also see

Online: help for regress, heckman, ml