------------------------------------------------------------------------------- help for mcmcmixed -------------------------------------------------------------------------------

Syntax

mcmcmixed depvar [fe_equation] [|| re_equation] , saving(filename) d0(#) delta0(name) [options]

where the syntax of fe_equation is

[indepvars] [if] [in] [weight] [, noconstant]

and the syntax of re_equation is

levelvarlist: [varlist] [, noconstant]

options Description ------------------------------------------------------------------------- Model d0(#) prior variance of regression error delta0(name) scalar or matrix specifying prior variance of random effects; relevant only if re_equation is specified noconstant suppress constant term

Results saving(filename) filename where results should be stored replace overwrite existing filename

Markov chain iterate(#) number of iterations in chain; default is iterate(100) seed(#) random number generator seed; default is seed(12345) nolog suppress iteration log -------------------------------------------------------------------------

aweights and fweights are allowed; see weight, and see methods for an important note on how aweights are interpreted.

Description

mcmcmixed uses Markov chain Monte Carlo (MCMC) to sample from the posterior distribution of a normal linear mixed model. If re_equation is not specified, mcmcmixed is identical to mcmcreg, which estimates a normal linear regression. If re_equation is specified, the model allows the coefficients to vary across groups defined by levelvarlist.

mcmcmixed produces a file of draws from the posterior distribution of the model parameters. Each observation in the file corresponds to an iteration of the sampler; each variable represents a different scalar parameter. These variables are named as follows:

beta_* Coefficient on independent variable *. beta__cons Intercept (omitted if noconstant is specified in fe_equation). sigma2 Variance of regression error. theta`i'_* Coefficient on independent variable * for group `i' of levelvarlist. theta`i'__cons Intercept for group `i' of levelvarlist (omitted if noconstant is specified in re_equation). Sigma_`j'_`j' Variance across groups of the `j'th coefficient in re_equation (where the constant, if included, is the final coefficient). Sigma_`j'_`k' Covariance across groups of the `j'th and `k'th coefficients in re_equation.

The variable iter in the output file keeps track of iterations; iter=0 is the initial condition used to start the chain.

Right-hand-side variables can appear in fe_equation, re_equation or both. If a variable appears only in fe_equation, it has the same coefficient for all groups, and this coefficient is reported in beta_*. If a variable appears in both fe_equation and re_equation, it has a different coefficient for each group; the mean across groups is reported in beta_*, and the group-specific deviation from the mean is reported in theta`i'_*. If a variable appears only in re_equation, it has a different coefficient for each group, reported in theta`i'_*; the mean across groups is assumed be zero.

The group indexes `i' in theta`i'_* are obtained from egen i=group(levelvarlist).

Options

+-------+ ----+ Model +------------------------------------------------------------

d0(#) sets the prior variance of the regression errors. Because results are potentially sensitive to this prior, you are required to specify it yourself; there is no default value.

delta0(name) sets the prior variance-covariance matrix of the group-specific coefficients. Because results are potentially sensitive to this prior, you are required to specify it yourself; there is no default value. delta0 must be the name of a scalar, vector or matrix containing the prior. If delta0 is a scalar, the prior is delta0*I(Nz), where Nz is the number of group-specific coefficients (including the intercept, unless noconstant is specified in re_equation); delta0 must be positive. If delta0 is an Nzx1 or 1xNz vector, the prior is diag(delta0); all elements of delta0 must be positive. If delta0 is an NzxNz matrix, the prior is delta0; this matrix must be symmetric and positive definite. It is an error for delta0 to be anything except a positive scalar; an Nzx1 or 1xNz vector containing only positive numbers; or an NzxNz symmetric, positive-definite matrix. The order of elements in delta0 is the same as the order of the varlist specified in re_equation, with the intercept last if there is an intercept.

noconstant; see [R] estimation options. If noconstant is specified, there must be at least one independent variable.

+---------+ ----+ Results +----------------------------------------------------------

saving(filename) designates the location where the results should be saved.

replace specifies that you want to overwrite filename if it already exists.

+--------------+ ----+ Markov chain +-----------------------------------------------------

iterate(#) specifies how many iterations the chain should continue for.

seed(#) sets the random number generator seed. Random numbers are used to initialize the sampler; thus, multiple independent chains can be obtained by running mcmcreg with different values of seed.

nolog suppresses printing of an iteration log.

Methods

mcmcreg uses the Gibbs sampler. Standard uninformative conjugate priors are used: The prior for the regression coefficients is uninformative, the prior for the error variance sigma^2 is InverseGamma(1/2, d0/2), and the prior for the variance-covariance matrix of group-specific coefficients Sigma is InverseWishart(1+Ng,delta0), where Ng is the number of groups. See Chib (2001, algorithm 16) for a textbook exposition.

The sampler is initialized with guesses for sigma^2 and Sigma obtained as follows. Let y be the dependent variable, let x be the independent variables in fe_equation, and let z be the independent variables in re_equation. Let beta be the estimated coefficient vector in a frequentist pooled least squares regression of y on x. For each group i, let theta_i be the posterior mean of the coefficient vector in a Bayesian least squares regression of y-x*beta on z using the data from group i, where the prior for theta_i is N(0,delta0). The initial guess for the inverse of Sigma is drawn from Wishart(1+Ng,inv(inv(delta0)+S*Ng)), where S is the observed variance-covariance matrix of the theta_i's. The initial guess for sigma^2 is drawn from InverseGamma((1+df)/2,(d0+SSR)/2), where df is the number of observations (or the sum of weights if you use fweights) and SSR is the sum of squares of y-x*beta-z*theta_i.

aweights are interpreted as scaling the variances of the regression error and all of the random effects. That is, if you specify aweights, they apply to both fe_equation and re_equation. A future version may provide the ability to specify aweights separately for the two equations.

Examples

Manufacturer-specific intercepts:

. scalar mydelta0=0.01 . sysuse auto . generate manuf=cond(strpos(make," ")>0,substr(make,1,strpos(make," ")-1),make) . mcmcmixed mpg weight foreign || manuf:, saving(mcmcmixed_auto1.dta) d0(0.01) delta0(mydelta0) . use mcmcmixed_auto1.dta, clear . list in 1/5 . summarize *

Manufacturer-specific intercept and manufacturer-specific coefficient on weight, same prior variance on each manufacturer-specific parameter:

. scalar mydelta0=0.01 . sysuse auto . generate manuf=cond(strpos(make," ")>0,substr(make,1,strpos(make," ")-1),make) . mcmcmixed mpg weight foreign || manuf:weight, saving(mcmcmixed_auto2.dta) d0(0.01) delta0(mydelta0) . use mcmcmixed_auto2.dta, clear . list in 1/5 . summarize *

Manufacturer-specific intercept and manufacturer-specific coefficient on weight, different prior variance on each manufacturer-specific parameter:

. matrix mydelta0=(1,0.01) . sysuse auto . generate manuf=cond(strpos(make," ")>0,substr(make,1,strpos(make," ")-1),make) . mcmcmixed mpg weight foreign || manuf:weight, saving(mcmcmixed_auto3.dta) d0(0.01) delta0(mydelta0) . use mcmcmixed_auto3.dta, clear . list in 1/5 . summarize *

When estimating models by MCMC, it is good practice to check for convergence by running multiple independent chains. mcmcmixed will generate different chains when run repeatedly with different values of seed(#); see mcmcreg##examples for an example of how to run multiple chains, and see mcmcconverge, available in the SSC package mcmcstats, for useful statistics for checking convergence once you have run multiple chains.

It is also good practice to drop early iterations, before convergence was achieved, when describing the posterior distribution. See mcmcsummarize, available in the SSC package mcmcstats, for a convenient way to describe the posterior distribution.

Saved results

mcmcmixed saves the following in e():

Scalars e(N) number of observations

Macros e(cmd) mcmcreg e(cmdline) command as typed e(depvar) name of dependent variable e(wtype) weight type e(wexp) weight expression

Functions e(sample) marks estimation sample

Author

Sam Schulhofer-Wohl, Federal Reserve Bank of Minneapolis, sschulh1.work@gmail.com. The views expressed herein are those of the author and not necessarily those of the Federal Reserve Bank of Minneapolis or the Federal Reserve System.

Reference

Chib, Siddhartha, 2001. "Markov Chain Monte Carlo Methods: Computation and Inference." In Handbook of Econometrics, vol. 5, ed. James J. Heckman and Edward Leamer, 3569-649. Amsterdam: Elsevier.