Title
reffadjustsim -- random effects adjustment: simulating from the distribution of random effect variances and covariances
reffadjustsim depvar indepvars , eqn(string) [options]
options Description ------------------------------------------------------------------------- eqn(string) name of the equation the adjusted coefficients are to be extracted from centileopts(string) options passed to centile level(#) set confidence level; default is level(95) mcmcsum use chains from mcmcsum n(#) # of observations to simulate; default is 10,000 post post estimation results replace replace beta_indepvar if variable exists in dataset saving(filename [, replace]) save simulated observations to filename sf(numlist) scaling factors corresponding to each coefficient seed(#) seed for random-number generator statadrawnorm use Stata's drawnorm for Wald type CIs sublevel(#) sublevel of a repeated group variable waldtype report means & Wald-type confidence intervals -------------------------------------------------------------------------
reffadjustsim is a postestimation command to perform adjustment of random effects estimates. It runs with estimates from runmlwin or chains from runmlwin by mcmcsum (Leckie and Charlton, 2011), xtmixed, xtmelogit, and xtmepoisson.
reffadjustsim generates the specified number of observations of the variances and covariances of the random effects from the corresponding multivariate normal distribution. Alternatively, values are used from the returned chains from Bayesian estimation in MLwiN by mcmcsum, getchains. For each observation the adjusted coefficient/s are estimated as described by Fisher (1925, chapter 5, section 29). The approach is described in more detail in Macdonald-Wallis et al. (2011, submitted). Further details are given in reffadjust. The covariates (indepvars) can be specified in any order.
An alternative approach is to use the accompanying reffadjust4nlcom command to generate the expression for the adjusted coefficient, and pass that to nlcom to estimate a delta-method confidence interval.
See Buis (2011) for a description of how to retrieve random effect variance and correlations from xt commands.
Note on multivariate response models: Covariates (indepvars) in runmlwin estimates from multivariate response models have suffix _#, where # is the corresponding equation number. For example, from equation 1 cons would be referred to as cons_1.
Note on shrinkage estimates: reffadjustsim uses the estimated random effect variances and covariances from the model. It does not use the shrinkage estimates of these parameters, i.e. the variances and covariances of the residuals (see chapter 3 of the MLwiN User Manual).
Warning about waldtype option: By default reffadjustsim reports coefficients as medians with 2.5 and 97.5 percentiles. Coefficients can be reported as means with Wald-type confidence intervals with the waldtype option. Means and Wald-type confidence intervals may not be accurate. It is always advised to compare results with the default output and if possible also with the delta-method confidence interval via reffadjust4nlcom and nlcom. In general, P-values associated with these estimates may be affected by boundary value issues in the estimation of the random effect variances and covariances (see Distribution theory for likelihood ratio tests subsection in [XT] xtmixed, Gutierrez et al. 2001).
Interpretation of coefficients: The coefficients estimated by reffadjustsim represent the mean difference in the random effect entered as the dependent variable, which is associated with a unit increase in each of the random effects entered as independent variables, whilst adjusting for the other random effects included as independent variables.
Parameters estimated with zero variance: Sometimes a multilevel model can be declared as converged but some parameters (especially random effect variances and covariances) may not have a standard error. A warning is issued that resulting confidence intervals may not be valid in this case.
eqn(string) the name of the equation the coefficients are to be extracted from. For example a two level random effects model from runmlwin will typically return four equations (FP1, FP2, RP1, RP2).
centileopts(string) options passed to centile, note you may not specify the centile(#) option here. The reported percentiles can be changed through the level(#) option.
level(#); see [R] estimation options.
mcmcsum calculates centiles from the Bayesian posterior distribution of the coefficients using chains imported by: mcmcsum, getchains. Note your runmlwin model must have been fitted by MCMC. Options: seed, n, statadrawnorm, waldtype (and post) are not required/allowed with mcmcsum. Only allowed with runmlwin estimates.
n(#) specifies the number of observations to be simulated. The default is 10,000 and is not allowed to be less than 10. Not allowed with mcmcsum, where n is taken as the number of observations in the dataset imported by mcmcsum, getchains.
post causes reffadjustsim to behave like a Stata estimation (eclass) command. May only be specified with waldtype. When post is specified, reffadjustsim will post the vector of adjusted estimates and its estimated variance-covariance matrix to e(). Thus you could, after posting, treat the estimation results in the same way as you would treat results from other Stata estimation commands. For example, after posting, you could redisplay the results by typing reffadjustsim without any arguments, or use test to perform simultaneous tests of hypotheses on linear combinations of the estimates.
Specifying post clears out the previous estimation results, which can be recovered only by refitting the original model or by storing the estimation results before running reffadjustsim and then restoring them; see [R] estimates store.
replace overwrites variables named beta_indepvar if they exist in the dataset. Only valid with mcmcsum.
saving(filename [, replace]) saves the simulated realisations of the random effect variances and covariances to filename, optionally replacing filename if it exists.
seed(#) specifies the initial value of the random-number seed. The default is the current random-number seed. Specifying seed(#) is the same as typing set seed # before issuing the command; see set_seed. Not allowed with mcmcsum.
sf(numlist) a numlist of scaling factors. If specified each number corresponds to the respective covariate (indepvar), i.e. first number is the scaling factor for the first coefficient and so on. If specified the numlist must be the same length as the number of covariates. To scale the coefficient by 2 times the dependent variable (Y), for example, then with one covariate (X) specify sf(2). To scale the coefficient by 2 times the covariate specify sf(.5) because in this case you scale by 2/2^2 since a regression coefficient is given by: cov(X,Y)/var(X).
statadrawnorm use Stata's drawnorm to simulate the adjusted coefficients. For speed by default reffadjustsim uses its own Mata implementation; see drawnorm. Not allowed with mcmcsum.
sublevel(#) the sublevel of a repeated group variable. For example, in the following model
. xtmixed f_p || school: z1 z2, nocons cov(id) || school: z3 z4, nocons cov(un) options
z1 and z2 are at sublevel 1 and z3 and z4 are at sublevel 2 of the school group variable. Only valid with xtmixed, xtmelogit, and xtmepoisson.
waldtype report coefficients as means with Wald-type confidence intervals. By default reffadjustsim reports coefficients as medians and centiles of the simulated coefficients. This option can produce inaccurate results, as per the warning above please compare with the default output. Not allowed with mcmcsum.
--------------------------------------------------------------------------- Examples 1 & 2 assume the path to the MLwiN executable is set in global MLwiN_path; see runmlwin
Example 1: Two level continuous response model (see page 59 of the MLwiN User Manual)
. * read in data . use http://www.bristol.ac.uk/cmm/media/runmlwin/tutorial, clear
. * fit model using MLwiN via runmlwin . runmlwin normexam cons standlrt, level1(student: cons) level2(school: cons standlrt) batch
. * report coefficient as median with 2.5 & 97.5 percentiles . reffadjustsim cons standlrt, eqn(RP2) seed(12345)
. * report coefficient as mean & Wald-type confidence interval . * Warning: mean and Wald-type confidence are inaccurate in this example . reffadjustsim cons standlrt, eqn(RP2) seed(12345) waldtype
. * compare with delta-method confidence interval (first refit model) . runmlwin normexam cons standlrt, level1(student: cons) level2(school: cons standlrt) batch . reffadjust4nlcom cons standlrt, eqn(RP2) . nlcom `r(beta_standlrt)'
. * compare with Bayesian posterior distribution . runmlwin normexam cons standlrt, level1(student: cons) level2(school: cons standlrt) batch mcmc(on) initsprevious seed(121211) . mcmcsum, getchains . reffadjustsim cons standlrt, eqn(RP2) mcmcsum
Example 2: Multivariate response model (see page 214 of the MLwiN User Manual)
. * read in data . use http://www.bristol.ac.uk/cmm/media/runmlwin/gcsemv1, clear
. * fit model using MLwiN via runmlwin . runmlwin (written cons female, eq(1)) (csework cons female, eq(2)), level1(student: (cons, eq(1)) (cons, eq(2))) level2(school: (cons, eq(1)) (cons, eq(2))) batch
. * report coefficient as median with 2.5 and 97.5 percentiles . reffadjustsim cons_1 cons_2, eqn(RP2) seed(12345) . * report coefficient as mean with Wald-type confidence interval . reffadjustsim cons_1 cons_2, eqn(RP2) seed(12345) waldtype
. * compare with delta-method confidence interval (first refit model) . runmlwin (written cons female, eq(1)) (csework cons female, eq(2)), level1(student: (cons, eq(1)) (cons, eq(2))) level2(school: (cons, eq(1)) (cons, eq(2))) batch . reffadjust4nlcom cons_1 cons_2, eqn(RP2) . nlcom `r(beta_cons_2)'
. * compare with Bayesian posterior distribution . runmlwin (written cons female, eq(1)) (csework cons female, eq(2)), level1(student: (cons, eq(1)) (cons, eq(2))) level2(school: (cons, eq(1)) (cons, eq(2))) batch mcmc(on) initsprevious seed(121211) . mcmcsum, getchains . reffadjustsim cons_1 cons_2, eqn(RP2) mcmcsum
Example 3: based on xtmixed helpfile . webuse nlswork, clear . xtmixed ln_w grade age c.age#c.age ttl_exp tenure c.tenure#c.tenure || idcode: tenure, cov(uns) var . reffadjustsim _cons tenure, eqn(idcode) seed(12345)
Example 4: based on xtmelogit helpfile . webuse bangladesh, clear . xtmelogit c_use urban age child* || district: urban, cov(uns) var . reffadjustsim _cons urban, eqn(district) seed(12345)
Example 5: based on xtmepoisson helpfile . webuse epilepsy, clear . xtmepoisson seizures treat lbas lbas_trt lage visit || subject: visit, cov(uns) var intpoints(9) . reffadjustsim _cons visit, eqn(subject) seed(12345)
Example 6: repeated group variable . webuse nlswork, clear . xtmixed ln_w grade age || idcode: tenure union, cov(uns) || idcode: race, cov(uns) var . reffadjustsim tenure union, eqn(idcode) sub(1) seed(12345) . reffadjustsim race _cons, eqn(idcode) sub(2) seed(12345) ---------------------------------------------------------------------------
reffadjustsim saves the following in r():
Scalars r(N) number of simulated observations
If waldtype is not specified, reffadjustsim saves the following for each indepvar in r():
Scalars r(n_cent_indepvar) number of centiles requested (usually 2) r(c_1_indepvar) value of 1st centile for indepvar r(lb_1_indepvar) 1st centile lower confidence bound r(ub_1_indepvar) 1st centile upper confidence bound r(c_2_indepvar) value of 2nd centile for indepvar r(lb_2_indepvar) 2nd centile lower confidence bound r(ub_2_indepvar) 2nd centile upper confidence bound r(med_indepvar) median of indepvar r(lb_med_indepvar) median lower confidence bound r(ub_med_indepvar) median upper confidence bound
If waldtype is specified, reffadjustsim saves the following in r():
Matrices r(b) vector of adjusted coefficients r(V) estimated variance-covariance matrix of the adjusted coefficients
If waldtype and post are specified, reffadjustsim also saves the following in e():
Scalars e(N) number of simulated observations
Macros e(cmd) reffadjustsim e(depvar) name of the dependent variable e(properties) b V
Matrices e(b) vector of adjusted coefficients e(V) estimated variance-covariance matrix of the adjusted coefficients
Buis ML. 2011. Stata tip 97: Getting at rho's and sigma's. The Stata Journal. 11(2) 315-317.
Fisher RA. 1925. Chapter 5: Tests of significance of means, differences of means, and regression coefficients, Section 29: Regression with several independent variates in Statistical Methods for Research Workers. Oliver and Boyd, Edinburgh.
Gutierrez RG, Carter S, Drukker DM. 2001. sg160: On boundary-value likelihood ratio tests. Stata Technical Bulletin. 60. 15-18.
Leckie G, Charlton C. 2011. runmlwin: Stata module for fitting multilevel models in the MLwiN software package. Centre for Multilevel Modelling, University of Bristol, UK. http://www.bristol.ac.uk/cmm/software/runmlwin/
Macdonald-Wallis C, Lawlor DA, Palmer TM, Tilling K. 2011 (submitted). Multivariate multilevel spline models for parallel growth processes: application to weight and mean arterial pressure in pregnancy. Statistics in Medicine.
Rasbash J, Charlton C, Browne WJ, Healy M, Cameron B. 2009. MLwiN version 2.1. Centre for Multilevel Modelling, University of Bristol, UK. http://www.bristol.ac.uk/cmm/software/mlwin.
Rasbash J, Steele F, Browne WJ, Goldstein H. 2009. A user's guide to MLwiN, v2.10. Centre for Multilevel Modelling, University of Bristol, UK. http://www.bristol.ac.uk/cmm/software/mlwin/download/manuals.html.
Tom Palmer, MRC Centre for Causal Analyses in Translational Epidemiology, School of Social and Community Medicine, University of Bristol, UK. tom.palmer@bristol.ac.uk.
Corrie Macdonald-Wallis, MRC Centre for Causal Analyses in Translational Epidemiology, School of Social and Community Medicine, University of Bristol, UK. c.macdonald-wallis@bristol.ac.uk.
Kate Tilling, School of Social and Community Medicine, University of Bristol, UK.
Also see
Help: reffadjust, reffadjust4nlcom, runmlwin (if installed), mcmcsum (if installed), nlcom, xtmixed, xtmelogit, xtmepoisson