help orderalpha -------------------------------------------------------------------------------Title
leebounds -- Lee (2009) treatment effect bounds
Syntax
leebounds depvar treatvar [if] [in] [weight], [options]
Outcome and treatment Description ------------------------------------------------------------------------- Model depvar dependent variable treatvar binary treatment indicator
options Description ------------------------------------------------------------------------- Main select(varname) selection indicator tight(varlist) covariats for tightened bounds cieffect compute confidence interval for treatment effect
SE/Bootstrap vce(analytic|bootstrap) compute analytic or bootstrapped standard err > ors; vce(analytic) is the default.
Reporting level(#) set confidence level; default is level(95) ------------------------------------------------------------------------- pweights (default), fweights, and iweights are allowed, aweights are not allowed; see weight. Observations with negative weight are skipped for any weight type. bootstrap is allowed, by and svy are not allowed; see prefix.
Description
leebounds computes treatment effect bounds for samples with non-random sample selection/attrition as proposed by Lee (2009). The lower and upper bound, respectively, correspond to extreme assumptions about the missing information that are consistent with the observed data. As opposed to parametric approaches to correcting for sample selection bias, such as the classical Heckman (1979) estimator, Lee (2009) bounds rest on very few assumptions, i.e. random assignment of treatment and monotonicity. Monotonicity means that the treatment status affects selection in just one direction. That is, receiving a treatment makes selection either more or less likely for any observation. In technical terms, the approach rests on a trimming procedure. Either from below or from above, the group (treatment, control) that suffers less from sample attrition is trimmed at the quantile of the outcome variable that corresponds to the share of 'excess observations' in this group. Calculating group differentials in mean outcome yields the lower and the upper bound, respectively, for the treatment effect depending on whether trimming is from below or above. leebounds assumes that it is unknown, a priori, which group (treatment, control) is subject to the higher selection probability and estimates this from data (see Lee, 2009:1090).
Outcome and treatment
+-------+ ----+ Model +------------------------------------------------------------
depvar specifies the outcome variable.
treatvar specifies a binary variable, indicating receipt of treatment. Estimating the effect of treatvar on depvar is subject of the empirical analysis. The lager value of treatvar is assumed to indicate treatment.
+------+ ----+ Main +-------------------------------------------------------------
select(varname) specifies a binary selection indicator. treatvar my only take the value zero or unity. If no selction indicator varname is specified, any observation with non-missing information on depvar is assumed to be selected while all observations with missing information on depvar are assumed to be not selected.
tight(varlist) specifies a list of covariates for computing tightened bounds. With tight() specified, the sample is splitted into into cells defined by the covariates in varlist. Trimmed means are calculated separately for each cell, where the trimming proportion is specific to each cell. Finally, a weighted average of trimmed means is calculated. Continuous variables may, hence, not enter varlist without afore being converted to categorical variables. Specifying to many cells by including numerous variables in varlist, or by including variables that take numerous different values, will cause error.
cieffect requests calculation of a confidence interval for the treatment effect. Note that this interval is narrower than the conjunction of confidence intervals for the estimated bounds (see Lee, 2009:1089; Imbens and Manski, 2004). This interval captures both, uncertainty about the bias due to non-random sample attrition and uncertainty because of sampling error.
+--------------+ ----+ SE/Bootstrap +-----------------------------------------------------
vce(analytic|bootstrap) specifies whether analytic or bootstrapped standard errors are calculated for estimated bounds. analytic is the default. bootstrap allows for the suboptions reps(#) and nodots; see bootstrap. For vce(analytic) the covariance for the estimated lower and upper bound is not computed. If this covariance is of relevance, one should choose vce(bootstrap). Instead of specifying vce(bootstrap) one may alternatively use the prefix command bootstrap, which allows for numerous additional options. Yet leebounds' internal bootstrapping routine is much faster than the prefix command, allows for sampling weights by performing a weighted bootstrap, and makes the option cieffect use bootstrapped standard errors, too.
+-----------+ ----+ Reporting +--------------------------------------------------------
level(#); see [R] estimation options. One may change the reported confidence level by retyping leebounds without arguments and only specifying the option level(#). However, this affects only the confidence interval for the bounds, but not for the confidence interval requested with cieffect.
Examples
Basic syntax . leebounds wage training
Tightened Lee bounds with weighted bootstrap and treatment effect-confidence interval . leebounds wage training [pw=1/prob], select(wageinfo) tight(female immigrant) cieffect vce(boot, reps(250) nodots)
Saved results
leebounds saves the following in e():
Scalars e(N) number of observations e(Nsel) number of selected observations e(trim) (overall) trimming proportion e(cells) number of cells (only saved for tight()) e(cilower) lower bound of treatment effect-confidence interval (only saved for cieffect) e(ciupper) upper bound of treatment effect-confidence interval (only saved for cieffect) e(level) confidence level e(N_reps) number of bootstrap repetitions (only saved for vce(bootstrap))
Macros e(cmd) leebounds e(cmdline) command as typed e(title) Lee (2009) treatment effect bounds e(vce) either analytic or bootstrap e(vcetype) Bootstrap for vce(bootstrap) e(depvar) depvar e(treatment) treatvar e(select) varname (only saved for select()) e(cellsel) cell-specific selection pattern, either homo, or hetero (only saved for tight()) e(covariates) varlist (only saved for tight()) e(trimmed) either treatment or control e(wtype) either pweight, fweight, or iweight (only saved if weights are specified) e(wexp) = exp (only saved if weights are specified) e(properties) b V
Matrices e(b) 1x2 vector of estimated treatment effect bounds (colnames are of the form treatvar:lower and treatvar:upper) e(V) 2x2 variance-covariance matrix for estimated treatment effect bounds (covariance set to zero for vce(analytic))
Functions e(sample) marks estimation sample
References
Heckman, J.J. (1979). Sample Selection Bias as a Specification Error. Econometrica 47, 153–161.
Imbens, G.W. and C.F. Manski (2004). Confidence Intervals for Partially Identified Parameters. Econometrica 72, 1845–1857.
Lee, D.S. (2009). Training, Wages, and Sample Selection: Estimating Sharp Bounds on Treatment Effects. Review of Economic Studies 76, 1071–1102.
Also see
Manual: [R] heckman
Help: [R] heckman
Online: bpbounds, bpboundsi, mhbounds
Author
Harald Tauchmann Rheinisch-Westfälisches Institut für Wirtschaftsforschung (RWI) Essen, Germany E-mail: harald.tauchmann@rwi-essen.de
Disclaimer This software is provided "as is" without warranty of any kind, either expressed or implied. The entire risk as to the quality and performance of the program is with you. Should the program prove defective, you assume the cost of all necessary servicing, repair or correction. In no event will the copyright holders or their employers, or any other party who may modify and/or redistribute this software, be liable to you for damages, including any general, special, incidental or consequential damages arising out of the use or inability to use the program.
Acknowledgements
This work has been supported in part by the Collaborative Research Center "Statistical Modelling of Nonlinear Dynamic Processes" (SFB 823) of the German Research Foundation (DFG).