Syntax radiusmatch treatvar [indepvars] [if] [in] , outcome(varlist) [options]
options Description ------------------------------------------------------------------------- Specification outcome(varlist) dependent variables pscore(varname) user-supplied propensity score mahalanobis(varlist) additional matching variables ate estimate ATENT and ATE as well (Note: doubles computation time)
Parameters for matching cquantile(#) defines the maximum distance as the # percentile of the distance distribution; default is cquantile(90) cpercent(#) defines the radius in %-distance to largest one-to-one match; default is cpercent(300) scoreweight(#) values larger than 1 give more weight to the propensity score in the mahalanobis metric; default is scoreweight(5) mweight(#) maximum share of weight in % of one observation compared to total weight; default is no restriction on maximum weight mweight(100) logit uses logit instead of probit to estimate the propensity score index uses linear index instead of probability as propensity score nocommon common support is not enforced bc(#) 0 for no bias correction, 1 for linear bias correction, 2 for linear and logit bias correction; default is bc(1)
Bootstrap and standard errors knn uses nearest-neighbor matching algorithm to estimate the conditional variance bootstrap(#) number of bootstrap replications; default is bootstrap(0) bfile(["]filename["]) saves the results of all bootstrap replications
Computation boost avoids loops when estimating the distances and increases computation speed.
------------------------------------------------------------------------- indepvars may contain factor variables; see fvvarlist. Weights are not allowed.
Description
radiusmatch estimates average treatment effects (ATET, ATENT, ATE) of treatvar for a set of outcomes variables outcome using radius matching. indepvars are used to compute the propensity score. radiusmatch is a one-to-many calliper matching algorithm as, for example, discussed by Rosenbaum and Rubin (1985) and used by Dehejia and Wahba (1999, 2002). Calliper or radius matching uses all comparison observations within a predefined distance around the propensity score or based on the Mahalanobis distance of the respective treated. This allows for higher precision than fixed nearest neighbour matching in regions in which many similar comparison observations are available. Also, it may lead to a smaller bias in regions where similar controls are sparse. In other words, instead of fixing M globally, M is determined in the local neighbourhood of each treated observation.
This estimator was proposed by Lechner, Miquel, and Wunsch (2011) combines the features of calliper matching with additional predictors and linear or nonlinear regression adjustment. After the first step of distance-weighted calliper matching with predictors, this estimator uses the weights obtained from matching in a weighted linear or non-linear regression in order to remove any bias due to mismatches. For a detailed description of the estimator and information on the adequate choice of the matching parameters please refer to Huber, Lechner & Steinmayr (2012). Syntax and output of radiusmatch are oriented towards the popular psmatch2 command by Barbara Sianesi and Edwin Leuven. The variables created by radiusmatch can be used with the pstest and psgraph commands to test for covariate imbalance and to graph the distribution of the propensity score. Both commands can be downloaded from the SSC archive. Note that radiusmatch requires the installation of the tknz command: . ssc install tknz, replace radiusmatch creates a number of variables: _treated equals 0 for control observations and 1 for treatment observations. _untreated is 1 - _treated _support equals 1 if the observation is on the common support and 0 otherwise. _pscore is the estimated propensity score or a copy of the one provided by pscore(). _weightt holds the estimated weights for the ATET _weightut holds the estimated weights for the ATENT
Options
+---------------+ ----+ Specification +----------------------------------------------------
outcome(varlist) specifies the outcome variables.
pscore(varname) is optional and specifies a user-supplied propensity score.
mahalanobis(varlist) Additional covariates to be controlled for in addition to the propensity score. Under the default NULL, propensity score matching is performed. If not NULL, matching on the Mahalanobis distance defined by the propensity score and the additional covariates is performed.
ate specifies that radiusmatch estimates the "ATET" (average treatment effect on the treated), the "ATENT" (average treatment effect on the nontreated), and the "ATE" (average treatment effect). If ate is not specified, only the ATET is estimated. Note that this option approximately doubles computation time.
+-------------------------+ ----+ Parameters for matching +------------------------------------------ cpercent(#) is the multiplier of the maximum distance in pair matching (or a particular quantile, see cquantile) which defines the radius. Default is 300, i.e., the radius is equal to 300 percent of the maximum distance in pair matching (or a particular quantile).
cquantile(#) is the quantile of the distances in pair matching to be used for the definition of the radius. Default is 90, i.e. the 0.9th quantile of the distances in pair matching are used. If cquantile is smaller or equal to 0 or greater or equal 1, the maximum distance is chosen. The size of the radius obtainedis a combination of cquantile and cpercent. E.g., cpercent=300 and cquantile=90 defines the radius as three times the 0.9th quantile of differences in pair matching.
mweight(#) is the maximum relative weight an observation may receive based on inverse probability weighting by the propensity score. E.g., if mweight=5, the maximum weight is 5 percent (compared to the joint weight of all other observations). The default is 100 (=100 percent), i.e., no restriction on the weight.
bc(#) 0 for no bias correction, 1 for linear bias correction, 2 for linear and logit bias correction. In either case, the outcome is regressed on the propensity score, its square, and the variables used to compute the Mahalanobis distance within the counterfactual treatment state.
nocommon specifies that no common support is imposed. The default is that common support is imposed). If support is imposed and ate is not specified, treated observations with propensity scores larger than the maximum propensity score among the nontreated are discarded. If ate is specified and common support is imposed, treated observations with propensity scores larger than the second largest propensity score among the nontreated are discarded from the sample. If commonsup=1 and estimand="ATENT", nontreated observations with propensity scores smaller than the minimum propensity score among the treated are discarded. For estimand="ATE"common support is imposed for both the treated and the nontreated.
logit specifies that logit is used for propensity score estimation instead of probit (probit is default).
index specifies that matching is based on the index of the probit/logit estimation instead of the propensity score (matching on the propensity score is default).
scoreweight(#) specifies the weight of the p-score in Mahalanobis distance matching. Default is 5, which implies that the p-score gets five times the weight of any of the additional covariates considered in Mahalanobis distance matching. A weight of 1 implies equal weighting of the p-score and additional covariates.
+-------------------------------+ ----+ Bootstrap and standard errors +------------------------------------
bootstrap(#) specifies that the bootstrap should be used for inference. The number in paranthesis specifies the number of bootstrap replications. For any positive integer of reps, bootstrap standard errors and p-values arecomputed based on the specified number of bootstrap replications (recommended). p-values are computed by bootstrapping the t-statistic.In the case that the probit cannot be estimated in one bootstrap replication, the pscore estimated in the original sample is used. Note that in principle radiusmatch can also be used with the Stata bootstrap command. However, bootstrapping the t-statistic is not possible in this case.
bfile(filename) saves the estimated parameters and analytical standard errors of every bootstrap replication in "filename.dta". The first row contains the estimates in the original sample.
knn specifies that a nearest neighbor matching algorithm is used for conditional variance estimation of the outcome given the matching weight under counterfactual treatment, which is required for estimating the standard error. Default is local constant kernel regression (based on the Epanechnikov kernel and the rule of thumb for bandwidth choice).
+-------------+ ----+ Computation +------------------------------------------------------
boost specifies that matrix operations should be used instead of loops wherever possible. This reduces computational time by roughly one third but may cause problems in datasets with many observations (at my computer > 30.000) when the operating system refuses to provide the memory needed.
Examples: radius matching without bootstrap
Radius matching based on the propensity score . sysuse nlsw88 . radiusmatch union age i.race married hours tenure south smsa, out(wage)
Radius matching based on Mahalanobis distance . radiusmatch union age i.race married hours tenure south smsa, out(wage) mahal(age) ate
Radius matching with user supplied propensity score . probit union age i.race married hours tenure south smsa . predict double pscore, index . radiusmatch union, pscore(pscore) out(wage) index ate
Examples: radius matching with bootstrap
Radius matching based on Mahalanobis distance with bootstrapped inference . radiusmatch union age i.race married hours tenure south smsa, out(wage) mahal(age) boot(99) ate
-------------------------------------------------------------------------------
Saved results
radiusmatch saves the following in r():
Matrices r(atet) vector of ATETs for all outcomes r(atent) vector of ATENTs for all outcomes (if requested) r(ate) vector of ATEs for all outcomes (if requested) r(seatet) vector of asymptotic s.e. for ATET for all outcomes r(seatent) vector of asymptotic s.e. for ATENT for all outcomes (if requested) r(seate) vector of asymptotic s.e. for ATE for all outcomes (if requested) r(y0_atet) vector of Y0 (average outcome of comparison group) for ATET for all outcomes r(y0_atent) vector of Y0 for ATENT for all outcomes (if requested) r(y0_ate) vector of Y0 for ATE for all outcomes (if requested) r(y1_atet) vector of Y1 (average outcome of treated group) for ATET for all outcomes r(y1_atent) vector of Y1 for ATENT for all outcomes (if requested) r(y1_ate) vector of Y1 for ATE for all outcomes (if requested) r(b_seatet) vector of bootstrap s.e. for ATET for all outcomes r(b_seatent) vector of bootstrap s.e. for ATENT for all outcomes (if requested) r(b_seate) vector of bootstrap s.e. for ATE for all outcomes (if requested) r(b_patet) vector of bootstrap p-values for ATET for all outcomes r(b_patent) vector of bootstrap p-values for ATENT for all outcomes (if requested) r(b_pate) vector of bootstrap p-values for ATE for all outcomes (if requested)
-------------------------------------------------------------------------------
Thanks for citing radiusmatch as follows
Huber, M., M. Lechner, and A. Steinmayr. (2012). "Radius matching on the propensity score with bias adjustment: finite sample behaviour, tuning parameters and software implementation". University of St. Gallen, School of Economics and Political Science, Economics Working Paper Series No. 1226
Disclaimer
THIS SOFTWARE IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. IN NO EVENT WILL THE COPYRIGHT HOLDERS OR THEIR EMPLOYERS, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THIS SOFTWARE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITEDTO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU ORTHIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
Further reading
Dehejia, R. H., and S. Wahba (1999): "Causal Effects in Non-experimental Studies: Reevaluating the Evaluation of Training Programmes", Journal of the American Statistical Association, 94, 1053-1062.
Dehejia, R. H., and S. Wahba (2002): "Propensity Score- Matching Methods for Nonexperimental Causal Studies", Review of Economics and Statistics, 84, 151-161.
Huber, M., M. Lechner and C. Wunsch (2012): "The performance of estimators based on the propensity score", forthcoming in the Journal of Econometrics.
Lechner, M., R. Miquel and C. Wunsch (2011): "Long-Run Effects of Public Sector Sponsored Training in West Germany", Journal of the European Economic Association, 9, 742-784.
Rosenbaum, P. R., and D. B. Rubin (1985): "Constructing a Control Group Using Multivariate Matched Sampling Methods that Incorporate the Propensity Score", The American Statistician, 39, 33-38.
Authors
Martin Huber, University of St. Gallen.
Michael Lechner, University of St. Gallen.
Andreas Steinmayr, University of St. Gallen. If you observe any problems or if you have any comments or suggestions please contact andreas.steinmayr@unisg.ch.