Last updated August 29, 2001

------------------------------------------------------------------------------- help for psmatch -------------------------------------------------------------------------------

Perform various types of propensity score matching

psmatch treatvar , {on(matchvar1 [matchvar2 [matchvar3]])|estimate(varlist)} [ logit index caliper(real) outcome(outcomevar) id(idvar) kernel(outcomevar) bwidth(real) epan tricube(outcomevar) spline(outcomevar) nknots(integer) mean(outcomevar) neighbour(integer) line(outcomevar) lowess(outcomevar) both nocommon nocount quality(varlist) saving(filename) bootstrap reps(integer) dots size(integer) every(integer) replace double level(integer) ]


treatvar is a 0-1 variable identifying two groups, 1 for the 'treated' and 0 for the 'controls'. psmatch pairs to each treated unit one or more 'comparable' (in terms of the matchvars) non-treated units and associates to the outcome of the treated unit, the (weighted) outcomes of his 'neighbours' in the comparison group, where the weights can optionally depend on their 'distance' to the treated unit under consideration.


on(matchvar1 [matchvar2 [matchvar3]]) specifies the variable(s) on which to match. The 'propensity score', the probability of belonging to the treated group given observable characteristics, can be used as a summary of these characteristics (Rosenbaum and Rubin, 1983). It can be estimated via logit or probit regressions, and the predicted probability or the index used as matchvar1. I suggest the use of double as storage type for matchvar1 in order to decrease the likelihood of multiple 'exact' matches for a given treated. If a second or third matchvar is specified (e.g. matching on a finer basis, or on multiple scores estimated from a multinomial model such as mlogit or nlogit), psmatch will look for the closest match in terms of the Mahalanobis distance constructed from the two or three variables. This Mahalanobis-metric matching (formula from Rubin, 1980) is presently only allowed for the one-to-one and kernel-based (gaussian and epanechnikov) versions. Note: 1) Mahalanobis-metric matching may take long to implement. 2) If required, it is the user who has in this case to programme bootstrapping with the appropiate scores estimation procedure.

estimate(varlist) is an alternative to on(matchvar1) and estimates the propensity score (i.e. the predicted probability, unless the option index is specified) using the user-specified varlist as regressors in a probit model (unless the option logit is specified). Either on or estimate must be specified.

logit uses a logit model to estimate the propensity score.

index requires the use of the linear index as the propensity score when estimated.

If no smoothing type of matching is specifed via the options kernel, spline, mean, line, lowess or tricube, the default is to perform one-to-one (or nearest-neighbour) matching with replacement: To each treated unit that single control unit with the closest propensity score is matched; a given control unit can be matched to more than one treated unit. This latter choice is necessary for estimation in the multiple-treatment case, where each sub-group will act both as a treated group and as (several) comparison groups. Two new variables are created:

_times stores the number of times a unit is used (1 for all matched treated); it is missing for the unmatched treated and unmatched control units. In subsequent analyses use [fw=_times] or else expand _times _matchdif is defined only for matched treated and stores the absolute distance (in terms of the propensity score or the Mahalanobis metric) of that treated unit with its matched control. It is useful to assess matching quality and decide on possibly refining or widening the caliper.

caliper(real) calls for one-to-one caliper matching (Cochran and Rubin, 1973) with replacement to be performed: Treated units for which no control unit is found within the maximum absolute distance given by the caliper are left unmatched.

id(idvar) specifies the numeric variable identifying the individual units. If specified, the programme will check that the dataset contains only one observation per individual, as needed by psmatch. For one-to-one matching, it will also ensure that the same matched control is always used should there be multiple 'exact' matches. Secondly, it will create the variable _matchedid, defined only for matched treated and storing the identifier idvar of the corresponding matched control.

outcome(outcomevar) specifies the numeric outcome variable which the user intends to evaluate by one-to-one matching. If specified, the means of outcomevar for the matched treated and for the matched controls, together with their difference, an approximate standard error of the effect and the corresponding t-test statistic will be directly displayed after matching is performed. Secondly, a new variable _moutcomevar will be created, storing for each matched treated the outcomevar of his matched control.

kernel(outcomevar) asks for kernel-based matching to be performed (Heckman et al., 1997 and 1998). outcomevar is the outcome being evaluated. All controls are used to construct a weighted matched outcomevar for a given treated unit (within the common support). These values are stored in the new variable, _moutcomevar.

bwidth(real) specifies the bandwidth to be used for kernel. Default is 0.06. It is also the bandwidth to be used by mean, line, lowess and tricube, this time however constrained in (0,1] and expressed in terms of the percentage of non-treated units to be used in smoothing. Default is 0.8.

epan specifies that the Epanechnikov kernel be used by kernel rather than the default Gaussian one. This means that only those controls falling within a radius of bwdith are used to construct _moutcomevar for a given treated unit (within the common support).

tricube(outcomevar) performs kernel-based matching using a tricube weight. Only those controls within a neighbourhood determined by bwdith are used to construct a weighted matched outcomevar for a given treated unit (within the common support). These values are stored in the new variable, _moutcomevar.

spline(outcomevar) performs 'spline-smoothing matching' by first fitting a natural cubic spline on matchvar1 (or on the result from estimate) to outcomevar for the non-treated. The matched values are stored in the new variable, _moutcomevar. It requires the STB spline programme, which can be downloaded by typing: net install snp7_1.

nknots(integer) specifies the number of interior knots for spline smoothing. Default is the fourth root of the number of non-treated units.

mean(outcomevar) performs nearest-neighbours matching (with equal weights). The number of non-treated neighbours is governed by either bwidth (default is 0.8) or more directly by neighbour. The matched values are stored in the new variable, _moutcomevar.

neighbour(integer) (or neighbor()) specifies the number of neighbours for nearest-neighbours matching.

line(outcomevar) performs 'least-squares-smoothing matching', obtained by locally fitting an (unweighted) line in the neighbourhood determined by bwidth. The matched values are stored in the new variable, _moutcomevar.

lowess(outcomevar) performs a type of local linear matching (Heckman et al., 1997), smoothing the non-treated in a neighbourhood determined by bwidth and weighting their contribution by a tricube weight. The matched values are stored in the new variable, _moutcomevar.

both asks kernel or tricube or spline or mean or line or lowess to smooth outcomevar for the treated as well and to store the smoothed values in the new _soutcomevar.

nocommon forces treated individuals outside the common support to be matched too (for the smoothed matching estimators).

nocount suppresses the display of the 'count down' of the treated being matched - which is shown by default in the more time-consuming types of estimators (one-to-one on more than one score; kernel) -, as well as the display of a summary of instructions (log files should remain unaffected anyway).

quality(varlist) creates a new dataset (named as specified by the saving option) containing (one-to-one) matching quality indicator variables for the regressors specified in varlist. The variables in the created dataset are: a string variable storing the name of the regressor, the corresponding means in the treated and control groups (both for the full and for the matched groups), the standardised percentage bias (a) before and (b) after matching for that regressor (the difference of the sample means in the treated and non-treated - (a) full or (b) matched - sub-samples as a percentage of the square root of the average of the sample variances in the treated and non-treated groups; formulae from Rosenbaum and Rubin, 1985), and the percentage bias reduction.

saving(filename) creates a dataset named filename.dta containing either the quality information requested if quality was specified or the bootstrap distribution for the treatment effect if bootstrap was specified. Note that for quality, a dataset with the same filename will automatically be OVERWRITTEN; for bootstrap it will not, unless the option replace is also specified.

bootstrap performs bootstrapping of the treatment effect. The option estimate needs to be specified, and for one-to-one matching outcome as well. The option quality cannot be requested. ATTENTION: This can be quite time-consuming. Also note that for non-standard estimation of the score (e.g. in a multiple-treatment framework) and/or non-standard type of outcome (e.g. histories), it is the user who should post the saved r(effect) in a tailor-made programme to be called by STATA's bstrap. If interested in the reproducibility of the results, set the random-number seed by typing set seed integer before bootstrapping. (This may not work exactly if there are ties in one-to-one matching). The following 7 options and saving are those of bstrap; see also the relevant entry in STATA's manual.

reps(integer) specifies the number of bootstrap replications to be performed. The default is 50.

dots requests a dot be placed on the screen at the beginning of each replication.

size(integer) specifies the size of the samples to be drawn. The default is _N, that is the size as the data.

every(integer) specifies that the results should be saved every integer-th replication.

replace indicates that the file specified by saving() may be overwritten.

double specifies that the bootstrap results for each replication are to be stored as double (8-byte reals) rather than as the default float (4-byte reals).

level(integer) specifies the confidence level, in percent, for confidence intervals. The default is 95 or as set by set level.


The dataset needs to contain ONLY ONE observation per unit; Stata will check this for you if the option id() is specified; if not, it is your responsibility! If running psmatch more than once on the same dataset, it is presumed the user is no longer interested in the _* variables previously created; if they are not renamed first, they will be REPLACED by the new ones.

Saved Results

For the smoothed types of matching and, if the option outcome is specified for one-to-one matching, psmatch saves the (possibly smoothed) average outcome for the matched treated in scalar r(mean1), for the matched controls in scalar r(mean0); the average treatment effect on the treated is stored in scalar r(effect).


. psmatch treated, on(score) caliper(.001) id(serialno) outcome(wage)

. psmatch treated, on(score age) quality(age-sex) saving(qual1) . stset days [fw=_times], failure(fail) . sts graph, by(treated)

. psmatch group1, on(p1 p2) kernel(logwage) bwidth(0.01) epan nocommon

. psmatch treated, est(age-sex) lowess(logwage) bwidth(0.2) both

. psmatch reform, est(age-sex) mean(educat) neighb(5) boot reps(100) dots


Barbara Sianesi University College London and Institute for Fiscal Studies Email:


The core code for one-to-one matching has been made faster by adapting a clever idea by Andrea Ichino (European University Institute). spline uses Peter Sasieni's STB-24 snp7.1 spline programme. mean, line, lowess and tricube are based on STATA's ksm command. See the corresponding entry in the manual for additional information.

Bibliography and Sources

Cochran, W. and Rubin, D.B. (1973), "Controlling Bias in Observational Studies", Sankyha, 35, 417-446. Dehejia, R.H and Wahba, S. (1999), "Causal Effects in Non-Experimental Studies: Re-Evaluating the Evaluation of Training Programmes", Journal of the American Statistical Association, 94, 1053-1062. Heckman, J.J., Ichimura, H. and Todd, P.E. (1997), "Matching As An Econometric Evaluation Estimator: Evidence from Evaluating a Job Training Programme", Review of Economic Studies, 64, 605-654. Heckman, J.J., Ichimura, H. and Todd, P.E. (1998), "Matching as an Econometric Evaluation Estimator", Review of Economic Studies, 65, 261-294. Heckman, J.J., Ichimura, H., Smith, J.A. and Todd, P. (1998), "Characterising Selection Bias Using Experimental Data", Econometrica, 66, 5. Heckman, J.J., LaLonde, R.J., Smith, J.A. (1998), "The Economics and Econometrics of Active Labour Market Programmes", in Ashenfelter, O. and Card, D. (eds.), The Handbook of Labour Economics, Volume III. Imbens, G. (2000), "The Role of Propensity Score in Estimating Dose-Response Functions", Biometrika, 87, 3, 706-710. Lechner, M. (2001), Identification and Estimation of Causal Effects of Multiple Treatments under the Conditional Independence Assumption, in: Lechner, M., Pfeiffer, F. (eds), Econometric Evaluation of Labour Market Policies, Heidelberg: Physica/Springer, p. 43-58. Rosenbaum, P.R. and Rubin, D.B. (1983), "The Central Role of the Propensity Score in Observational Studies for Causal Effects", Biometrika, 70, 1, 41-55. Rosenbaum, P.R. and Rubin, D.B. (1985), "Constructing a Control Group Using Multivariate Matched Sampling Methods that Incorporate the Propensity Score", The American Statistician, 39, 1, 33-38. Rubin, D.B. (1974), "Estimating Causal Effects of Treatments in Randomised and Non-Randomised Studies", Journal of Educational Psychology, 66, 688-701. Rubin, D.B. (1980), "Bias Reduction Using Mahalanobis-Metric Matching", Biometrics, 36, 293-298.