{smcl}
{* *! version 5.0.2 01Jul2017}{...}
{cmd:help diff}
{hline}
{title:Title}
{p2colset 5 19 21 2}{...}
{p2col:{hi: diff} {hline 2}}Difference in differences estimation{p_end}
{p2colreset}{...}
{title:Syntax}
{p 8 15 2}
{cmd:diff}
{it:outcome_var}
{ifin}
{weight}
{cmd:,}[ {it:options}]
{title:Description}
{pstd}
{opt diff} runs several difference in differences (diff-in-diff) treatment effect estimations of
a given outcome variable from a pooled baseline and follow up dataset: Single Diff-in-Diff, Diff-in-Diff
accounting for covariates, Kernel Propensity Score Matching diff-in-diff, and the Quantile Diff-in-Diff.
{opt diff} is also suitable for estimating repeated cross sections diff-in-diff (including the {opt k:ernel} option)
and the triple difference-in-differences analysis.
{title:Options}
{synoptset 20 tabbed}{...}
{synopthdr}
{synoptline}
{syntab:Model - Required}
{synopt :{opt p:eriod(varname)}}Indicates the binary period variable (0: before; 1: after). Note: if your data contains a periodical
frequency (monthly, quarterly, yearly, etc.), it is suggested to specify option {opt p:eriod(varname)} and include a binary variable
for each frequency in option {opt c:ov(varlist)}.{p_end}
{synopt :{opt t:reated(varname)}}Indicates the binary treatment variable (0: controls; 1:treated).{p_end}
{syntab: Optional}
{synopt :{opt c:ov(varlist)}}Specifies the pre-treatment covariates of the model. Also use this option to specify time fixed-effects
in the case of multiple time-frequency data (e.g. monthly, yearly, quarterly, etc.). When
option {opt k:ernel} is selected these variables are used to estimate the propensity score.{p_end}
{synopt :{opt k:ernel}}Performs the Kernel-based Propensity Score Matching diff-in-diff. This option
generates the variable _weights containing the weights derived from the Kernel Propensity Score Matching,
and _ps when the Propensity Score is not supplied in {opt ps:core(varname)}, following {stata "ssc des psmatch2" :Leuven and Sianesi (2014)}.
This option requires the {it: id(varname)} of each unit or individual except under the repeated cross section
{opt rcs}) setting.{p_end}
{synopt :{opt id(varname)}}Option {it: kernel} requires the supply of the identification variable.{p_end}
{synopt :{opt bw(#)}}Supplied bandwidth of the Kernel function. The default bandwidth is 0.06.{p_end}
{synopt :{opt kt:ype(kernel)}}Specifies the type of the Kernel function. The types are {it:epanechnikov} (the default),
{it:gaussian}, {it:biweight}, {it:uniform} and {it:tricube}.{p_end}
{synopt :{opt rcs}}Indicates that the {opt k:ernel} is set for repeated cross section. This option
does not require option {opt id(varname)}. Option {opt rcs} strongly assumes that covariates in {opt c:ov(varlist)}
do not vary over time.{p_end}
{synopt :{opt qd:id(quantile)}}Performs the Quantile Difference in Differences estimation at the specified
quantile from 0.1 to 0.9 (quantile 0.5 performs the QDID at the medeian). You may combine this option with
{opt k:ernel} and {opt c:ov}. {opt qd:id} does not support weights nor robust standard errors.
This option uses {manhelp qreg R} and {manhelp bsqreg R} for bootstrapped standard errors{p_end}
{synopt :{opt ps:core(varname)}}Supplied Propensity Score.{p_end}
{synopt :{opt lo:git}}Specifies logit estimation of the Propensity Score. The default is Probit.{p_end}
{synopt :{opt sup:port}}Performs {opt diff} on the common support of the propensity score given the option {opt k:ernel}.{p_end}
{synopt :{opt add:cov(varlist)}}Indicates additional covariates in addition to those specified
in the estimation of the propensity score. Also use this option to specify time fixed-effects
in the case of multiple time-frequency data (e.g. monthly, yearly, quarterly, etc.).{p_end}
{synopt :{opt ddd(varname)}}Additional category for triple difference estimation. {opt t:reated(varname)} is deemed as the
first category and {opt ddd(varname)} the second category. This option is not compatible with options {opt k:ernel}, {opt test}
or {opt qd:id(quantile)}.{p_end}
{syntab:SE/Robust}
{synopt :{opt cl:uster(varname)}}Calculates clustered Std. Errors by {it: varname}.{p_end}
{synopt :{opt robust}}Calculates robust Std. Errors.{p_end}
{synopt :{opt bs}}performs a Bootstrap estimation of coefficients and standard errors.{p_end}
{synopt :{opt r:eps(int)}}Specifies the number of repetitions when the {opt bs} is
selected. The default are 50 repetitions.{p_end}
{syntab:Balancing test}
{synopt :{opt test}}Performs a balancing t-test of the difference in the means of the
covariates between the control and treated groups in period == 0. The option {it: test}
combined with {it: kernel} performs the balancing t-test with the weighted covariates.
See {manhelp ttest R}{p_end}
{syntab:Reporting}
{synopt :{opt rep:ort}}Displays the inference of the included covariates or the
estimation of the Propensity Score when option {opt kernel} is specified.{p_end}
{synopt :{opt nos:tar}}Removes the inference stars from the p-values.{p_end}
{synoptline}
{p 4 6 2}
{title:Exporting results}
{phang} You can export your results with {stata "ssc des outreg2" :outreg2}. Run the following command after {cmd: diff}
with double difference:{p_end}
{col 9}{phang}{txt}outreg2 using table_diff, ctitle(`r(depvar)') addstat(Mean control t(0), r(mean_c0), Mean treated t(0), r(mean_t0), Diff t(0), r(diff0), Mean control t(1), r(mean_c1), Mean treated t(1), r(mean_t1), Diff t(1), r(diff1)) label adec(3) excel keep(_diff) nocons{p_end}
{phang} Run the following command after {cmd: diff} with triple difference:{p_end}
{col 9}{phang}{txt}outreg2 using output, ctitle(`r(depvar)') addstat(Mean control - A t(0), r(mean_c0a), Mean control - B t(0), r(mean_c0b), Mean treated A - t(0), r(mean_t0a), Mean treated B - t(0), r(mean_t0b), Diff t(0), r(diff0), Mean control - A t(1), r(mean_c1a), Mean control - B t(1), r(mean_c1b), Mean treated - A t(1), r(mean_t1a), Mean treated - B t(1), r(mean_t1b), Diff t(1), r(diff1)) label excel keep(_diff) nocons dec(4) {p_end}
{phang} Results will be stored in the working directory (also see {cmd: help outreg2} for further options).{p_end}
{synoptline}
{p 4 6 2}
{title:Example}
{phang} 1. Diff-in-Diff with no covariates.{p_end}
{phang} We use the dataset form Card & Krueger (1994)*.{p_end}
{col 9}{stata "use http://fmwww.bc.edu/repec/bocode/c/CardKrueger1994.dta" : use "http://fmwww.bc.edu/repec/bocode/c/CardKrueger1994.dta"}
{col 9}{stata "diff fte, t(treated) p(t)" : diff fte, t(treated) p(t)}
{phang} For bootstrapped std. err.:{p_end}
{col 9}{stata "diff fte, t(treated) p(t) bs rep(50)" : diff fte, t(treated) p(t) bs rep(50)}
{phang} 2. Diff-in-Diff with covariates.{p_end}
{col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys)" : diff fte, t(treated) p(t) cov(bk kfc roys)}
{col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys) report" : diff fte, t(treated) p(t) cov(bk kfc roys) report}
{col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys) report bs" : diff fte, t(treated) p(t) cov(bk kfc roys) report bs}
{phang} 3. Kernel Propensity Score Diff-in-Diff.{p_end}
{col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys) kernel id(id)" : diff fte, t(treated) p(t) cov(bk kfc roys) kernel id(id)}
{col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys) kernel id(id) support" : diff fte, t(treated) p(t) cov(bk kfc roys) kernel id(id) support}
{col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys) kernel id(id) support addcov(wendys)" : diff fte, t(treated) p(t) cov(bk kfc roys) kernel id(id) support addcov(wendys)}
{col 9}{stata "diff fte, t(treated) p(t) kernel id(id) ktype(gaussian) pscore(_ps)" : diff fte, t(treated) p(t) kernel id(id) ktype(gaussian) pscore(_ps)}
{col 9}{stata "diff fte, t(treated) p(t) kernel id(id) ktype(gaussian) pscore(_ps) bs reps(50)" : diff fte, t(treated) p(t) kernel id(id) ktype(gaussian) pscore(_ps) bs reps(50)}
{phang} 3. Kernel Propensity Score Diff-in-Diff (Repeated Cross Section - rcs).{p_end}
{col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys) kernel rcs" : diff fte, t(treated) p(t) cov(bk kfc roys) kernel rcs}
{col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys) kernel rcs support" : diff fte, t(treated) p(t) cov(bk kfc roys) kernel rcs support}
{col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys) kernel rcs support addcov(wendys)" : diff fte, t(treated) p(t) cov(bk kfc roys) kernel rcs support addcov(wendys)}
{col 9}{stata "diff fte, t(treated) p(t) kernel rcs ktype(gaussian) pscore(_ps)" : diff fte, t(treated) p(t) kernel rcs ktype(gaussian) pscore(_ps)}
{col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys) kernel rcs support addcov(wendys) bs reps(50)" : diff fte, t(treated) p(t) cov(bk kfc roys) kernel rcs support addcov(wendys) bs reps(50)}
{phang} 4. Quantile Diff-in-Diff.{p_end}
{col 9}{stata "diff fte, t(treated) p(t) qdid(0.25)" : diff fte, t(treated) p(t) qdid(0.25)}
{col 9}{stata "diff fte, t(treated) p(t) qdid(0.50)" : diff fte, t(treated) p(t) qdid(0.50)}
{col 9}{stata "diff fte, t(treated) p(t) qdid(0.75)" : diff fte, t(treated) p(t) qdid(0.75)}
{col 9}{stata "diff fte, t(treated) p(t) qdid(0.50) cov(bk kfc roys)" : diff fte, t(treated) p(t) qdid(0.50) cov(bk kfc roys)}
{col 9}{stata "diff fte, t(treated) p(t) qdid(0.50) cov(bk kfc roys) kernel id(id)" : diff fte, t(treated) p(t) qdid(0.50) cov(bk kfc roys) kernel id(id)}
{col 9}{stata "diff fte, t(treated) p(t) qdid(0.50) cov(bk kfc roys) kernel id(id)" : diff fte, t(treated) p(t) qdid(0.50) cov(bk kfc roys) kernel rcs}
{phang} 5. Balancing test of covariates.{p_end}
{col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys wendys) test" : diff fte, t(treated) p(t) cov(bk kfc roys wendys) test}
{col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys wendys) test id(id) kernel" : diff fte, t(treated) p(t) cov(bk kfc roys wendys) test id(id) kernel}
{col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys wendys) test rcs kernel" : diff fte, t(treated) p(t) cov(bk kfc roys wendys) test kernel rcs}
{phang} 6. Triple differences (consider bk is a second treatment category).{p_end}
{col 9}{stata "diff fte, t(treated) p(t) ddd(bk)" : diff fte, t(treated) p(t) ddd(bk)}
{title:Saved results}
{phang} {cmd: diff} saves the following list of scalars in {cmd: r()}:{p_end}
{synoptset 15 tabbed}{...}
{synopt:{cmd:r(N)}} total number of observations.{p_end}
{synopt:{cmd:r(N_t0)}} number of observations in period == 0.{p_end}
{synopt:{cmd:r(N_t1)}} number of observations in period == 1.{p_end}
{synopt:{cmd:r(R2)}} R-square{p_end}
{synopt:{cmd:r(mean_c0)}} mean of {it:output_var} of the control group in period == 0{p_end}
{synopt:{cmd:r(mean_c0a)}} mean of {it:output_var} of the control group A in period == 0{p_end}
{synopt:{cmd:r(mean_c0b)}} mean of {it:output_var} of the control group B in period == 0{p_end}
{synopt:{cmd:r(mean_t0)}} mean of {it:output_var} of the treated group in period == 0{p_end}
{synopt:{cmd:r(mean_t0a)}} mean of {it:output_var} of the treated group A in period == 0{p_end}
{synopt:{cmd:r(mean_t0b)}} mean of {it:output_var} of the treated group B in period == 0{p_end}
{synopt:{cmd:r(diff0)}} difference of the mean of {it:output_var} between treated and
control groups in period == 0{p_end}
{synopt:{cmd:r(mean_c1)}} mean of {it:output_var} of the control group in period == 1{p_end}
{synopt:{cmd:r(mean_c1a)}} mean of {it:output_var} of the control group A in period == 1{p_end}
{synopt:{cmd:r(mean_c1b)}} mean of {it:output_var} of the control group B in period == 1{p_end}
{synopt:{cmd:r(mean_t1)}} mean of {it:output_var} of the treated group in period == 1{p_end}
{synopt:{cmd:r(mean_t1a)}} mean of {it:output_var} of the treated group A in period == 1{p_end}
{synopt:{cmd:r(mean_t1b)}} mean of {it:output_var} of the treated group B in period == 1{p_end}
{synopt:{cmd:r(diff1)}} difference of the mean of {it:output_var} between treated and
control groups in period == 1{p_end}
{synopt:{cmd:r(did)}} differences in differences - Treatment Effect {p_end}
{synopt:{cmd:r(se_c0)}} Standard error of the mean of {it:output_var} of the control group
in period == 0{p_end}
{synopt:{cmd:r(se_c0a)}} Standard error of the mean of {it:output_var} of the control group A
in period == 0{p_end}
{synopt:{cmd:r(se_c0b)}} Standard error of the mean of {it:output_var} of the control group B
in period == 0{p_end}
{synopt:{cmd:r(se_t0)}} standard errors of the mean of {it:output_var} of the treated group
in period == 0{p_end}
{synopt:{cmd:r(se_t0a)}} standard errors of the mean of {it:output_var} of the treated group A
in period == 0{p_end}
{synopt:{cmd:r(se_t0b)}} standard errors of the mean of {it:output_var} of the treated group B
in period == 0{p_end}
{synopt:{cmd:r(se_d0)}} standard Errors of the difference of {it:output_var} between the
treated and control groups in period == 0 {p_end}
{synopt:{cmd:r(se_c1)}} standard errors of the mean of {it:output_var} of the control group
in period == 1{p_end}
{synopt:{cmd:r(se_c1a)}} standard errors of the mean of {it:output_var} of the control group A
in period == 1{p_end}
{synopt:{cmd:r(se_c1b)}} standard errors of the mean of {it:output_var} of the control group B
in period == 1{p_end}
{synopt:{cmd:r(se_t1)}} standard errors of the mean of {it:output_var} of the treated group
in period == 1{p_end}
{synopt:{cmd:r(se_t1a)}} standard errors of the mean of {it:output_var} of the treated group A
in period == 1{p_end}
{synopt:{cmd:r(se_t1b)}} standard errors of the mean of {it:output_var} of the treated group B
in period == 1{p_end}
{synopt:{cmd:r(se_d1)}} standard errors of the difference of {it:output_var} between the
treated and control groups in == 0 {p_end}
{synopt:{cmd:r(se_dd)}} standard errors of the difference in difference{p_end}
{synopt:{cmd:r(se_dd)}} standard errors of the difference in difference{p_end}
{p2colreset}{...}
{title:Recommended references}
Single diff-in-diff:
{phang}Card, D., Krueger, A. "Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania".
The American Economic Review, Vol. 84, No. 4 (Sep., 1994), pp. 772-793.{p_end}
Kernel diff-in-diff:
{phang}Heckman, J., Ichimura, H., Todd, P. "Matching As an Econometric Evaluation Estimator". The Review of Economic Studies, Vol. 65,
No. 2 (Apr., 1998), pp. 261-294.{p_end}
{phang}Leuven, E., Sianesi, B. 2014. "PSMATCH2: Stata module to perform full Mahalanobis and propensity score matching,
common support graphing, and covariate imbalance testing, Statistical Software Components". Boston College Department of Economics.{p_end}
Kernel diff-in-diff (repeated cross section):
{phang}Blundell, R., Dias, M. "Alternative Approaches to Evaluation in Empirical Microeconomics". Journal of Human Resources, Vol. 44,
No. 3 (Jun., 2009), pp. 565-640.{p_end}
Quantile diff-in-diff:
{phang}Meyer, B., Viscusi, W. "Workers' Compensation and Injury Duration: Evidence from a Natural Experiment". The American
Economic Review, Vol. 85, No.3 (Jun., 1995), pp. 322-340.{p_end}
Triple difference in differences:
{phang}Imbens, G., Wooldridge, J. "Difference-in-Differences Estimation. Lecture Notes 10, Summer '07". NBER (Jul., 2007), pp. 322-340.{p_end}
{title:Author}
{phang}Juan M. Villa{p_end}
{phang}Global Development Institute{p_end}
{phang}The University of Manchester{p_end}
{phang}juan.villa@manchester.ac.uk{p_end}
{phang}Inter-American Development Bank{p_end}
{phang}juanmiguelv@iadb.org{p_end}
{phang}Please cite as: Villa, J.M., 2016. diff: Simplifying the estimation of
difference-in-differences treatment effects. Stata Journal 16, pp. 52-71.{p_end}
{phang}Update: August - 2017.{stata "ssc install diff, replace" : Click here periodically} to get the lastest version.{p_end}
{phang}*Acknowledgements to Kit Baum for valuable comments. The Kernel matching is based on the command {stata "ssc des psmatch2":psmatch2} developved by Edwin
Leuven and Barbara Sianesi.{p_end}