{smcl} {* *! version 5.0.3 27Dec2019}{...} {cmd:help diff} {hline} {title:Title} {p2colset 5 19 21 2}{...} {p2col:{hi: diff} {hline 2}}Difference in differences estimation{p_end} {p2colreset}{...} {title:Syntax} {p 8 15 2} {cmd:diff} {it:outcome_var} {ifin} {weight} {cmd:,}[ {it:options}] {title:Description} {pstd} {opt diff} runs several difference in differences (diff-in-diff) treatment effect estimations of a given outcome variable from a pooled baseline and follow up dataset: Single Diff-in-Diff, Diff-in-Diff accounting for covariates, Kernel Propensity Score Matching diff-in-diff, and the Quantile Diff-in-Diff. {opt diff} is also suitable for estimating repeated cross sections diff-in-diff (including the {opt k:ernel} option) and the triple difference-in-differences analysis. {title:Options} {synoptset 20 tabbed}{...} {synopthdr} {synoptline} {syntab:Model - Required} {synopt :{opt p:eriod(varname)}}Indicates the binary period variable (0: before; 1: after). Note: if your data contains a periodical frequency (monthly, quarterly, yearly, etc.), it is suggested to specify option {opt p:eriod(varname)} and include a binary variable for each frequency in option {opt c:ov(varlist)}.{p_end} {synopt :{opt t:reated(varname)}}Indicates the binary treatment variable (0: controls; 1:treated).{p_end} {syntab: Optional} {synopt :{opt c:ov(varlist)}}Specifies the pre-treatment covariates of the model. Also use this option to specify time fixed-effects in the case of multiple time-frequency data (e.g. monthly, yearly, quarterly, etc.). When option {opt k:ernel} is selected these variables are used to estimate the propensity score.{p_end} {synopt :{opt k:ernel}}Performs the Kernel-based Propensity Score Matching diff-in-diff. This option generates the variable _weights containing the weights derived from the Kernel Propensity Score Matching, and _ps when the Propensity Score is not supplied in {opt ps:core(varname)}, following {stata "ssc des psmatch2" :Leuven and Sianesi (2014)}. This option requires the {it: id(varname)} of each unit or individual except under the repeated cross section {opt rcs}) setting.{p_end} {synopt :{opt id(varname)}}Option {it: kernel} requires the supply of the identification variable.{p_end} {synopt :{opt bw(#)}}Supplied bandwidth of the Kernel function. The default bandwidth is 0.06.{p_end} {synopt :{opt kt:ype(kernel)}}Specifies the type of the Kernel function. The types are {it:epanechnikov} (the default), {it:gaussian}, {it:biweight}, {it:uniform} and {it:tricube}.{p_end} {synopt :{opt rcs}}Indicates that the {opt k:ernel} is set for repeated cross section. This option does not require option {opt id(varname)}. Option {opt rcs} strongly assumes that covariates in {opt c:ov(varlist)} do not vary over time.{p_end} {synopt :{opt qd:id(quantile)}}Performs the Quantile Difference in Differences estimation at the specified quantile from 0.1 to 0.9 (quantile 0.5 performs the QDID at the medeian). You may combine this option with {opt k:ernel} and {opt c:ov}. {opt qd:id} does not support weights nor robust standard errors. This option uses {manhelp qreg R} and {manhelp bsqreg R} for bootstrapped standard errors{p_end} {synopt :{opt ps:core(varname)}}Supplied Propensity Score.{p_end} {synopt :{opt lo:git}}Specifies logit estimation of the Propensity Score. The default is Probit.{p_end} {synopt :{opt sup:port}}Performs {opt diff} on the common support of the propensity score given the option {opt k:ernel}.{p_end} {synopt :{opt add:cov(varlist)}}Indicates additional covariates in addition to those specified in the estimation of the propensity score. Also use this option to specify time fixed-effects in the case of multiple time-frequency data (e.g. monthly, yearly, quarterly, etc.).{p_end} {synopt :{opt ddd(varname)}}Additional category for triple difference estimation. {opt t:reated(varname)} is deemed as the first category and {opt ddd(varname)} the second category. This option is not compatible with options {opt k:ernel}, {opt test} or {opt qd:id(quantile)}.{p_end} {syntab:SE/Robust} {synopt :{opt cl:uster(varname)}}Calculates clustered Std. Errors by {it: varname}.{p_end} {synopt :{opt robust}}Calculates robust Std. Errors.{p_end} {synopt :{opt bs}}performs a Bootstrap estimation of coefficients and standard errors.{p_end} {synopt :{opt r:eps(int)}}Specifies the number of repetitions when the {opt bs} is selected. The default are 50 repetitions.{p_end} {syntab:Balancing test} {synopt :{opt test}}Performs a balancing t-test of the difference in the means of the covariates between the control and treated groups in period == 0. The option {it: test} combined with {it: kernel} performs the balancing t-test with the weighted covariates. See {manhelp ttest R}{p_end} {syntab:Reporting} {synopt :{opt rep:ort}}Displays the inference of the included covariates or the estimation of the Propensity Score when option {opt kernel} is specified.{p_end} {synopt :{opt nos:tar}}Removes the inference stars from the p-values.{p_end} {synoptline} {p 4 6 2} {title:Exporting results} {phang} You can export your results with {stata "ssc des outreg2" :outreg2}. Run the following command after {cmd: diff} with double difference:{p_end} {col 9}{phang}{txt}outreg2 using table_diff, ctitle(`r(depvar)') addstat(Mean control t(0), r(mean_c0), Mean treated t(0), r(mean_t0), Diff t(0), r(diff0), Mean control t(1), r(mean_c1), Mean treated t(1), r(mean_t1), Diff t(1), r(diff1)) label adec(3) excel keep(_diff) nocons{p_end} {phang} Run the following command after {cmd: diff} with triple difference:{p_end} {col 9}{phang}{txt}outreg2 using output, ctitle(`r(depvar)') addstat(Mean control - A t(0), r(mean_c0a), Mean control - B t(0), r(mean_c0b), Mean treated A - t(0), r(mean_t0a), Mean treated B - t(0), r(mean_t0b), Diff t(0), r(diff0), Mean control - A t(1), r(mean_c1a), Mean control - B t(1), r(mean_c1b), Mean treated - A t(1), r(mean_t1a), Mean treated - B t(1), r(mean_t1b), Diff t(1), r(diff1)) label excel keep(_diff) nocons dec(4) {p_end} {phang} Results will be stored in the working directory (also see {cmd: help outreg2} for further options).{p_end} {synoptline} {p 4 6 2} {title:Example} {phang} 1. Diff-in-Diff with no covariates.{p_end} {phang} We use the dataset form Card & Krueger (1994)*.{p_end} {col 9}{stata "use http://fmwww.bc.edu/repec/bocode/c/CardKrueger1994.dta" : use "http://fmwww.bc.edu/repec/bocode/c/CardKrueger1994.dta"} {col 9}{stata "diff fte, t(treated) p(t)" : diff fte, t(treated) p(t)} {phang} For bootstrapped std. err.:{p_end} {col 9}{stata "diff fte, t(treated) p(t) bs rep(50)" : diff fte, t(treated) p(t) bs rep(50)} {phang} 2. Diff-in-Diff with covariates.{p_end} {col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys)" : diff fte, t(treated) p(t) cov(bk kfc roys)} {col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys) report" : diff fte, t(treated) p(t) cov(bk kfc roys) report} {col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys) report bs" : diff fte, t(treated) p(t) cov(bk kfc roys) report bs} {phang} 3. Kernel Propensity Score Diff-in-Diff.{p_end} {col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys) kernel id(id)" : diff fte, t(treated) p(t) cov(bk kfc roys) kernel id(id)} {col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys) kernel id(id) support" : diff fte, t(treated) p(t) cov(bk kfc roys) kernel id(id) support} {col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys) kernel id(id) support addcov(wendys)" : diff fte, t(treated) p(t) cov(bk kfc roys) kernel id(id) support addcov(wendys)} {col 9}{stata "diff fte, t(treated) p(t) kernel id(id) ktype(gaussian) pscore(_ps)" : diff fte, t(treated) p(t) kernel id(id) ktype(gaussian) pscore(_ps)} {col 9}{stata "diff fte, t(treated) p(t) kernel id(id) ktype(gaussian) pscore(_ps) bs reps(50)" : diff fte, t(treated) p(t) kernel id(id) ktype(gaussian) pscore(_ps) bs reps(50)} {phang} 3. Kernel Propensity Score Diff-in-Diff (Repeated Cross Section - rcs).{p_end} {col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys) kernel rcs" : diff fte, t(treated) p(t) cov(bk kfc roys) kernel rcs} {col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys) kernel rcs support" : diff fte, t(treated) p(t) cov(bk kfc roys) kernel rcs support} {col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys) kernel rcs support addcov(wendys)" : diff fte, t(treated) p(t) cov(bk kfc roys) kernel rcs support addcov(wendys)} {col 9}{stata "diff fte, t(treated) p(t) kernel rcs ktype(gaussian) pscore(_ps)" : diff fte, t(treated) p(t) kernel rcs ktype(gaussian) pscore(_ps)} {col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys) kernel rcs support addcov(wendys) bs reps(50)" : diff fte, t(treated) p(t) cov(bk kfc roys) kernel rcs support addcov(wendys) bs reps(50)} {phang} 4. Quantile Diff-in-Diff.{p_end} {col 9}{stata "diff fte, t(treated) p(t) qdid(0.25)" : diff fte, t(treated) p(t) qdid(0.25)} {col 9}{stata "diff fte, t(treated) p(t) qdid(0.50)" : diff fte, t(treated) p(t) qdid(0.50)} {col 9}{stata "diff fte, t(treated) p(t) qdid(0.75)" : diff fte, t(treated) p(t) qdid(0.75)} {col 9}{stata "diff fte, t(treated) p(t) qdid(0.50) cov(bk kfc roys)" : diff fte, t(treated) p(t) qdid(0.50) cov(bk kfc roys)} {col 9}{stata "diff fte, t(treated) p(t) qdid(0.50) cov(bk kfc roys) kernel id(id)" : diff fte, t(treated) p(t) qdid(0.50) cov(bk kfc roys) kernel id(id)} {col 9}{stata "diff fte, t(treated) p(t) qdid(0.50) cov(bk kfc roys) kernel id(id)" : diff fte, t(treated) p(t) qdid(0.50) cov(bk kfc roys) kernel rcs} {phang} 5. Balancing test of covariates.{p_end} {col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys wendys) test" : diff fte, t(treated) p(t) cov(bk kfc roys wendys) test} {col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys wendys) test id(id) kernel" : diff fte, t(treated) p(t) cov(bk kfc roys wendys) test id(id) kernel} {col 9}{stata "diff fte, t(treated) p(t) cov(bk kfc roys wendys) test rcs kernel" : diff fte, t(treated) p(t) cov(bk kfc roys wendys) test kernel rcs} {phang} 6. Triple differences (consider bk is a second treatment category).{p_end} {col 9}{stata "diff fte, t(treated) p(t) ddd(bk)" : diff fte, t(treated) p(t) ddd(bk)} {title:Saved results} {phang} {cmd: diff} saves the following list of scalars in {cmd: r()}:{p_end} {synoptset 15 tabbed}{...} {synopt:{cmd:r(N)}} total number of observations.{p_end} {synopt:{cmd:r(N_t0)}} number of observations in period == 0.{p_end} {synopt:{cmd:r(N_t1)}} number of observations in period == 1.{p_end} {synopt:{cmd:r(R2)}} R-square{p_end} {synopt:{cmd:r(mean_c0)}} mean of {it:output_var} of the control group in period == 0{p_end} {synopt:{cmd:r(mean_c0a)}} mean of {it:output_var} of the control group A in period == 0{p_end} {synopt:{cmd:r(mean_c0b)}} mean of {it:output_var} of the control group B in period == 0{p_end} {synopt:{cmd:r(mean_t0)}} mean of {it:output_var} of the treated group in period == 0{p_end} {synopt:{cmd:r(mean_t0a)}} mean of {it:output_var} of the treated group A in period == 0{p_end} {synopt:{cmd:r(mean_t0b)}} mean of {it:output_var} of the treated group B in period == 0{p_end} {synopt:{cmd:r(diff0)}} difference of the mean of {it:output_var} between treated and control groups in period == 0{p_end} {synopt:{cmd:r(mean_c1)}} mean of {it:output_var} of the control group in period == 1{p_end} {synopt:{cmd:r(mean_c1a)}} mean of {it:output_var} of the control group A in period == 1{p_end} {synopt:{cmd:r(mean_c1b)}} mean of {it:output_var} of the control group B in period == 1{p_end} {synopt:{cmd:r(mean_t1)}} mean of {it:output_var} of the treated group in period == 1{p_end} {synopt:{cmd:r(mean_t1a)}} mean of {it:output_var} of the treated group A in period == 1{p_end} {synopt:{cmd:r(mean_t1b)}} mean of {it:output_var} of the treated group B in period == 1{p_end} {synopt:{cmd:r(diff1)}} difference of the mean of {it:output_var} between treated and control groups in period == 1{p_end} {synopt:{cmd:r(did)}} differences in differences - Treatment Effect {p_end} {synopt:{cmd:r(se_c0)}} Standard error of the mean of {it:output_var} of the control group in period == 0{p_end} {synopt:{cmd:r(se_c0a)}} Standard error of the mean of {it:output_var} of the control group A in period == 0{p_end} {synopt:{cmd:r(se_c0b)}} Standard error of the mean of {it:output_var} of the control group B in period == 0{p_end} {synopt:{cmd:r(se_t0)}} standard errors of the mean of {it:output_var} of the treated group in period == 0{p_end} {synopt:{cmd:r(se_t0a)}} standard errors of the mean of {it:output_var} of the treated group A in period == 0{p_end} {synopt:{cmd:r(se_t0b)}} standard errors of the mean of {it:output_var} of the treated group B in period == 0{p_end} {synopt:{cmd:r(se_d0)}} standard Errors of the difference of {it:output_var} between the treated and control groups in period == 0 {p_end} {synopt:{cmd:r(se_c1)}} standard errors of the mean of {it:output_var} of the control group in period == 1{p_end} {synopt:{cmd:r(se_c1a)}} standard errors of the mean of {it:output_var} of the control group A in period == 1{p_end} {synopt:{cmd:r(se_c1b)}} standard errors of the mean of {it:output_var} of the control group B in period == 1{p_end} {synopt:{cmd:r(se_t1)}} standard errors of the mean of {it:output_var} of the treated group in period == 1{p_end} {synopt:{cmd:r(se_t1a)}} standard errors of the mean of {it:output_var} of the treated group A in period == 1{p_end} {synopt:{cmd:r(se_t1b)}} standard errors of the mean of {it:output_var} of the treated group B in period == 1{p_end} {synopt:{cmd:r(se_d1)}} standard errors of the difference of {it:output_var} between the treated and control groups in == 0 {p_end} {synopt:{cmd:r(se_dd)}} standard errors of the difference in difference{p_end} {synopt:{cmd:r(se_dd)}} standard errors of the difference in difference{p_end} {p2colreset}{...} {title:Recommended references} Single diff-in-diff: {phang}Card, D., Krueger, A. "Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania". The American Economic Review, Vol. 84, No. 4 (Sep., 1994), pp. 772-793.{p_end} Kernel diff-in-diff: {phang}Heckman, J., Ichimura, H., Todd, P. "Matching As an Econometric Evaluation Estimator". The Review of Economic Studies, Vol. 65, No. 2 (Apr., 1998), pp. 261-294.{p_end} {phang}Leuven, E., Sianesi, B. 2014. "PSMATCH2: Stata module to perform full Mahalanobis and propensity score matching, common support graphing, and covariate imbalance testing, Statistical Software Components". Boston College Department of Economics.{p_end} Kernel diff-in-diff (repeated cross section): {phang}Blundell, R., Dias, M. "Alternative Approaches to Evaluation in Empirical Microeconomics". Journal of Human Resources, Vol. 44, No. 3 (Jun., 2009), pp. 565-640.{p_end} Quantile diff-in-diff: {phang}Meyer, B., Viscusi, W. "Workers' Compensation and Injury Duration: Evidence from a Natural Experiment". The American Economic Review, Vol. 85, No.3 (Jun., 1995), pp. 322-340.{p_end} Triple difference in differences: {phang}Imbens, G., Wooldridge, J. "Difference-in-Differences Estimation. Lecture Notes 10, Summer '07". NBER (Jul., 2007), pp. 322-340.{p_end} {title:Author} {phang}Juan M. Villa{p_end} {phang}Global Development Institute{p_end} {phang}The University of Manchester{p_end} {phang}juan.villa@manchester.ac.uk{p_end} {phang}Colpensiones{p_end} {phang}{p_end} {phang}Please cite as: Villa, J.M., 2016. diff: Simplifying the estimation of difference-in-differences treatment effects. Stata Journal 16, pp. 52-71.{p_end} {phang}This version: December - 2019.{stata "ssc install diff, replace" : Click here periodically} to get the lastest version.{p_end} {phang}*Acknowledgements to Kit Baum for valuable comments. The Kernel matching is based on the command {stata "ssc des psmatch2":psmatch2} developved by Edwin Leuven and Barbara Sianesi.{p_end}