------------------------------------------------------------------------------- help for dfl Joao Pedro Azevedo -------------------------------------------------------------------------------

DiNardo, Fortin and Lemieux Counterfacual Kernel Density

dfl depvar indepvars [if exp] [in range] , outcome(varname) [group(varname) min(integer) max(integer) nbins(integer) w(bandwidth) step(varlist) adaptive gauss epan oaxaca quietly probit nxvar(string) ncfactual(string) nufactual(string) nfactual(string) graph_combine axis_selection_options axis_scale_options title_options nogroupt name3(string)]

Description

dfl estimates DiNardo, Fortin and Lemieux Counterfacual Kernel Densities. Treatment status is identified by depvar==1 for the treated and depvar==0 for the untreated observations.

The propensity score - the conditional treatment probability - is estimated by the program on the indepvars.

outcome variable has to be specified.

dfl also support graph options such as graph_combine, title_options and axis_selection_options.

Several different weight can be used on this procedure. In this ado we use the following weight. Weight=1-Prob(Depvar=1)/Prob(Depvar=0).

Options

group the group variable.

graph cfactual (compares the (Depvar=0) distribution to the (Depvar=0) distribution that would have prevailed if they had been paid like (Depvar=1)) , ufactual (compares the (Depvar=1) distribution to the counterfactual. These will be different to the extent that the X's of the two groups differ) and diff (difference between the how (Depvar=0) distribution that would have prevailed if they had been paid like (Depvar=1) and (Depvar=1) distribution).

min sets the minimum value. When omited dfl uses the minimum value of the specified outcome.

max sets the maximum value. When omited dfl uses the maximum value of the specified outcome.

nbins number of equaly spaced intervals.Default value 200.

adaptive produces density estimates using adaptive kernel estimation methods. Please note that the method is slower than than the default one.

step Sequential decomposition. This option runs N+1 models (N being the number of variables in the varlist). The first model will consider all variables. Each subsequent model will remove one variable at the time. The final output is a single graph with overlaying kernel density differences for the N+1 models. My impression is that if N>2 the figure gets very clutered.

gauss specify the kernel (gaussian).

epan specify the kernel (epanechnikov).

oaxaca compute mean using the estimated density to and compare to the actual mean of log wages or oaxaca wages (the predicted values).

quietly do not print output of propensity score estimation.

probit use probit instead of the default logit to estimate the propensity score.

nxvar assign a label to the x-axis.

ncfactual assign a label to cfactual.

nfactual assign a label to factual.

nufactual assing a label to ufactual.

nogroupt drops the group names from the figure.

Examples

. webuse nlsw88, clear . g ttl_exp2 = ttl_exp^2 . g lwage = log(wage) . dfl union ttl_exp ttl_exp2 married grade , outcome(lwage) . dfl union ttl_exp ttl_exp2 married grade , outcome(lwage) w(.05) . dfl union ttl_exp ttl_exp2 married grade , outcome(lwage) adaptive . dfl union ttl_exp ttl_exp2 married grade , outcome(lwage) step(tenure collgrad)

References

DiNardo, J., N.M. Fortin, and T.Lemieux (1996) "Labour Market Insitutions and the Distribution of Wages, 1973-1992: A Semiparametric Approach," Econometrica, 64(5): 1001-1044.

Van Kerm, P. (2003) "Adaptive kernel density estimation."The Stata Journal , 3(2): 148-156.

Author

Joao Pedro Azevedo, World Bank, jazevedo@worldbank.org

Aknowledgements

This ado uses part of the code written by John DiNardo. I would like to thank some Stata users who have made comments to an earlier release of this ado, in particular Jean Ries. The usual disclaimer applies.

Also see

Manual: [R] kdensity Online: help for kdensity and help for akdensity and decompose and jmp if installed.