```help for gdecomp
-------------------------------------------------------------------------------

Decomposition of outcome differentials after nonlinear models

Syntax

gdecomp groupvar [, options ] : estimation_command

where

groupvar specifies a binary (numeric) variable identifying the two
groups;

estimation_command (see help estcom) should begin with the logit,
logistit, logistit, probit, poisson, or nbreg;

options are
dxweight(high|low) reverse eform level(#) noheader nocoef
dummies(varlist_1 [\ varlist_2 ..])

Description

gdecomp implements a generalized Blinder-Oaxaca decomposition which
applies to categorical and count outcomes (and parallel to this, to
nonlinear regression models). First, estimation_command is estimated in
the two groups of groupvar. Then the observed difference in the dependent
variable of estimation_command between the groups defined by groupvar is
decomposed into three parts: (1) a part due to differences in endowments
(labeled by E), and (2) a part due to differences in marginal effects and
finally (3) a part due to difference in baseline predictions or constants
(labeled by U).  See the Methods and formulas section below.

Typed without arguments, gdecomp replays the estimation results.  gdecomp
shares all features of estimation commands; see help estcom for details.

carrying out Blinder-Oaxaca decompositions at the bottom of this help
file.

Options

dxweight(high|low) affects the calculation of the endowment effect.  If
dxweight(high) is specified then differences in endowments are
evaluated at the high-outcome regression line. If dxweight(low) is
specified then differences in endowments are weighted with the
marginal effects from the low-outome group.  The default is
dxweight(high).

reverse tells gdecomp that the group with the lower average of the
outcome variable should be treated as the high-outcome group. By
default, gdecomp defines the low-outcome group to be the group with
the largest observed mean of the outcome variable. The default
behavior generalizes the idea that average earnings are higher in the
high-outcome group. The reverse option makes sense and should be used
only if high value of the outcome variable indicate outcomes that are
"negatively" valued (or, outcomes decreasing subjective utility). Do
not use this option if large categories of the outcome variable
record high salaries or being in the labor force; use this option if
large categories of the outcome variable record being unemployed.

eform tells gdecomp that the dependent variable is the natural logarithm
of the outcome variable, so that correct marginal effects (changes in
the exponential of the linear prediction) can be calculated.  This
option is useful if the dependent variable is the logarithm of wage.
Warning: with this option, you do not request the results to be
displayed in exponentiated form.

level(#) specifies the confidence level, in percent terms, for the
confidence intervals of the computed statistics; see help level.

noheader suppresses the display of overall and variable-level
decomposition results.

nocoef suppresses the display of the decomposition results for the
variables, and forces gdecomp to display the E, C and U components
(without respective standard errors).  .

dummies(varlist_1 [\ varlist_2 ... ]) modifies the calculation of
marginal effects for dummy variables. Here, varlist_1 [\ varlist_2
... ] are lists of dummy variables, where all dummies of a list
indicate different categories of the same underlying categorical
variable. Let xvar be a categorical variables with K+1 (K>1)
categories. In this case, not xvar, but K dummies - say, D1, ..., DK
- are included in the regression model. The estimated marginal
effects for these K dummies may be misleading (see an example in the
help file margeff). The correct result is obtained if one specifies
the dummies(D*) option.

Methods and Formulas

Let y1 and y0 be the means of the dependent variable Y in the
high-outcome and the low-outcome groups, respectively (thus y1>y0). Let
x1 and x0 the row vectors of the means of the explanatory variables
X1,...,Xk, and m1 and m0 the column vectors of the marginal effects in
groups 1 and 0, and a1 and a0 the baseline predictions in groups 1 and 0.

If the dxweight(high|low) option is omitted or dxweight(high) is
specified, then the raw differential y1-y2 is approximated as

y1-y0 = (x1-x0)m1 + x0(m1-m0) + a1-a0

If, however, the dxweight(low) option is specified, then the raw
differential y1-y2 is approximated as

y1-y0 = (x1-x0)m0 + x0(m1-m0) + a1-a0

Whatever method is chosen, the first part on the right-hand side is the
endowments effect (E), and the second part on the right-hand side is the
coefficient effect (C), and the third part is the difference due to
differences in "constants" (unexplained part, U).

Author

Tamás Bartus, Corvinus University, Budapest, tamas.bartus@uni-corvinus.hu

Also see

On-line:  help for gdecomp decompose oaxaca if installed
```