{smcl} {* 15 April 2014}{...} {cmd:help glmdeco}{right: ({browse "http://staff.vwi.unibe.ch/kaiser/research.html"})} {hline} {title:Title} {p2colset 5 25 22 2}{...} {p2col :{hi: glmdeco} {hline 2}}Detailed Decomposition of Mean Outcome Differentials in Generalized Linear Models{p_end} {p2colreset}{...} {title:Syntax} {p 8 17 2} {cmdab:glmdeco} {depvar} {indepvars} {weight} {ifin} {cmd:,} {opth g:roup(varname)} [{it:options}] {synoptset 30}{...} {synopthdr} {synoptline} {synopt:{opth g:roup(varname)}}required. binary indicator for group{p_end} {synopt:{opth l:ink(name)}}GLM link function{p_end} {synopt:{opth f:amily(name)}}GLM family{p_end} {synopt:{opth ipw(varname)}}inverse probability weights{p_end} {synopt:{cmdab:b:oot}}activate the bootstrap{p_end} {synopt:{opth iter:ation(integer)}}number of bootstrap iterations{p_end} {synopt:{opth sub:bs(integer)}}subsample bootstrap{p_end} {synopt:{cmdab:nodetail}}suppress detailed decomposition{p_end} {synopt:{opth cat1(varlist)}}dummies belonging to categorical variable 1{p_end} {synopt:...}...{p_end} {synopt:{opth cat5(varlist)}}dummies belonging to categorical variable 5{p_end} {synopt:{opth cont1(varlist)}}set 1 of continuous variables{p_end} {synopt:...}...{p_end} {synopt:{opth cont5(varlist)}}set 5 of continuous variables{p_end} {synopt:{cmdab:approx}}calculates Yun's approximate detailed decomposition{p_end} {synopt:{cmdab:n:ormcoef}}normalizes the coefficients of the dummies in the cat options to be invariant to the choice of the reference group{p_end} {synoptline} {p 4 6 2} {cmd:pweight}s are allowed; see {help weight}.{p_end} {title:Description} {pstd} {cmd:glmdeco} decomposes the mean differential of the outcome {depvar} between {cmd:group}=1 and {cmd:group}=0. Besides the usual aggregate decomposition, a detailed decompositions is performed to obtain the contributions of individual covariates to the overall gap (Kaiser, 2013). The counterfacutal is the average outcome of group 1 that we would observe if their outcome distribution had been generated by the conditional expectation function of group 0.{p_end} {pstd} Use the options {opth l:ink(name)} and {opth f:amily(name)} to fit the GLM of your choice, see {help glm} for details. Note that these models are estimated consistently as long as the assumed conditional outcome model is correct.{p_end} {pstd} If covariates include sets of dummies of categorical variables or polynomials of continuous variables, the user may specify these in the cat- and cont-options, respectively. In this way, the routine computes the joint contribution of these variables instead of indiviual contributions.{p_end} {title:Options} {phang} {cmdab:g:roup(}{it:varname}) specifies the binary group variable (0/1).{p_end} {phang} {cmdab:l:ink(}{it:name}) specifies the link function. Default is identity-link. See {help glm##linkname} for details.{p_end} {phang} {cmdab:f:amily(}{it:name}) specifies the family. Default is the Gaussian family. See {help glm##familyname} for details.{p_end} {phang} {cmdab:ipw(}{it:varname}) allows the user to specify inverse probability weights (IPW), which are used in the estimation of the conditional mean models. In the case of Poisson QML or logit, the IPW defined as P(D=1|X)/(1-P(D=1|X)) produce a doubly robust estimator. The user is responsible for supplying the correct weights. {p_end} {phang} {cmdab:b:oot} specifies the option to compute bootstrapped standard errors. Default is no standard errors. {p_end} {phang} {cmdab:iter:ation}({it:integer}) specifies the number of bootstrap iterations. The default is 100.{p_end} {phang} {cmdab:sub:bs}({it:integer between 1 and 100}) sets the sample size used in the bootstrap. If the user sets 50, the bootstrap is performed on 50%-subsamples of the original sample size. This can be useful for large datasets. The default is 100 for samples <10,000 obs., 120-0.002*N for samples 10,000-50,000 obs. and 20 for samples >50,000 obs.{p_end} {phang} {cmd:cat1}({it:varlist}) ... {cmd:cat5}({it:varlist}) can take lists of dummy variables for which the joint contribution is to be computed. {cmd:Important}: the reference group dummy must be included as the first element of {it:varlist}. For the first list, use {cmd:cat1}({it:varlist}); for the second list, use {cmd:cat2}({it:varlist}); and so on. Currently, five sets are supported, i.e. up to {cmd:cat5}({it:varlist}). Variables included in the {it:cat}-options should not appear in {indepvars}.{p_end} {phang} {cmd:cont1}({it:varlist}) ... {cmd:cont5}({it:varlist}) can contain lists of continuous variables for which the joint contribution is to be computed. Currently, five sets are supported, i.e. up to {cmd:cont5}({it:varlist}). This option is useful for combining the effects of polynomial terms, e.g. age and age2. Variables included in the {it:cont}-options should not appear in {indepvars}.{p_end} {phang} {cmd:approx} calculates Yun's (2008) approximate detailed decomposition at the sample means of the covariates.{p_end} {phang} {cmdab:n:ormcoef} normalizes the coefficients of the sets of dummies specified in the cat-options so that they are invariant to the reference category (cf. Yun 2008).{p_end} {title:Example} {phang} a nonlinear decomposition of the (arithmetic mean) union wage differential using a Poisson quasi-maximum-likelihood estimator:{p_end} {phang2}{cmd:. webuse nlsw88, clear}{p_end} {phang2}{cmd:. g ttl_exp2=ttl_exp^2}{p_end} {phang2}{cmd:. qui tab race, g(drace)}{p_end} {phang2}{cmd:. glmdeco wage south collgrad grade, g(union) boot link(log) f(poi) cont1(ttl_exp*) cat1(drace*)}{p_end} {title:Saved results} {phang}{cmd:glmdeco} saves the following in {cmd:r()}:{p_end} {synoptset 15 tabbed}{...} {p2col 5 15 19 2: Scalars}{p_end} {synopt:{cmd:r(Diff)}}raw difference in mean outcomes{p_end} {synopt:{cmd:r(ADx)}}aggregate characteristics effect{p_end} {synopt:{cmd:r(ADs)}}aggregate structural effect{p_end} {synopt:{cmd:r(N1)}}number of observations in group1{p_end} {synopt:{cmd:r(N0)}}number of observations in group0{p_end} {synopt:{cmd:r(bic1)}}BIC in model of group 1{p_end} {synopt:{cmd:r(bic0)}}BIC in model of group 0{p_end} {synopt:{cmd:r(r2_m1)}}pseudo R^2 in model of group 1{p_end} {synopt:{cmd:r(r2_m0)}}pseudo R^2 in model of group 0{p_end} {p2col 5 15 19 2: Matrices}{p_end} {synopt:{cmd:r(M)}}mean estimates{p_end} {synopt:{cmd:r(R)}}decomposition estimates{p_end} {p2col 5 15 19 2: Macros}{p_end} {synopt:{cmd:r(xlist)}}list of detailed decomposition terms{p_end} {synopt:{cmd:r(subsample)}}size of bootstrap subsamples{p_end} {title:Requirements} {phang} This command requires the package {cmd:distinct} that you can find by typing {cmd:findit distinct} {p_end} {title:References} {phang} Kaiser, B. (2015). Detailed decompositions in nonlinear models. {it:Applied Economics Letters}, 22(1), 25-29.{p_end} {phang} Yun, Myeong-Su. 2008. Identification problem and detailed Oaxaca decomposition: a general solution and inference. {it:Journal of economic and social measurement} 33.1 (2008): 27-38.{p_end} {title:If you use {cmd:glmdeco} in your work, please cite it as follows} {pstd} Kaiser, B. (2015). Detailed decompositions in nonlinear models. {it:Applied Economics Letters}, 22(1), 25-29.{p_end} {title:Disclaimer} {p 4 4 2}THIS SOFTWARE IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. {p 4 4 2}IN NO EVENT WILL THE COPYRIGHT HOLDERS OR THEIR EMPLOYERS, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THIS SOFTWARE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM. {title:Author} {pstd}For questions, queries or suggestions, please contact{p_end} {pstd}Boris Kaiser, bo.kaiser@gmx.ch{p_end}