help khb-------------------------------------------------------------------------------

Title

khbDecomposition of effects in non-linear probabiltiy models using the KHB-method

Syntax

khbmodel-typedepvarkey-vars||z-vars[if] [in] [weight] [,options]

optionsDescription ------------------------------------------------------------------------- Mainconcomitant(varlist)concomitantsdisentangledisentangle difference for each z-varsummarysummary of decompositionorexponentiated coeficientsvce(vcetype)vcetype may berobust,clusterclustvarapedecomposition using average partial (marginal) effectscontinuoustreat dummy variable as continuous when using ape-methodnotablesuppress coefficient tableverboseshow restricted and full modelkeepkeep residuals of z-varsxstandardstandardizekey-varszstandardstandardizez-varsModel-type specificoutcome(outcome)outcome used for decomposition whenmodel-typeis mlogitbaseoutcome(#)value of depvar that will be the base outcome whenmodel-typeis mlogitgroup(varname)necessary options formodel-typesrologit and clogit; see help of these models.otherall options allowed for the specifiedmodel-type-------------------------------------------------------------------------

model-typecan be any of regress, logit, ologit, probit, oprobit, cloglog, {help slogit}, scobit, rologit, clogit, xtlogit, xtprobit and mlogit. Other models might also produce output but for the time being this output is considered to be "experimental".

depvaris the name of the dependent variable,key-varsis a varlist holding the name(s) of the variable(s) to be decomposed, andz-varsis a varlist holding the name(s) of control variables of interest.Factor variables are allowed for

key-vars. Factor variables forz-varsare only allowed for Stata 12 or higher. Factor variables forkey-varsare not allowed, if option -xstandard- is specified.aweights, fweights, iweights, and pweights are allowed if they are allowed in the specified

model-type; see weight.

Description

khbapplies the KHB method developed to compare the estimated coefficients between two nested non-linear probability models (Karlson/Holm/Breen 2011; Breen/Karlson/Holm 2010). An important use of the technique is to decompose the total effect of a variable into adirectandindirectofspuriouspart. The method is developed for binary, logit and probit models, but this command also includes other nonlinear probability models (ordered and multinomial) and linear regression. Contrary to other decomposition methods, the KHB-method gives unbiased decompositions, decomposes effects of both discrete and continuous variables, and provides analytically derived statistical tests for many models of the GLM family.In linear regression models, decomposing the total effect into direct and indirect/spurious effects is straightforward. The decomposition is done by comparing the estimated coefficient of a key variable of interest (

key-var) between areducedmodel without a control variable Z and afullmodel with one or more Z variable added. The difference between the estimated coefficients of the key-variable of interest in the two models expresses the amount by which the effect of the key-variable is confounded by the z-variable(s). If the control variable is hypothesized to be a consequence of the key-variable, the difference will be commonly termed as the "indirect effect"; if the control variable is the hypothesized to be a cause of the key-variable, the difference is termed the "spurious effect".The strategy described for linear models cannot be used in the context of nonlinear probability models such as logit and probit, because the estimated coefficients of these models are not comparable between different models. The reason is a rescaling of the model induced by a property of these models: the coefficients and the error variance are not separately identified. The KHB-method solves this problem. It allows the comparison of effects of nested models for many models of the GLM framework, including logit, probit, ologit, oprobit, and mlogit. The basic idea of the method is to compare the full model with a reduced model that substitutes some Z-variables by the residuals of the Z-variables from a regression of the Z-variables on the

key-vars(see Karlson/Holm/Breen 2011 for explanations and details). The method consequently allows separation of the change in the coefficient that is due to confounding and the change that is due to rescaling.The KHB-method also allows the inclusion variables that control for confounding influences on the decomposition. These variables are named concomitants in Karlson/Holm (2011) and Breen/Karlson/Holm (2010). These variables do not play the role of the Z-variables of the y*-x-relationship, but rather as a set of variables that is included to secure that both, the effects of the full model and the reduced model are not confounded by these variables.

The KHB-method is primarily intended to be used for various variants of logit and probit models. However, it can be also used for linear regression, in which case it returns the same results as the standard technique.

khbis then just a convenient way to do the decomposition with one single command.Note that using regress as

model-typefor binary dependent variables boils down to using a linear probability model for the decomposition. However, the interpretation of decompositions in linear probability models is unknown, and may not reflect the parameters of interest (in particular the indirect effect). Caution should consequently be exercised, and the authors do not recommend using khb for linear probability models until the properties of these models have been explored formally.A worked example using

khbappears in Breen/Karlson/Holm (2010).

Options

summaryrequests the provision of a decomposition summary for allkey-vars. By default,khbreports the effects of all key variables along with standard errors in terms of the estimated coefficients. With option summary,khbalso presents a table holding the "confounding ratios", the "percentage reduction due to confounding" and the "rescale factor". The confounding ratio measures the impact of confounding net of rescaling. The percentage reduction measures the percentage change in the coefficient of eachkey-varattributable to confounding net of scaling. Finally, the rescale factor measures the impact of rescaling, net of confounding.

disentanglerequest a table that show how much of the difference between the full and reduced model is contributed by each of the single z-variables.

notablesuppresses the display of the coefficient table. This normally involves the optionssummarizeand/ordisentangle

concomitant(varlist)is used to specify control variables that are not z-variables. Factor variables are allowed.

vce(vcetype)specifies the type of standard error reported. It defaults to the Stata's defaults for the specified model-type. Standard errors for indirect effects are estimated using a method discussed by Sobel (1982). The option vce() set the standard errors for total and direct effects and controls the type of standard error that enter into Sobel's method. Typesrobust,cluster; see help vce_option.

apeis used to decompose thekey-varsusing average partial effects (average marginal effects). Uses margins to compute average partial effects. Formodel-typesologit and oprobit, khb uses the average partial effect on the probability for the first outcome unlessoutcome()is specified; see ologit_postestimation for various ways to specifyoutcome(). Note that with APE the calculated indirect effect is not constant across outcomes. This is a well-known property of ordered choice models (see Greene/Henscher 2010).

orexponentiates the estimated coefficients, and hence shows odds-ratios for logit models. The coefficient for the reduced model is then the product of the full model with the estimated difference.

verboseis used to show the complete output of the full and restricted models that are used to estimate the decomposition. This is especially usefull to detect problems that occure in the intermediate steps of the estimation.

keepis used to keep the residuals of the z-variables, i.e. the z-variables net of confounding. These residuals are included as independent variables in the reduced model.

continuousAverage partial effects are by default based on unit effects for dummy variables. Specifying continuous treats dummy variables equal to continuous variables. See margins for details about this option

xstandardis used to standardize thekey-vars.

zstandardis used to standardize thez-vars.

outcome(outcome)specifies the outcome for which the decompostion is to be calculated. This takes effect for models for multinomial response (mlogit), and, if optionapeis specified, for ordered response models.outcome()can be specified using

#1,#2, ..., where#1means the first category of the dependent variable,#2means the second category, etc.;the values of the dependent variable; or

the value labels of the dependent variable if they exist.

baseoutcome(#)can be used formodel-typemlogit. It specifies the value of depvar to be treated as the base outcome. The default is to choose the most frequent outcome. The option can be used together withoutcome()to fully control the contrast for which the decompositon is done.

Example(s)

. use dlsy_khb.dta. khb logit univ fses || abil. khb probit univ fses || abil. khb logit univ fses || abil, c(intact boy). khb logit univ fses || abil, summary

ReferencesBreen, R./Karlson, K.B./Holm, A. (Forthcoming). Total, direct, and indirect effects in logit models. Accepted for publication in: Sociological Methods and Research.

Greene, W.H./Hensher, D.A. (2010): Modeling Ordered Choices: A Primer. New York: Cambridge University Press.

Karlson, K.B./Holm, A./Breen, R. (2011): Comparing Regression Coefficients Between Same-sample Nested Models using Logit and Probit. A New Method. Sociological Methodology 42:286-313.

Karlson, K.B./Holm, A. (2011): Decomposing primary and secondary effects: A new decomposition method. Research in Stratification and Social Mobility 29:221-237.

Kohler, U./Karlson, K.B./Holm, A. (2011): Comparing coefficients of nested nonlinear probability models. The Stata Journal 11:420-438.

Also seeManual:

[R] marginsOnline: help for margins, ldecomp (if installed)

Web: Stata's Home

AuthorUlrich Kohler (kohler@wzb.eu) and Kristian Karlson (kbk@dpu.dk)

Please send bug reports and questions regarding the program to Ulrich Kohler. Questions regarding the KHB method itself are handled by Kristian Karlson.