{smcl} {* 31aug2006}{...} {hline} help for {hi:devcon} {hline} {title:Deviation contrast transformation for estimation results} {p 8 14 2}{cmd:devcon} [ {cmd:,} {cmdab:g:roups:(}{it:glist}{cmd:)} {cmdab:eq:uations:(}{it:numlist}{cmd:)} {cmd:check}[{cmd:(}{it:#}{cmd:)}] {cmdab:non:oise} {cmdab:l:evel:(}{it:#}{cmd:)} ] where {it:glist} is {p 8 8 2}{it:varlist1} [{cmd:(}{it:varname1}{cmd:)}] [{cmd:,} {it:varlist2} [{cmd:(}{it:varname2}{cmd:)}] [{cmd:,} {it:...}] ] {title:Description} {p 4 4 2} A categorical regressor is usually included in a regression model using a set of 0/1 dummies differentiating the effects of the separate categories of the variable. A coefficient associated with such a dummy variable reflects the expected outcome difference between the represented category and some reference category. Since one of the categories serves as the reference category, only {it:k}-1 dummy variables are used for a {it:k}-category variable. {p 4 4 2}{cmd:devcon} may be used to transform the coefficients of such 0/1 dummy variables so that they reflect deviations from the "grand mean" (in other words, the modified coefficients will sum up to zero over all categories) rather than deviations from the reference category. The transformed coefficients are equivalent to those obtained by using the so called "effects coding" for the dummy variables (see the {cmd:e} prefix in {cmd:xi3} or the {cmd:dev()} contrast in {cmd:desmat}; both packages are available from the SSC Archive). However, {cmd:devcon} reports coefficients for {it:all} categories (including the category that was used as the reference category in the original model) and modifies the model's constant accordingly (with the effects coding, the coefficient of one of the categories is "hidden" in the constant). Furthermore, the coding of the underlying dummy variables is still 0/1 with {cmd:devcon}. {p 4 4 2}The deviation contrast transformation is applied to the last (i.e. currently active) estimates. Use the {cmd:groups()} option to define the group(s) of dummy variables. {cmd:devcon} specified without the {cmd:groups()} option may be used to redisplay estimates that have already been transformed by {cmd:devcon}. The {cmd:devcon} routine will work after most estimation commands (see help {help estcom}). Multiple equation models are supported. Use the {cmd:equations()} option to specify the equation(s) to be transformed. Note that {cmd:devcon} also transforms the variance-covariance matrix of the estimates and that the usual post estimation commands such as {help predict} or {help test} may be used with the transformed estimates. {p 4 4 2}The {cmd:devcon} command has two main benefits. First, it may be very convenient to use {cmd:devcon} to quickly display the deviation contrasts without having to change the coding of the variables and without having to take further action to make the reference category's coefficient visible. Second, the transformed estimates may be valuable for use with some post-estimation procedures. In fact, {cmd:devcon} was originally developed for use with with the Oaxaca-Blinder decomposition (see help {help oaxaca} if installed; the package is available from the SSC Archive, type {net "describe http://fmwww.bc.edu/RePEc/bocode/o/oaxaca":{bind:ssc describe oaxaca}}). In this decomposition, the results for categorical variables depend on the choice of the reference category (see, e.g., Oaxaca and Ransom 1999). Applying the deviation contrast transformation to the estimates before conducting the decomposition is one solution to this problem (see Yun 2003). {it:Technical note} {p 4 4 2} The deviation contrast transform can also be applied to the variables used to model an interaction between a categorical and a continuous variable. The relevant continuous variable must be provided in parentheses within the {cmd:groups()} option in such a case. {title:Options} {p 4 8 2} {cmd:groups(}{it:glist}{cmd:)} defines the dummy-variable groups. If more than one group is specified, use commas to separate the groups. Note that in each of the groups a variable reflecting the reference category must be specified (i.e. the variable must exist in the data). If the variables in a group represent interactions with a continuous variable, specify the continuous variable in parentheses at the end of the group. The usual shorthand conventions apply to the {it:varlist}s specified in {it:glist} (see help {help varlist}). {p 4 8 2}{cmd:equations(}{it:numlist}{cmd:)} is relevant only for multiple-equation models. It specifies the equation(s) to be transformed. Use numbers to refer to the equations' positions in the model ({cmd:1} for the first equation, {cmd:2} for the second, and so on). The usual shorthand conventions apply to {it:numlist} (see help {help numlist}). The default is {cmd:equations(1)}. {p 4 8 2} {cmd:check}[{cmd:(}{it:#}{cmd:)}] checks the integrity of the normalized estimates by verifying that the linear predictions from the original estimates and the normalized estimates are equal for all observations in the estimation sample. If the results do not pass the check, an error message is issued and no results are returned. By default, the check is performed using the models's first equation. To use another equation, specify its number in parentheses. A failed check indicates that the dummy variables used are not well defined (i.e. that the indicated groups overlap or that at least one group has been omitted). In rare cases, however, the results might fail the check even though the dummy variables have been correctly defined ({cmd:devcon} uses the information in {cmd:e(sample)} and, if available, {cmd:e(subpop)} to determine the sample of relevant cases; situations may arise in which the sample would have to be narrowed further). {p 4 8 2} {cmd:nonoise} suppresses the display of the transformed estimates. {p 4 8 2}{cmd:level(}{it:#}{cmd:)} specifies the confidence level, in percent terms, for the confidence intervals of the coefficients; see help {help level}. {title:Example} {p 4 4 2} Standard application with one categorical variable ... {com}. sysuse auto . generate rep1 = rep78 <= 3 if rep78 < . . generate rep2 = rep78 == 4 if rep78 < . . generate rep3 = rep78 == 5 if rep78 < . . logit foreign mpg rep2 rep3, nolog . devcon, groups(rep1 rep2 rep3){txt} {p 4 4 2} ... and interactions with a continuous variable: {com}. generate mpgrep1 = mpg * rep1 . generate mpgrep2 = mpg * rep2 . generate mpgrep3 = mpg * rep3 . logit foreign mpg rep2 rep3 mpgrep2 mpgrep3, nolog . devcon, groups(rep1 rep2 rep3, mpgr* (mpg)){txt} {p 4 4 2}Transforming OLS estimates for use with the Blinder-Oaxaca decomposition ({cmd:oaxaca} is available from the SSC Archive): {com}. reg lnwage educ expr expr2 single divorced if female==0 . devcon , groups(married single divorced) . estimates store male . reg lnwage educ expr expr2 single divorced if female==1 . devcon , groups(married single divorced) . estimates store female . oaxaca male female, detail {txt} {title:Methods and Formulas} {p 4 4 2}Consider the model y = a + b_1*D_1 + b_2*D_2 + e {p 4 4 2}where "a" is the constant and "e" is the error. D_1 and D_2 are two 0/1 dummy variables representing a polytomous variable with three categories. Alternatively, the above equation can be formulated as y = a + b_1*D_1 + b_2*D_2 + b_3*D_3 + e {p 4 4 2}with b_3 constrained to zero and D_3 being the indicator for the (omitted) reference category. Now define c as c = (b_1 + b_2)/3 {p 4 4 2}and let a' = a + c b_1' = b_1 - c b_2' = b_2 - c b_3' = b_3 - c = -c {p 4 4 2}{cmd:devcon} then reports the equation y = a' + b_1'*D_1 + b_2'*D_2 + b_3'*D_3 + e {p 4 4 2}More generally, c = (b_1 + b_2 + ... + b_{k-1}) / k {p 4 4 2}for a k-category variable. {p 4 4 2}The transformation can also be applied to interaction terms. Consider the model y = a + b_1*DX_1 + b_2*DX_2 + d*X + e {p 4 4 2}where X is a continuous variable and DX_1 and DX_2 are the interaction terms, i.e. DX_1 = D_1*X and DX_2 = D_2*X. The deviation contrast transformation is then y = a + b_1'*DX_1 + b_2'*DX_2 + b_3'*DX_3 + d'*X + e {p 4 4 2} where b_1' = b_1 - c b_2' = b_2 - c b_3' = b_3 - c = -c d' = d + c {p 4 4 2}{cmd:devcon} also transforms the variance-covariance matrix of the coefficients, applying the general formula for the variances and covariances of weighted sums of random variables (see Mood et al. 1974:179). {title:References} {p 4 8 2} Mood, A.M., F.A. Graybill, D.C. Boes (1974). Introduction to the Theory of Statistics, 3. edn. New York: McGraw-Hill.{p_end} {p 4 8 2}Oaxaca, R.L., Ransom, M.R. (1999). Identification in Detailed Wage Decompositions. The Review of Economics and Statistics 81: 154-157.{p_end} {p 4 8 2}Yun, M.-S. (2003). A Simple Solution to the Identification Problem in Detailed Wage Decompositions. IZA Discussion Paper No. 836.{p_end} {title:Author} {p 4 4 2}Ben Jann, ETH Zurich, jann@soz.gess.ethz.ch {title:Also see} {p 4 13 2} Online: help for {help regress}, {help estimates}, {help oaxaca} (if installed)