Title
intgph -- Employs King et al.'s (2000) simulation-based approach to interpret interaction effects in selected nonlinear models and presents the results graphically in the manner suggested in Zelner (2009)
Syntax
intgph modelname depvar [indepvars] [if exp] [in range] [weight] , options
options Description ------------------------------------------------------------------------- Required
ivars(varlist) names of the two variables to be interacted
Optional
cmdopts(options) valid options for the estimation command specified by modelname level(#) a positive integer between 1 and 99 (default value is 95) sims(#) a positive integer (default value is 1000) setx(options) options for setx command genname(newvars) genname option for estsimp command difvals(numlist) pair of values of the second interacted variable used to calculate the difference in predicted values plotted on the chart's y-axis xinc(#) positive integer specifying the number of points on the chart's x-axis gphdif produces a chart showing two sets of predicted values rather than difference in predicted values gphsym produces chart with symbols denoting statistical significance instead of confidence interval bars xtitle(axis_title) title for chart's X axis ytitle(axis_title) title for chart's Y axis gphopts(options) additional twoway options slopetest(numlist) pair of x-axis values used to test corresponding predicted values against the null hypothesis that they are equal to one another moddata permits dataset in memory to be modified verbose runs intgph with more output to results window
Description
intgph estimates a selected nonlinear model that includes a multiplicative interaction term, and uses simulated parameters generated by King et al.'s (2000) estsimp command (part of the "Clarify" suite of commands) to evaluate and graphically portray the effect of one interacted variable conditional on different values of the other interacted variable. The first interacted variable must be continuous and the second interacted variable (whose effect is assessed conditional on the first interacted variable) may be either continuous or binary. Neither of the interacted variables should appear in a higher-order term in the model. Currently supported models include logit, probit, poisson and nbreg
The default chart produced by intgph is the type illustrated in Zelner (2009). The y-axis of the chart measures the change in the predicted probability of a positive outcome (for logit and probit models) or incidence (for poisson and nbreg models) associated with a discrete change in the value of the second interacted variable. intgph plots this quantity and (by default) the confidence interval that surrounds it against the observed range of values of the first interacted variable, measured on the x-axis. Alternatively, intgph can also produce a chart showing the predicted values themselves (rather than the difference in predicted values). intgph optionally creates a series of new variables that can be used to produce additional charts or conduct hypothesis tests.
ivars(varlist) identifies the two variables, ivar1 and ivar2, to be included in the interaction term. The variables should not appear explicitly in indepvars; intgph will automatically add them to the model along with the interaction term. ivar1 must be continuous and ivar2 may be either binary or continuous.
level(#) sets the significance level used to plot confidence intervals, conduct hypothesis tests etc. The value of level must be an integer between 1 and 99. If unspecified, level takes a default value of 95.
cmdopts(options) allows the user to add command options to the estimation command specified by modelname. Options should be typed as they normally would, e.g., cmdopts(vce(r)) produces robust standard errors.
sims(#) is a positive integer specifying the number of simulations performed by estsimp. The default is 1000 simulations. For more information, consult the help entry for estsimp.
genname(newvar) specifies a stub name for the newly created variables containing the simulated coefficient values created by estsimp. The default variable names are b1, b2, ... , bk. For more information, consult the help entry for estsimp.
setx(options) sets the values of the variables in indepvars for purposes of creating the chart. If the setx option is not specified, intgph will automatically set the value of each non-binary variable in indepvars to its estimating sample mean, and each binary variable to its estimating sample mode. If the setx option is specified (using the syntax described in the help entry for setx) to assign values to any of the variables in indepvars, then these values will override the default values set by intgph, and the remaining variables in indepvars will retain their default values. In no case will the setx option affect the values of ivar1 and ivar2 used to create the chart.
difvals(numlist) sets the two values of the second interacted variable used to calculate the difference in predicted values plotted on the chart's y-axis by default. This difference is equal to Pr(Y=1|ivar2 = hi_val) - Pr(Y=1|ivar2 = lo_val) for logit and probit models, and E(Y|ivar2 = hi_val) - E(Y|ivar2 = lo_val) for poisson and nbreg models. If difvals is not specified and ivar2 is binary, then lo_val takes a default value of 0 and hi_val takes a default value of 1. If difvals is not specified and ivar2 is continuous, then lo_val takes a default value of ivar2's estimating sample mean and hi_val takes a default value of ivar2's estimating sample mean plus one standard deviation. Note that the gphdif option can be used to plot the predicted values themselves rather than the difference in predicted values.
xinc(#) is a positive integer specifying the number of points on the chart's x-axis for which the values on the y-axis are calculated and plotted. The x-axis points are evenly spaced between ivar2's estimating sample minimum and maximum. If xinc is not specified, it takes a default value of 100. Higher values of xinc will increase the amount of time it takes intgph to run.
gphdif produces a chart showing the two sets of predicted values associated with lo_val and hi_val from the difvals(numlist) option rather than the difference between these predicted values. If gphdif is specified, then for logit and probit models, the chart will include separate schedules for Pr(Y=1|ivar2 = hi_val) and Pr(Y=1|ivar2 = lo_val), and for poisson and nbreg models, the chart will include separate schedules for E(Y|ivar2 = hi_val) and E(Y|ivar2 = lo_val).
gphsym produces a chart that uses symbols to denote the regions in which the values plotted on the y-axis differ significantly from zero at the level specified in level(#)}. If gphsym is specified, the chart will not include confidence interval bars.
xtitle(axis_title) sets the x-axis title (specified as a string surrounded by quotation marks) for the chart produced by intgph. If not specified, the default title for the x-axis is the variable name of ivar1.
ytitle(axis_title) sets the y-axis title (specified as a string surrounded by quotation marks) for the chart produced by intgph. If not specified, intgph will choose an appropriate default title.
gphopts(options)} specifies additional twoway_options options for the chart. For more information, consult the help entry for twoway_options.
slopetest(numlist) tests whether the "slope" of a specified segment of the schedule representing the difference in predicted probabilities in the default chart type (i.e., when gphdif is not specified) is significantly different from zero at the level specified in level(#) (95% by default). Because the confidence intervals calculated by intgph are derived using the simulation-based approach, there is no mathematical formula for these intervals or related inferential statistics. The "slope" test is thus a test of the null hypothesis that a double difference--the difference between the difference in predicted values associated with one value of the variable appearing on the x-axis differs from that associated with a second value--is different from zero. The arguments of slopetest(numlist) are the two values of the variable on the x-axis for which one- and two-tailed versions of this test are performed. slopetest(numlist) accepts multiple pairs of x-axis variable values.
moddata instructs intgph to leave variables that it creates in the dataset. These include the simulated coefficient values and the variables used to produce the charts: X, which contains the values of ivar1 used to plot the chart; Y0, which contains the predicted value associated with the first of the two values to which ivar2 is set using the difvals option; Y1, which contains the predicted value associated with the second of the two values to which ivar2 is set using the difvals option; dY, the difference between the predicted values Y1 and Y0; and six variables ending in the stub lb or ub, which respectively represent the lower and upper bounds of the confidence intervals surrounding Y0, Y1, and dY.
verbose creates additional screen output showing the intermediate steps used to construct the chart.
Examples
Setup . webuse union
Logit model with black and time variables interacted . intgph logit union south age grade not_smsa, ivars(t0 black)
Same as above + robust standard errors . intgph logit union south age grade not_smsa, ivars(t0 black) cmdopts(r)
Same as above + set value of south to 1 when producing the chart . intgph logit union south age grade not_smsa, ivars(t0 black) cmdopts(r) setx(south 1)
Same as above + include second interaction term and set relevant variable values meaningfully . intgph logit union south age grade not_smsa southXt, ivars(t0 black) cmdopts(r) setx(south 1 t0 5 southXt 5) Same as above + perform hypothesis test that Pr(union=1|t0=5) ~= Pr(union=1|t0=15) . intgph logit union south age grade not_smsa southXt, ivars(t0 black) cmdopts(r) setx(south 1 t0 5 southXt 5) slopetest(5 15)
Same as above + plot predicted probabilities rather than difference in predicted probabilities} . intgph logit union south age grade not_smsa southXt, ivars(t0 black) cmdopts(r) setx(south 1 t0 5 southXt 5) slopetest(5 15) gphdif
Reference
If you use this program, please cite:
Zelner, Bennet A. 2009. "Using simulation to interpret results from logit, probit, and other nonlinear models." Strategic Management Journal. doi: 10.1002/smj.783 (print version forthcoming). and:
Gary King, Michael Tomz, and Jason Wittenberg. 2000. "Making the most of statistical analyses: improving interpretation and presentation." American Journal of Political Science 44, no. 2 (April): 347-61.
Authors
Bennet A. Zelner Assistant Professor of Strategy Duke University's Fuqua School of Business bzelner@duke.edu
Dan Blanchette Center for Entrepreneurship and Innovation Duke University's Fuqua School of Business Dan.Blanchette@Duke.edu
Also see
Online: estsimp, simqi, setx, twoway options, logit, probit, poisson and nbreg