help intgphBennet Zelner and Dan Blanchette-------------------------------------------------------------------------------

Title

intgph-- Employs King et al.'s (2000) simulation-based approach to interpret interaction effects in selected nonlinear models and presents the results graphically in the manner suggested in Zelner (2009)

Syntax

intgphmodelnamedepvar [indepvars] [if exp] [in range] [weight],options

optionsDescription ------------------------------------------------------------------------- Required

ivars(varlist)names of the two variables to be interactedOptional

cmdopts(options)valid options for the estimation command specified bymodelnamelevel(#)a positive integer between1and99(default value is95)sims(#)a positive integer (default value is1000)setx(options)options forsetxcommandgenname(newvars)gennameoption forestsimpcommanddifvals(numlist)pair of values of the second interacted variable used to calculate the difference in predicted values plotted on the chart's y-axisxinc(#)positive integer specifying the number of points on the chart's x-axisgphdifproduces a chart showing two sets of predicted values rather than difference in predicted valuesgphsymproduces chart with symbols denoting statistical significance instead of confidence interval barsxtitle(axis_title)title for chart'sXaxisytitle(axis_title)title for chart'sYaxisgphopts(options)additional twoway optionsslopetest(numlist)pair of x-axis values used to test corresponding predicted values against the null hypothesis that they are equal to one anothermoddatapermits dataset in memory to be modifiedverboserunsintgphwith more output to results window

Description

intgphestimates a selected nonlinear model that includes a multiplicative interaction term, and uses simulated parameters generated by King et al.'s (2000)estsimpcommand (part of the "Clarify" suite of commands) to evaluate and graphically portray the effect of one interacted variable conditional on different values of the other interacted variable. The first interacted variable must be continuous and the second interacted variable (whose effect is assessed conditional on the first interacted variable) may be either continuous or binary. Neither of the interacted variables should appear in a higher-order term in the model. Currently supported models includelogit,probit,poissonandnbregThe default chart produced by

intgphis the type illustrated in Zelner (2009). The y-axis of the chart measures the change in the predicted probability of a positive outcome (forlogitandprobitmodels) or incidence (forpoissonandnbregmodels) associated with a discrete change in the value of the second interacted variable.intgphplots this quantity and (by default) the confidence interval that surrounds it against the observed range of values of the first interacted variable, measured on the x-axis. Alternatively,intgphcan also produce a chart showing the predicted values themselves (rather than the difference in predicted values).intgphoptionally creates a series of new variables that can be used to produce additional charts or conduct hypothesis tests.

ivars(varlist)identifies the two variables,ivar1andivar2, to be included in the interaction term. The variables should not appear explicitly inindepvars;intgphwill automatically add them to the model along with the interaction term.ivar1must be continuous andivar2may be either binary or continuous.

level(#)sets the significance level used to plot confidence intervals, conduct hypothesis tests etc. The value oflevelmust be an integer between1and99. If unspecified,leveltakes a default value of95.

cmdopts(options)allows the user to add command options to the estimation command specified bymodelname. Options should be typed as they normally would, e.g.,cmdopts(vce(r))produces robust standard errors.

sims(#)is a positive integer specifying the number of simulations performed byestsimp. The default is1000simulations. For more information, consult the help entry forestsimp.

genname(newvar)specifies a stub name for the newly created variables containing the simulated coefficient values created byestsimp. The default variable names areb1,b2, ... ,bk. For more information, consult the help entry forestsimp.

setx(options)sets the values of the variables inindepvarsfor purposes of creating the chart. If thesetxoption is not specified,intgphwill automatically set the value of each non-binary variable inindepvarsto its estimating sample mean, and each binary variable to its estimating sample mode. If thesetxoption is specified (using the syntax described in the help entry forsetx) to assign values to any of the variables inindepvars, then these values will override the default values set byintgph, and the remaining variables inindepvarswill retain their default values. In no case will thesetxoption affect the values ofivar1andivar2used to create the chart.

difvals(numlist)sets the two values of the second interacted variable used to calculate the difference in predicted values plotted on the chart's y-axis by default. This difference is equal toPr(Y=1|ivar2 = hi_val) - Pr(Y=1|ivar2 = lo_val)forlogitandprobitmodels, andE(Y|ivar2 =hi_val) - E(Y|ivar2 =lo_val)forpoissonandnbregmodels. Ifdifvalsis not specified andivar2is binary, thenlo_valtakes a default value of0andhi_valtakes a default value of 1. Ifdifvalsis not specified andivar2is continuous, thenlo_valtakes a default value ofivar2's estimating sample mean andhi_valtakes a default value ofivar2's estimating sample mean plus one standard deviation. Note that thegphdifoption can be used to plot the predicted values themselves rather than the difference in predicted values.

xinc(#)is a positive integer specifying the number of points on the chart's x-axis for which the values on the y-axis are calculated and plotted. The x-axis points are evenly spaced betweenivar2's estimating sample minimum and maximum. Ifxincis not specified, it takes a default value of100. Higher values ofxincwill increase the amount of time it takesintgphto run.

gphdifproduces a chart showing the two sets of predicted values associated withlo_valandhi_valfrom thedifvals(numlist)option rather than the difference between these predicted values. Ifgphdifis specified, then forlogitandprobitmodels, the chart will include separate schedules forPr(Y=1|ivar2 = hi_val)andPr(Y=1|ivar2 = lo_val), and forpoissonandnbregmodels, the chart will include separate schedules forE(Y|ivar2 =hi_val)andE(Y|ivar2 =lo_val).

gphsymproduces a chart that uses symbols to denote the regions in which the values plotted on the y-axis differ significantly from zero at the level specified inlevel(#)}. Ifgphsymis specified, the chart will not include confidence interval bars.

xtitle(axis_title)sets the x-axis title (specified as a string surrounded by quotation marks) for the chart produced byintgph. If not specified, the default title for the x-axis is the variable name ofivar1.

ytitle(axis_title)sets the y-axis title (specified as a string surrounded by quotation marks) for the chart produced byintgph. If not specified,intgphwill choose an appropriate default title.

gphopts(options)} specifies additionaltwoway_optionsoptions for the chart. For more information, consult the help entry fortwoway_options.

slopetest(numlist)tests whether the "slope" of a specified segment of the schedule representing the difference in predicted probabilities in the default chart type (i.e., whengphdifis not specified) is significantly different from zero at the level specified inlevel(#)(95% by default). Because the confidence intervals calculated byintgphare derived using the simulation-based approach, there is no mathematical formula for these intervals or related inferential statistics. The "slope" test is thus a test of the null hypothesis that a double difference--the difference between the difference in predicted values associated with one value of the variable appearing on the x-axis differs from that associated with a second value--is different from zero. The arguments ofslopetest(numlist)are the two values of the variable on the x-axis for which one- and two-tailed versions of this test are performed.slopetest(numlist)accepts multiple pairs of x-axis variable values.

moddatainstructsintgphto leave variables that it creates in the dataset. These include the simulated coefficient values and the variables used to produce the charts:X, which contains the values ofivar1used to plot the chart;Y0, which contains the predicted value associated with the first of the two values to whichivar2is set using thedifvalsoption;Y1, which contains the predicted value associated with the second of the two values to whichivar2is set using thedifvalsoption;dY, the difference between the predicted valuesY1andY0; and six variables ending in the stublborub, which respectively represent the lower and upper bounds of the confidence intervals surroundingY0,Y1, anddY.

verbosecreates additional screen output showing the intermediate steps used to construct the chart.

ExamplesSetup

. webuse unionLogit model with black and time variables interacted

. intgph logit union south age grade not_smsa, ivars(t0 black)Same as above + robust standard errors

. intgph logit union south age grade not_smsa, ivars(t0 black) cmdopts(r)Same as above + set value of south to 1 when producing the chart

. intgph logit union south age grade not_smsa, ivars(t0 black) cmdopts(r)setx(south 1)Same as above + include second interaction term and set relevant variable values meaningfully

. intgph logit union south age grade not_smsa southXt, ivars(t0 black)cmdopts(r) setx(south 1 t0 5 southXt 5)Same as above + perform hypothesis test that Pr(union=1|t0=5) ~= Pr(union=1|t0=15). intgph logit union south age grade not_smsa southXt, ivars(t0 black)cmdopts(r) setx(south 1 t0 5 southXt 5) slopetest(5 15)Same as above + plot predicted probabilities rather than difference in predicted probabilities}

. intgph logit union south age grade not_smsa southXt, ivars(t0 black)cmdopts(r) setx(south 1 t0 5 southXt 5) slopetest(5 15) gphdif

ReferenceIf you use this program, please cite:

Zelner, Bennet A. 2009. "Using simulation to interpret results from logit, probit, and other nonlinear models." Strategic Management Journal. doi: 10.1002/smj.783 (print version forthcoming). and:

Gary King, Michael Tomz, and Jason Wittenberg. 2000. "Making the most of statistical analyses: improving interpretation and presentation." American Journal of Political Science 44, no. 2 (April): 347-61.

AuthorsBennet A. Zelner Assistant Professor of Strategy Duke University's Fuqua School of Business bzelner@duke.edu

Dan Blanchette Center for Entrepreneurship and Innovation Duke University's Fuqua School of Business Dan.Blanchette@Duke.edu

Also seeOnline:

estsimp,simqi,setx,twoway options,logit,probit,poissonandnbreg