{smcl} {* *! version 1.0.0 28aug2018}{...} {vieweralsosee "" "--"}{...} {vieweralsosee "avciplots" "help avciplots"}{...} {vieweralsosee "[R] avplot" "help regress_postestimation_plots##avplot"}{...} {vieweralsosee "xtavplot" "search xtavplot"}{...} {viewerjumpto "Description" "avciplot##description"}{...} {viewerjumpto "Options" "avciplot##options"}{...} {viewerjumpto "Examples" "avciplot##examples"}{...} {viewerjumpto "Stored values" "avciplot##stored"}{...} {marker avciplot}{...} {bf:avciplot} {hline 2} Added-variable plot with confidence intervals {marker syntax}{...} {title:Syntax} {p 8 18 2} {cmd:avciplot} {it:{help indepvars:indepvar}} [{cmd:,} {it:options}] {synoptset 25 tabbed}{...} {synopthdr:options} {synoptline} {syntab:Plot} {p2col:{it:{help marker_options}}}change look of markers (color, size, etc.){p_end} {p2col:{it:{help marker_label_options}}}add marker labels; change look or position{p_end} {synopt :{opt xl:im(# [#])}, {opt yl:im(# [#])}}limit the ranges of the x and y residuals displayed{p_end} {synopt :{opt gen:erate(exvar eyvar)}}save the values of x and y residuals in new variables {syntab:Regression line} {synopt :{opth rl:opts(cline_options)}}affect rendition of the regression line{p_end} {synopt :{opt noco:ef}}turns off display of coefficent below graph{p_end} {syntab:Confidence interval} {synopt :{opt noci}}turns off confidence interval{p_end} {synopt :{opt ciu:nder}}puts confidence intervals underneath scatter{p_end} {synopt :{opth l:evel(level:#)}}specifies the confidence level{p_end} {synopt :{opth cio:pts(fitarea_options:ci_options)}}affect rendition of the confidence{p_end} {synopt :{opth cip:lot(graph_twoway:plottype)}}how to plot CIs; default is {cmd:ciplot({help twoway_rline:rline})}; a common alternative is {cmd:ciplot({help twoway_rline:rarea})} {p_end}{...} {syntab:Add plots} {synopt :{opth "addplot(addplot_option:plot)"}}add other plots to the generated graph{p_end} {syntab:Y axis, X axis, Titles, Legend, Overall} {synopt :{it:twoway_options}}any options other than {opt by()} documented in {manhelpi twoway_options G-3}{p_end} {synoptline} {p2colreset}{...} {marker description}{...} {title:Description of avciplot} {pstd} {opt avciplot} creates an added-variable plot ({it:a.k.a.} partial-regression leverage plot, partial regression plot, or adjusted partial residual plot) after {helpb regress}. It differs from {helpb regress postestimation plots##avplot:avplot} by adding confidence intervals around the regression line and various options. {pstd} {it:indepvar} is an independent ({it:x}) variable ({it:a.k.a.} predictor, carrier, or covariate) that may or may not be included in the preceding regression. The user would choose an {it:indepvar} not already in the regression to evaluate whether it is worthwhile to include it. {pstd} {opt avciplot} shows the partial correlation between one {it:indepvar} and the {it:depvar} controlling for all the other regressors in an multiple linear regression. {pstd} Besides showing the relationship between the {it:indepvar} and the {it:depvar} controlling for the other regressors, {cmd:avciplot} is useful for visually identifying which outlier observations have a big effect on the estimated coefficient. {pstd} {opt avciplot} calculates e({it:x}|X), the residuals from the regression of the {it:indepvar} ({it:x}) on the other independent (X) variables, and e({it:y}|X), the residuals from the regression of the {it:depvar} ({it:y}) on the other (X) variables. The graph shows e({it:x}|X) plotted against e({it:y}|X), that is, the variation in {it:x} not correlated with X plotted against the variation in {it:y} not correlated with X. {pstd} The fitted line shown in the graph is the least squares fit between the residuals e({it:x}|X) and e({it:y}|X). The fitted line has the same slope as estimated coefficient on the {it:indepvar} in the preceding full regression. {pstd} By construction, the residuals e({it:x}|X) and e({it:y}|X) each have a mean of zero, and the regression line (without a constant term) fitted between them passes exactly through e({it:x}|X)=0 and e({it:y}|X)=0. At that point, the confidence interval has zero width, giving it an unfamiliar shape. Note that this also happens in a conventional regression at the point where all the independent variables have a value of zero if there is no constant term. {marker options}{...} {title:Options for avciplot} {dlgtab:Plot} {phang} {it:marker_options} affect the rendition of markers drawn at the plotted points, including their shape, size, color, and outline; see {manhelpi marker_options G-3}. {phang} {it:marker_label_options} specify whether and how markers are to be labeled; see {manhelpi marker_label_options G-3}.{p_end} {phang} {opt xlim(# [#])}, {opt ylim(# [#])} constrain the range of the {it:indepvar} and {it:depvar} residuals displayed. If only one number is specified, residuals with a value below that number will not be displayed in the scatter plot. If two numbers are specified, residuals below that first number and above the second number will not be displayed. {p 8 8} Excluding observations of the residual does not affect the slope of the regression line in the graph. The purpose of these options is to avoid situations where outlying observations cause a lot of extra white space in the graph, obscuring display of the relationship between the variables. As usual, care should be taken to make sure that the undisplayed observations are not important to the estimated relationship. {phang} {opt generate(exvar eyvar)} saves the values of the x and y residuals in variables named by the user. The user must specify two variable names for {it:exvar} and {it:eyvar}. These residuals can be used for subsequent calculations or graphing commands.{p_end} {dlgtab:Regression line} {phang} {opt rlopts(cline_options)} affects the rendition of the regression (fitted) line. See {manhelpi cline_options G-3}. {phang} {opt nocoef} turns off display of the coefficent, standard error and {it:t} statistic from the regression line below the graph. {dlgtab:Confidence interval} {phang} {opt noci} turns off display of the confidence interval on the graph. {phang} {opt ciunder} confidence interval will be graphed underneath the scatter plot (i.e. data scatter is graphed on top of confidence interval). This is mainly useful when graphing a solid confidence interval with option {opt ciplot(rarea)}.{p_end} {phang} {opt level(#)} specifies the confidence level, in percent, for confidence interval of the coefficients; see help {help level}. {phang} {opt ciopts(line_options)} affects how the upper and lower confidence interval lines are rendered. See {manhelpi cline_options G-3}. If you specify {opt ciplot()}, then rather than using {it:cline_options}, you should specify whichever options are appropriate for the {it:plottype}. {phang} {cmd:ciplot(}{it:plottype}{cmd:)} specifies how the confidence interval is to be plotted. The default is {cmd:ciplot(rline)}, meaning that the prediction will be plotted by {cmd:graph} {cmd:twoway} {cmd:rline}. {p 8 8} A common alternative is {cmd:ciplot({help twoway_rarea:rarea})}, which will substitute lines around the prediction for shading. See {manhelp graph_twoway G-2:graph twoway} for a list of {it:plottype} choices. You may choose any {it:plottypes} that expect two {it:y} variables and one {it:x} variable.{p_end} {dlgtab:Add plots} {phang} {opt addplot(plot)} provides a way to add other plots to the generated graph. See {manhelpi addplot_option G-3}. {dlgtab:Y axis, X axis, Titles, Legend, Overall} {phang} {it:twoway_options} are any of the options documented in {manhelpi twoway_options G-3}, excluding {opt by()}. These include options for titling the graph (see {manhelpi title_options G-3}) and for saving the graph to disk (see {manhelpi saving_option G-3}). {marker examples}{...} {title:Examples} {hline} {pstd}Load the auto dataset and look at a graph of engine displacement versus fuel efficiency (mpg):{p_end} {phang2}{bf:{stata "sysuse auto": . sysuse auto}}{p_end} {phang2}{bf:{stata "twoway lfitci mpg displacement || scatter mpg displacement, msize(vsmall) leg(off)": . twoway lfitci mpg displacement || scatter mpg displacement, msize(vsmall) leg(off)}} {pstd}Though the correlation of {opt displacement} and {opt mpg} is clearly negative, if we also include weight as a regressor in a multiple regression, {opt displacement} has a {it:positive} and insignificant partial correlation. {phang2}{bf:{stata "regress mpg displacement weight": . regress mpg displacement weight}} {pstd} How can we show this graphically? With an added-variable plot: {phang2}{bf:{stata "avciplot displacement": . avciplot displacement}} {pstd} The added-variable plot shows the correlation of the {it:x} variable, {opt displacement}, conditional on all the other independent variables in the regression, with the {it:y} variable, {opt mpg}, also conditional on all the other regressors. That is, it shows the the correlation of one {it:x} with {it:y}, netting out the influence of all the other independent variables. {pstd} The added-variable plot shows a scatter of the values of the residuals e(x|X) versus e(y|X). The solid line is the regression fit of these values and the dashed lines are the limits of the 95% confidence interval around the regression fit. The slope of the regression fit in the added-variable plot is equal to the coefficient on {opt displacement} in the preceding regression which is also printed at the bottom of the graph. {pstd} Unlike {help regress_postestimation_plots##avplot:avplot}, {cmd:avciplot} shows the confidence interval around the linear regression line. We can see that the partial correlation of {opt displacement} with {opt mpg} is not statistically significant (at the 5% level) since the confidence interval includes zero. {pstd} {opt avciplot} can display the confidence interval as a solid pattern (similar to the {help lfitci} graphs) by using the option {opt ciplot(rarea)} rather than the default of delineating the interval by two dashed lines. The {opt ciunder} option causes the scatter plot to be superimposed on the confidence interval rather than vice versa, so that data points within the interval are still visible: {phang2}{bf:{stata "avciplot displacement, ciplot(rarea) ciunder": . avciplot displacement, ciplot(rarea) ciunder}} {pstd} Added-variable plots are a good diagnostic for finding outlier observations which influence the partial correlation of a regressor of interest, in this case {opt displacement}. {pstd} There is a clear outlier in the e(mpg|X) vertical axis with a value of about 14. It is also clear that this outlier has little affect on the slope of the regression line because it has a value e(displacement|X) of about zero. Including this outlier makes the rest of the graph smaller. {pstd} We can exclude the display of this observation with the option {cmd:ylim(-10 10)} to magnify the rest of the graph. The lower limit of {opt ylim} has no affect because there are no e(mpg|X) values below -10. The {opt ylim} and {opt xlim} options are not available in {help regress_postestimation_plots##avplot:avplot} so this could be a reason to use {cmd:avciplot} even if you don't want to display a confidence interval. {pstd} Another difference between {help regress_postestimation_plots##avplot:avplot} and {opt avciplot} is the ability to save the values of e(x|X) and e(y|X) for later use with the {opt generate} option, perhaps to create a more complicated graph after running the {opt avciplot} command. {pstd} The following command implements the {opt ylim}, {opt generate} and {opt noci} (no confidence interval display) options: {phang2}{bf:{stata "avciplot displacement, ylim(-10 10) noci generate(ex ey)": . avciplot displacement, ylim(-10 10) noci generate(ey ex)}} {pstd} The new variables {opt ex} and {opt ey}, containing the residuals of the {it:x} and {it:y} variables conditional on all the other regressors, are added to the dataset in memory. {pstd} A simple scatter plot of a dummy variable like {opt foreign} versus {opt mpg} only displays two values on the horizontal axis, making it difficult to discern the relationship visually: {phang2}{bf:{stata "twoway lfitci mpg foreign || scatter mpg foreign, leg(off) xl(0 1) yt(mpg) xt(foreign)": . twoway lfitci mpg foreign || scatter mpg foreign, leg(off) xl(0 1) yt(mpg) xt(foreign)}}{p_end} {pstd} In contrast, the added-variable plot of {opt foreign} versus {opt mpg} graphs the residual of the dummy variable conditional on the other {it:x} variables, which is continuous even though the dummy variable itself only has two discrete values. {opt avciplot} can take an {it:indepvar} which has not yet been included in the regression, making it a useful tool for exploring the influence of new variables. To see the partial correlation of the new variable {opt foreign} added to the existing regression, use {phang2}{bf:{stata "avciplot foreign": . avciplot foreign}}{p_end} {pstd} This controls for all the existing variables in the last regression, which in our example are {opt displacement}, {opt weight} and an intercept. The added-variable plot of {opt foreign} could help the user decide whether to add it as a new variable to the regression. {pstd} The companion command {helpb avciplots} shows the added-variable plots of {it:all} regressors in the preceding regression in a single graph. We include an interaction term between {opt weight} and {opt foreign} in the regression and then show all the partial correlations: {phang2}{bf:{stata "regress mpg displacement weight foreign c.weight#i.foreign": . regress mpg displacement weight foreign c.weight#i.foreign}} {phang2}{bf:{stata "avciplots, title(All covariates) ": . avciplots, title(All covariates)}} {pstd} {helpb avciplots} provides a quick way of understanding the coefficient estimates after any linear regression. The graph shows the strength and significance of the partial correlations of all the independent variables, as well as help to highlight outlier observations which affect each correlation. {pstd} The examples above show how {opt avciplot} and {helpb avciplots} can be used to present the relationship between independent and dependent variables graphically when there are multiple covariates in a regression. The inclusion of confidence intervals in the {opt avciplot} graphs makes it possible to see the statistical significance of the estimated coefficients as well as their magnitude. {marker stored}{...} {title:Stored results} {pstd} {cmd:avciplot} stores the following in {cmd:r()}: {synoptset 20 tabbed}{...} {p2col 5 20 24 2: Scalars}{p_end} {synopt:{cmd:r(coef)}}the estimated coefficient of the added variable{p_end} {synopt:{cmd:r(se)}}the standard error of the estimated coefficient{p_end} {title:Author} John Luke Gallup, Portland State University, USA jlgallup@pdx.edu