help logitcprplot
-------------------------------------------------------------------------------

Title

logitcprplot -- Component-plus-residual plot for logistic regression

Syntax

logitcprplot varname [varlist] [if] [in] [, options ]

options Description ------------------------------------------------------------------------- sample(#) plot a # percent sample of data points norline suppress reference line (model fit) rline(cline_options) affect rendition of reference line rcspline[(options)] add a restricted cubic spline smooth lpoly[(options)] add a local polynomial smooth (Stata 10 required) fpfit[(options)] add a fractional polynomial fit lowess[(lowess_opts)] add a lowess smooth pspline[(options)] add a penalized spline smooth (pspline user command required) rspline[(options)] add a regression spline smooth (uvrs user command required) running[(options)] add a running line smooth (running user command required) autosmoo[(options)] add an adaptive variable span running line smooth (autosmoo user command required) scatter_options options as defined in help scatter addplot(plot) add other plots to the generated graph generate(newvar) store partial logit residuals in newvar replace overwrite existing variable nograph suppress graph -------------------------------------------------------------------------

Description

logitcprplot is used after logit or logistic for graphing a component-plus-residual plot (a.k.a. partial residual plot) for varname as suggested by Landwehr, Pregibon, and Shoemaker (1984; also see, e.g., Fox 1997:458).

Use varlist to specify additional terms to be taken into account when computing the partial residuals and predictions. This is useful if a variable enters the model repeatedly via different transformations (e.g. polynomials).

Use if and in to restrict the sample to specific observations. This can be useful, for example, if a model contains interaction terms. In any case, the sample will be restricted to the estimation sample of the fitted model.

Whether logitcprplot can be applied after weighted models depends on specified options: no weights are allowed with the lowess, pspline, running, and autosmoo options; fweights are allowed with the lpoly option; fweights and pweights are allowed with the rcspline, fpfit, and rspline options.

The rcspline option has been inspired by the rcspline command by Nick Cox (see ssc describe rcspline). Furthermore, some of logitcprplot's code has been adapted from official Stata's mkspline command.

Partial residual plots using the pre Stata 8 graphics engine are available as lprplot from the SSC Archive (see ssc describe lprplot) and lpartr in the Stata Technical Bulletin (Hilbe 1992; see net stb 10 sqv6).

Dependencies

The sample() option requires the gsample and moremata user packages to be installed on the system (Jann 2005, 2006). See ssc describe gsample and ssc describe moremata.

The pspline option requires the pspline user command to be installed on the system (Jann and Gutierrez 2008). See ssc describe pspline.

The rspline option requires the uvrs user command to be installed on the system (Royston and Sauerbrei 2007). See net sj 7-1 st0120

The running option requires the running user command to be installed on the system (Sasieni 1995, Sasieni and Royston 1998, Sasieni, Royston, and Cox 2005). See net sj 5-2 sed9_2

The autosmoo option requires the autosmoo user command to be installed on the system (Sasieni 1998). See net stb 41 gr27

Options

sample(#) causes only a # percent random sample of observations to be plotted. # is a number between 1 and 100. sample() also restricts the estimation sample for the computationally intensive lowess and lpoly smooths. It does not, however, restrict the estimation sample for the other smoothers.

The sample() option requires gsample and moremata to be installed on the system (see ssc describe gsample and ssc describe moremata).

norline suppresses the reference line (i.e. the partial linear predictions from the model).

rline(cline_options) specifies details about the rendition of reference line. See help cline_options.

rcspline[(options)] adds a restricted cubic spline smooth to the plot. Options are:

ci[(area_options)] to plot a confidence interval(*) for the restricted cubic spline smooth with options as described in help area_options.

level(#) to set the confidence level, as a percentage, for the plotted confidence interval. The default is level(95) or as set by set level.

displayknots to display a table containing the values of the knots for the restricted cubic spline.

nknots(#) to specify the number of knots that are used for the restricted cubic spline. # must be between 3 and 7 unless the knot locations are specified using knots(). The default number of knots is 5. See [R] mkspline for details on how the knot positions are determined.

knots(numlist) to specify the exact location of the knots. The values must be given in increasing order.

cline_options to affect the rendition of the restricted cubic spline smooth. See help cline_options.

lpoly[(options)] adds a local polynomial smooth to the plot (Stata 10 required). Options are:

ci to plot a confidence interval(*) for local polynomial smooth.

lpolyci_options as described in twoway lpolyci (if ci is specified).

lpoly_options as described in twoway lpoly.

fpfit[(options)] adds a fractional polynomial fit to the plot. Options are:

ci to add a confidence interval(*) for fractional polynomial fit.

fpfitci_options as described in twoway fpfitci (if ci is specified).

fpfit_options as described in twoway fpfit

lowess[(options)] adds a lowess smooth to the plot. Options are as described in twoway lowess.

pspline[(options)] adds penalized spline smooth to the plot using the pspline user command (Jann and Gutierrez 2008; see ssc describe pspline). Options are degree(), nknots(), knots(), estopts(), and nopenalty as described in help pspline.

rspline[(options)] adds a regression spline smooth to the plot using the uvrs user command (Royston and Sauerbrei 2007; see net sj 7-1 st0120). Options are:

ci[(area_options)] to plot a confidence interval(*) for the regression spline smooth with options as described in help area_options.

level(#) to set the confidence level, as a percentage, for the plotted confidence interval. The default is level(95) or as set by set level.

alpha(), degree(), df(), and knots() as described in help uvrs.

regressopts() containing any options of regress to be passed through to the regression spline model.

running[(options)] adds a running line smooth to the plot using the running user command (Sasieni 1995, Sasieni and Royston 1998, Sasieni, Royston, and Cox 2005; see net sj 5-2 sed9_2). Options are:

ci[(area_options)] to plot a confidence interval(*) for the running line smooth with options as described in help area_options.

level(#) to set the confidence level, as a percentage, for the plotted confidence interval. The default is level(95) or as set by set level.

double, knn(), mean, repeat(), span(), and twice as described in help running.

autosmoo[(options)] adds an adaptive variable span running line smooth to the plot using the autosmoo user command (Sasieni 1998; see net stb 41 gr27). Options are kmin(), kmax(), and repeat() as described in help autosmoo.

scatter_options are any of the options documented in help twoway scatter

addplot(plot) provides a way to add other plots to the generated graph; see [G] addplot_option.

generate(newvar) stores the partial logit residuals in newvar.

replace permits generate() to overwrite an existing variable.

nograph suppresses drawing the graph.

(*) The confidence intervals computed by the rcspline, lpoly, fpfit, running, and rspline options assume the partial logit residuals to be known and are therefore somewhat too narrow.

Examples

. use http://www.stata-press.com/data/r9/lbw.dta

. logit low age lwt smoke ht ui ptl

. logitcprplot age, rcspline fpfit lowess

. logitcprplot age, lpoly /*Stata 10 required*/

. logitcprplot age, rcspline(ci)

. logitcprplot age, rcspline(ci) sample(10) /*-gsample- and -moremata- > required*/

. generate agesq = age^2

. logit low age agesq lwt smoke ht ui ptl

. logitcprplot age agesq, rcspline(ci)

References

Fox, J. 1997. Applied Regression Analysis, Linear Models, and Related Methods. Thousand Oaks, CA: Sage.

Hilbe, J. 1992. sqv6: Smoothed partial residual plots for logistic regression. Stata Technical Bulletin 10:27.

Jann, B. 2005. moremata: Stata module (Mata) to provide various functions. Available from http://ideas.repec.org/c/boc/bocode/s455001.html.

Jann, B. 2006. gsample: Stata module to draw a random sample. Available from http://ideas.repec.org/c/boc/bocode/s456716.html.

Jann, B., and R. Gutierrez. 2008. pspline: Stata module providing a penalized spline scatterplot smoother based on linear mixed model technology. Available from http://ideas.repec.org/c/boc/bocode/s456972.html.

Landwehr, J. M., D. Pregibon, and A. C. Shoemaker. 1984. Graphical Methods for Assessing Logistic Regression Models. Journal of the American Statistical Association 79:61-71.

Royston, P., and W. Sauerbrei. 2007. Multivariable modeling with cubic regression splines: A principled approach. The Stata Journal 7:45-70.

Sasieni, P. 1995. sed9: Symmetric nearest neighbor linear smoothers. Stata Technical Bulletin 24:10-14.

Sasieni, P. 1998. gr27: An adaptive variable span running line smoother. Stata Technical Bulletin 41:4-7.

Sasieni, P., and P. Royston. 1998. sed9.1: Pointwise confidence intervals for running. Stata Technical Bulletin 41:17-23.

Sasieni, P., P. Royston, and N. J. Cox. 2005. sed9_2: Software update: Symmetric nearest neighbour linear smoothers. Stata Journal 5(2): 285.

Author

Ben Jann, ETH Zurich, jannb@ethz.ch

Thanks for citing this software as follows:

Jann, B. 2008. logitcprplot: Stata module to graph component-plus-residual plot for logistic regression. Available from http://ideas.repec.org/c/boc/bocode/s456969.html.

Also see

Online: help for logit, logistic, cprplot, mkspline, lpoly, fracpoly, lowess; gsample, rcspline, pspline, uvrs, running, autosmoo (if