help nlcheck
-------------------------------------------------------------------------------

Title

nlcheck -- Check linearity assumption after model estimation

Syntax

nlcheck varname [varlist] [, options ]

options Description ------------------------------------------------------------------------- Main bins(#) number of bins for the adaptive fit; default is bins(10) spline use linear spline instead of bins knots(#) number of knots for the spline fit; default is knots(9) eqfreq use equal frequency bins discrete treat varname as a discrete variable noisily display adaptive model estimation results

Graph graph display graph containing linear predictions level(#) set confidence level; default is level(95) step use model without linear term to plot the adaptive fit equation(eqno) plot predictions for eqno equation in a multiple-equation system cline_options affect rendition of the plotted predictions ciopts(area_options) affect rendition of the plotted confidence interval twoway_options any options other than by() documented in [G] twoway_options -------------------------------------------------------------------------

Description

nlcheck is a simple diagnostic tool that can be used after fitting a model to quickly check the linearity assumption for predictor varname. nlcheck categorizes the predictor into bins, refits the model including dummy variables for the bins, and then performs a joint Wald test for the added parameters. A significant test result indicates that the linearity assumption is violated. Alternatively, if the spline option is specified, nlcheck uses linear splines for the adaptive model. Furthermore, support for discrete variables is provided (see the discrete option).

Optionally, nlcheck also displays a graph of the adjusted linear predictions from the original model and the adaptive model (setting all other variables to the mean). Pointwise confidence intervals are plotted for the adaptive fit. Such a linear prediction plot can be useful to evaluate the functional form of a relationship.

If a predictor enters the model repeatedly via different transformations (e.g. polynomials), then these additional terms should be taken into account when computing the adjusted linear predictions for the graph. Use varlist to specify such additional variables.

nlcheck can be used with any estimation command as long as it supports test (and, if graph is specified, adjust), follows the standard syntax

command varlist [if] [in] [weight] [, options ]

and stores the command as typed in e(cmdline). Stata 10 is required.

Options

+------+ ----+ Main +-------------------------------------------------------------

bins(#) sets the number of bins used for the adaptive fit (resulting in #-1 additional parameters). The default is bins(10) or knots()+1 if knots() is specified.

spline causes linear splines to be used for the adaptive fit instead of bins.

knots(#) sets the number of knots for the spline fit. The default is knots(9) or bins()-1 if bins() is specified.

eqfreq causes the bin boundaries to be chosen according to quantiles of the empirical distribution of the predictor (i.e. so that each bin contains approximately the same number of observations). The default is to determine the cut points based on quantiles of the distinct values of the predictor (i.e. so that each bin contains approximately the same number of distinct values). With spline, the knots are positioned at equally spaced quantiles (of the distinct values or of the empirical distribution, depending on the eqfreq option) with half a step before the first knot and after last.

discrete causes varname to be treated as a discrete variable and includes one parameter for each distinct value of the predictor in the adaptive model. The bins(), knots(), and spline options are not allowed with discrete.

noisily causes the estimation results from the adaptive model to be displayed.

+-------+ ----+ Graph +------------------------------------------------------------

graph plots the linear predictions from the base model and the adaptive fit against the predictor with all other variables set to their mean and including pointwise confidence intervals for the adaptive fit.

level(#) specifies the confidence level, as a percentage, for the plotted confidence intervals. The default is level(95) or as set by set level.

equation(eqno), where eqno is ## or name specifies the equation in a multiple-equation system for which the predictions be plotted. This option is allowed only after multiple-equation commands.

step causes the plotted adaptive fit to be based on a model from which the original predictor is excluded. The adjusted predictions from such model appear as a step-function and may be easier to read. step has no effect with spline or discrete. step also has no effect on the performed nonlinearity test.

cline_options affect the rendition of the plotted predictions. See help cline_options.

ciopts(area_options) specifies details about the rendition of the plotted confidence interval. See help area_options.

twoway_options are any of the options documented in help twoway_options, excluding by().

Examples

Basic usage:

. use http://www.stata-press.com/data/r10/nlswork4.dta . regress ln_wage ttl_exp msp . nlcheck ttl_exp . nlcheck ttl_exp, graph step . nlcheck ttl_exp, spline graph

Nonlinear effect:

. generate ttl_exp2 = ttl_exp^2 . regress ln_wage ttl_exp ttl_exp2 msp . nlcheck ttl_exp ttl_exp2, graph step bin(20)

Discrete predictor:

. regress ln_wage ttl_exp msp year . nlcheck year, discrete graph

Logit model:

. sysuse auto . logit foreign price mpg . nlcheck price, spline knots(3) graph

Multinomial logit:

. mlogit rep78 mpg if rep78>=3 . nlcheck mpg, bin(4) graph equation(5)

Returned results

Scalars r(p) two-sided p-value r(F) or r(chi2) F statistic or chi-squared r(df) degrees of freedom r(df_r) residual degrees of freedom (some models) r(cut#) value of the #th cut point or spline knot r(levels) list of distinct values of discrete predictor

Author

Ben Jann, ETH Zurich, jannb@ethz.ch

Thanks for citing this software as follows:

Jann, B. (2008). nlcheck: Stata module to check linearity assumption after model estimation. Available from http://ideas.repec.org/.

Also see