help pspline-------------------------------------------------------------------------------

Title

pspline-- Penalized spline scatterplot smoother based onxtmixed

Syntax

psplineyvarxvar[varlist] [if] [in] [,options]

optionsDescription ------------------------------------------------------------------------- Maindegree(#)degree of spline; default is 1nknots(#)number of knots; default is min(int(U/4), 35) where U is the number of distinct values ofxvarknots(numlist)exact location of knotsalpha(#)significance level for pilot goodness-of-fit testforceforce penalized spline estimationat(varname[if] [in])obtain the smooth at the values ofvarnamegenerate(newvar)store smoothed fit innewvarreplaceoverwrite existing variablenopenaltydo not apply a roughness penalty; treat spline coefficients as fixed effectsdiscretetreatxvaras a factor variableestopts(options)estimation options as documented in helpxtmixednoisilydisplay estimation outputnographsuppress graphnoscattersuppress scatterplot onlynoknotpossuppress ticks indicating knot positionsScatterplot

marker_optionschange look of markers (color, size, etc.)marker_label_optionsadd marker labels; change look or positionSmoothed line

lineopts(cline_options)affect rendition of the smoothed lineAdd plots

addplot(plot)add other plots to the generated graphY axis, X axis, Titles, Legend, Overall

twoway_optionsany options other thanby()documented in[G]twoway_options-------------------------------------------------------------------------

Description

psplineusesxtmixedto fit a penalized spline regression ofyvaronxvaras discussed in Ruppert et al. (2003) and plots the function. The knots of the spline are positioned at equally spaced quantiles of the distinct values ofxvar.

psplineis an automatic smoother in that the optimal degree of smoothing is determined from the data by (restricted) maximum likelihood.Specify

varlistto adjust for additional covariates and plot partial residuals.To circumvent convergence problems in situations where there is only little deviation in the data from a simple parametric model (e.g. a linear model if degree=1, a quadratic model if degree=2),

psplineperforms a pilot goodness-of-fit (GOF) test for the parametric model. The GOF test is implemented as a Wald test of the spline terms in a non-penalized model (see thenopenaltyoption). A low p-value indicates that there is a lot of evidence against the parametric model.psplineuses the penalized spline model only if the p-value is smaller than 0.3 (or as set byalpha()) and otherwise sticks with the parametric model. Specifyforceto skip the test and enforce the penalized spline model.

Options+------+ ----+ Main +-------------------------------------------------------------

degree(#)specifies the degree of the spline to be used in the smoothing. The default isdegree(1)(linear splines), resulting in a piecewise linear smooth. Usedegree(2)(quadratic splines) for a continuous smooth (i.e. a smooth with a continuous first derivative).degree(0)results in a step function.

nknots(#)specifies the number of knots of the spline. The default is min(int(U/4), 35) where U is the number of distinct values ofxvar.nknots(0)is allowed and causes a parametric model without splines to be fitted. This is equivalent to fitting a polynomial model usingregress(i.e. a linear model if degree=1, a quadratic model if degree=2, etc.).

knots(numlist)specifies the exact locations of knots of the spline. The default is to position the knots at equally spaced quantiles of the distinct values ofxvar.nknots()is not allowed ifknots()is specified.

alpha(#)sets the significance level for the pilot goodness-of-fit test (see description above). The default isalpha(0.3).

forceskips the pilot goodness-of-fit test and enforces estimation of the penalized spline model.

at(varname[if] [in])obtains the smoothed fit at the values ofvarname. The default is to obtain the fit at the values ofxvar. The fit at the values ofvarnameis computed by linear interpolation (or extrapolation) from the fit at the values ofxvar.

generate(newvar)stores the smoothed values innewvar.

replacepermitspsplineto overwrite an existing variable.

nopenaltyfits a non-penalized spline smooth. This is accomplished by treating the spline coefficients as fixed instead of random inxtmixedand is equivalent to fitting a spline model usingregress.

discretecausesxvarto be treated as a factor variable and fits a model containing a random effect among the levels ofxvarinstead of a spline.nknots(),knots(), andat()are not allowed ifdiscreteis specified.

estopts(options)specified options to be passed through toxtmixed.

noisilycauses output fromxtmixedto be displayed.

nographsuppresses drawing the graph of the estimated smooth.

noscattersuppresses graphing a scatterplot of the observed data or partial residuals.

noknotpossuppresses the ticks indicating the knot positions.+-------------+ ----+ Scatterplot +------------------------------------------------------

marker_optionsaffect the rendition of markers drawn at the plotted points, including their shape, size, color, and outline; see[G]marker_options.

marker_label_optionsspecify if and how the markers are to be labeled; see[G]marker_label_options.+---------------+ ----+ Smoothed line +----------------------------------------------------

lineopts(cline_options)affects the rendition of the smoothed line; see[G]cline_options.+-----------+ ----+ Add plots +--------------------------------------------------------

addplot(plot)provides a way to add other plots to the generated graph; see[G]addplot_option.+-----------------------------------------+ ----+ Y axis, X axis, Titles, Legend, Overall +--------------------------

twoway_optionsare any of the options documented in[G]twoway_options, excludingby(). These include options for titling the graph (see[G]title_options) and for saving the graph to disk (see[G]saving_option).

ExamplesExample using the auto data:

. sysuse auto . pspline price mpg // piecewise linear . pspline price mpg, degree(0) // step function . pspline price mpg, degree(2) // continuous . pspline price mpg weight foreign, degree(2) // covariate adjustment

Graph on titlepage of Ruppert et al. (2003):

. use http://fmwww.bc.edu/repec/bocode/l/lidar.dta . pspline logratio range

The motorcycle data:

. webuse motorcycle . pspline accel time, d(2)

Saved results

psplinereturns the results fromxtmixedine()and saves the following inr():Scalars

r(degree)degree of spliner(nknots)number of knotsr(alpha)significance level for pilot GOF testr(gof_chi2)chi-squared of pilot GOF testr(gof_df)degrees of freedom of pilot GOF testr(gof_p)p-value of pilot GOF testMacros

r(model)penalized,parametric, ornon-penalizedr(discrete)discreteor emptyMatrix

r(knots)knot positions

ReferencesRuppert, D., M. P. Wand, and R. J. Carroll (2003). Semiparametric Regression. Cambridge University Press.

AuthorsBen Jann, ETH Zurich, jannb@ethz.ch

Roberto G. Gutierrez, StataCorp., rgutierrez@stata.com

Thanks for citing this software as follows:

Jann, B., and R. Gutierrez. 2008. pspline: Stata module providing a penalized spline scatterplot smoother based on linear mixed model technology. Available from http://ideas.repec.org/c/boc/bocode/s456972.html.

Also see