------------------------------------------------------------------------------- help formlowess-------------------------------------------------------------------------------

Lowess smoothing with multiple predictors

mlowessyvar xvarlist[if] [in] [,combine(combine_options)cycles(#)draw(numlist)generate(stub)nographloglowess(lowess_options)omit(numlist)predict(newvar)noptsreplacescatter(scatter_options)line_options]

Description

mlowesscomputes lowess smooths ofyvaron all predictors inxvarlistsimultaneously; that is, each smooth is adjusted for the others. Fitted values may be saved in new variables with names beginning withstub, as specified in thegenerate()option.By default, for each

xvarinxvarlistadjusted values ofyvarand the lowess smooth forxvarare plotted againstxvar. SeeRemarksfor more details.If you have just one predictor, use

lowessdirectly.

Options

combine(combine_options)specifies any of the options allowed by thegraph combinecommand. Useful examples arecombine(ycommon)andcombine(saving(graphname)).

cycles(#)sets the number of cycles. The default iscycles(3).

draw(numlist)specifies that smooths for a subset of the variables inxvarlistbe plotted. The elements ofnumlistare indexes determined by the order of the variables inxvarlist. For example,mlowess y x1x2 x3, draw(2 3)would plot smooths only forx2andx3. By default results for all variables invarlistare plotted.draw()takes precedence overomit()in the sense that results for variables included (by index) innumlistare plotted, even if they are excluded byomit(). See alsoomit().

generate(stub)specifies that fitted values for each member ofxvarlistbe saved in new variables with names beginning withstub.

nographsuppresses the graph.

logdisplays the squared correlation coefficient between the overall fitted values andyvarat each cycle for monitoring convergence. This option is provided mainly for pedagogic interest.

lowess(lowess_options)control the operation oflowessin generating smooths. Key are

meanspecifies running-mean smoothing; the default is running-line least-squares smoothing.

noweightprevents the use of Cleveland's tricube weighting function; the default is to use the weighting function.

bwidth(#)specifies the bandwidth. Centred subsets ofbwidth()* n observations are used for calculating smoothed values for each point in the data except for end points, where smaller, uncentred subsets are used. The greater thebwidth(), the greater the smoothing. The default is 0.8.Note that each choice applies to all predictors. There is no provision for treating predictors differently.

omit(numlist)specifies that smooths for a subset of the variables inxvarlistnot be plotted. The elements ofnumlistare indexes determined by the order of the variables invarlist. For example,mlowess y x1 x2 x3, omit(3)would plot smooths only forx1andx2. By default results for no variables invarlistare omitted.draw()takes precedence overomit(). See alsodraw().

predict(newvar)specifies that the predicted values be saved in new variablenewvar.

noptssuppresses the points in the plots. Only the lines representing the smooths are drawn.

replaceallows variables specified by any of thegenerate()andpredict()options to be replaced if they already exist.

scatter(scatter_options)specifies any of the options allowed by thescattercommand. These should be specified to control the rendering of the data points. The default includesmsymbol(oh), ormsymbol(p)with over 299 observations.

line_optionsare any of the options allowed withline. These should be specified to control the rendering of the smoothed lines or the overall graph.

RemarksThe approach of

mlowessis based on methodology for generalised additive models (Hastie and Tibshirani 1990).mlowessis primarily intended for exploratory graphics, rather than model fitting with inferential apparatus.An R-square (squared correlation coefficient) is provided as a goodness of fit indicator. However, this R-square can typically be increased simply by just smoothing less, which is often likely to be unhelpful. As the resulting predictions come closer to interpolating the data, R-square will approach 1, but scientific usefulness and the possibility of insight will usually diminish.

Suppose that there are p >= 1 predictors.

mlowessestimates the smooths f_1,...,f_p by using a backfitting algorithm and a lowess smoother S[y|x_j] for each predictor, as follows:1. Initialize: alpha = mean(

yvar), f_1,...,f_p estimated by multiple linear regression.2. Cycle: j = 1,...,p, 1,...,p, ...

f_j = S[y - alpha - sum_{i != j} f_i|x_j]

3. Continue for

cycles()rounds.No convergence criterion is applied. In practice, three cycles are usually more than sufficient to get results adequate for exploratory work.

The smooths are adjusted so that the mean of each equals the mean of

yvar.The points in the plots provided by

mlowessdepict y - sum_{i != j} f_i|x_j, i.e., the partial residuals plus alpha.

Examples

. mlowess mpg weight displ length

. mlowess mpg weight displ length, lowess(mean)

. mlowess mpg weight displ length, generate(S) nograph

. mlowess mpg weight displ length, omit(2) combine(saving(graph1))For comparison, bivariate smooths may be compared like this:

. foreach v in weight displ length {. lowess mpg `v', combine(saving(lwss_`v')). }. graph combine "lwss_weight" "lwss_displ" "lwss_length"

AuthorNicholas J. Cox Durham University n.j.cox@durham.ac.uk

AcknowledgementsThe main features of the implementation here depend on the work of Patrick Royston, as reported by Royston and Cox (2005).

ReferencesHastie, T. and Tibshirani, R. 1990.

Generalized additive models.London: Chapman and Hall.Royston, P. and Cox, N.J. 2005. A multivariable scatterplot smoother.

Stata Journal5(3): 405-412.

Also seeOnline:

lowess