{smcl}
{* 17oct2006}{...}
{hline}
help for {hi:mlowess}
{hline}
{title:Lowess smoothing with multiple predictors}
{p 8 12 2}
{cmd:mlowess}
{it:yvar xvarlist}
{ifin}
[{cmd:,}
{cmd:combine(}{it:combine_options}{cmd:)}
{cmdab:cyc:les(}{it:#}{cmd:)}
{cmdab:dr:aw(}{it:numlist}{cmd:)}
{cmdab:gen:erate(}{it:stub}{cmd:)}
{cmd:nograph}
{cmd:log}
{cmd:lowess(}{it:lowess_options}{cmd:)}
{cmdab:om:it(}{it:numlist}{cmd:)}
{cmdab:p:redict(}{it:newvar}{cmd:)}
{cmdab:nopt:s}
{cmd:replace}
{cmdab:sc:atter(}{it:scatter_options}{cmd:)}
{it:line_options}]
{title:Description}
{p 4 4 2}
{cmd:mlowess} computes lowess smooths of {it:yvar} on all predictors in
{it:xvarlist} simultaneously; that is, each smooth is adjusted for the others.
Fitted values may be saved in new variables with names
beginning with {it:stub}, as specified in the {cmd:generate()} option.
{p 4 4 2}By default, for each {it:xvar} in {it:xvarlist} adjusted
values of {it:yvar} and the lowess smooth for {it:xvar} are plotted against
{it:xvar}. See {hi:Remarks} for more details.
{p 4 4 2}If you have just one predictor, use {helpb lowess} directly.
{title:Options}
{p 4 8 2} {cmd:combine(}{it:combine_options}{cmd:)} specifies any of the
options allowed by the {helpb graph combine} command. Useful examples are
{cmd:combine(ycommon)} and {cmd:combine(saving(}{it:graphname}{cmd:))}.
{p 4 8 2} {cmd:cycles(}{it:#}{cmd:)} sets the number of cycles. The default is
{cmd:cycles(3)}.
{p 4 8 2} {cmd:draw(}{it:numlist}{cmd:)} specifies that smooths for a subset of
the variables in {it:xvarlist} be plotted. The elements of {it:numlist} are
indexes determined by the order of the variables in {it:xvarlist}. For
example, {cmd:mlowess y x1 x2 x3, draw(2 3)} would plot smooths only for
{cmd:x2} and {cmd:x3}. By default results for all variables in
{it:varlist} are plotted. {cmd:draw()} takes precedence over {cmd:omit()} in
the sense that results for variables included (by index) in {it:numlist} are
plotted, even if they are excluded by {cmd:omit()}. See also {cmd:omit()}.
{p 4 8 2} {cmd:generate(}{it:stub}{cmd:)} specifies that fitted values for each
member of {it:xvarlist} be saved in new variables with names beginning with
{it:stub}.
{p 4 8 2}{cmd:nograph} suppresses the graph.
{p 4 8 2}{cmd:log} displays the squared correlation coefficient between the
overall fitted values and {it:yvar} at each cycle for monitoring convergence.
This option is provided mainly for pedagogic interest.
{p 4 8 2}{cmd:lowess(}{it:lowess_options}{cmd:)} control the operation
of {helpb lowess} in generating smooths. Key are
{p 8 8 2}{cmdab:m:ean} specifies running-mean smoothing; the default is
running-line least-squares smoothing.
{p 8 8 2}{cmdab:now:eight} prevents the use of Cleveland's tricube weighting
function; the default is to use the weighting function.
{p 8 8 2}{cmdab:bw:idth(}{it:#}{cmd:)} specifies the bandwidth.
Centred subsets of {cmd:bwidth()} * n
observations are used for calculating smoothed values for each point
in the data except for end points, where smaller, uncentred subsets
are used. The greater the {cmd:bwidth()}, the greater the smoothing. The
default is 0.8.
{p 8 8 2}Note that each choice applies to all predictors. There is no
provision for treating predictors differently.
{p 4 8 2}{cmd:omit(}{it:numlist}{cmd:)} specifies that smooths for a subset of
the variables in {it:xvarlist} not be plotted. The elements of {it:numlist} are
indexes determined by the order of the variables in {it:varlist}. For example,
{cmd:mlowess y x1 x2 x3, omit(3)} would plot smooths only for {cmd:x1} and
{cmd:x2}. By default results for no variables in {it:varlist} are omitted.
{cmd:draw()} takes precedence over {cmd:omit()}. See also {cmd:draw()}.
{p 4 8 2}{cmd:predict(}{it:newvar}{cmd:)} specifies that the predicted values
be saved in new variable {it:newvar}.
{p 4 8 2}{cmd:nopts} suppresses the points in the plots. Only the lines
representing the smooths are drawn.
{p 4 8 2}{cmd:replace} allows variables specified by any of the
{cmd:generate()} and {cmd:predict()} options to be replaced if they already
exist.
{p 4 8 2}{cmd:scatter(}{it:scatter_options}{cmd:)} specifies any of the options
allowed by the {helpb scatter} command. These should be specified to control
the rendering of the data points. The default includes {cmd:msymbol(oh)}, or
{cmd:msymbol(p)} with over 299 observations.
{p 4 8 2}{it:line_options} are any of the options allowed with {helpb line}.
These should be specified to control the rendering of the smoothed lines or the
overall graph.
{title:Remarks}
{p 4 4 2}The approach of {cmd:mlowess} is based on methodology for
generalised additive models (Hastie and Tibshirani 1990). {cmd:mlowess}
is primarily intended for exploratory graphics, rather than model fitting
with inferential apparatus.
{p 4 4 2}An R-square (squared correlation coefficient) is provided as a
goodness of fit indicator. However, this R-square can typically be increased simply
by just smoothing less, which is often likely to be unhelpful. As the resulting
predictions come closer to interpolating the data, R-square will approach 1,
but scientific usefulness and the possibility of insight will usually diminish.
{p 4 4 2}
Suppose that there are p >= 1 predictors. {cmd:mlowess} estimates the
smooths f_1,...,f_p by using a backfitting algorithm and a lowess
smoother S[y|x_j] for each predictor, as follows:
{p 4 8 2}
1. Initialize: alpha = mean({it:yvar}), f_1,...,f_p
estimated by multiple linear regression.
{p 4 8 2}
2. Cycle: j = 1,...,p, 1,...,p, ...
{p 8 8 2}
f_j = S[y - alpha - sum_{i != j} f_i|x_j]
{p 4 8 2}
3. Continue for {cmd:cycles()} rounds.
{p 4 4 2}
No convergence criterion is applied. In practice, three cycles are
usually more than sufficient to get results adequate for exploratory work.
{p 4 4 2}
The smooths are adjusted so that the mean of each equals the mean of {it:yvar}.
{p 4 4 2}
The points in the plots provided by {cmd:mlowess}
depict y - sum_{i != j} f_i|x_j, i.e., the partial residuals plus alpha.
{title:Examples}
{p 4 8 2}
{cmd:. mlowess mpg weight displ length}
{p 4 8 2}
{cmd:. mlowess mpg weight displ length, lowess(mean)}
{p 4 8 2}
{cmd:. mlowess mpg weight displ length, generate(S) nograph}
{p 4 8 2}
{cmd:. mlowess mpg weight displ length, omit(2) combine(saving(graph1))}
{p 4 4 2}For comparison, bivariate smooths may be compared like this:
{p 4 8 2}{cmd:. foreach v in weight displ length {c -(}}{p_end}
{p 4 8 2}{cmd:. {space 8}lowess mpg `v', combine(saving(lwss_`v'))}{p_end}
{p 4 8 2}{cmd:. {c )-}}{p_end}
{p 4 8 2}{cmd:. graph combine "lwss_weight" "lwss_displ" "lwss_length"}
{title:Author}
{p 4 4 2}Nicholas J. Cox{break}
Durham University{break}
n.j.cox@durham.ac.uk
{title:Acknowledgements}
{p 4 4 2}The main features of the implementation here depend on the work of
Patrick Royston, as reported by Royston and Cox (2005).
{title:References}
{p 4 8 2}
Hastie, T. and Tibshirani, R. 1990.
{it:Generalized additive models.}
London: Chapman and Hall.
{p 4 8 2}
Royston, P. and Cox, N.J. 2005.
A multivariable scatterplot smoother.
{it:Stata Journal} 5(3): 405{c -}412.
{title:Also see}
{p 4 13 2}Online: {helpb lowess}{p_end}