{smcl} {* 17oct2006}{...} {hline} help for {hi:mlowess} {hline} {title:Lowess smoothing with multiple predictors} {p 8 12 2} {cmd:mlowess} {it:yvar xvarlist} {ifin} [{cmd:,} {cmd:combine(}{it:combine_options}{cmd:)} {cmdab:cyc:les(}{it:#}{cmd:)} {cmdab:dr:aw(}{it:numlist}{cmd:)} {cmdab:gen:erate(}{it:stub}{cmd:)} {cmd:nograph} {cmd:log} {cmd:lowess(}{it:lowess_options}{cmd:)} {cmdab:om:it(}{it:numlist}{cmd:)} {cmdab:p:redict(}{it:newvar}{cmd:)} {cmdab:nopt:s} {cmd:replace} {cmdab:sc:atter(}{it:scatter_options}{cmd:)} {it:line_options}] {title:Description} {p 4 4 2} {cmd:mlowess} computes lowess smooths of {it:yvar} on all predictors in {it:xvarlist} simultaneously; that is, each smooth is adjusted for the others. Fitted values may be saved in new variables with names beginning with {it:stub}, as specified in the {cmd:generate()} option. {p 4 4 2}By default, for each {it:xvar} in {it:xvarlist} adjusted values of {it:yvar} and the lowess smooth for {it:xvar} are plotted against {it:xvar}. See {hi:Remarks} for more details. {p 4 4 2}If you have just one predictor, use {helpb lowess} directly. {title:Options} {p 4 8 2} {cmd:combine(}{it:combine_options}{cmd:)} specifies any of the options allowed by the {helpb graph combine} command. Useful examples are {cmd:combine(ycommon)} and {cmd:combine(saving(}{it:graphname}{cmd:))}. {p 4 8 2} {cmd:cycles(}{it:#}{cmd:)} sets the number of cycles. The default is {cmd:cycles(3)}. {p 4 8 2} {cmd:draw(}{it:numlist}{cmd:)} specifies that smooths for a subset of the variables in {it:xvarlist} be plotted. The elements of {it:numlist} are indexes determined by the order of the variables in {it:xvarlist}. For example, {cmd:mlowess y x1 x2 x3, draw(2 3)} would plot smooths only for {cmd:x2} and {cmd:x3}. By default results for all variables in {it:varlist} are plotted. {cmd:draw()} takes precedence over {cmd:omit()} in the sense that results for variables included (by index) in {it:numlist} are plotted, even if they are excluded by {cmd:omit()}. See also {cmd:omit()}. {p 4 8 2} {cmd:generate(}{it:stub}{cmd:)} specifies that fitted values for each member of {it:xvarlist} be saved in new variables with names beginning with {it:stub}. {p 4 8 2}{cmd:nograph} suppresses the graph. {p 4 8 2}{cmd:log} displays the squared correlation coefficient between the overall fitted values and {it:yvar} at each cycle for monitoring convergence. This option is provided mainly for pedagogic interest. {p 4 8 2}{cmd:lowess(}{it:lowess_options}{cmd:)} control the operation of {helpb lowess} in generating smooths. Key are {p 8 8 2}{cmdab:m:ean} specifies running-mean smoothing; the default is running-line least-squares smoothing. {p 8 8 2}{cmdab:now:eight} prevents the use of Cleveland's tricube weighting function; the default is to use the weighting function. {p 8 8 2}{cmdab:bw:idth(}{it:#}{cmd:)} specifies the bandwidth. Centred subsets of {cmd:bwidth()} * n observations are used for calculating smoothed values for each point in the data except for end points, where smaller, uncentred subsets are used. The greater the {cmd:bwidth()}, the greater the smoothing. The default is 0.8. {p 8 8 2}Note that each choice applies to all predictors. There is no provision for treating predictors differently. {p 4 8 2}{cmd:omit(}{it:numlist}{cmd:)} specifies that smooths for a subset of the variables in {it:xvarlist} not be plotted. The elements of {it:numlist} are indexes determined by the order of the variables in {it:varlist}. For example, {cmd:mlowess y x1 x2 x3, omit(3)} would plot smooths only for {cmd:x1} and {cmd:x2}. By default results for no variables in {it:varlist} are omitted. {cmd:draw()} takes precedence over {cmd:omit()}. See also {cmd:draw()}. {p 4 8 2}{cmd:predict(}{it:newvar}{cmd:)} specifies that the predicted values be saved in new variable {it:newvar}. {p 4 8 2}{cmd:nopts} suppresses the points in the plots. Only the lines representing the smooths are drawn. {p 4 8 2}{cmd:replace} allows variables specified by any of the {cmd:generate()} and {cmd:predict()} options to be replaced if they already exist. {p 4 8 2}{cmd:scatter(}{it:scatter_options}{cmd:)} specifies any of the options allowed by the {helpb scatter} command. These should be specified to control the rendering of the data points. The default includes {cmd:msymbol(oh)}, or {cmd:msymbol(p)} with over 299 observations. {p 4 8 2}{it:line_options} are any of the options allowed with {helpb line}. These should be specified to control the rendering of the smoothed lines or the overall graph. {title:Remarks} {p 4 4 2}The approach of {cmd:mlowess} is based on methodology for generalised additive models (Hastie and Tibshirani 1990). {cmd:mlowess} is primarily intended for exploratory graphics, rather than model fitting with inferential apparatus. {p 4 4 2}An R-square (squared correlation coefficient) is provided as a goodness of fit indicator. However, this R-square can typically be increased simply by just smoothing less, which is often likely to be unhelpful. As the resulting predictions come closer to interpolating the data, R-square will approach 1, but scientific usefulness and the possibility of insight will usually diminish. {p 4 4 2} Suppose that there are p >= 1 predictors. {cmd:mlowess} estimates the smooths f_1,...,f_p by using a backfitting algorithm and a lowess smoother S[y|x_j] for each predictor, as follows: {p 4 8 2} 1. Initialize: alpha = mean({it:yvar}), f_1,...,f_p estimated by multiple linear regression. {p 4 8 2} 2. Cycle: j = 1,...,p, 1,...,p, ... {p 8 8 2} f_j = S[y - alpha - sum_{i != j} f_i|x_j] {p 4 8 2} 3. Continue for {cmd:cycles()} rounds. {p 4 4 2} No convergence criterion is applied. In practice, three cycles are usually more than sufficient to get results adequate for exploratory work. {p 4 4 2} The smooths are adjusted so that the mean of each equals the mean of {it:yvar}. {p 4 4 2} The points in the plots provided by {cmd:mlowess} depict y - sum_{i != j} f_i|x_j, i.e., the partial residuals plus alpha. {title:Examples} {p 4 8 2} {cmd:. mlowess mpg weight displ length} {p 4 8 2} {cmd:. mlowess mpg weight displ length, lowess(mean)} {p 4 8 2} {cmd:. mlowess mpg weight displ length, generate(S) nograph} {p 4 8 2} {cmd:. mlowess mpg weight displ length, omit(2) combine(saving(graph1))} {p 4 4 2}For comparison, bivariate smooths may be compared like this: {p 4 8 2}{cmd:. foreach v in weight displ length {c -(}}{p_end} {p 4 8 2}{cmd:. {space 8}lowess mpg `v', combine(saving(lwss_`v'))}{p_end} {p 4 8 2}{cmd:. {c )-}}{p_end} {p 4 8 2}{cmd:. graph combine "lwss_weight" "lwss_displ" "lwss_length"} {title:Author} {p 4 4 2}Nicholas J. Cox{break} Durham University{break} n.j.cox@durham.ac.uk {title:Acknowledgements} {p 4 4 2}The main features of the implementation here depend on the work of Patrick Royston, as reported by Royston and Cox (2005). {title:References} {p 4 8 2} Hastie, T. and Tibshirani, R. 1990. {it:Generalized additive models.} London: Chapman and Hall. {p 4 8 2} Royston, P. and Cox, N.J. 2005. A multivariable scatterplot smoother. {it:Stata Journal} 5(3): 405{c -}412. {title:Also see} {p 4 13 2}Online: {helpb lowess}{p_end}