------------------------------------------------------------------------------- help forfractileplot-------------------------------------------------------------------------------

Smoothing with respect to distribution function predictors

fractileplotyvar xvarlist[if] [in] [,a(#)combine(combine_options)cycles(#)draw(numlist)generate(stub)nographlocpoly[(locpoly_options)]loglowess(lowess_options)omit(numlist)predict(newvar)noptsreplacescatter(scatter_options)line_options]

Description

fractileplotcomputes smooths ofyvaron all predictors inxvarlistsimultaneously; that is, each smooth is adjusted for the others. Each predictor x_j is treated on the scale of its distribution function, F(x_j), estimated for a sample of n as (rank of x_j - a) / (n - 2a + 1). a defaults to 0.5. By default smoothing is withlowess. Optionally smoothing may be withlocpoly, so long as that has been installed. Fitted values may be saved in new variables with names beginning withstub, as specified in thegenerate()option.By default, for each

xvarinxvarlistadjusted values ofyvarand the smooth for F(xvar) are plotted against F(xvar). SeeRemarksfor more details.

Options

a(#)specifies a in the formula for F. The default is a = 0.5, giving (i - 0.5) / n. Other choices include a = 0, giving i / (n + 1), and a = 1/3, giving (i - 1/3) / (n + 1/3). More discussion on this is available at http://www.stata.com/support/faqs/stat/pcrank.html.

combine(combine_options)specifies any of the options allowed by thegraph combinecommand. Useful examples arecombine(ycommon)andcombine(saving(graphname)).

cycles(#)sets the number of cycles. The default iscycles(3).

draw(numlist)specifies that smooths for a subset of the variables inxvarlistbe plotted. The elements ofnumlistare indexes determined by the order of the variables inxvarlist. For example,fractileploty x1 x2 x3, draw(2 3)would plot smooths only for F(x2) and F(x3). By default results for all variables invarlistare plotted.draw()takes precedence overomit()in the sense that results for variables included (by index) innumlistare plotted, even if they are excluded byomit(). See alsoomit().

generate(stub)specifies that fitted values for each member ofxvarlistbe saved in new variables with names beginning withstub.

nographsuppresses the graph.

locpolyspecifies thatlocpoly(Gutierrez, Linhart and Pitblado 2003, 2005a, 2005b) should be used for smoothing. This should be installed beforehand.locpolymay be specified with options. Key are

degree(#)specifies the degree of the polynomial to be used in the smoothing. Zero is the default, meaning local mean smoothing.

width(#)specifies the halfwidth of the kernel, the width of the smoothing window around each point. Ifwidth()is not specified, then the "default" width is used; see[R] kdensity. This default is entirely inappropriate for local polynomial smoothing. Choose your own.

epanechnikov,biweight,cosine,gaussian,parzen,rectangle, andtrianglespecify the kernel. By default,epanechnikov, meaning the Epanechnikov kernel, is used.

logdisplays the squared correlation coefficient between the overall fitted values andyvarat each cycle for monitoring convergence. This option is provided mainly for pedagogic interest.

lowess(lowess_options)control the operation oflowessin generating smooths. Key are

meanspecifies running-mean smoothing; the default is running-line least-squares smoothing.

noweightprevents the use of Cleveland's tricube weighting function; the default is to use the weighting function.

bwidth(#)specifies the bandwidth. Centred subsets ofbwidth()* n observations are used for calculating smoothed values for each point in the data except for end points, where smaller, uncentred subsets are used. The greater thebwidth(), the greater the smoothing. The default is 0.8.

omit(numlist)specifies that smooths for a subset of the variables inxvarlistnot be plotted. The elements ofnumlistare indexes determined by the order of the variables invarlist. For example,fractileplot y x1 x2 x3, omit(3)would plot smooths only for F(x1) and F(x2). By default results for no variables invarlistare omitted.draw()takes precedence overomit(). See alsodraw().

predict(newvar)specifies that the predicted values be saved in new variablenewvar.

noptssuppresses the points in the plots. Only the lines representing the smooths are drawn.

replaceallows variables specified by any of thegenerate()andpredict()options to be replaced if they already exist.

scatter(scatter_options)specifies any of the options allowed by thescattercommand. These should be specified to control the rendering of the data points. The default includesmsymbol(oh), ormsymbol(p)with over 299 observations.

line_optionsare any of the options allowed withline. These should be specified to control the rendering of the smoothed lines or the overall graph.

RemarksSmoothing with respect to distribution functions has various elementary attractions. An F scale provides a common scale for variables with different level and spread and even different units. Subject to the occurrence of ties, values are equally spaced on the F scale and so in good condition for smoothing. This can be especially useful when predictors are highly skewed. F is invariant under strictly increasing transformations, so that for example F(log x) is identical to F(x) so long as x > 0. This can be useful when it is not clear whether predictors should be transformed.

Sen (2005) gives a useful recent account of kernel smoothing of responses with respect to distribution functions of predictors. The canonical reference is Mahalanobis (1960), which introduced the term "fractile graphical analysis". Mahalanobis plotted means of one variable for bins defined by selected fractiles of the other variable. Binning and averaging now appear arbitrary and awkward, and some kind of kernel-based smoothing is more appealing. The approach in

fractileplotis based on methodology for generalised additive models (Hastie and Tibshirani 1990).Terminology is problematic here. Terms such as "fractile graph" (Sen 2005) and "fractile plot" (Nordhaus 2006) persist in recent literature for modern versions of Mahalanobis' plots, even though neither ordinate nor abscissa in the resulting graphs is a fractile. The term "fractile" was introduced to the English literature by Hald (1952) with the sense of "quantile", but it has never supplanted "quantile" and is often misunderstood to mean fraction or cumulative probability or plotting position (e.g. Nordhaus 2006). Hald used "fractile diagram" for normal probability plots - in Stata terms, his examples are equivalent to

qnormwith axes reversed - and this usage also continues in recent literature (e.g. Blęsild and Granfeldt 2003). In the absence of an obvious alternative, customary terminology is used here under protest.An R-square (squared correlation coefficient) is provided as a goodness of fit indicator. However, this R-square can typically be increased simply by just smoothing less, which is often likely to be unhelpful. As the resulting predictions come closer to interpolating the data, R-square will approach 1, but scientific usefulness and the possibility of insight will usually diminish.

Note that you do not need the machinery here to do this for just one predictor. The following is a basic recipe:

. gen touse = (y< .) & (x< .). egen abscissa = rank(x) if touse. count if touse. replace abscissa = (abscissa - 0.5) / r(N). lowessyabscissa, xti("fraction of data")Suppose that there are p >= 1 predictors.

fractileplotestimates the smooths f_1,...,f_p by using a backfitting algorithm and a lowess or local polynomial smoother S[y|F(x_j)] for each predictor, as follows:1. Initialize: alpha = mean(

yvar), f_1,...,f_p estimated by multiple linear regression.2. Cycle: j = 1,...,p, 1,...,p, ...

f_j = S[y - alpha - sum_{i != j} f_i|F(x_j)]

3. Continue for

cycles()rounds.No convergence criterion is applied. In practice, three cycles are usually more than sufficient to get results adequate for exploratory work.

The smooths are adjusted so that the mean of each equals the mean of

yvar.The points in the plots provided by

fractileplotdepict y - sum_{i != j} f_i|F(x_j), i.e., the partial residuals plus alpha.

Examples

. fractileplot mpg weight displ length

. fractileplot mpg weight displ length, lowess(mean)

. fractileplot mpg weight displ length, locpoly(degree(1) biweightwidth(0.4))

. fractileplot mpg weight displ length, generate(S) nograph

. fractileplot mpg weight displ length, omit(2) combine(saving(graph1))For comparison, bivariate smooths may be compared like this:

. foreach v in weight displ length {. fractileplot mpg `v', combine(saving(fl_`v')). }. graph combine "fl_weight" "fl_displ" "fl_length"

AuthorNicholas J. Cox Durham University n.j.cox@durham.ac.uk

AcknowledgementsThe main features of the implementation here depend on the work of Patrick Royston, as reported by Royston and Cox (2005).

ReferencesBlęsild, P. and Granfeldt, J. 2003.

Statistics with applications ingeology and biology.Boca Raton, FL: Chapman & Hall/CRC.Gutierrez, R.G., Linhart, J.M. and Pitblado, J.S. 2003. From the help desk: Local polynomial regression and Stata plugins.

Stata Journal3(4): 412-419. Software Updates: 2005a. 5(1): 139 and 2005b. 5(2): 285.Hald, A. 1952.

Statistical theory with engineering applications.New York: John Wiley.Hastie, T. and Tibshirani, R. 1990.

Generalized additive models.London: Chapman and Hall.Mahalanobis, P.C. 1960. A method of fractile graphical analysis.

Econometrica28: 325-351. Reprinted 1961.SankhyaSeries A 23: 41-64.Nordhaus, W.D. 2006. Geography and macroeconomics: new data and new findings.

Proceedings, National Academy of Sciences103(10): 3510-3517.Royston, P. and Cox, N.J. 2005. A multivariable scatterplot smoother.

Stata Journal5(3): 405-412.Sen, B. 2005. Estimation and comparison of fractile graphs using kernel smoothing techniques.

Sankhya67: 305-334. http://sankhya.isical.ac.in/search/67_2/2005014.pdfNote:

Sankhyashould carry a bar accent on its final "a".

Also seeOnline:

lowess,locpoly(if installed)