-------------------------------------------------------------------------------
help for trimplot
-------------------------------------------------------------------------------

Plots of trimmed means

trimplot varname [if exp] [in range] [, over(varname) percent scatter_options]

trimplot varlist [if exp] [in range] [, percent scatter_options]

Description

trimplot produces plots of trimmed means versus depth for one or more numeric variables. Such plots may help specifically in choosing or assessing measures of level and generally in assessing the symmetry or skewness of distributions. They can be used to compare distributions or to assess whether transformations are necessary or effective.

trimplot may be used to show trimmed means for one variable, in which case different groups may be distinguished by the over() option; or for several variables.

Remarks

Order n data values for a variable x and label them such that x(1) <= ... <= x(n). Following Tukey (1977), depth is defined as 1 for x(1) and x(n), 2 for x(2) and x(n-1), and so forth: it is the smaller number reached by counting inwards from either extreme x(1) or x(n) toward any specified value. So the depth of x(i) is the smaller of i and n - i + 1.

Trimmed means may be related to depth as follows. A trimmed mean may be defined for any particular depth as the mean of all values with that depth or greater. Thus the trimmed mean for depth 1 is the mean of all values. The trimmed mean for depth 2 is the mean of all values except those of depth 1, i.e. all values except for the extremes. The trimmed mean for depth 3 is the mean of all values except those of depth 1 and 2; and so forth.

The highest depth observed for a distribution occurs once if n is odd and twice if n is even; either way it labels those values whose mean is the median. Thus trimmed means range from the mean to the median.

The idea of plotting trimmed mean versus percent trimmed can only be a little deal. An example can be found in Rosenberger and Gasko (1983, p.315). Users knowing good and/or early references are welcome to email me with details.

For more on trimmed means, see the help for trimmean (which must be installed first).

Options

over(varname) specifies that calculations are to be carried out separately for each group defined by varname. over() is allowed only with a single variable to be plotted.

percent specifies that depth is to be scaled and plotted as percent trimmed, which will range from 0 to nearly 50 (a median cannot be based on no observed values, so 50 cannot be attained).

scatter_options are options of twoway scatter.

Examples

. webuse citytemp . describe . trimplot *dd . trimplot temp* . trimplot tempjan, over(region) percent

Author

Nicholas J. Cox, Durham University n.j.cox@durham.ac.uk

Acknowledgments

Ariel Linden found a typo in the help.

References

Rosenberger, J.L. and Gasko, M. 1983. Comparing location estimators: trimmed means, medians, and trimean. In Hoaglin, D.C., Mosteller, F. and Tukey, J.W. (Eds) Understanding robust and exploratory data analysis. New York: John Wiley, 297-338.

Tukey, J.W. 1977. Exploratory data analysis. Reading, MA: Addison-Wesley.

Also see

summarize, means, trimmean (if installed), hsmode (if installed), shorth (if installed)