help for statplot

Plots of summary statistics, including plots by category

statplot varlist [if exp] [in range] [weight]

[ , statistic(stat) over(over_options) [ over(over_options) ] missing

xpose recast(plottype) varnames varopts(varlist_options) graph_options ]

fweights, aweights and iweights may be specified.


statplot plots summary statistics for varlist.


By default statplot is a wrapper for graph hbar. Optionally, statplot may be recast as a wrapper for graph bar or graph dot. The choice is a matter of personal taste, although in general horizontal displays make it easier to identify names or labels of categories.

statplot is offered as an alternative command that may produce graphs that are more congenial or more convenient than those yielded by graph hbar, graph bar, or graph dot. statplot is implemented in terms of those commands, and is less general than those commands in that only one statistic may be plotted at once, so where does any difference or advantage lie?

Like those commands, statplot calls upon collapse to temporarily produce a reduced dataset of summary statistics. The difference is that it organizes that dataset in a different way. The graphs produced, compared with those of graph hbar|bar|dot, are typically based more on axis labeling than on the use of legends, and typically are shown in one color rather than several. Thus, they are likely to be closer to a format acceptable for journal publication.

Otherwise put, statplot may often reduce the need for users to collapse the data for themselves and manipulate the resulting reduced dataset.

The schematic examples below should help to make this concrete.

graph hbar (mean) var1 var2 var3

+-----------------------------------------+ | title(s) | | +-----------------------+ | | | plot region | | | | | | | |__________ | | | |__________| | | | |__________ | | | |__________| | | | |__________ | | | |__________| | | | | | | | +-----------------------+ | | | | legend | | +----------------------+ | | |[green] Mean of Var1 | | | |[orange] Mean of Var2 | | | |[yellow] Mean of Var3 | | | +----------------------+ | +-----------------------------------------+

statplot var1 var2 var3

+-----------------------------------------+ | title(s) | | +-----------------------+ | | | plot region | | | | | | | |__________ | | | Var1 Lbl|__________| | | | |__________ | | | Var2 Lbl|__________| | | | |__________ | | | Var2 Lbl|__________| | | | | | | | +-----------------------+ | | | +-----------------------------------------+ The advantages of statplot's ability to group on the axis become even more evident when using combinations of the over() and xpose options.


What is plotted

statistic() specifies the summary statistic used to summarize and plot varlist. The default is mean. See collapse for a full list of accepted statistics. Note that only one statistic may be specified.

over() contains a call to an over() option (and its suboptions) for graph hbar, graph bar or graph dot as appropriate for controlling grouping variables for the varlist on the axis.

No more than two over() options may be specified.

If two over() options are used, the order is important for how these groupings are nested. The second over() will be labeled on the axis and the first option will be indicated in a legend. To suppress the legend and place the first over()'s variable labels closest to the axis, specify either legend(off) or the xpose option (see below).

The examples below indicate how separate may be used with a single variable to use one over() option rather than two.

missing indicates that observations for missing values in over() or by() variables should be included on the graph.

How it is presented

xpose indicates that the grouping labels for over() options be switched or transposed.

If no over() options are specified, xpose has no effect on the graph.

If one over() option is used, xpose will switch the position of the varlist labels from being the outermost nesting of labels to being closest to the axis. Likewise, the over() labels will move off the axis.

If two over() options are used, xpose will switch the varlist variable labels to the legend, which can be suppressed using legend(off), move the first over() option labels closest to the axis, and move the labels of the second over() option specified off the axis, thereby nesting the first over() option labels.

recast() recasts the graph to another plottype, one of hbar, bar, dot.

Note for users of Stata 10 up: using the Graph Editor is another way to produce these and many other changes.

Note for experienced users: although the name is suggested by another recast() option, this is not a back door to recasting to a twoway plot.

varnames indicates that variable names should be used instead of variable labels for varlist.

varopts() specifies options for the display of the varlist bars or dots. For example, labels could be modified with varopts(label(labsize(medsmall))).

graph_options refer to options of graph hbar, graph bar or graph dot as appropriate.


. sysuse citytemp, clear . statplot heatdd cooldd . statplot heatdd cooldd, over(region) . statplot temp*, over(region, sort(1) descending) s(sd) blabel(bar, format(%2.1f)) ysc(r(. 17.5))

. sysuse census, clear . statplot marriage divorce, over(region) s(sum) . statplot marriage divorce, over(region) s(sum) xpose . statplot marriage divorce, over(region) s(sum) xpose varnames

. sysuse nlsw88, clear . statplot wage, over(race) over(union) . separate wage, by(race) veryshortlabel . statplot wage?, over(union)


Eric A. Booth, Texas A&M University ebooth@ppri.tamu.edu

Nicholas J. Cox, Durham University n.j.cox@durham.ac.uk

Also see

On-line: help for graph hbar; graph bar; graph dot