------------------------------------------------------------------------------- help forcatplot-------------------------------------------------------------------------------

Plots of frequencies, fractions or percents of categorical data

catplotcatvar1[catvar2[catvar3]] [weight] [ifexp] [inrange] [,{fraction|fraction(varlist)|percent|percent(varlist)}var1opts(over_options)var2opts(over_options)var3opts(over_options)recast(plottype)graph_options]

Description

catplotshows frequencies (or optionally fractions or percents) of the categories of one, two or three categorical variables. The first named variable is innermost on the display; that is, its categories vary fastest. Often, but not necessarily, it will be the response or outcome of interest. By defaultcatplotis a wrapper for graph hbar. Optionallycatplotmay be recast as a wrapper for graph bar or graph dot. The choice is a matter of personal taste, although in general horizontal displays make it easier to identify names or labels of categories.

fweights,aweights andiweights may be specified. This opens a door to use ofcatplotfor plotting any set of values for each of several different categories.

RemarksThis version of

catplot(2.0.0 or up) is not compatible with previous versions.The default display using

graph hbarorgraph baris graphically conservative, reflecting the view that height or length of bars and text indicating categories are good ways of conveying information. If you wish also to have bars in different colours, specify the optionasyvars, which differentiates the categories of thefirstnamed variablecatvar1. If you wish also to stack bars of different colours, specify the further optionstack.The default display with

graph dotis similarly conservative. If you wish to have point symbols in different colours, specify the optionasyvars, which differentiates the categories of thefirstnamed variablecatvar1. If you wish also to use different point symbols, use the further optionmarker().Note some simple principles in this territory:

It is difficult to create a great graph, but easy to improve a bad one.

Comparisons must be easy. That could mean in one dimension, across a row or down a column, or it could mean using a table structure.

Ordering by magnitude may be even more useful than ordering by category.

Bars are better than pie slices as length is easier to judge than angle. Dots on a scale are a good way to include magnitudes.

Text is better read as horizontal than as vertical.

Showing numbers as text as well by graphical elements can be helpful.

Lose the legend if you can. A great advantage of

graph hbar|bar|dotis strong support for category labels, which can be nested too.The sum of one value is just that value, so weights allow showing any values, not just frequencies or percents.

by()allows table structures to be shown withgraph hbar|bar|dot.

by()can look like anotherover().

Options

fractionindicates that all frequencies should be shown as fractions (with sum 1) of the total frequency of all values being represented in the graph.

fraction(varlist)indicates that all frequencies should be shown as fractions (with sum 1) of the total frequency for each distinct category defined by the combinations ofvarlist. For example, given a variablesexwith two categories male and female, the fractions shown for male would have sum 1 and those for female would have sum 1.

percentindicates that all frequencies should be shown as percents (with sum 100) of the total frequency of all values being represented in the graph.

percent(varlist)indicates that all frequencies should be shown as percents (with sum 100) of the total frequency for each distinct category defined by the combinations ofvarlist. For example, given a variablesexwith two categories male and female, the percents shown for male would have sum 100 and those for female would have sum 100.Only one of these

fraction[()] andpercent[()] options may be specified.

recast()recasts the graph to anotherplottype, one ofhbar,bar,dot.Note for users of Stata 10 up: using the Graph Editor is another way to produce these and many other changes.

Note for experienced users: although the name is suggested by another recast() option, this is not a back door to recasting to a

twowayplot.

var1opts(),var2opts()andvar3opts()contain calls to anover()option of graph bar, graph hbar or graph dot as appropriate controlling the display of elements forcatvar1,catvar2andcatvar3respectively. For example,var1opts(sort(1) descending)specifies that values ofcatvar1should be sorted on frequency or percent and displayed increasing downwards or from left to right.

graph_optionsrefers to options of graph bar, graph hbar or graph dot as appropriate.by()is one useful example. Note: any categorical axis title that appears by default is produced byl1title()withhbarordotor byb1title()withbaror the (otherwise undocumented)verticaloption.

Examples

. set scheme s1color(Stata's auto data)

. sysuse auto, clear

. catplot rep78. catplot rep78, blabel(bar, pos(base) size(4)) bar(1, bfcolor(none))ysc(off). catplot rep78 foreign. catplot rep78 foreign, nofill. catplot rep78, by(foreign) percent(foreign). catplot rep78, by(foreign) percent(foreign) recast(bar). catplot rep78 foreign, percent(foreign) bar(1, bcolor(blue))blabel(bar, position(outside) format(%3.1f)) ylabel(none)yscale(r(0,60))

. gen himpg = mpg > 25. label def himpg 1 "mpg > 25" 0 "mpg <= 25". label val himpg himpg. catplot himpg rep78 foreign. catplot rep78 foreign, by(himpg, col(1) note("")) subtitle(, pos(9)ring(1) bcolor(none) nobexpand place(e)). catplot rep78 foreign, recast(dot) by(himpg, col(1) note(""))subtitle(, pos(9) ring(1) bcolor(none) nobexpand place(e)). catplot rep78 foreign, recast(bar) by(himpg, row(1) note(""))subtitle(, pos(6) ring(1) bcolor(none) nobexpand)

. catplot rep78, var1opts(sort(1)). catplot rep78, var1opts(sort(1) descending)(Titanic data)

. use titanic, clear. collapse survived, by(age sex class)

. catplot age sex [aw=100*survived], by(class, compact note("") col(1))bar(1, blcolor(gs8) bfcolor(gs14)) blabel(bar, format(%4.1f)pos(base)) subtitle(, pos(9) ring(1) bcolor(none) nobexpand place(e))ytitle(% survived from Titanic, place(e)) var1opts(gap(0))var2opts(gap(*.2)) outergap(*.2) ysize(5) yla(0(25)100, glcolor(gs14)glw(*.5))

. catplot age sex [aw=100*survived], by(class, compact note("") col(1) )bar(1, blcolor(gs8) bfcolor(pink*.2)) blabel(bar, format(%4.1f)pos(base)) subtitle(, pos(9) ring(1) bcolor(none) nobexpand place(e))ytitle(% survived from Titanic) var1opts(gap(*0.1) axis(noline))var2opts(gap(*.2)) ysize(5) yla(none) ysc(noline)plotregion(lcolor(none))

AuthorNicholas J. Cox, Durham University n.j.cox@durham.ac.uk

AcknowledgmentsThe first version of

catplotwas written and revised in 2003 and 2004. At that time, Vince Wiggins provided very helpful comments, Fred Wolfe asked for sorting and David Schwappach provided feedback on limitations. During revision in 2010, Vince Wiggins and Ronán Conroy made encouraging noises.

Also seeOn-line: help for graph hbar; graph bar; graph dot; histogram; tabplot (if installed)