-------------------------------------------------------------------------------
help for catplot
-------------------------------------------------------------------------------

Plots of frequencies, fractions or percents of categorical data 

        catplot catvar1 [catvar2 [catvar3]] [weight] [if exp] [in range] [ ,
                 {fraction|fraction(varlist)|percent|percent(varlist)}
                 var1opts(over_options) var2opts(over_options)
                 var3opts(over_options)
                 recast(plottype) graph_options ]


Description

    catplot shows frequencies (or optionally fractions or percents) of the
    categories of one, two or three categorical variables. The first named
    variable is innermost on the display; that is, its categories vary
    fastest. Often, but not necessarily, it will be the response or outcome
    of interest. By default catplot is a wrapper for graph hbar. Optionally
    catplot may be recast as a wrapper for graph bar or graph dot.  The
    choice is a matter of personal taste, although in general horizontal
    displays make it easier to identify names or labels of categories.

    fweights, aweights and iweights may be specified. This opens a door to
    use of catplot for plotting any set of values for each of several
    different categories.


Remarks 

    This version of catplot (2.0.0 or up) is not compatible with previous
    versions.

    The default display using graph hbar or graph bar is graphically
    conservative, reflecting the view that height or length of bars and text
    indicating categories are good ways of conveying information.  If you
    wish also to have bars in different colours, specify the option asyvars,
    which differentiates the categories of the first named variable catvar1.
    If you wish also to stack bars of different colours, specify the further
    option stack.

    The default display with graph dot is similarly conservative.  If you
    wish to have point symbols in different colours, specify the option
    asyvars, which differentiates the categories of the first named variable
    catvar1. If you wish also to use different point symbols, use the further
    option marker().

    Note some simple principles in this territory:

        It is difficult to create a great graph, but easy to improve a bad
        one.

        Comparisons must be easy. That could mean in one dimension, across a
        row or down a column, or it could mean using a table structure.

        Ordering by magnitude may be even more useful than ordering by
        category.

        Bars are better than pie slices as length is easier to judge than
        angle. Dots on a scale are a good way to include magnitudes.

        Text is better read as horizontal than as vertical.

        Showing numbers as text as well by graphical elements can be helpful.

        Lose the legend if you can. A great advantage of graph hbar | bar |
        dot is strong support for category labels, which can be nested too.

        The sum of one value is just that value, so weights allow showing any
        values, not just frequencies or percents.

        by() allows table structures to be shown with graph hbar | bar | dot.

        by() can look like another over().


Options 

    fraction indicates that all frequencies should be shown as fractions
        (with sum 1) of the total frequency of all values being represented
        in the graph.

    fraction(varlist) indicates that all frequencies should be shown as
        fractions (with sum 1) of the total frequency for each distinct
        category defined by the combinations of varlist. For example, given a
        variable sex with two categories male and female, the fractions shown
        for male would have sum 1 and those for female would have sum 1.

    percent indicates that all frequencies should be shown as percents (with
        sum 100) of the total frequency of all values being represented in
        the graph.

    percent(varlist) indicates that all frequencies should be shown as
        percents (with sum 100) of the total frequency for each distinct
        category defined by the combinations of varlist.  For example, given
        a variable sex with two categories male and female, the percents
        shown for male would have sum 100 and those for female would have sum
        100.

    Only one of these fraction[()] and percent[()] options may be specified.

    recast() recasts the graph to another plottype, one of hbar, bar, dot.

        Note for users of Stata 10 up: using the Graph Editor is another way
        to produce these and many other changes.

        Note for experienced users: although the name is suggested by another
        recast() option, this is not a back door to recasting to a twoway
        plot.

    var1opts(), var2opts() and var3opts() contain calls to an over() option
        of graph bar, graph hbar or graph dot as appropriate controlling the
        display of elements for catvar1, catvar2 and catvar3 respectively.
        For example, var1opts(sort(1) descending) specifies that values of
        catvar1 should be sorted on frequency or percent and displayed
        increasing downwards or from left to right.

    graph_options refers to options of graph bar, graph hbar or graph dot as
        appropriate.  by() is one useful example. Note: any categorical axis
        title that appears by default is produced by l1title() with hbar or
        dot or by b1title() with bar or the (otherwise undocumented) vertical
        option.


Examples

    . set scheme s1color

    (Stata's auto data)
    . sysuse auto, clear

    . catplot rep78
    . catplot rep78, blabel(bar, pos(base) size(4)) bar(1, bfcolor(none))
        ysc(off)
    . catplot rep78 foreign
    . catplot rep78 foreign, nofill
    . catplot rep78, by(foreign) percent(foreign)
    . catplot rep78, by(foreign) percent(foreign) recast(bar)
    . catplot rep78 foreign, percent(foreign) bar(1, bcolor(blue))
        blabel(bar, position(outside) format(%3.1f)) ylabel(none)
        yscale(r(0,60))

    . gen himpg = mpg > 25
    . label def himpg 1 "mpg > 25" 0 "mpg <= 25"
    . label val himpg himpg
    . catplot himpg rep78 foreign
    . catplot rep78 foreign, by(himpg, col(1) note("")) subtitle(, pos(9)
        ring(1) bcolor(none) nobexpand place(e))
    . catplot rep78 foreign, recast(dot) by(himpg, col(1) note(""))
        subtitle(, pos(9) ring(1) bcolor(none) nobexpand place(e))
    . catplot rep78 foreign, recast(bar) by(himpg, row(1) note(""))
        subtitle(, pos(6) ring(1) bcolor(none) nobexpand)

    . catplot rep78, var1opts(sort(1))
    . catplot rep78, var1opts(sort(1) descending)

    (Titanic data)
    . use titanic, clear
    . collapse survived, by(age sex class)

    . catplot age sex [aw=100*survived], by(class, compact note("") col(1))
        bar(1, blcolor(gs8) bfcolor(gs14)) blabel(bar, format(%4.1f)
        pos(base)) subtitle(, pos(9) ring(1) bcolor(none) nobexpand place(e))
        ytitle(% survived from Titanic, place(e)) var1opts(gap(0))
        var2opts(gap(*.2)) outergap(*.2) ysize(5) yla(0(25)100, glcolor(gs14)
        glw(*.5))

    . catplot age sex [aw=100*survived], by(class, compact note("") col(1) )
        bar(1, blcolor(gs8) bfcolor(pink*.2)) blabel(bar, format(%4.1f)
        pos(base)) subtitle(, pos(9) ring(1) bcolor(none) nobexpand place(e))
        ytitle(% survived from Titanic) var1opts(gap(*0.1) axis(noline))
        var2opts(gap(*.2)) ysize(5) yla(none) ysc(noline)
        plotregion(lcolor(none))


Author

    Nicholas J. Cox, Durham University
    n.j.cox@durham.ac.uk

         
Acknowledgments 

    The first version of catplot was written and revised in 2003 and 2004.
    At that time, Vince Wiggins provided very helpful comments, Fred Wolfe
    asked for sorting and David Schwappach provided feedback on limitations.
    During revision in 2010, Vince Wiggins and Ronán Conroy made encouraging
    noises.


Also see

    On-line:  help for graph hbar; graph bar; graph dot; histogram; tabplot
        (if installed)