-------------------------------------------------------------------------------
help for tabcount
-------------------------------------------------------------------------------

Tabulate frequencies

tabcount varlist [if exp] [in range] [weight = exp] [, { v(value_list) | v1(value_list) } v2(value_list) v3(value_list) ... { c(condition_list) | c1(condition_list) } c2(condition_list) c3(condition_list) ... zero missing replace freq(newvarname) matrix(matrix_name) tabdisp_options ]

by ... : may be used with tabcount; see help by.

fweights and iweights may be specified.

Description

tabcount tabulates frequencies for up to 7 variables. Its main distinctive features are that zero frequencies of one or more specified values are always shown in the table (i.e. entirely empty rows, columns, etc. are not omitted) and that reduced datasets and/or matrices containing the frequencies may also be saved.

Options

v(), v1(), etc. (v options) and/or c(), c1(), etc. (c options) are required. For each variable one v option or one c option is required. Suppose you specify tabcount foreign rep78. foreign is the first variable, for which you must specify either v1() or c1(). rep78 is the second variable, for which you must specify either v2() or c2().

A v option specifies a list of values which define the categories of a variable that are to be counted. A numlist of numeric values will be expanded. Value labels will be used for display when they exist. String values containing spaces or quotes should be given in double quotes or compound double quotes. Thus v1(1/5) specifies that the integers 1 through 5 are the categories of the first variable to be tabulated.

A c option specifies a list of conditions which define the categories of a variable that are to be counted. Conditions are treated as true or false. In particular, conditions whose first non-space character is >, <, ! or ~ will be treated as specifying an inequality, so make sure you use correct Stata syntax, i.e. one of > >= < <= != or ~= should be given. Any value given without one of these preceding characters is treated as if it were preceded by ==. So if your conditions are <=3 4 5 6 >7 the categories will be <=3, (equal to) 4, ... (equal to) 6 and >7. The text given will be taken literally, so that "> 7" produces the same counts as >7 but the display will be as given. Note the need for protecting the space by quotes. There is no requirement that the categories be exclusive or exhaustive. Thus c1(<=10 <=20 <=30 <=40) is fine. There is no special syntax for specifying closed intervals. See the examples for one commonly used device.

If there is just one variable tabulated, then v() is a synonym for v1() and c() is a synonym for c1().

zero specifies that zeros are to be shown as such in the table. The default is to blank them out. Irrespective of this option, zeros are always saved as such by the matrix() and replace options.

missing specifies that missing values of varlist are not automatically to be excluded.

replace specifies that the dataset is to be replaced by a dataset showing combinations of categories and their frequencies. This option may not be specified with by:. The replacement dataset is the basis for many other tables not produced by tabcount. Note useful functions under functions and egen, in particular the function sum() for cumulative sums and the egen functions sum() for totals and pc() for percents and proportions. See also the last example below.

freq(newvarname) is for use with replace and specifies an alternative to the default variable name _freq used for frequencies.

matrix(matrix_name) specifies a matrix name to hold frequencies. This option may not be specified if there are three or more variables in varlist or with by:.

tabdisp_options are options of tabdisp other than by(), cellvar(), missing and totals.

Examples

. tabcount rep78, v(1/5)

. bysort foreign: tabcount rep78, v(1/5)

. generate mpg2 = 5 * int(mpg/5) . label var mpg2 "Mileage (mpg)" . forval i = 10(5)40 { . label def mpg2 `i' "`i'-", modify . } . label val mpg2 mpg2 . tabcount rep78 mpg2, v1(1/5) v2(10(5)40)

. tabcount foreign rep78, v1(0 1) v2(1/5) matrix(counts)

run this as a do file: . sysuse auto, clear . preserve . tabcount foreign rep78, v1(0/1) v2(1/5) replace . egen pcfreq = pc(_freq), by(foreign) . bysort foreign (rep78) : gen cupc = sum(pcfreq) . noi di _n as txt "{title:Repair record and car type}" _c . noi di "{txt:Percents and cumulative percents}" _c . noi tabdisp foreign rep78, c(pcfreq cupc) format(%2.1f) . restore

Author

Nicholas J. Cox, University of Durham, U.K. n.j.cox@durham.ac.uk

Acknowledgements

Kit Baum, Michael Blasnik and Shannon Driver made many useful comments on problems of this kind and how to tackle them. Hildegard Schaeper found a bug and pointed out a difficulty in the handling of weights. Rufus Browning pointed out two errors in the examples.

Also see

On-line: help for tabulate, table, tabdisp, contract