-------------------------------------------------------------------------------
help for tabout
-------------------------------------------------------------------------------

Title

tabout -- Building publication quality tables for export to a text file

Table of contents

Syntax Description Options Examples

Syntax

tabout [ varlist ] [ if exp ] [ in range ] [ weight = exp ] using filename [ , options ]

options alternatives ---------------------------------------------------------------------- core options replace append cells(contents) see below format(string) clab(string) layout(layouts) col row cblock rblock oneway sum stats(statstypes) chi2 gamma V taub lrchi2 sample size (n) options npos(positions) col row both lab tufte nlab(string) nwt(string) nnoc noffset(string) survey options svy sebnone cibnone cisepstring) ci2col percent level(#) pop total options total(string) ptotal(totaltype) none single all h1(string) h2(string) h3(string) style options style(styles) tab tex htm csv semi lines(linetypes) single double none font(fontstyles) bold italic bt rotate(#) cl1(#-#) cl2(#-#) cltr1(string) cltr2(string) additional output options body topf(string) botf(string) topstr(string) botstr(string) psymbol(string) delim(string) miscellaneous options dpcomma money(string) mi sort chkwtnone debug noborder show(showtypes) output none all wide(#)

where:

varlist is [list] of vertical (row) variables, followed by the horizontal (column) variable last. if the oneway option is specified, then all the variables are regarded as vertical.

contents for basic tables any of the following are permitted: freq cell row col cum. The default is freq.

For summary tables any of the following are permitted:

N count mean median var sd skewness kurtosis uwsum sum min max p1 p5 p10 p25 p50 p75 p90 p95 p99 iqr r9010 r9050 r7525 r1050

Note that you may enter the median as either p50 or median and you may enter N as either N or count.

When the svy option is used, you can also specify any of the following: se ci lb ub

fweights aweights iweights and pweights are allowed with tabout, depending on the underlying command; see Manual: [U] 14.1.6 weight and individual entries for [R] tabulate and [R] summarize. For tables of summary statistics, iweights are not allowed, because tabout uses the detail option in Stata's summarize command (which does not allow iweights). The svy option requires that the data be already svyset and an error message reminds you of this if you forget.

Note that tabout will work under Stata 9.2 onward. An older version of tabout (which works with Stata 8.2) called tabout8 is available here: http://www.ianwatson.com.au/stata/tabout8.ado.

Description

tabout is a table building program for oneway and twoway tables of frequencies and percentages, and for summary tables. It produces publication quality tables for export to a text file. These tables can then be used with spreadsheets, word processors, web browsers or compilers like LaTeX. The tables produced by tabout allow multiple panels so that a number of variables can be included in the one table. tabout also provides standard errors and confidence intervals, as well as a range of table statistics (chi2 etc). The output from tabout matches Stata's tabulate, most of tabstat and some of table.

tabout has a comprehensive tutorial which includes numerous examples. This is available from the SSC with this help file. The tutorial is also available here: http://www.ianwatson.com.au/stata/tabout_tutorial.pdf.

Options

Contents

core options sample size (n) options survey options total options style options additional output options miscellaneous options

+--------------+ ----+ core options +-----------------------------------------------------

using is required, and indicates the filename for the output. Some applications (particularly MS Excel) "lock" files when they're open. tabout cannot write to these files and consequently issues an error message, warning you to check if the file is already open in another application.

replace and append are file options, and determine whether the current output will overwrite an existing file, or be appended to the end of that file. If you omit append or replace, tabout issues a warning if the file already exists.

cells determines the contents of table cells. As the table below shows, you can enter any one or more of freq cell row col cum in a basic table. They can be in any order. When you choose the svy option, you can only have one of these choices, and it must come first. The additional choices which are then available are: se ci lb ub.

For summary tables, you can have any of the contents listed earlier. If you are creating a twoway table, only one summary statistic may go in a cell (eg. median wage); if it's a oneway table, any number of statistics (followed by a variable name) may go in the cell (eg. median wage mean age iqr weight). When you choose the svy option with summary tables, only mean is allowed (eg. mean wage se ci.)

+---------------------------------------------------------------------- --------------+ | Type of table | Allowable cell contents | Avail > able layout | | | cells( ) | lay > out( ) | |----------------------+--------------------------------------+-------- --------------| | basic | freq cell row col cum | col row > cb rb | | | any number of above, in any order | > | | | for example: cells(freq col) | > | |----------------------+--------------------------------------+-------- --------------| | basic with SE or CI | freq cell row col se ci lb ub | col row > cb rb | | | only one of: freq cell row col | > | | (turn on svy option) | (must come first in the cell) | > | | | and any number of: se ci lb ub | > | | | for example: cells(col se lb ub) | > | |----------------------+--------------------------------------+-------- --------------| | summary | any number of: N mean var sd skewness| no opti > ons (fixed) | | -as a oneway table | kurtosis sum uwsum min max count | > | | | median iqr r9010 r9050 r7525 r1050 | > | | (turn on sum option; | p1 p5 p10 p25 p50 p75 p90 p95 p99 | > | | also may need to turn| with each followed by variable name | > | | on oneway option) | for example: cells(min wage mean age)| > | |----------------------+--------------------------------------+-------- --------------| | summary | only one of: N mean var sd skewness | no opti > ons (fixed) | | -as a twoway table | kurtosis sum uwsum min max count | > | | | median iqr r9010 r9050 r7525 r1050 | > | | (turn on sum option) | p1 p5 p10 p25 p50 p75 p90 p95 p99 | > | | | followed by one variable name | > | | | for example: cells(sum income) | > | |----------------------+--------------------------------------+-------- --------------| | summary with SE or CI| mean followed by one variable name | col row > cb rb | | turn on sum option | and any number of: se ci lb ub | > | | and svy option) | for example: cells(mean weight se ci)| > | +---------------------------------------------------------------------- --------------+

format indicates the number of decimal points. Unlike mainstream Stata, this option only requires a number. Do not enter % or f symbols. You can however, enter c for comma, p for percentage, and m for money (currency) and you can use the money option (see below) to specify the currency. For example, you might enter f(0c 1p 1p 2) to produce: 1,291 9.2% 10.3% 23.93. The entries should be in the same order as the cells order, that is, if freq comes first, then 0c should come first if you want 0 decimal points (with commas) as the format for frequencies. You do not have to type in the same number of format entries as there are cell entries. If you include more, tabout ignores them; if you include less, the last format entry is repeated for the remaining cell entries.

clab determines the column headings for the third row of the table, that is, the headings just above the data. By default, tabout places the horizontal variable's name in the first row, its value labels in the second row, and an abbreviation for the cell contents (eg. No. Row % etc) in the third row. You can over-ride all of these defaults using the h1 h2 and h3 options (see below). Most of the time, however, it will only be the third row which you need to change, so the clab option makes this easy for you. Just enter the column titles as you want them to display, without quote marks or other symbols. However, you must include underscores between words if there are spaces in the column title, for example clab(No. Row_% Col_%). You do not have to type in the same number of clab entries as there are cell entries. If you include more, tabout ignores them; if you include less, the last clab entry is repeated for the remaining cell entries. For example if your cell entry was freq col row cum you could just enter clab(No. %)and all but the first column of data would have % symbols at the top.

layout determines how the columns will be laid out. They can be in alternating columns (No. % No. % No. %) and alternating rows (No. on the first row, % on the next two, then back to No. and so on). They can be in column blocks, or in row blocks, where the data is kept contiguous, for example: No. No. No. % % %. The exception to this is summary tables where the layout is fixed and you have no choice. (However, an exception to this is the svy option, which can be laid out using all of these options. See the earlier table for clarification.)

oneway tells tabout that the list of variables are all vertical. Normally, tabout assumes that the last variable in the list is the horizontal variable, to be used in a twoway cross-tabulation. To override this default behaviour, specify oneway.

sum tells tabout that the table is to be a summary table. Normally, tabout assumes that the table will be a basic table and checks to see if the cells contents have the correct entries ( freq row col etc). By telling tabout that the table is a summary table, this checking process includes checks for the various summary statistics and the variables in the data set. The sum option is essential if you wish to produce a summary table.

stats allows you to include additional information based on the various statistics available in tabulate. Note that, unlike tabulate, tabout requires that you enter the full term (and not an abbreviation) and will only allow one statistic in a table. You must enter chi2, not just chi.

+-------------------------+ ----+ sample size (n) options +------------------------------------------

npos determines where the n information will be place. The various n options ( npos nlab nwt nnoc) provide sample counts for the table. You need only enter one of these options for the n to be included. For the options you have not entered, tabout places make use of the default values.

lab determines the label for the n counts. The default for col and row positions is a simple uppercase N; for the lab position it is (n=#) where # stands for number; and for the tufte position it is (#%). You can change all of these except the tufte position (which is fixed), and if you wish to alter the lab position, use the # symbol to indicate where the number should go. For example, {cmd: npos(lab) nlab(Sample count=#)}. The npos(tufte) option provides a convenient way of displaying a percentage breakdown, rather than a count, for the main vertical variables. The name comes from the approach adopted by Edward Tufte in his construction of a supertable, which he designed for the New York Times in 1980.

nwt indicates that the n count be weighted by this variable. This can be useful for producing population estimates in a table, rather than just sample counts. Note that tabout always uses Stata's iweight option for this weighting.

nnoc stands for n-no-comma and turns off the comma in the n count. Because tabout does not provide a format option for n counts (decimal points don't really make sense here), the default behaviour is to include commas. The nnoc option over-rides this default behaviour.

noffsetstands for n offset and determines where the n counts should be placed. The default is 1, which means the n counts will be in the first data column and/or the first data row in a table. Setting noff(2) for example, allows you to shift the n counts further along (or down) in the table, into either the second data column or the second data row. If you are using block layouts (layout(cb) or layout(rb)), the noffset option applies to blocks rather than individual columns or rows.

+----------------+ ----+ survey options +---------------------------------------------------

svy tells tabout that the cell contents include survey output, and so the checking procedure (mentioned earlier) looks for things like se, ci and so forth. You must turn on svy is you wish to include survey output in your table.

sebnone stands for se-brackets-none and tells tabout to suppress the parentheses which normally surround the standard errors.

cibnone stands for ci-brackets-none and tells tabout to suppress the square brackets which normally surround the confidence intervals.

cisep stands for ci-separator and tells tabout to replace the default (which is a comma) by whatever the user enters (for example, a dash).

ci2col stands for ci-in-two-columns and tells tabout to place the lb and ub estimates in two columns (as it normally does), and to place a [ and a , in the first column, and a ] in the second column. This can be useful for layout in a word processor, because the first column can be right aligned (to the comma) and the second column can be left aligned, and it appears that you have a single column for your ci, which is neatly aligned according to the commas. Note that if you select ci in the cells, tabout normally places both the lower bound and the upper bound in a single cell and includes brackets and separator. The ci2col does not apply in this case. For it to work, you need to specify the upper and lower bound options, for example: {cmd: cell(freq lb ub) ci2col}.

percent tells tabout that the svy output should be shown as percentages, not proportions. This follows the default behaviour of svy:tab.

level specifies the level for the svy estimates. The default is 95%.

pop specifies that a weighted population estimate should be provided for the n in the table, rather than the sample size. This option makes use of the weight specified by the nwt option. This makes the svy option work the same as the nwt option with non-survey tables.

+---------------+ ----+ total options +----------------------------------------------------

total tells tabout what labels to use for totals. The vertical total comes first, the horizontal second. The default labels for these variables are Total. If there are spaces in either of the labels which you wish to enter, use underscores. For example, total(All_persons Total).

ptotal tells tabout how to treat the totals for each panel, when you have multiple panels in a table. The default behaviour is to show all totals, but this can sometimes be repetitive, so you can specify ptotal(single) to have a single total row shown at the bottom of the table. You can also turn off all totals with ptotal(none).

h1 through to h3 over-ride the default headings for a table. If you choose to use these, there are a couple of requirements. If you have selected either tex or htm as your output style, you are responsible for all the various code needed. tabout does not make any adjustments to what you enter, it just outputs it as it finds it. If you have chosen tab or csv as your output style, you must enter a delimiter to indicate where the columns are in your heading. Unlike the usual tabout practice, you do not need to worry about spaces in your titles (no need for underscores!) because this column delimiter takes care of things. However, the number of delimiters must match the number of columns in the table or the headings may be out of alignment. You might enter: h2( | Very good | Good | Bad | Very bad | Total | N) and the first column heading would be empty, and the remaining columns would have the appropriate labels. Note that the npos(col) option usually places the nlab on the h2 line so you may need to include this yourself in your h2 label, as in the example just given. To suppress the display of any of these headings, enter `nil' into the appropriate option (for example, h3(nil)).

+---------------+ ----+ style options +----------------------------------------------------

style The default is style(tab), which is useful for importing into spreadsheets or word processors. Note that the first row always has the correct number of tabs, even when a single title is involved. This helps other applications parse the table correctly. Note also that the repetition of labels in headings can be easily dealt with by using a merge cells command in your spreadsheet or word processor. The style(csv) option is useful for importing into spreadsheets (like MS Excel) because it opens immediately as a spreadsheet. The style(semi) uses semi-colons as the delimiter. Note, however, that some spreadsheets ignore trailing 0s, so this may muck up your neat formatting. To avoid this, export the table from tabout as style(tab) and use the wizard in your spreadsheet to indicate that all columns are text rather than general.

lines indicates how much space (for style(tab) and style(csv)) or how many lines (for style(tex)) should separate tables between panels. The default is single.

font only applies to style(tex) and style(htm) and provides bold and italic fonts for the vertical variable names and the horizontal variable names and value labels. The totals are also given this font. You can also use the h1 to h3 options to manually set up fonts for your titles.

bt only applies to users of LaTeX, and requires that you have the booktabs package installed. This allows the use of the toprule, midrule and bottomrule commands, rather than the usual hline command. It produces more pleasing output.

rotate only applies to users of LaTeX, and can be used to rotate the horizontal variable's labels through whatever angle is entered in this option. For example, rotate(60) produces quite a pleasing effect.

cl1 and cl2 only apply to users of LaTeX, and also requires that you use the booktabs package in your LaTeX document. These options can be used to place column lines (hence cl) between the first and second heading rows, and between the second and third heading rows (hence two sets). You enter the column numbers which you wish to span, separated with a dash. For example, to place a line under the horizontal variable's name, you might enter: cl1(2-6) in a table with six columns. If you are entering lines spanning blocks of columns (2-4 5-7), you might need to fine tune the gap between them using cltr1 and cltr2. By default, whenever you specify either of the cl options, tabout places a small gap (0.75em) between adjacent lines.

cltr1 and cltr2 stand for column-line-trim, and allow you to specify an amount of trim to be applied to the left side of the cl1 or cl2 lines which you have entered. You can specify the amount in whatever acceptable tex measurement you like. For example: cl2(2-3 4-5 6-7) cltr2(1.5em). As just noted, the default amount is 0.75em.

+---------------------------+ ----+ additional output options +----------------------------------------

body is used to insert some basic html or LaTeX code above and below the table. This allows you to view the table without further coding.

topf and botf allow you to insert code stored in files which tabout can insert above and below the tables. These are particularly useful for html and LaTeX users, and allow you to control the layout of the tables more precisely. All users will find them useful as a way of inserting additional information above and below the table, such as notes, populations, data sources (for the bottom of the table) and titles (for the top of the table).

topstr and botstr contain text which you can pass to the topf and botf files. This text will be inserted into the files where ever the placeholder (default #) has been placed. Note that each placeholder must be on a separate line in these files. The strings designated in the topstr and botstr must be separated with the pipe delimiter (or other user-chosen delimiter) if there is more than one block of text being passed.

psymbol stands for placeholder-symbol and can be any symbol the user chooses. The default is # and it provides a placeholder in the stored files (the topf and botf) which tabout places above and below the tables.

delimit can be any symbol the user chooses. The default is the pipe delimiter (|) as shown in the earlier example. It is used to specify columns within the h1 to h3 options, and for separating the contents of the topstr and botstr options. Note, unlike earlier versions of tabout, the delimit symbol is no longer used for labels. Instead, underscores are used to close-up spaces and parsing is done on the remaining spaces.

+-----------------------+ ----+ miscellaneous options +--------------------------------------------

dpcomma specifies that tabout should use commas for decimal points and periods (full-stops) for thousand separators. This style is common in many European countries. This option affects the presentation of both the tabular output and the statistics when these are requested (such as chi2).

money indicates the currency to be used if you have chosen the money format. For example, format(2m) money(\pounds). You can enter any symbol that your keyboard allows. For LaTeX users, you can enter any text which LaTeX accepts, though you may need to include quotes.

mi specifies that tabout should display missing values. This works the same as the mi option in Stata's tabulate command.

sort specifies that tabout should display values in descending order of frequecy. This works the same as the sort option in Stata's tabulate oneway command. Note that you cannot use this option with twoway tables.

chkwtnone prevents tabout from checking the legality of your weights. Stata commands will not allow you to use non-integer frequency weights and tabout normally checks for this. You can over-ride this behaviour with the chkwtnone option. Note that this option does not stop Stata itself from refusing to use non-integer frequency weights.

debug shows you most of the underlying Stata commands (though not for summary tables) from which the tables are built. This can be useful for confirming your results.

noborder only applies to html output, and determines whether the table and cells should be surrounded by borders. This only applies when the body option is turned on.

show determines what will be seen on the screen. The show(all) option displays the final table output as well as the Mata string matrices which are used to build this final output. The contents of these matrices may not exactly match the final output, in terms of formatting and labelling. The show(none) option suppresses all output except for the name of the file to which the table has been exported. The default option is to show the output which has been sent to a file. It may look messy on the screen, but open it in the appropriate application to check it first before panicking.

wide is used in conjunction with show(all) and specifies the width of the columns in the Mata matrices. The default is 10 spaces. Note that even if you reduce this to a very small number, tabout will always increase the width of the columns to accommodate the widest cell entry in the data.

Examples

The best examples are to be found in the tutorial ( http://www.ianwatson.com.au/stata/tabout_tutorial.pdf) where both the syntax and the final table can be viewed, side by side. The following examples illustrate each of the types of table outlined earlier. The datasets used in the following examples are all built-in Stata datasets, namely, nlsw88.dta and voter.dta.

+--------------+ ----+ basic tables +-----------------------------------------------------

. sysuse nlsw88.dta, clear tabout south race smsa coll [iw=wt] /// using table2.txt, /// c(freq row col) f(0c 1p 1p) clab(_ _ _) /// layout(rb) h3(nil) . tabout race south smsa coll [iw=wt] /// using table3.txt, c(freq row col) f(0c 1p 1p) /// layout(cb) h1(nil) h3(nil) npos(row)

. tabout coll smsa south race /// using table4.txt, c(col) f(1) /// clab(Col_%) stats(gamma) npos(row)

. tabout occupation industry south /// using table5.txt, c(col cell) f(1) /// clab(Col_% Cell_%) npos(row) nlab(Sample size) . tabout occupation industry south /// using table6.txt, c(col cell) f(1) /// npos(row) nlab(Sample size) layout(cb)

+-------------------------------+ ----+ Basic tables with survey data +------------------------------------

. tabout coll race smsa south using table7.txt, /// c(row ci) f(1 1) clab(Row_% 95%_CI) svy stats(chi2) /// {txt] npos(lab) per

. tabout smsa race south using table8.txt, /// c(mean wage lb ub) f(2m) svy sum

+----------------+ ----+ Summary tables +---------------------------------------------------

. sysuse voter, clear . tabout inc candidat using table9.txt, /// c(mean pfrac) f(1) clab(%) sum . tabout rep78 foreign using table10.txt, /// c(iqr weight) f(0c) sum h3(nil) npos(both) . tabout foreign rep78 using table11.txt, /// c(mean mpg mean weight mean length median price median headroom) /// f(1c 1c 1c 2cm 1c) /// clab(MPG Weight_(lbs) Length_(in) Price Headroom_(in)) /// sum npos(tufte)

+---------------------------------+ ----+ Summary tables with survey data +----------------------------------

. tabout occ south race coll using table12.txt, /// c(mean wage se) f(2 2) clab(Mean_wage SE) /// sum svy npos(lab) . tabout south race coll using table13.txt, /// c(mean wage se ci) f(2 2) sum svy npos(lab) layout(row) /// level(90) clab(_ (SE) (90%_CI)) /// h3( | Average wage | Average wage | Average wage) . tabout south race coll using table14.txt, /// c(mean wage lb ub) f(2 2) sum svy /// npos(lab) nlab((Sample size = #)) /// layout(row) level(90) clab(_ Lower_bound Upper_bound) /// h3( | | Average wage | )

Acknowledgements

Numerous people have provided feedback and advice over the last two years and I am very grateful for their comments. In particular I'd like to thank: Mitch Abdon, Ulrich Atz, JP Azevedo, Megan Blaxland, Eric Booth, Simon Coulombe, Enzo Coviello, Nick Cox, Axel Engellandt, David L. Eckles, Richard Fox, Jonathan Gardner, Johannes Geyer, Bill Gould, Daniel Hoechle, Ben Jann, Stephen Jenkins, Stas Kolenikov, Thomas Masterson, Scott Merryman, Nirmala Devi Naidoo, Thomas Odeny, Cathy Redmond, Mikko Ronkko, Rafael Martins de Souza, Benjamin Schirge, Urvi Shah, Tim Stegmann, Herve Stolowy, Amanda Tzy-Chyi Yu and Chris Wallace.

Thanks also to Arjan Soede for contributing the code for the dpcomma option.

Author

Ian Watson Freelance researcher and Visiting Senior Research Fellow Macquarie University and Social Policy Research Centre University of New South Wales Sydney Australia mail@ianwatson.com.au www.ianwatson.com.au

Version 2.0.6 26nov2012