Title

[R] tabmult -- Extension of tabulate tables, save results in XML format

Syntax

tabmult [if] [in] [weight] , [cat(varlist) cont(varlist)] by(varlist) [options]

Multiple two-way tables for all possible combinations - a convenience tool. Saves results in XML format.

options Description ------------------------------------------------------------------------- Main col report relative frequency within its column of each cell row report relative frequency within its row of each cell scol report relative frequency supercolum total srow report relative frequency superrow total missing treat missing values like other values statcont(statname [...]) report specified statistics

Output save(["]filename["]) name and path for the output file replace overwrite existing filename append if workbook filename exists, add a new sheet, otherwise create a new workbook sheet(name [, sh_opts]) worksheet where the table is outputted

color(#) specify tab color for a worksheet nogridlines hide gridlines on a worksheet

savemat(name [, sm_opts]) save estimates to a matrix

replace if matrix name already exists replace it. The default is to append exit after writing the matrix exit tabmult without creating an output file

Description

tabmult collects multiple two-way tabulation tables of frequency counts or percentages between each variable in the categorial (cat) varlist and each variable in the by varlist, and calculates averages for variables in the continuous (cont) varlist by categories of each variable in the by varlist. It also aligns, re-arranges, and saves Stata results directly into XML format that could be opened with Microsoft Excel or OpenOffice Calc.

Options

+------+ ----+ Main +-------------------------------------------------------------

col displays the relative frequency of each cell within its column in a two-way table.

row displays the relative frequency of each cell within its row in a two-way table.

scol displays the relative frequency of supercolumn total.

srow displays the relative frequency of superrow total.

missing requests that missing values be treated like other values in calculations of counts, percentages.

statcont(statname [...]) specifies the statistics to be displayed; the default is equivalent to specifying statcont(mean) or statcont(count) depending on whether "row" and/or "col" options are specified. (statc() is a synonym for statcont().) Multiple statistics may be specified and are separated by white space, such as statcont(mean sd). Available statistics are

statname definition --------------------------------------------------------------------- mean mean count count of nonmissing observations n same as count sum sum max maximum min minimum range range = max - min sd standard deviation variance variance cv coefficient of variation (sd/mean) semean standard error of mean (sd/sqrt(n)) skewness skewness kurtosis kurtosis p1 1st percentile p5 5th percentile p10 10th percentile p25 25th percentile median median (same as p50) p50 50th percentile (same as median) p75 75th percentile p90 90th percentile p95 95th percentile p99 99th percentile iqr interquartile range = p75 - p25 q equivalent to specifying p25 p50 p75 ---------------------------------------------------------------------

+--------+ ----+ Output +-----------------------------------------------------------

save(["]filename["]) specifies a name for XML file where tables are outputted. If save(["]filename["]) is omitted, the output will be saved in stata_tabmult.xml located in the current working directory. Use append and replace options to instruct tabmult to append a table into the new worksheet of the existing file or to replace the existing file.

replace permits to overwrite the existing XML file.

append if XML file already exists, a new sheet will be added to it where the output table will be saved. Otherwise the new file will contain one sheet with the output.

NOTE: You need to close XML file opened in Excel or Calc for tabmult to save new tables there. If XML file is still opened, tabmult reports an error message: file can not be saved at this location.

sheet(name) You can output several tables into the different sheets of XML file (workbook). Excel or Calc files could contain multiple worksheets within a single document (workbook). sheet() option specifies the name for the new sheet where the table will be outputted. If not specified, a worksheet named Sheet1 will be added.

A valid Excel worksheet name must have no more than 31 characters. The worksheet name cannot contain any of the following: : \ / ? * [ or ] and can not be left blank.

savemat(name) saves tabulation results in a matrix. If name exists and option replace was not used then additional rows will be appended to that matrix. In this case number of columns in the existing matrix and the one to be appended must be the same. The output matrix will contain frequency and/or mean of the variables specified in the syntax.

Limits

Similar to [R] tab, each two-way tables may have a maximum of 1,200 rows and 80 columns (Stata/MP and Stata/SE), 300 rows and 20 columns (Stata/IC), or 160 rows and 20 columns (Small Stata). For larger tables, you might need to adjust the matrix size in Stata, see [R] matsize and [R] table.

Examples

--------------------------------------------------------------------------- Setup . webuse citytemp2

Multiple two-way tables of frequencies . tabmult, cat(region agecat) cont(heatdd cooldd tempjan tempjuly) by(division)

Multiple two-way tables of frequencies including missing observations . tabmult, cat(region agecat) cont(heatdd cooldd tempjan tempjuly) by(division) m

. tabmult, cat(region agecat) cont(heatdd cooldd tempjan tempjuly) by(division) srow scol m

Include row percentages and means for continuous variables . tabmult, cat(region agecat) cont(heatdd cooldd tempjan tempjuly) by(division) row save(Table1.xls)

. tabmult, cat(agecat) cont(heatdd cooldd tempjan tempjuly) by(division region) row m save(Table1.xls) replace

Include column percentages and means, medians for continuous variables . tabmult, cat(region agecat) cont(heatdd cooldd tempjan tempjuly) by(division) col save(Table1.xls) replace

. tabmult, cat(region agecat) cont(heatdd cooldd tempjan tempjuly) by(division) statc(mean med) col save(Table1.xls) replace

. tabmult, cat(agecat) cont(heatdd cooldd tempjan tempjuly) by(division region) col m save(Table1.xls) sheet(missing) append

. tabmult, cat(agecat) cont(heatdd cooldd tempjan tempjuly) by(division region) col m sr sc save(Table1.xls) sheet(totals) append

----------------------------------------------------------------------------

Acknowledgements

Some functionality ideas were adopted from xml_tab by Michael Lokshin and Zurab Sajaia.

Author

Minh Cong Nguyen World Bank, 2012 mnguyen3@worldbank.org

Version

This is version 1.0.6 released May., 2012.