-------------------------------------------------------------------------------
help for stat2data
-------------------------------------------------------------------------------

Title

stat2data --- Generates a Dataset of Descriptive Statistics Calculated for a List of Variables

+--------------------+ ----+ Table of Contents +-----------------------------------------------

Syntax General description of stat2data Description of the options Examples Author information

Syntax

stat2data varlist [if] [in] [weight], saving(filename[, suboption]) [other_options]

options Description ------------------------------------------------------------------------- Main by(varname) request statistics by variable statistics(statname [...]) create dataset for specified statistics

Options generate(newvarlist) generate newvar_1, ..., newvar_k for each requested statistic casewise perform casewise deletion of observations missing report statistics for missing values of by() variable format[(%fmt)] display format for statistics; default format is %9.0g saving(filename[, suboption]) save dataset of statistics to file filename; this option is required where suboption must equal replace to overwrite filename

+-------------+ ----+ Description +------------------------------------------------------

stat2data, a wrapper for and ostensibly an extension of the Stata's official tabstat command, generates a dataset of descriptive statistics calculated for a list of variables. stat2data's output is different from that of collapsing a dataset using the collapse command. In the dataset generated by stat2data, the statistics are in columns and the variables for which the statistics were calculated are in rows. In other words, statistics become variables and variables become observations. If the by() option is specified, the dataset will contain observations for each variable and for each value of the by variable including missing if the missing option is specified.

+---------+ ----+ Options +----------------------------------------------------------

by(varname) specifies that the dataset contain statistics for each unique value of varname, which may be numeric or string.

statistics(statname [...]) specifies the statistics for which the dataset needs to be generated. If this option is not specified, statistics(mean) is assumed.

While tabstat allows both median and q to be requested, stat2data does not since median will be reported when q is specified.

generate(newvarlist) indicates the names of the variables to hold the statistics. Specify a variable name for each statistic unless q is listed among the statistics to request p25, p50, and p75.

If generate() is not specified, stat2data will form variable names by prefixing the name for each specified statistic with an s. If q is listed among the statistics to be calculated and generate() is specified, you need two more variable names, in addition to those specified for each statistic listed.

format and format(%fmt) specify how the statistics are to be formatted when the dataset is created. The default is to use a %9.0g format.

casewise see tabstat.

missing specifies that statistics for the missing values of the by() variable, if by() is specified, be included in the dataset. The default is to exclude from the dataset the statistics for the missing of the by()==missing group.

stat2data's options by(), statistics(), format, casewise, and missing are apparently equivalent to those of tabstat.

saving(filename [, suboption]) specifies that the dataset to be generated for the calculated statistics be saved to the Stata data file filename. The dataset will be placed in the current directory. Specifying the suboption replace will overwrite an existing filename.

If your filename (including its path) contains embedded spaces, remember to enclose it in double quotes.

+----------+ ----+ Examples +---------------------------------------------------------

. sysuse auto, clear

. stat2data price mpg trunk weight length turn, saving(statdata) by(rep78) stat(mean sd k sk q) missing

. preserve

. use statdata, clear

. list

. restore

Author

P. Wilner Jeanty, the Kinder Institute for Urban Research/Hobby Center for the > Study of Texas, Rice University, Houston, Texas. pwjeanty@rice.edu

See also

Online: tabstat