{smcl} {* Created August 22, 2011}{...} {hline} {cmd:help for stat2data} {hline} {title:Title} {p 4 4 2}{...} {bf:stat2data ---} {sf:Generates a Dataset of Descriptive Statistics Calculated for a List of Variables} {p2colreset}{...} {marker contents}{dlgtab: Table of Contents} {p 6 16 2} {p 2}{help stat2data##syntax:Syntax}{p_end} {p 2}{help stat2data##description:General description of {cmd:stat2data}}{p_end} {p 2}{help stat2data##options:Description of the options}{p_end} {p 2}{help stat2data##examples:Examples}{p_end} {p 2}{help stat2data##author:Author information}{p_end} {marker syntax}{title:Syntax} {p 8 16 2} {cmd:stat2data} {varlist} {ifin} {weight}{cmd:,} {cmdab:sa:ving(}{it:filename}[{cmd:,} {it:suboption}]{cmd:)} [{it:other_options}] {synoptset 30 tabbed}{...} {synopthdr} {synoptline} {syntab:Main} {synopt:{opth by(varname)}}request statistics by variable {p_end} {synopt:{cmdab:s:tatistics:(}{it:{help tabstat##statname:statname}} [{it:...}]{cmd:)}}create dataset for specified statistics {p_end} {syntab:Options} {synopt:{opth gen:erate(newvarlist)}}generate {it:newvar_1}, ..., {it:newvar_k} for each requested statistic {p_end} {synopt:{opt case:wise}}perform casewise deletion of observations {p_end} {synopt:{opt m:issing}}report statistics for missing values of {opt by()} variable {p_end} {synopt:{opt f:ormat}[{cmd:(%}{it:{help format:fmt}}{cmd:)}]}display format for statistics; default format is {cmd:%9.0g} {p_end} {synopt:{cmdab:sa:ving(}{it:filename}[{cmd:,} {it:suboption}]{cmd:)}}save dataset of statistics to file {it:filename}; this option is required {p_end} where {it:suboption} must equal {it:replace} to overwrite filename {p2colreset}{...} {marker description}{dlgtab:Description} {pstd} {cmd:stat2data}, a wrapper for and ostensibly an extension of the Stata's official {help tabstat} command, generates a dataset of descriptive statistics calculated for a list of variables. {cmd:stat2data}'s output is different from that of collapsing a dataset using the {help collapse} command. In the dataset generated by {cmd:stat2data}, the statistics are in columns and the variables for which the statistics were calculated are in rows. In other words, statistics become variables and variables become observations. If the {opt by()} option is specified, the dataset will contain observations for each variable and for each value of the by variable including missing if the {opt missing} option is specified. {marker options}{dlgtab:Options} {phang} {opth by(varname)} specifies that the dataset contain statistics for each unique value of {it:varname}, which may be numeric or string. {phang} {cmd:statistics(}{it:statname} [{it:...}]{cmd:)} specifies the statistics for which the dataset needs to be generated. If this option is not specified, {cmd:statistics(mean)} is assumed. {pmore} While {help tabstat} allows both {bf:median} and {bf:q} to be requested, {bf: stat2data} does not since {bf:median} will be reported when {bf:q} is specified. {phang} {opth g:enerate(newvarlist)} indicates the names of the variables to hold the statistics. Specify a variable name for each statistic unless {bf:q} is listed among the statistics to request p25, p50, and p75. {pmore} If {opt generate()} is not specified, {bf:stat2data} will form variable names by prefixing the name for each specified statistic with an {bf:s}. If {bf:q} is listed among the statistics to be calculated and {opt generate()} is specified, you need two more variable names, in addition to those specified for each statistic listed. {phang} {opt format} and {cmd:format(%}{it:{help format:fmt}}{cmd:)} specify how the statistics are to be formatted when the dataset is created. The default is to use a {cmd:%9.0g} format. {phang} {opt casewise} see {help tabstat}. {phang} {opt missing} specifies that statistics for the missing values of the {opt by()} variable, if {bf:by()} is specified, be included in the dataset. The default is to exclude from the dataset the statistics for the missing of the {cmd:by()==}{it:missing} group. {pmore} {cmd:stat2data}'s options {opt by()}, {opt statistics()}, {opt format}, {opt casewise}, and {opt missing} are apparently equivalent to those of {help tabstat}. {phang} {cmd:saving(}{it:filename} [{cmd:,} {it:suboption}]{cmd:)} specifies that the dataset to be generated for the calculated statistics be saved to the Stata data file {it:filename}. The dataset will be placed in the current directory. Specifying the suboption {opt replace} will overwrite an existing {it:filename}. {pmore} If your {it:filename} (including its path) contains embedded spaces, remember to enclose it in double quotes. {marker examples}{dlgtab:Examples} {pmore} {stata "sysuse auto, clear" :. sysuse auto, clear}{p_end} {pmore} {stata "stat2data price mpg trunk weight length turn, saving(statdata) by(rep78) stat(mean sd k sk q) missing" :. stat2data price mpg trunk weight length turn, saving(statdata) by(rep78) stat(mean sd k sk q) missing}{p_end} {pmore}{stata "preserve" :. preserve}{p_end} {pmore}{stata "use statdata, clear" :. use statdata, clear}{p_end} {pmore}{stata "list" :. list}{p_end} {pmore}{stata "restore" :. restore}{p_end} {marker author}{title:Author} {hi:P. Wilner Jeanty}, the Kinder Institute for Urban Research/Hobby Center for the Study of Texas, Rice University, Houston, Texas. {browse "mailto:pwjeanty@rice.edu":pwjeanty@rice.edu} {title:See also} {psee} Online: {help tabstat} {p_end}