-------------------------------------------------------------------------------
help for mktab                                                      Nick Winter
-------------------------------------------------------------------------------

Print table of estimates in delimited or screen-presentation format

Basic syntax:

mktab (depvar1 varlist1) (depvar2 varlist2) ... (depvarN varlistN) [weight] [if exp] [in range] [, options ]

Full syntax:

mktab ([eqname1:] depvar1a [depvar1b ... =] varlist1 [, noconstant]) ([eqname2:] depvar2a [depvar2b ... =] varlist2 [, noconstant]) ... ([eqnameN:] depvarNa [depvarNb ... =] varlistN [, noconstant]) [weight] [if exp] [in range] [, options ]

Options:

print(varlist) log(filename[, append | replace]) cmd(cmdname) aux(name[=label][,name[=label] ... ]) est(name[=label][,name[=label] ... ]) flag(#=label[,#=label ... ]) tag(string) delimit(string) noisily continue xlabel ylabel t1title t2title b1title b2title notitle nobtitle mif(conditions) miflabel(labels) plevels nose bfmt(%fmt) sfmt(%fmt) efmt(%fmt) pfmt(%fmt) onetail screen vspace(#) connect notags latex nocaption html command_options ]

Description

mktab estimates one or more single-equation models, and prints the results in tab- (or other) delimited format for importation into a spreadsheet or word-processor. This facilitates creating tables in "journal-article" format, i.e. with standard errors in parentheses below parameter estimates, significance flagged, and so on.

mktab estimates each model specified. It then produces a delimited table with the dependent variables across the top, the RHS variables (and auxiliary parameter and returned estimates results) down the left, and the coefficients and standard errors in the cells. Each column corresponds to a single model; each row to a single RHS variable.

mktab is similar to John Gallup's exemplary outreg. The primary differences are that it estimates all the models and generates the table with a single command, and that it does not require preserving and restoring the data. In addition, it has an option to display output in space-delimited columns on the screen.

if, in and weights are supported.

This command has been tested with regress, probit, logit, oprobit, and ologit, along with their survey counterparts. It should work with any single-equation estimator. Time series operators are not supported.

Specification of Models to Run

Models may be specified with the "new" Stata 6.0 multi-equation syntax. Thus, they can be entered individually, each surrounded by parentheses. If several models involve the same RHS variables, they can be specified together, with an equal sign separating the LHS and RHS variables, as in (mpg price weight = rep78 foreign).

Options

print(varlist) specifies the list (and ordering) of RHS variables to include in the output. The default is to include all variables from all models. Wildcards can be used to refer to variables in this list (and will expand only to include variables in the models).

log(filename) is the name of a log file for the results. The usual log options, replace and append are valid. If a log file is open and this option is specified, then the open log file will remain open for any output other than the comma-delimited table (see noisily, below) and will be re-opened after the table printed.

cmd(cmdname) is the name of the command to run the model. The default is regress.

aux(string) is a comma-separated list of auxillary estimates to print immediately after varlist and in the listed order, e.g. aux(_cons), optionally with labels, e.g. aux(_cut1=Cut One,_cut2=Cut Two). They need not exist for all the estimated models.

est(string) is a comma-separated list of returned estimates to print with optional labels (e.g. est(N,r2=R Squared)).

mif(conditions) specifies a set of if conditions that vary across the multiple equations. They are combined with the if condition, if any, with an &. Conditions should take the form:

mif( [stub : ] cond1 [\ cond2 [...]])

where stub is pre-pended to each of the cond's. So, for example, mif(rep78== : 1 \ 2 \ 3 \ 4 \ 5 ) would apply if rep78==1 to the first equation estimates, if rep78==2 to the second, and so on. The number of mif() conditions specified must match the number of equations. However, if there is exactly one equation specified, and multiple mif() conditions, then the single equation will be duplicated for each mif() condition. This means that if you want to run the same model across multiple subgroups, you need only specify the equation once, as in the example below.

miflabels(labels) specifies a set of labels for the columns, to replace those generated by default by the mif() option. For example

mktab (mpg = price weight length) , mif(foreign==1\foreign==0) miflabel("Foreign" "Domestic")

would label the two columns of output "Foreign" and "Domestic", rather than "foreign==1" and "foreign==0"

flag(flaglist) specifies a comma-separated list of one or more significance (p) levels, specified as integers, and corresponding symbols. The default is flag(1=**,5=*,10=^), which labels p<0.01 with **; p<0.05 with *, and p<0.10 with ^. Flag values must be in ascending order. To suppress significance marking, specify flag(0=*).

connect indicates that the significance level flags (e.g. **) should be connected with the coefficients, rather than separated into their own output columns.

tag(string) is an identifying name that prints in the third column of output (e.g. tag(Model One)).

notags indicates that line numbers, line types, and tag information should not be included. This option is selected automatically when screen formatting is requested.

delimit(string) specifies a delimiter to be used to separate the columns. The default is to tab-delimit the output.

noisily indicates that the individual models should be displayed as they are run. If a log file is open when mktab is executed, this output will be sent to that log file; the delimited table will be sent to the file indicated in the log() option.

continue indicates that the master row numbers should continue from the last invocation of mktab. This allows creation of a single log file with multiple tables, one after the other.

xlabel uses variable labels (if any) to label the RHS variables.

ylabel uses variable labels (if any) to label the LHS variables.

onetitle indicates that all the equations have the same LHS variable, and that the column header (i.e., variable name or label) should span all the columns, rather than being repeated for each.

t1title, t2title, b1title, and b2title allow up to two lines of titles at the top and bottom of the table.

notitle and nobtitle suppress printing of the default top and bottom title information. This is helpful in conjunction with the continue() option.

plevels indicates that p-levels should be displayed under the coefficients, in place of standard errors.

nose suppressed the display of standard errors (and p-levels)

bfmt sfmt efmt and pfmt indicate the display format to use for coefficients, standard errors, returned estimates, and p-values, respectively. The defaults are %4.3f for coefficients and standard errors, %3.2f for estimates, and %5.4 for p-values.

onetail indicates that one-tailed significance levels should be calculated. The default is two-tailed.

screen indicates that the table should be formatted for the screen, rather than delimited. This will produce a space-delimited table that will display appropriately in a fixed-pitch font. It is useful for reading model results during interactive use.

vspace indicates the width of the column for variable names, when results are formatted to the screen. The default is 20.

latex indicates that the table should be formatted for inclusion in a LaTeX file. See discussion of LaTeX, below.

nocaption indicates that the table caption (taken from t1title()) should be omitted from the LaTeX output.

html indicates that the table should be formatted as an HTML table.

command_options can include any options appropriate to the command being run (e.g. robust).

Output

mktab outputs the following columns:

1 row number: consecutive numbering of the lines in the table as a whole, which allow re-sorting into the correct order

2 line type: 0 for header/title rows 1 for coefficients 2 for standard errors 3 for estimates, and 4 for footer rows. These allow re-sorting of the data for formatting

3 the tag: Contents of the tag() option, to identify the table

4 variable names: The name of the variable for this row (blank for std error rows)

5+ Alternating columns of coefficients for each model, with significance symbols for each coefficient. Rows alternate between coefficients and their standard errors. Standard errors are surrounded by parentheses.

LaTeX output

The latex option specifies that the output table should be formatted as a LaTeX table (i.e., in a tabular environment). The table is formatted with coefficient and standard error lines decimal-aligned, and other lines centered. Standard errors are printed in a smaller font under the coefficients.

The resulting table makes use of two LaTeX packages: threewaytables and booktabs, so be sure to include \usepackage{threeparttable,booktabs} in your LaTeX preamble.

For example, the following command produces the following LaTeX output, and saves it in the file mylog.tex:

. mktab (mpg price rep78 = gratio weight), log(mylog.tex) latex

\begin{table}[ht] \begin{center} \begin{threeparttable} \caption{{\em }} \begin{tabular}{ l r@{}l r@{}l r@{}l } \toprule & \multicolumn{2}{c}{{\em mpg}}& \multicolumn{2}{c}{{\em price}}& \multicolumn{2}{c}{{\em rep78}} \\ \cmidrule(lr){2-7} gear\_ratio&0.&099&1454.&284&0.&535 \\ &\raisebox{.7ex}[0pt]{\scriptsize (1.}&\raisebox{.7ex}[0pt]{\scriptsize 365)} &\raisebox{.7ex}[0pt]{\scriptsize (978.}&\raisebox{.7ex}[0pt]{\scriptsize 092)} &\raisebox{.7ex}[0pt]{\scriptsize (0.}&\raisebox{.7ex}[0pt]{\scriptsize 389)} \\ weight&--0.&006$^{**}$&2.&692$^{**}$&--0.&000 \\ &\raisebox{.7ex}[0pt]{\scriptsize (0.}&\raisebox{.7ex}[0pt]{\scriptsize 001)} &\raisebox{.7ex}[0pt]{\scriptsize (0.}&\raisebox{.7ex}[0pt]{\scriptsize 574)} &\raisebox{.7ex}[0pt]{\scriptsize (0.}&\raisebox{.7ex}[0pt]{\scriptsize 000)} \\ \bottomrule \end{tabular} \begin{tablenotes}[flushleft] \item \hspace{-0.2em}$^{**}$ p$<$0.01; $^$ p$<$0.05; \raisebox{.7ex}[0pt]{\tiny $\wedge$} p$<$0.10 two tailed \end{tablenotes} \end{threeparttable} \end{center} \end{table} \clearpage

The following example LaTeX document would include this table:

\documentclass[8pt]{extarticle} \usepackage{threeparttable,booktabs}

\oddsidemargin 0.0in \evensidemargin 0.0in \textwidth 6.5in \topmargin 0.5in \textheight 9.0in

\begin{document} \input{mylog.tex} \end{document}

In fact, this wrapper LaTeX file can be created automatically by mktab, if you issue the following command:

. mktab wrapper , log(mylog.tex)

This creates a file _mylog.tex, which contains the wrapper listed above. This can be handy when fine-tuning a table or set of tables.

Examples

. mktab (mpg price rep78 = gratio weight), log(mylog)

This runs three regressions (with DVs mpg, price and rep78), each with RHS variables gratio and weight (and a constant). It prints the coefficients (and their standard errors) into a file called mylog.log.

. mktab (vote90 vote92 = dem rep ideology) cmd(probit) aux(_cons=Intercept) est(N,ll=Log Likelihood,r2_p=Pseudo R2) tag(Table One) log(mylog, append) continue

This runs two probit models (DVs vote90 and vote92), each with three dependent variables (dem, rep and ideology). The table includes coefficients and standard errors for the three RHS variables and the constant term (labeled Intercept), as well as estimates N, ll, and r2_p (which are labeled "N", "LnL", and "Pseudo R2", respectively.) It appends the table to mylog.log, and numbers the rows consecutively with the previous table.

. mktab (rep78 mpg price) (rep78 mpg price weight), cmd(oprobit) f(.1=**,1=*) pr(5) log(mylog, replace) xlab ylab delimit(,) a(_cut1=Cut One,_cut2=Cut Two,_cut3=Cut Three,_cut4=Cut Four)

This runs two ordered probit models of rep78, one with independant variables mpg and price, the other with mpg, price, and weight. It prints the results with five decimal places, uses the variable labels from the data set, flags coefficients p<0.001 with ** and p<0.01 with *, and delimits the output with commas. This produces the following output:

1,0,,Estimates (using oprobit) 2,1,,Variable,Repair Record,,Repair Record, 3,1,, ,1978,,1978, 4,2,,Mileage (mpg),0.11342,**,0.05978, 5,3,,,(0.02897),,(0.04091), 6,2,,Price,0.00010, ,0.00014, 7,3,,,(0.00005),,(0.00005), 8,2,,Weight (lbs.),--, ,-0.00055, 9,3,,, ,,(0.00031), 10,2,,Cut One,0.96916, ,-1.66922, 11,3,, ,(0.82655),,(1.68018), 12,2,,Cut Two,1.83700, ,-0.78752, 13,3,, ,(0.79926),,(1.66034), 14,2,,Cut Three,3.22326,**,0.63996, 15,3,, ,(0.82574),,(1.65004), 16,2,,Cut Four,4.18628,**,1.65129, 17,3,, ,(0.87507),,(1.65440), 18,6,,** p<0.001, * p<0.01, two tailed 19,7,,

. mktab (rep78 mpg price) (rep78 mpg price weight), cmd(oprobit) f(.1=**,1=*) pr(5) log(mylog, replace) xlab ylab screen a(_cut1=Cut One,_cut2=Cut Two,_cut3=Cut Three,_cut4=Cut Four)

This runs the same models, but formats the output for the screen. This produces the following output:

Variable Repair Repair Record Record 1978 1978

Mileage (mpg) 0.113** 0.060 (0.029) (0.041) Price 0.000 0.000 (0.000) (0.000)

Weight (lbs.) -- -0.000 (0.000)

Cut One 0.969 -1.669 (0.827) (1.680) Cut Two 1.837 -0.788 (0.799) (1.660) Cut Three 3.223** 0.640 (0.826) (1.650) Cut Four 4.186** 1.651 (0.875) (1.654)

** p<0.001; * p<0.01; two tailed

Notes

This program is inspired by (and some programming lifted from) Christopher Ferrell's esthold and estprt commands. Code for parsing the multiple equation syntax was taken shamelessly from the reg3 command - thank you, Vince Wiggins and Statacorp.

The tag, row number and row type columns can be used to sort the rows in useful ways once inside the spreadsheet. For example, sorting on row type, one would then have blocks of coefficients, standard errors, and results grouped together, possibly across a large number of tables (different tags). So one can easily set formatting options for each type (block) of row, and then resort by tag and row number to return to the orginal output order.

If you set things up right, however, you should be able to take these results directly into a word processor.

The significance calculations are based on e(df_r) containing degrees of freedom for t ratios; for models that do not generate e(df_r) (e.g. probit) the z ratio is used.

Author

Nicholas Winter Department of Political Science Cornell University nw53@cornell.edu