Title
outreg - reformat and write regression tables to a document file
n.b. outreg has many options. For basic options only, see basic outreg.
For an explanation of the large changes to outreg since the previous version, see outreg updates.
Syntax
outreg [using filename] [, options]
Description
outreg can arrange the results of Stata estimation commands in tables as they are typically presented in journal articles, rather than as they are presented in the Stata Results window. By default, t statistics appear in parentheses below the coefficient estimates with asterisks for significance levels.
outreg provides as complete control of the layout and formatting of estimation tables as possible, both in Word and TeX files. Almost every aspect of the table's structure and format (including fonts) can be specified with options. Multiple tables can be written to the same document, with paragraphs of text in between, creating a whole statistical appendix.
outreg works after any estimation command in Stata (see estimation commands for a complete list*). Like predict, outreg makes use of internally saved estimation results, so it should be invoked after the estimation.
outreg creates a Microsoft Word file by default, or a TeX file using the tex option. In addition, the table created by outreg is displayed in the Results window, minus some of the finer formatting destined for the Word or TeX file.
Successive estimation results, which may use different variables, can be combined by outreg into a single table using the merge option. (n.b. In previous versions of outreg, the merge option was called "append".)
* To be precise, outreg can display results after every estimation command which saves both e(b) and e(V) values. Estimation commands which do not save both e(b) and e(V) are ca, candisc, discrim, exlogistic, expoisson, factor, mca, mds, mfp, pca, procrustes, svy:tabulate. The estimates from these estimation commands (in the e() matrices) can be turned into a Word or TeX table with the frmttable command. outreg can display the results of the commands mean, ratio, proportion, and total which may not be thought of as estimation commands, and these commands accept the svy: prefix.
options categories Description --------------------------------------------------------------------------- estimates selection which statistics are displayed in table estimates formatting numerical formatting & arrangement of estimates text additions titles, notes, added rows and columns text formatting: column formatting column widths, justification, etc. fonts font specifications for table lines & spaces horizontal and vertical lines, cell spacing page formatting page orientation and size file & display options TeX files, merge, replace, etc. stars options change stars for statistical significance brackets options change brackets around, e.g., t stats summary stats options summary statistics below estimates frmttable options technical options passed to frmttable ---------------------------------------------------------------------------
Inline text formatting: superscripts, italics, Greek characters, etc. Notes about specific estimation commands Examples of outreg in use
estimates selection Description --------------------------------------------------------------------------- se report standard errors rather than t statistics marginal report marginal effects instead of coefficients or | hr | irr | rrr odds ratios, that is, exp(b) instead of b stats(statname [...]) report statistics other than b and t statistics nocons drop constant estimate (don't include _cons coefficient) keep(eqlist | varlist) include only specified coefficients drop(eqlist | varlist) exclude specified coefficients level(#) set level for confidence intervals; default is level(95) ---------------------------------------------------------------------------
estimates formatting Description --------------------------------------------------------------------------- bdec(numlist) decimal places for coefficients tdec(#) decimal places for t statistics sdec(numgrid) decimal places for all statistics bfmt(fmtlist) numerical format for coefficients sfmt(fmtgrid) numerical format for all statistics nosubstat don't put t statistics (or others) below coefficients eq_merge merge multi-equation coefficients into multiple columns ---------------------------------------------------------------------------
text additions Description --------------------------------------------------------------------------- varlabels use variable labels as rtitles title(textcolumn) put title above table ctitles(textgrid) headings at top of columns rtitles(textgrid) headings to the left of each row note(textcolumn) put note below table pretext(textcolumn) regular text placed before the table posttext(textcolumn) regular text placed after the table nocoltitl no column titles norowtitl no row titles addrows(textgrid) add rows at bottom of table addrtc(#) number of rtitles columns in addrows addcols(textgrid) add columns to right of table annotate(Stata matrix name) grid of annotation locations asymbol(textrow) symbols for annotations ---------------------------------------------------------------------------
column formatting Description --------------------------------------------------------------------------- colwidth(numlist)* change column widths multicol(numtriple[;numtriple ...]) have column titles span multiple columns coljust(cjstring[;cjstring ...]) column justification: left, center, right, or decimal nocenter Don't center table within page --------------------------------------------------------------------------- * Word-only option
font specification Description --------------------------------------------------------------------------- basefont(fontlist) change the base font for all text titlfont(fontcolumn) change font for table title ctitlfont(fontgrid[;fontgrid...]) change font for column titles rtitlfont(fontgrid[;fontgrid...]) change font for row titles statfont(fontgrid[;fontgrid...]) change font for statistics in body of table notefont(fontcolumn) change font for notes below table addfont(fontname)* add a new font type plain plain text - one font size, no justification table sections explanation of outreg table sections --------------------------------------------------------------------------- * Word-only option
border lines and spacing Description --------------------------------------------------------------------------- hlines(linestring) horizontal lines between rows vlines(linestring) verticle lines between columns hlstyle(lstylelist)* change style of horizontal lines (e.g. double, dashed) vlstyle(lstylelist)* change style of verticle lines (e.g. double, dashed) spacebef(spacestring) put space above cell contents. spaceaft(spacestring) put space below cell contents. spaceht(#) change size of spacebef & spaceaft. --------------------------------------------------------------------------- * Word-only option
page formatting Description --------------------------------------------------------------------------- landscape pages in landscape orientation a4 A4 size paper (instead of 8 1/2Ó x 11Ó) ---------------------------------------------------------------------------
file and display options Description --------------------------------------------------------------------------- tex write a TeX file instead of the default Word file merge[(tblname)] merge as new columns to existing table replace replace existing file addtable write a new table below an existing table append[(tblname)] append as new rows below an existing table replay[(tblname)] write preexisting table store(tblname) store table with name "tblname" clear[(tblname)] clear existing table from memory fragment** create TeX code fragment to insert into TeX document nodisplay don't display table in results window dwide display all columns however wide --------------------------------------------------------------------------- ** TeX-only option
stars options Description --------------------------------------------------------------------------- starlevels(numlist) significance levels for stars starloc(#) locate stars next to which statistic (def=2) margstars calculate stars from marginal effects, not coefficients nostars no stars for significance nolegend no legend explaining significance levels sigsymbols(textrow) symbols for significance (in place of stars) ---------------------------------------------------------------------------
brackets options Description --------------------------------------------------------------------------- squarebrack square brackets instead of parentheses brackets(textpair [ \ textpair ...]) symbols with which to bracket substatistics nobrket put no brackets on substatistics dbldiv(text) symbol dividing double statistics ("-") ---------------------------------------------------------------------------
summary statistics options Description --------------------------------------------------------------------------- summstat(e_values) additional summary statistics below coefficients summdec(numlist) decimal places for summary statistics summtitles(textgrid) row titles for summary statistics noautosumm no automatic summary stats (R^2, N) ---------------------------------------------------------------------------
frmttable options Description --------------------------------------------------------------------------- blankrows allow (don't drop) blank rows in table nofindcons don't assign _cons to separate section of table ---------------------------------------------------------------------------
+---------------------+ ----+ Estimates selection +----------------------------------------------
se specifies that standard errors rather than t statistics are reported in parentheses below the coefficient estimates. The decimal places displayed are those set by bdec.
marginal specifies that marginal effects rather than coefficients are reported. The t statistics are for the hypothesis that the marginal effects, not the coefficients, are equal to zero, and the asterisks report the significance of this hypothesis test. marginal is equivalent to stats(b_dfdx t_abs_dfdx) (or stats(b_dfdx se_dfdx) if the se option is used) combined with the margstars option.
or | hr | irr | rrr cause the coefficients to be displayed in exponentiated form: for each coefficient, exp(b) rather than b is displayed. Standard errors and confidence intervals are also transformed. Display of the intercept, if any, is suppressed. These options are identical, but by convention different estimation methods use different names.
exponentiation option name --------------------------------------- or odds ratio hr hazard ratio irr incidence-rate ratio rrr relative-risk ratio ---------------------------------------
Note that after commands such as stcox, which report coefficients in exponentiated form by default, you must use one of the exponentiation options for the outreg table to display exponentiated coefficients and standard errors as they are displayed in the Results window after stcox command.
The exponentiation options are equivalent to the option stats(e_b t) (or stats(e_b e_se) if the se option is in effect).
These options correspond to the or option used for logit, clogit, and glogit estimation, irr for poisson estimation, rrr for mlogit, hr for stcox hazard models, and eform for xtgee, but they can be used to exponentiate the coefficients after any estimation. Exponentiation of coefficients is explained in [R] maximize - methods and formulas.
stats(statname [...]) specifies the statistics to be displayed; the default is equivalent to specifying stats(b t_abs). Multiple statistics are arranged below each other (unless you use the nosubstat option), with varying brackets. Available statistics are:
statname definition ------------------------------------------------------------------ b coefficient estimates se standard errors of estimate t t statistics for the test of b=0 t_abs absolute value of t statistics p p value of t statistics ci confidence interval of estimates ci_l lower confidence interval of estimates ci_u upper confidence interval of estimates beta normalized beta coefficients (see the beta option of regress) e_b exponentiated form of the coefficients. e_se exponentiated standard errors e_ci exponentiated confidence interval e_ci_l exponentiated lower confidence interval e_ci_u exponentiated upper confidence interval b_dfdx marginal effect of the coefficients (requires margins) se_dfdx standard errors of marginal effects t_dfdx t statistics of marginal effects t_abs_dfdx absolute value of t statistics of marginal effects p_dfdx p values of t statistics of marginal effects ci_dfdx confidence interval of marginal effects ci_l_dfdx lower confidence interval of marginal effects ci_u_dfdx upper confidence interval of marginal effects at values around which marginal effects were estimated ------------------------------------------------------------------
nocons drops the constant estimate from the table.
keep(eqlist | coeflist) includes only the specified coefficients (and potentially reorders included coefficients). drop(eqlist | coeflist) excludes the specified coefficients.
eqlist (equation list) consists of eqname: [coeflist] [eqname: [coeflist] ...].
coeflist (coefficient list) is like a varlist but can include "_cons" for the constant coefficient, or other parameter names. Factor variable notation can be included. The coeflist can include any of the simple column names of the e(b) coefficient vector, which forms the basis of the table created by outreg. You can see the contents of the e(b) vector after an estimation command by typing matrix list e(b). If using marginal effects (after the margins command) rather than coefficient estimates, the relevant vector is r(b).
eqname is a second level column name of the e(b) vector used for multi-equation estimation commands, such as reg3 or mlogit. Many Stata estimation commands attach additional parameters to the coefficient vector e(b) with a distinct equation name. For instance, the xtreg,fe command includes two parameters in e(b) with eqnames "sigma_u:" and "sigma_e:". The coeflist for each of these eqnames is "_cons".
To report only the coefficient estimates without additional parameters in the e(b) vector, it usually works to use the keep(depvar:) option, since the coefficients are given an eqname of the dependent variable.
You can use the keep option to reorder variables for the formatted outreg table. The estimation coefficients will be displayed in the order specified in keep. Don't forget to include "_cons" in the reordered coeflist if you want the constant coefficient term to be included in the formatted table. By default, the "_cons" term is always displayed last in outreg even if it is not listed last in the keep coeflist. To display the "_cons" coefficient other than last, combine the keep option with the nofindcons option. If you want the "_cons" coefficient not to be last and are merging multiple tables, you must specify the nofindcons option with all the tables being merged, whether you employ the keep option for them or not, to insure that the coefficients merge properly.
If in doubt about what variable names, or especially equation names, to include in keep or drop, use matrix list e(b) (or matrix list r(b) for marginal effects) to see what names are assigned to saved estimation results.
You may have problems with keep and drop if you have chosen both coefficients and marginal effects as statistics, since they usually do not have the same coeflist in both cases due to the absence of a constant coefficient estimate in the marginal effects. A keep option that included "_cons" would result in an error message because no constant could be found in the marginal effects. In this case, you could only keep or drop variables occurring in both vectors. However, if you are using drop, you can still eliminate the constant term with the nocons option.
level(#) sets the significance level for confidence intervals, which are included in the outreg table using the stats(ci) option. The default is level(95) for a 95% confidence level. Note that level has no impact on the asterisks for the statistical significance of coefficients (for this, see starlevels). For more information about level see [R] estimation options - level. The default level can be set for all Stata commands, including outreg using the set level command.
+----------------------+ ----+ Estimates formatting +---------------------------------------------
bdec(numlist) specifies the number of decimal places reported for coefficient estimates (the b's). It also specifies the decimal places reported for standard errors if the se option is in effect. The default value for bdec is 3. The minimum value is 0 and the maximum value is 15. If one number is specified in bdec, it will apply to all coefficients. If multiple numbers are specified in bdec, the first number will determine the decimals reported for the first coefficient, the second number, the decimals for the second coefficient, etc. If there are fewer numbers in bdec than coefficients, the last number in bdec will apply to all the remaining coefficients.
The decimal places applied to each coefficient are also applied to the corresponding standard errors, confidence intervals, beta coefficients, and marginal effects, if they are included with the se or stats options.
tdec(#) specifies the number of decimal places reported for t statistics. The default value for tdec is 2. The minimum value is 0 and the maximum value is 15.
sdec(numgrid) is for finer control of the decimal places of estimates than is possible with bdec and tdec, but is rarely needed. The sdec numgrid corresponds to the decimal places for each of the statistics in the table. It can be used, for instance, to specify different decimal places for coefficients versus standard errors (bdec applies to both), or to allow varying decimal places for t statistics.
numgrid is a grid of intergers 0-15 in the form used by matrix define. Commas separate elements along a row, and backslashes ("\") separate rows: numgrid has the form #[,#...] [\ #[,#...] [...]]. For example, if the table of statistics has three rows and two columns, the sdec(numgrid) could be sdec(1,2 \ 2,2 \ 1,3). If you specify a grid smaller than the table of statistics created by outreg, the last rows and columns of the sdec numgrid will be repeated to cover the whole table. Unbalanced rows or columns will not cause an error. They will be filled in, and outreg will display a warning message.
bfmt(fmtlist) specifies the numerical format for coefficients.
fmtlist consists of fmt [fmt ...]] where fmt is either e, f, fc, g, or gc:
fmt code format type ------------------------------------------------- e exponential (scientific) notation f fixed number of decimals fc fixed with commas for thousands, etc. - the default for outreg g "general" format (see format) gc "general" format with commas for thousands, etc. -------------------------------------------------
The g formats do not allow the user to control the number of decimal places displayed.
Like bdec, if one format is specified in bfmt, it will apply to all coefficients. If multiple format codes are specified in bfmt, the first format will apply to the first coefficient, the second format, the second coefficient, etc. If there are fewer formats in fmt than coefficients, the last format in bfmt will apply to all the remaining coefficients. The format applied to each coefficient is also applied to the corresponding standard errors, confidence intervals, beta coefficients, and marginal effects, if they are specified in se or stats.
sfmt(fmtgrid) is for finer control of the numerical formats of estimates than is possible with bfmt, but is rarely needed. The sfmt fmtgrid is a grid of the format types (e, f, g, fc, or gc) for each statistic in the table. For example, sfmt could be used to assign different numerical formats for the coefficients in different columns of a multi-equation estimation, or to change the format for t statistics.
The fmtgrid in sfmt has the same form as the numgrid of the sdec option above.
nosubstat puts additional statistics, like t statistics or other "sub-statistics", in columns to the right of coefficients, rather than below them. Applying the nosubstat with the default statistics of b and t_abs, the outreg table would have one only row, but two columns, for each coefficient. For example, the command outreg using test, nosubstat stats(b se t p ci_l ci_u) will arrange regression output the way it is displayed in the Stata Results window after the regress command, with each statistic in a separate column. In this case, for each variable in the regression, there is one row of results, but six columns, of statistics (see Example 15).
eq_merge merges multi-equation estimation results into multiple columns, one column per equation. By default, outreg displays the equations one below the other in a single column. eq_merge is most useful after estimation commands like reg3, sureg, mlogit, and mprobit, where many or all of the variables recur in each equation. The coefficients are merged as if the equations were estimated one at a time, and the results were sequentially combined with the merge option.
+---------------+ ----+ Stars options +----------------------------------------------------
starlevels(numlist) indicates significance levels for stars in percent. By default, one star is placed next to coefficients which pass the test for significant difference from zero at the 5% level, and two stars are placed next to coefficients that pass the test for significance at the 1% level, which is equivalent to specifying starlevels(5 1). To place one star for the 10% level, 2 for the 5% level, and 3 for the 1% level, you would specify starlevels(10 5 1). To place one star for the 5% level, 2 for the 1% level, and 3 for the 0.1% level, you would specify starlevels(5 1 .1).
Example 5 applies the starlevels option.
starloc(#) put stars next to the statistic indicated. By default, stars are displayed next to the second statistic (starloc(2)), but they can be placed next to the first statistic (usually the coefficient estimate) or next to third or higher statistic if they have been specified in stats.
margstars calculates stars for significance from marginal effects (and their standard errors), rather than from the coefficients themselves, which is the default.
nostars suppresses the stars indicating significance levels.
nolegend indicates that there will be no legend explaining the stars for significance levels below the table (by default, the legend is "* p<0.05; ** p<0.01"). To replace the legend, use the nolegend option, and put your own legend in a note.
sigsymbols(textrow) replaces the stars used to indicate statistical significance with other symbols of your choice. For example, to use a plus sign "+" to indicate a 10% significance level, you could apply sigsymbols(+,*,**) along with starlevels(10 5 1). By default, outreg uses one star for the first significance level, and adds an additional star for each additional significance level displayed.
The argument textrow consists of text separated by commas.
Example 5 applies the sigsymbols option.
+----------------------------+ ----+ Summary statistics options +---------------------------------------
summstat(evaluegrid) places summary statistics below the coefficient estimates. evaluegrid is a grid of the names of different e() return values already calculated by the estimation command. The syntax of the evaluegrid is the same as the other grids in outreg. Elements within a row are separated with commas (","), and rows are separated by backslashes ("\"). The default value of summstat is summstat(r2 \ N) (when e(r2) is defined), which places the R-squared statistic e(r2) below the coefficient estimates, and the number of observations e(N) below that.
To replace the R-squared with the adjusted R-squared stored in e(r2_a), you could use the options summstat(r2_a \ N) and summtitle("Adjusted R2" \ "N"). You can also specify the decimal places for the summary statistics with the summdec option. To see a complete list of the e() macro values available after each estimation command, type ereturn list.
Statistics not included in the e() return values can be added to the table with the addrows option as in Example 7.
See an application of summstat in Example 5.
summdec(numlist) designates the decimal places displayed for summary statistics in the manner of bdec.
summtitles(textgrid) designates row titles for summary statistics in the same manner as rtitles.
noautosumm eliminates the automatically generated summary stats (R-squared, if there is one, and the number of observations) from the outreg table.
+-------------------+ ----+ frmttable options +------------------------------------------------
blankrows allows blank rows (across all columns) in the body of the outreg table to remain blank without being deleted. By default, outreg sweeps out any completely blank rows. This option is useful if you want to use blank rows to separate different parts of the table.
findcons is a technical option that finds rows of statmat with row titles "_cons" and puts them in a separate row section. Usually finding the constant is needed to ensure that new variables coefficients are merged in correctly, above the constant term, when multiple estimations are merged together. This option is most likely to be useful when you don't want the "_cons" term to be last when using the keep option, or when merging with a non-outreg table that treats constants differently.
Notes about specific estimation commands
rocfit reports a t statistic for the null hypothesis that the slope is equal to 1. outreg reports the t statistic for the null hypothesis that the slope is equal to 0.
stcox and streg report hazard ratios by default, and the coefficients only if the nohr option is employed. outreg does the reverse. To show the hazard rates in the outreg table, use the hr option.
mim is a user-written command that makes multiple imputations (see also the Stata command mi). mim does not store the estimation results in the e(b) and e(V) matrices, so it is necessary to repost them to these matrices before outreg can access the mim results. This is accomplished with the following commands:
. mat b = e(MIM_Q) . mat V = e(MIM_V) . ereturn post b V, depname(`e(MIM_depvar)') obs(`e(MIM_Nmin)') /// dof(`e(MIM_dfmin)')
After these commands, outreg can be used in the usual manner.
Examples
1. Basic usage and variable labels 2. Decimal places for coefficients and titles 3. Merging estimation tables together 4. Standard errors, no stars, and square brackets in a TeX file 5. 10% significance level and summary statistics 6. Display some but not all coefficients 7. Add statistics not in summstat 8. Multi-equation models 9. Marginal effects and star options 10. Multi-column ctitles; merge variable means to estimation results 11. Specifying fonts 12. Superscripts, italics, and Greek characters 13. Place additional tables in same document 14. Place footnotes among coefficients 15. Show statistics side-by-side, like Stata estimation results 16. Merge multiple estimations in a loop
Example 6. Display some but not all coefficients
The options keep and drop allow you to display some but not all coefficients in the estimation. keep also allows you to change the order in which the coefficient estimates are displayed. To keep or drop the constant term, include "_cons" in the list of coefficients.
The first example removes dummy variable coefficients and reorders the coefficients with keep(weight foreign):
. tab rep78, gen(repair) (output omitted) . regress mpg foreign weight repair1-repair4 (output omitted) . outreg using auto, keep(weight foreign) varlabels replace /// note(Coefficients for repair dummy variables not shown) -------------------------------- Mileage (mpg) -------------------------------- Weight (lbs.) -0.006 (9.16)** Car type -2.923 (2.18)* R2 0.69 N 69 -------------------------------- * p<0.05; ** p<0.01 Coefficients for repair dummy variables not shown
The keep and drop options can use the wildcard characters *, ?, and ~. They can also use factor variable notation.
The second example uses keep to remove from the table the auxiliary parameters included in e(b) by Stata. The tobit command estimates a sigma parameter. The main coefficient estimates are included in the e(b) vector with the equation name "model" and the sigma parameter is given the equation name "sigma". When in doubt about which equation names are included in the e(b) vector after an estimation, you can view the matrix and its names with the matrix list e(b) command. outreg includes the sigma parameter and the equation names in the estimates table.
. gen wgt = weight/100 . label var wgt "Weight (lbs/100)" . tobit mpg wgt, ll(17) (output omitted) . outreg using auto, replace --------------------------- model wgt -0.687 (9.82)** _cons 41.499 (20.16)** sigma _cons 3.846 (10.50)** N 74 --------------------------- * p<0.05; ** p<0.01
To limit the table to the coefficient estimates alone, we can use the option keep(model:). The colon after "model" indicates that it is an equation name, not a coefficient name, and all estimates in the "model" equation are kept.
. outreg using auto, keep(model:) varlabel replace ----------------------------------- Mileage (mpg) ----------------------------------- Weight (lbs/100) -0.687 (9.82)** Constant 41.499 (20.16)** N 74 ----------------------------------- * p<0.05; ** p<0.01
Example 7. Add statistics not in summstat
There are many statistics, particularly test statistics, which we may want to report in estimation tables but are not available in the summstat option. The statistics available in summstat are limited to the e( ) scalar values that can be viewed after an estimation command with ereturn list.
The addrows option can add additional rows of text below the coefficient estimates and summary statistics. This example shows how to display the results of the test command as addeds rows of the outreg table.
Below we test whether the coefficient on the variable foreign is equal to the negative of the coefficient on goodrep with test foreign = -goodrep. The command test saves the F statistic in the return value r(F) and its p value in the return value r(p). If we include r(F) and r(p) in addrows directly, they are reported with seven or eight decimal places. To control the numerical formatting of the return values F and p, we use the local macro directive display. local F : display %5.2f `r(F)' takes the value in r(F) and puts it in the local macro "F" displayed with two decimal places and a width of 5. Similarly, the local macro "p" has three decimal places.
. gen goodrep = rep78==5 . reg mpg weight foreign goodrep (output omitted) . test foreign = -goodrep (output omitted) . local F : display %5.2f `r(F)' . local p : display %4.3f `r(p)'
We are now ready to add the test statistics to the outreg table. The addrows option below adds two rows, one for the F test and one for its p value, and two columns, one for the text in the left column and one for the test values. As usual, columns of text are separated with a comma, and rows of text are separated with the backslash.
. outreg using auto, replace /// addrows("F test: foreign = -goodrep", "`F'" \ "p value", "`p'") ----------------------------------------- mpg ----------------------------------------- weight -0.006 (10.40)** foreign -2.745 (2.53)* goodrep 3.613 (2.98)** _cons 40.733 (19.59)** R2 0.70 N 74 F test: foreign = -goodrep 0.43 p value 0.515 ----------------------------------------- * p<0.05; ** p<0.01
If we wanted to report the F test statistics above the summary statistics (R2 and N), then we would need to use the option noautosumm to suppress the default summary statistics, and instead include them in the addrows option below the F test statistics. The values of R2 and N are available in the scalars e(r2) and e(N).
Example 8. Multi-equation models
outreg displays estimation results in a single column even for multi-equation models unless the user chooses the eq_merge option (for "equation merge"). When different equations in the estimation model share many of the same covariates, users may prefer to display the results like the merged results of separate estimations. eq_merge puts each equation is a separate column and any common variables are displayed the same row. Using an example of seemingly unrelated regression estimation with the three equations each sharing two covariates, outreg organizes the table as shown below.
. sureg (price foreign weight length) (mpg displ = foreign weight) (output omitted) . outreg using auto, varlabels eq_merge replace /// ctitles("", Price Equation, Mileage Equation, Engine Size Equation > ) /// summstat(r2_1, r2_2, r2_3 \ N, N, N) summtitle(R2 \ N) ------------------------------------------------------------------------- Price Equation Mileage Equation Engine Size Equation ------------------------------------------------------------------------- Car type 3,575.260 -1.650 -25.613 (5.75)** (1.57) (2.05)* Weight (lbs.) 5.691 -0.007 0.097 (6.18)** (10.56)** (13.07)** Length (in.) -88.271 (2.81)** Constant 4,506.212 41.680 -87.235 (1.26) (19.65)** (3.47)** R2 0.55 0.66 0.81 N 74 74 74 ------------------------------------------------------------------------- * p<0.05; ** p<0.01
Each of the equations in sureg has an R-squared statistic, so the summstat option places them below the coefficient estimates along with the number of observations. The summstat option has three columns and two rows.
Example 9. Marginal effects and star options
outreg can display marginal effects estimates calculated by the margins command instead of coefficient estimates. outreg can also display marginal effects calculated by the mfx and dprobit commands that were part of Stata 10 and earlier. Displaying marginal effects requires that the user run margins, dydx(*) or a similar command after the estimation in question before using outreg.
The simplest way to substitute marginal effects for coefficient estimates is with the marginal option. This replaces the statistic b_dfdx for b and t_abs_dfdx for t_abs (or se_dfdx for se if the option se is in effect). The asterisks for significance now refer to the marginal effects rather than the underlying coefficients.
. logit foreign wgt mpg (output omitted) . margins, dydx(*) (output omitted) . outreg using auto, marginal replace ----------------- foreign ----------------- wgt -0.046 (8.01)** mpg -0.020 (2.03)* N 74 ----------------- * p<0.05; ** p<0.01
Marginal effects can also be combined with regression coefficients or other statistics in the outreg table. Below, the table displays each coefficient estimate with the marginal effect below it, and the 95% confidence interval of the marginal effect below that, because of the stats(b b_dfdx ci_dfdx) option. Note that the statistics b_dfdx and ci_dfdx refer to whichever marginal effects were specified in the margins command. This could be from the dydx(), eydx(), dyex(), or eyex() option.
The margstar option specifies that the asterisks refer to the significance of the hypothesis that the marginal effects are zero, rather than the coefficients being zero. The starloc(3) option places the asterisks next to the third statistic (the marginal effect confidence intervals) instead of the default, next to the second statistic.
. outreg using auto, stat(b b_dfdx ci_dfdx) replace /// title("Marginal Effects & Confidence Intervals" \ /// "Below Coefficients") margstar starloc(3) Marginal Effects & Confidence Intervals Below Coefficients ------------------------------ foreign ------------------------------ wgt -0.391 (-0.046) [-0.057 - -0.035]** mpg -0.169 (-0.020) [-0.039 - 0.001] _cons 13.708 N 74 ------------------------------ * p<0.05; ** p<0.01
Example 10. Multi-column ctitles; merge variable means with estimation results
The summary statistics for the variables used in estimations, usually their means and standard deviations, are commonly reported in empirical papers. This example shows how to merge variable means onto an estimation table.
First we create an outreg table which merges two simple regressions as was done in Example 3. The nodisplay option suppresses display of the outreg tables we are creating, which normally appears in the Stata results window. The ctitles have been specified to have two rows, with a supertitle on the first two columns of "Regressions".
Notice that the two outreg commands below do not include a using statement. This means that the results are not written as Word files. This is not necessary because we will merge more estimation results below, and don't need to save the intermediate files. The contents of the table are saved in Stata's memory in the mean time.
. reg mpg foreign weight (output omitted) . outreg, bdec(2 5 2) varlabels nodisplay /// ctitles("", "Regressions" \ "", "Base case") . reg mpg foreign weight weightsq (output omitted) . outreg, bdec(2 5 2) bfmt(f f e f) varlabels merge /// ctitles("", "" \ "", "Quadratic mpg") nodisplay
Then we run the mean command, which calculates variable means and their standard errors. mean is an estimation command, so it stores its results in e(b) and e(V) and they can be displayed and merged using outreg. We merge the variable means to the outreg table already created above. The ctitles in this outreg command have two rows, aligning them with the previous ctitles. The multicol(1,2,2) option causes the cell in the first row, second column, to span two cells horizontally so that the title "Regressions" is centered over both the "Base case" and "Quadratic mpg" columns. The effect of the multicol option can not be seen in the Stata results window (shown below), but does appear in the Word or TeX document created by outreg. Note that the multicol option must be used in the third and last outreg command, because it is a formatting characteristic that is not retained from an earlier outreg table that is merged with a new one.
. mean mpg foreign weight (output omitted) . outreg using auto, bdec(1 3 0) nostar merge replace /// ctitles("", "Means &" \ "", "Std Errors") multicol(1,2,2) ---------------------------------------------------- Regressions Means & Base case Quadratic mpg Std Errors ---------------------------------------------------- foreign -1.65 -2.20 0.297 (1.53) (2.08)* (0.053) weight -0.00659 -0.01657 3,019 (10.34)** (4.18)** (90) weightsq 1.59e-06 (2.55)* mpg 21.3 (0.7) _cons 41.68 56.54 (19.25)** (9.12)** R2 0.66 0.69 N 74 74 74 ---------------------------------------------------- * p<0.05; ** p<0.01
We could embellish the "Regressions" supertitle by underlining it. In Word files, this is accomplished with the formatting code "{\ul Regressions}". If we want the underline to span more widely than the word "Regressions", one approach is to place tab characters before and after the word. Spaces do not do the job, because Word does not underline spaces. To place one tab character on either side of the supertitle, we would use "{\ul\tab Regressions\tab}" in the ctitles option. Another option is to use underscore characters, although the line they create is offset slightly below the underlining. See Inline formatting for more information about underlining and other within-string formatting issues.
The mean command calculates the variable means and their standard errors. More typically, summary statistic tables report the variable means and their standard deviations (which differ from the standard errors of the mean by a factor of the square root of N). To report the standard deviations of the variables, I use the as yet unreleased command outstat which, since it is also based on the underlying formatting engine frmttable, can be appended to an outreg table:
. reg mpg foreign weight (output omitted) . outreg (output omitted) . outstat mpg foreign weight using auto, merge replace /// title(Merge summary statistics with regression results) /// sdec(2\2\4\4\0\0) varlabels basefont(fs10) (note: tables being merged have different numbers of row sections) -------------------------------- mpg Means -------------------------------- foreign -1.650 0.2973 (1.53) (0.4602) weight -0.007 3,019 (10.34)** (777) mpg 21.30 (5.79) _cons 41.680 (19.25)** R2 0.66 N 74 -------------------------------- * p<0.05; ** p<0.01
The warning message "tables being merged have different numbers of row sections" is displayed because the differing structure of the outreg table and the outstat table mean that the merge process may not align rows the way the user intended, but in this case there is no problem.
Example 11. Specifying fonts
One of the objectives of this version of outreg is to have as complete control of the layout and appearance of estimates tables as possible. An important element of this is fine control of fonts. outreg now enables users to specify fonts down to the table cell level, although this is needed only rarely. Users can specify font sizes, font types (such as Times Roman or Arial), and font styles (such as bold or italic). For Word files, users can apply any font type installed on their computers by adding the font name in the addfont option.
This example prepares a table for a presentation as an overhead slide with special fonts that are displayed much larger than usual. Two specialized fonts are added to the document with the addfont(Futura,Didot Bold) command. These fonts can then be applied to different parts of the table as "fnew1" for the first added font, or "fnew2", the second added font. We set the default font of the table to be Futura ("fnew1") in the basefont(fs32 fnew1). This basefont option also sets the font size to 32 points to make the table fill the whole overhead slide. The title is assigned the second added font, Didot Bold, with a 40 point size in titlfont(fs40 fnew2). The statistics in the table are displayed in the Arial font for readability with the statfont(arial) option. (Times Roman, Arial, and Courier fonts are predefined in Word and TeX documents and don't need to be added.) The basefont font characteristics apply to all parts of the table, unless otherwise specified, so the Arial font in statfont has a point size of 32.
Font specifications do not change the appearance of the table displayed in the Stata results window (only in the Word document written to auto.doc), so the output is omitted.
. reg mpg foreign weight (output omitted) . outreg using auto, addfont(Futura, Didot Bold) /// basefont(fs32 fnew1) titlfont(fs40 fnew2) statfont(arial) /// title(New Fonts for Overhead Slides) varlabels replace (output omitted)
Example 12. Superscripts, italics, and Greek characters
This example uses some of the methods of inline formatting explained above to apply superscripts, italic text, and Greek characters. It is helpful to review those methods to understand the codes used here.
This example is similar to Example 7 in that the results of a test of coefficient equality are displayed in the estimation table. However, since the estimation is nonlinear, the test statistic is a chi-squared rather than an F statistic. We will write the chi-squared with the Greek character chi and a superscripted "2" in the Word table generated by outreg. A different set of codes can produce the same formatting in TeX files, as discussed in Inline formatting.
The Word code for the Unicode representation of the Greek lower-case letter chi is "\u0966?" (see all Word Greek letter codes here). The code for chi needs to be placed in quotes in the addrows option because otherwise the backslash would be interpreted as a row divider. The superscripted 2 is encoded as "{\super 2}". Note the space between the formatting code ("\super") and the regular text ("2"). Without it, Word would try to interpret the code "\super2", which doesn't exist. Finally, we italicize the "p" in p value like this: "{\i p}". The full addrows option becomes addrows("\u0966{\super 2} test", "`chi2'" \ "{\i p} value", "`p'"). As in Example 7, `chi2' and `p' are the value of local macros containing the numerically formatted values of the chi-squared statistic and its p value.
The note option in the outreg command below has a couple of tricks in it. The first is a blank row ("") to separate the note text from the legend for asterisks above it. We also add Stata system macro values for the current time, date, and dataset file name from predefined Stata macros $S_TIME, $S_DATE, and $S_FN, respectively.
. logit foreign wgt mpg (output omitted) . test wgt = mpg (output omitted) . local chi2 : display %5.2f `r(chi2)' . local p : display %4.3f `r(p)' . outreg using auto, replace /// addrows("\u0966?{\super 2} test", "`chi2'" \ "{\i p} value", "`p'" > ) /// note("" \ "Run at $S_TIME, $S_DATE" \ "Using data from $S_FN") ------------------------------------------------ foreign ------------------------------------------------ wgt -0.391 (3.86)** mpg -0.169 (1.83) _cons 13.708 (3.03)** N 74 \u0966?{\super 2}(1) test: wgt=mpg 10.84 {\i p} value 0.001 ------------------------------------------------ * p<0.05; ** p<0.01 Run at 16:51:44, 27 Aug 2010 Using data from /Applications/Stata/ado/base/a/auto.dta
Example 13. Place additional tables in same document
One of the goals for outreg is to create whole documents, such as statistical appendices, from a Stata .do file. To do this, one must be able to write multiple tables to the same document, which is possible with the addtable option.
Below, the mean command creates summary statistics for the variables. outreg with the addtable option places summary statistics table below the table just created in Example 12 in the Word file auto.doc. The option nostars turns off asterisks for significance tests, and nosubstat puts the standard errors side-by-side with the means, as explained in Example 15 below. . mean foreign wgt mpg Mean estimation Number of obs = 74
-------------------------------------------------------------- | Mean Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ foreign | .2972973 .0534958 .1906803 .4039143 wgt | 30.19459 .9034692 28.39398 31.99521 mpg | 21.2973 .6725511 19.9569 22.63769 --------------------------------------------------------------
. outreg using auto, addtable ctitle(Variables, Means, Std Errors) /// nostars nosubstat title("Summary Statistics") basefont(fs6) Summary Statistics --------------------------------- Variables Means Std Errors --------------------------------- foreign 0.297 0.053 wgt 30.195 0.903 mpg 21.297 0.673 ---------------------------------
The user can add paragraphs of regular text before and after each table with the pretext and posttext options.
Example 14. Place footnotes among coefficients
Placing footnotes in any of the text elements of a outreg table is straightforward, such as in title, ctitles, rtitles, or note. You can place a footnote number in the text, using a superscript as in Example 12 if you want, and place the footnote text in the note or posttext.
Placing a footnote in the body of the outreg table is not as straightforward as in the text elements, because the table body is made up of numeric statistics. For this, we use the annotate option. First we create a Stata matrix with the footnote locations used by annotate, and put the footnote symbols in the text string of asymbol. It is helpful to review the entry for the annotate option for details.
Below, we place superscripted footnotes in a regression table. The first footnote is added to the label of the variable foreign, which is used by outreg because of the varlabels option. The next two footnotes are placed among the regression statistics. For this we create a Stata matrix with the matrix annotmat = J(3,2,0) command. This creates a 3 by 2 matrix of zeros. The matrix should have the dimension of the number of coefficients (3, including the constant) by the number of statistics (by default, 2: b and t_abs). All elements of the matrix annotmat which are zero are ignored. The locations with a "1" have the first asymbol appended, "2" have the second asymbol, etc. Since we want to place a footnote next to the first t statistic, we place a 1 at position (1,2) of annotmat for the first coefficient, second statistic of the table. We place another footnote next to the third coefficient estimate, so we place a 2 at position (3,1) of annotmat. The 1 and 2 in annotmat correspond to the first and second strings in asymbol, which are "{\super 2}" and "{\super 3}" since these should be footnotes number 2 and 3.
The final footnote, 4, is placed in the text labeling the summary statistic, N, using the summtitle("{\i N}{super 4}") which gives us an italicized N and a superscripted 4.
It is not possible to position a footnote next to the summary statistic in summstat. To accomplish this, it is necessary to turn off the automatic summary statistics with noautosumm (which summstat does by default), and place the statistic and the footnote symbol in addrows, which was described in Example 7 and Example 12.
The footnote text is added below the table in the note option, with superscripts for the footnote numbers.
. reg mpg foreign weight . label var foreign "Car Type{\super 1}" . matrix annotmat = J(3,2,0) . matrix annotmat[1,2] = 1 . matrix annotmat[3,1] = 2 . outreg using auto, varlabels replace colwidth(10 10) /// annotate(annotmat) asymbol("{\super 2}","{\super 3}") /// basefont(fs10) summstat(N) summtitle("{\i N}{\super 4}") /// note("{\super 1}First footnote." \ /// "{\super 2}Second footnote." \ /// "{\super 3}Third footnote." \ /// "{\super 4}Fourth footnote.") ---------------------------------------- Mileage (mpg) ---------------------------------------- Car Type{\super 1} -1.650 (1.53){\super 2} Weight (lbs.) -0.007 (10.34)** Constant 41.680{\super 3} (19.25)** {\i N}{\super 4} 74 ---------------------------------------- * p<0.05; ** p<0.01 {\super 1}First footnote. {\super 2}Second footnote. {\super 3}Third footnote. {\super 4}Fourth footnote.
Example 15. Show statistics side-by-side, like Stata estimation results
To show statistics side-by-side, such as t statistics next to the coefficients rather than below them, use the nosubstat option. The following example creates a table similar to Stata's display of regression results, reporting six statistics using the stats option. Asterisks for significance have been turned off with the nostars option.
. outreg using auto, nosubstat stats(b se t p ci_l ci_u) nostar /// ctitles("mpg", "Coef.", "Std. Err.", "t", "P>|t|", "[95% Conf.", > /// "Interval]") bdec(7) replace /// title("Horizontal Output like Stata's -estimates post-") Horizontal Output like Stata's -estimates post- ------------------------------------------------------------------------- mpg Coef. Std. Err. t P>|t| [95% Conf. Interval] ------------------------------------------------------------------------- foreign -1.6500291 1.0759941 -1.53 0.13 -3.7955004 0.4954422 weight -0.0065879 0.0006371 -10.34 0.00 -0.0078583 -0.0053175 _cons 41.6797023 2.1655472 19.25 0.00 37.3617239 45.9976808 -------------------------------------------------------------------------
Example 16. Merge multiple estimation results in a loop
If you want to run the same estimation on different datasets or on different groups within a dataset, it is often efficient to create a loop using the forvalues or foreach commands. This example shows how to merge the results of each estimation in the loop into a single outreg table, and secondly, how to merge sequential estimations in a loop into two separate tables.
Say we want to run separate regressions by groups which are indexed by the categorical variable rep78 in the auto.dta dataset. We use the forvalues command to create a loop that steps through the values of rep78 from 2 to 5. For each value of rep78, we run a regression of the variable mpg on covariates, restricting the sample to the current value of rep78 with the statement if rep78==`r'. r is a local macro containing the current value of the loop indicator.
Following each regression, the outreg, merge command merges successive regression results into a single table. The first time that outreg, merge is executed after the first regression, we actually don't want it to merge with anything. The merge option allows merging without an existing table precisely to enable its use in loops, although outreg does produce the warning message below, that no existing outreg table was found.
To ensure that there is no preexisting table before the first outreg, merge command in the loop that would be merged to the first regression coefficients, we preceed the forvalues loop with a outreg, clear command. The clear option removes any outreg table in memory, since outreg tables persist until cleared or replaced by a new table. Even if no previous outreg command has been run, if the commands in this example are rerun, the outreg, clear command is necessary to clear out the previous version of the table.
. outreg, clear . forvalues r = 2/5 { 2. quietly reg mpg price weight if rep78==`r' 3. outreg, merge varlabels ctitle("", "`r'") nodisplay 4. } warning: no existing table found for merge or append
The outreg command in the loop does not need any using statement because we don't need to save the table as a Word document (or TeX document) until we have merged all the regressions together. Once we have, and the loop is complete, we save the table as a Word document with the outreg using auto, replay command.
. outreg using auto, replay replace title(Regressions by Repair Record)
Regressions by Repair Record ------------------------------------------------------------ 2 3 4 5 ------------------------------------------------------------ Price -0.000 0.000 0.000 0.001 (0.61) (0.07) (0.71) (0.98) Weight (lbs.) -0.008 -0.004 -0.005 -0.025 (5.40)** (4.74)** (8.47)** (3.10)* Constant 44.953 34.052 34.918 78.648 (10.91)** (14.40)** (15.96)** (6.17)** R2 0.92 0.64 0.84 0.76 N 8 30 18 11 ------------------------------------------------------------
The replay option tells outreg to use the existing outreg table in memory instead of creating a new one. If we had left out the replay option, we would have created a new table from the existing e(b) matrix, which holds just the results of the last regression in the loop, so the replay option is important. With the replay option, it is possible to make text additions (except for varlabels) such as new titles or even addrows, but it is not possible to change the numerical contents or numerical formatting of the statistics in the table (options for estimate selection, estimates formatting, star options, brackets options, and summary statistics will be ignored). When using the replay option, it is possible to specify all the text formatting options such as fonts, lines, and spacing, and the relevant file options such as replace or tex.
Since the outreg command in the loop above used the merge option, no legend was created at the bottom of the table for the asterisks. This can be rectified with the option note(* p<0.05; ** p<0.01) in the outreg, replay command.
There are some contexts in which it is helpful to merge the estimation results in a loop into two separate outreg tables, such as when for each iteration of the loop, the results of the first estimation are used in the second estimation, and we want to record the results of both estimations. In this example, we run instrumental variables estimation in a loop, and record both the first and second stage regressions. In order to merge the regressions results to two separate tables, we need to give the tables separate names. Each time the merge option is used, it will refer to either the "first" table (for the first stage regression results) or the "iv" table (for the second stage results). These table-specific merge options become merge(first) and merge(iv).
As before, we preceed the forvalues loop with outreg, clear to clear out any outreg table in memory, but in this case we need to refer to the named tables, so we have two commands outreg, clear(first) and outreg, clear(iv). The built-in Stata command for instrumental variables estimation, ivregress does not have the capability of saving the first stage results (although they can be displayed). Instead we use the excellent user-written command ivreg2, which saves the first stage results with the savefirst option. The ivreg2 command is preceded by the quietly command to suppress the display of its output. We then add the instrumental variables estimates to the "iv" table with the outreg, merge(iv) command. The estimates restore _ivreg2_hsngval command puts the first stage estimates into the e(b) and e(V) vectors, so the second outreg command outreg, merge(first) saves the first stage regression results in the "first" table.
. webuse hsng2, clear (1980 Census housing data)
. outreg, clear(iv) . outreg, clear(first) . forvalues r = 1/4 { 2. quietly ivreg2 rent pcturban (hsngval = faminc) if reg`r', savef > irst 3. outreg, merge(iv) varlabels ctitle("","Region `r'") nodisplay 4. quietly estimates restore _ivreg2_hsngval 5. outreg, merge(first) varlabels ctitle("","Region `r'") nodisplay 6. } warning: no existing table found for merge or append warning: no existing table found for merge or append
We now save the two tables with two outreg, replay commands. To replay the table of first stage estimates, we use the replay(first) option, and the second stage estimates with the replay(iv) option. By using the addtable option in the second outreg, replay command (and using the same file name) we combine both tables into the file iv.doc.
. outreg using iv, replay(first) replace /// title(First Stage Regressions by Region) (output omitted) . outreg using iv, replay(iv) addtable /// title(Instrumental Variables Regression by Region) (output omitted)
Author
John Luke Gallup, Portland State University, USA jlgallup@pdx.edu
Also see
basic outreg