Title
xml_tab -- Save results in XML format
Syntax
xml_tab [namelist] [, options]
where namelist is a list of stored estimations or matrices; see estimates. A namelist comprises one or more specifications, separated by spaces. A specification can be a name of a stored estimation or a matrix name. xml_tab will output the estimation coefficients and one of the three statistics (standard errors, t-ratio, or p-values).
For estimation results xml_tab has enough information to calculate significance levels itself but if a matrix to be outputted, xml_tab looks also for matname_STARS matrix that must be of same size as matname and contain values 0,1,2 or 3 denoting significance levels. See option stars for more details.
The stored estimation could be also specified in an extended form with parameters.
xml_tab [estname1(stat11 stat12, eform_option) [estname2(stat21 stat22) [...]]] [, options]
where estname is a name of the stored estimation, and stat1 and stat2 are the names of matrixes stored in e(). Specification estname(b V) is identical to estname. You can access the names of the stored matrixes with ereturn list. See examples;
The extended form specification is useful when accessing non-standard statistics, for example, when outputting marginal effects of the parameters after e.g., mfx, dprobit. The extended form could also be used when outputting the results in exponentiated forms after commands such as streg, stcox, st, etc.
Note that you may enclose filename in double quotes and must do so if filename contains blanks or other special characters.
options Description ------------------------------------------------------------------------- Output save(["]filename["]) name and path for the output file replace overwrite existing filename append if workbook filename exists, add a new sheet, otherwise create a new workbook sheet(name [, sh_opts]) worksheet where the table is outputted
color(#) specify tab color for a worksheet nogridlines hide gridlines on a worksheet
savemat(name [, sm_opts]) save estimates to a matrix
replace if matrix name already exists replace it. The default is to append exit after writing the matrix exit xml_tab without creating an output file
mv(mvspec) change missing values to string or numeric values.
Statistics sd show estimated coefficients and standard deviations (default) tstat show estimated coefficients and t-statistic pvalue show estimated coefficients and p-value stats(scalarlist) report scalarlist statistics in the table stars(starspec) controls significance levels and symbols noadjust report not adjusted t-statistics eform_option display exponentiated coefficients. see eform_option;
Table Layout below show standard deviation (t-statistics or p-values) under the estimates nobrackets remove brackets around standard deviation (t-statistics or p-values) right show standard deviation (t-statistics or p-values) next to the estimates (default) long long output table style wide wide output table style (default) keep(keeplist) report keeplist rows drop(droplist) drop droplist rows from the table equations(matchlist) match the equations of the models according to matchlist
Table Formatting format(string) description of the output table format lines(string) rows to be underlined nolabel display variable names instead of labels constant(string) specifies label for the constant rblanks(string) add rows to the table cblanks(cblist) add blank columns to the table cwidth(cwlist) modify column widths tblanks(#) add blank rows at the top of the table title(string) table title rnames(string) custom row names cnames(string) custom column names ceq(string) custom column equation names (super-titles) notes(string) add notes to the end of the table font(string) specifies font for a worksheet style(stylename) predefined format styles for output
System options excelpath(["]filename["]) specifies the location of the MS Excel executable calcpath(["]filename["]) specifies the location of the OO Calc executable noisily displays the complete list of options applied to the table updateopts forces the options file update
Description
xml_tab saves Stata output directly into XML file that could be opened with Microsoft Excel or OpenOffice Calc. The program is relatively flexible and produces print-ready tables in Excel or Calc. xml_tab allows users to apply different formats to the elements of the output table and essentially do everything MS Excel or OO Calc can do in terms of formatting from within Stata.
xml_tab can create formatted tables of coefficients, standard errors, t- and p- values, summary statistics, etc., after any Stata estimation command that saves its results in e(). The program allows saving the results of a single estimation or a matrix into a table, combining several stored estimations or matrixes into one table, and outputting several tables into the different sheets of XML workbook.
Calling xml_tab without any arguments saves the results of last estimation command into file stata_out.xml located in the current work directory. After the XML file is saved xml_tab can create links in the Stata Results window. Clicking on a link opens the output table in MS Excel or OO Calc. See system options for details.
xml_tab can combine multiple estimations saved using estimates store command. The example of outputting multiple estimates into one table might look as follows:
.sysuse auto
.regress price rep78 length mpg
.estimates store reg1
.regress price rep78 length mpg turn if foreign==1
.estimates store reg2
.xml_tab reg1 reg2
In this very simple example, the estimates from the first regression are saved under the name reg1. The specification of the second regression contains one extra variable and the sample for the estimation is constrained. The estimates from the second regression are saved under the name reg2. xml_tab creates an output table containing five columns. The first column contains the names of the variables used in either regression. The second and the fourth columns contains the estimated coefficients of the first and the second regression, and the standard errors of the coefficients are outputted in columns 3 and 5.
xml_tab has many additional options to control the outputted statistics, formatting and a general layout of the output tables. Please read the description of these options below and have a look at our examples.
Options
+--------+ ----+ Output +-----------------------------------------------------------
save(["]filename["]) specifies a name for XML file where tables are outputted. If save(["]filename["]) is omitted, the output will be saved in stata_out.xml located in the current working directory. Use append and replace options to instruct xml_tab to append a table into the new worksheet of the existing file or to replace the existing file.
Alternatively, the output file name can be specified with the using syntax. Instead of save() users can write:
.xml_tab using c:\tmp\xml_out.xml, *
replace permits to overwrite the existing XML file.
append if XML file already exists, a new sheet will be added to it where the output table will be saved. Otherwise the new file will contain one sheet with the output.
NOTE: You need to close XML file opened in Excel or Calc for xml_tab to save new tables there. If XML file is still opened, xml_tab reports an error message: file can not be saved at this location.
sheet(name) You can output several tables into the different sheets of XML file (workbook). Excel or Calc files could contain multiple worksheets within a single document (workbook). sheet() option specifies the name for the new sheet where the table will be outputted. If not specified, a worksheet named Sheet1 will be added.
A valid Excel worksheet name must have no more than 31 characters. The worksheet name cannot contain any of the following: : \ / ? * [ or ] and can not be left blank.
savemat(name) saves estimation results in a matrix. If name exists and option replace was not used then additional rows will be appended to that matrix. In this case number of columns in the existing matrix and the one to be appended must be the same. The output matrix will contain estimates and standard errors (t-statistics, p-values) in a form determined by table layout options. Significance level info will be saved/appended to name_STARS matrix, see exporting a matrix example.
mv(#|mvc=# [\ mvc=#...] [\ else=#]) specifies the new values (string or numeric) to which the missing values ate to be changed.
mv(str) specifies that all types of missing values be changed to str.
mv(mvc=str) specifies that occurrences of missing-value code mvc be changed to str. Multiple transformation rules may be specified, separated by a backward slash \. The list may be terminated by the special rule else=str, specifying that all types of missing values not yet transformed be set to str.
For example: mv(""), mv(.n="N.A." \ .d="(dropped)" \ else="")
back to top
+------------+ ----+ Statistics +-------------------------------------------------------
sd, tstat and pvalue control what statistic will be outputted together with the estimated coefficient. If option sd is specified, the standard errors of the estimated coefficients are outputted. Specifying tstat produces a table of coefficients and the t-statistics. pvalues outputs the probability that the true value of the estimated coefficient is zero.
Only one option can be specified. If no options are specified sd the standard deviations are outputted.
stats(scalarlist) specifies one or more statistics (scalars) to be displayed in the table. The statistics specified in stats are outputted at the end of the table and could be the number of observation stats(N); adjusted R2 for a regression stats(r2_a); value of log-likelihood stats(ll) and any statistics that are saved in e() scalars after estimation routines. See ereturn.
In addition scalarlist may contain the following statistics:
aic Akaike's information criterion bic Schwarz's information criterion rank rank of e(V) - number of free parameters in model
If several estimations are combined in one table the specified statistics will be displayed for each estimation.
stars(starspec) specifies the significance level(s) and the symbol(s) for the coefficients to be marked in the output table. The simple syntax for starspec may contain up to three numbers separated by spaces corresponding to significance levels denoted by one, two and three stars. If numlist is just one number, the coefficients with the p-value less than that number will be marked by a star. For example, option stars(0.1) will mark with one star all the coefficients with p-value less than 0.1. If two values are specified in the numlist, xml_tab will mark with one star the coefficients with p-values that are less or equal to the first value and grater than the second value. Coefficients with p-values less than the second value will be marked by two starts. For example, specifying {opt stars(0.1 0.05)} will put one star on coefficients with pvalues in the range from 0.1 to 0.5 and put two stars on the coefficients with p-value less than 0.5.
The extended syntax allows selecting the symbols to mark the significance leves. In this case, {bf starspec} consist of up to three symbol/number pairs. The symbol could by one or several characters and the number indicates the significance level similar to the simple syntax described above.
By default, xml_tab output tables with the following star specification: star(0.1 0.05 0.01). Alternatively, this option could be specified in the extended syntax as star(* 0.1 ** 0.05 *** 0.01). Note that any symbol or a combination of several symbols cold be used instead of * in this specification. Specifying 0 in the numlist would suppress stars on the table: star(0).
noadjust is used when outputting some transformations of the estimated coefficients (i.e. marginal effects). By default, xml_tab calculates t-statistics as f(b)/sd_f(b). But if noadjust is specified then t-stat=b/sd_b will be reported and used for p-values and significance calculations.
eform_option when this option is specified the coefficients and the standard errors (t-statistics and p-values) for all estimation results will be reported in exponentiated form. To transform only some of the estimation results, the extended syntax can be used. For example,
.xml_tab reg streg(, hr) prob, replace
outputs unmodified coefficients and standard errors for esimations reg and prob, but exponentiated (Hazard ratio) coefficients and modified standard errors for the estimation streg (displaying 'Haz Ratio' as a title).
back to top
+--------------+ ----+ Table Layout +-----------------------------------------------------
below and right control whether the standard deviation, t-statistic or p-values (see sd tstat pvalue) will be placed to the right or below the coefficients. right is used by default.
For example: if right is specified the element of the output table will look like:
0.123 0.456
if below is specified the element of the output table will look like:
0.123 (0.456)
nobrackets removes brackets around standard deviations (t-statistics or p-values), which is the default format when below is used.
long and wide produce "long version" or "wide version" of output table. wide specifies that the individual equations from multiple-equation models (e.g., mlogit, biprobit, heckman) to be placed in separate columns. Summary statistics will be reported under the first equation if wide is specified. This is a default option.
For example if a dependent variable with three categories is fitted with mlogit using 10 exogenous variables, specifying wide option would result in a 12x5 table. The first column of this table contains variable labels/names, the second and third columns contain the estimated coefficients and a standard errors for the first equation, and the fourth and fifth columns contain the estimated coefficients and a standard errors for the second equation.
Alternatively, specifying option long places the equations of the multiple-equation estimation below one another in a single column.
keep(keeplist) specifies the coefficients (and their order) to be included in the table. A keeplist comprises one or more specifications, separated by a space: a variable name (e.g., price), and equation name (e.g., mean:), or a full name (e.g., mean:price).
Using keep in a multiple-equation or in multiple-estimation tables would output only variables specified in the keeplist. If some of the equations/estimations contain no variables in keeplist, these equations will not be outputted.
If you want to keep some variables only in selected equations, make sure you specify the correct names for the equations. xml_tab uses the name of a dependent variable as an equation name. keep will output statistics for the variables in keeplist in all estimation/equations with identical names (same dependent variable).
drop(droplist) identifies the coefficients to be dropped from the table. A droplist comprises one or more specifications, separated by spaces. A specification can be either a variable name (e.g. price), an equation name followed by a colon (e.g. mean:), or a full name (e.g. mean:price).
drop(droplist) option could be useful if a user wants to suppress the output of coefficients for some variables. For example, if the empirical specification includes several regional dummies, one might want to create the output tables without the coefficients on these dummies. If regional dummies are named reg1,...,reg12, specifying drop(reg*) will suppress the output of the coefficients on these dummies in the table.
In the multi-equation estimations (e.g., heckman, heckprobit) sometimes you might want to suppress the output of the coefficients for first stage equation. This could be done by specifying drop(select:), where select is the name of the first stage equation. See more example of using drop() option in Examples section.
Characters * and ? can be used for variable names in the keeplist and droplist. The * character indicates to match one or more characters and the ? character matches a single character. All variables matching the pattern will be a included in the lists.
.xml_tab, droop(region_*) .xml_tab, droop(region_?)
In this example, the first command will suppress the output of coefficients for variables region_1-region_15, but the second one will output just suppress the coefficients for region_1-region_9, because only the one digit(character) was allowed by ?.
If both keep() and drop() are specified, keep will be applied first.
equations(matchlist) specifies how the equations of the models in namelist are to be matched. This option works exactly the same way as equations in estimates table; see estimates_options for details.
back to top
+------------------+ ----+ Table Formatting +-------------------------------------------------
format(flist) specifies a list containing the formatting information for each cell of an XML table. The format of each cell is defined by a formatting string. A formatting string consists of five alpha-numerical symbols in the order specified bellow:
Cell type : (S|s) - string, (N|n) - numeric Vertical alignment: (T|t|1) - Top , (C|c|2) - Center, (B|b|3) - Botto > m Horizontal alignment: (L|l|1) - Left, (C|c|2) - Center, (R|r|3) - Rig > ht Font style: (R|r|0) - Regular, (B|b|1)- Bold, (I|i|2) - Italic, (O|o| > 3) - Bold Italic, (U|u|4) - Underline Decimal places : 0-9 defining number of digits after the decimal
The order of codes in the formatting string is important. The formatting string can contain a mixture of alpha-numerical symbols. Upper and lower-case character and numerical codes can be used interchangably. For example, a formatting string can look like N2210, NCCI2, nT1R0, sccb0 etc.
The formatting for a table of size nxm is described as a list of codes for table's columns. That list has a form:
((F_00 F_01...F_0n) (F_10 F_11...F1_n) ... (F_n0 F_n1...F_mn))
Format lists for the different columns are enclosed in parenthesis and separated by a space. The first format list (F_00 F_01...F_0n) defines format for the rownames (e.g. variable names). The list (F_k0 F_k1...F_kn) describes formatting of cells of the k-th column; the F_k0 defines the format for the k-th column's header (e.g. coeff. std.err.).
If the dimensions of the formatting list are smaller than the dimensions of the output table, the formatting will be extended for the remaining rows/columns. If just one formatting code is specified as in format((S2110)) (string, centered vertically, left justified, in bold, with 0 decimal places), that format will be applied to each cell of the output table (including variable names and column headers). If two codes are specified as in format((S2110) (N2230)), the column with variables names will be formatted using S2110 format and all other columns in the table will be formatted with N2230. Specifying three codes (format((S2110) (NCCI0) (N2211))) will apply format S2110 to the column of variable names; format N2230 to all odd numerical columns; and format N2211 to all even numerical columns of the table.
Similarly, is k+1 formatting codes are specified the first format will be applied to the column of the variable/row names and remaining k formats will be applied repeatedly to every group of k columns of the table. By specifying more than one formatting code for a column you can control the formatting of every cell in a column.
Analogously to the extension rule for the column formats, the first code of the list will be applied to the column header and the next codes will be used repeatedly for the cells of a column. For example, format((S2110) (SCCB0 N2302 N2322)) outputs the variable names with format S2110, the headers for each numerical column with format S2210 format; every odd row has a format N2302 (right justified with 2 decimals) and every even row is formatted with N2322 (italic with 2 decimals).
See styles for an alternative way of specifying the formatting of the output table.
lines(string) will draw the bottom borderline in the cells with the listed variables. string is list of paired parameters. The first parameter specifies the variable name (or row number) and the second parameter a line style. The line style can be a number from 0 to 13 that corresponds to the line styles defined below. In addition, specifying SCOL_NAMES, COL_NAMES and LAST_ROW instead of variable names xml_tab will draw lines under equation names (SCOL_NAMES), statistic titles (e.g., coeff. std.error) COL_NAMES or under the last row of table (LAST_ROW).
Line # Style
# Style
0 None 1 Continuous hairline 2 Continuous thin 3 Continuous medium 4 Continuous thick 5 Dot thin 6 DashDotDot thin 7 DashDotDot medium 8 DashDot thin 9 DashDot medium 10 Dash thin 11 Dash medium 12 SlantDashDot thin 13 Double thick
For example, lines(COL_NAMES 1 turn 1 LAST_ROW 13) draws a single line under the captions (e.g., coeff std.err), a single line under the variable turn, and draw a double line under the last raw of the table.
nolabel by default, variable labels will be displayed as row names in the output table. If nolabel is specified, variable names will be displayed in the output table. This option is ignored when outputting matrixes.
constant(string) controls the label for the constant term. The default in Stata is _cons, but if specified, string will be displayed instead.
rblanks(varname [text] [format], [varname [text] [format]], [...]) inserts an empty row or text after the specified rows (varname) in the output table. For each inserted row xml_tab expects variable name varname after which an empty row or text should be added, the text itself text, and format for a new cell format. Multiple rows are separated by comma. Also see format.
For example, rblanks(turn "this text after turn" S2210, headroom "and this one below headroom") will add vertically and horizontally centered "this text after turn" below data row for variable called turn (if found) and "and this one below headroom" after variable headroom using the same format as for headroom.
In addition, specifying SCOL_NAMES, COL_NAMES and LAST_ROW instead of variable names xml_tab will add a row under equation names (SCOL_NAMES), statistic titles (e.g., coeff. std.error) COL_NAMES or under the last row of table (LAST_ROW).
cblanks(Equations|numlist) inserts an empty column after each column specified in the numlist. The numbering of the numeric columns in the output table starts from 1, For example, cblanks(2 4) inserts an empty column after the second and the fourth columns of the output table. Column with the row names is number 0. If keyword Equations (abbreviation allowed) is used then blank column will be added after every equation (group of columns having the same column equation value).
cwidth(string) controls the column withs in Excel worksheet. string is list of paired parameters. The first parameter specifies the column number (0 being the row names column) or keyword Equations and the second parameter a column width. You can specify a column width of 0 to 255. This value represents the number of characters that can be displayed in a cell that is formatted with the standard font. If the column width is set to 0, the column is hidden. cwidth is evaluated after the cblanks option so additional blank columns (if any) should be taken into consideration when determining column numbering.
tblanks(#) inserts # rows at the beginning of an Excel sheet. The output table shifts down by # rows.
title(string) specifies the title for the table. The string is inserted at the top of table. Example:
.xml_tab *, title("Table 1.1")
rnames(string), cnames(string) and ceq(string) modify default row/column names as well as column equation names (super-titles) for the output matrix/estimation result. string should contain as many words as rows/columns in the output. To specify a name containing spaces it must be included in quotes. Table formating options long, below, keep(), drop(), rblanks() and cblanks() will be applied before the rnames(), cnames(string) and ceq(string).
notes([#] line1 [, [#] line2 [, ...]]) adds one or more lines of text at the end of the table. Lines are separated by commas. If first word (after comma) of a line specification is numeric then the note will be shifted down by # rows, otherwise it will be written imediately after the previous output row.
font("FontName" [Fontsize]) specifies the font for the output table in a particular sheet of the workbook. The FontName is a font name enclosed in quotes; an optional argument FontSize is the size of the font. Example:
.xml_tab *, font("Arial Narrow" 12)
If no font is specified, the defualt font is used. When a new sheet is added to the existing workbook and font is specified, that font will be applied only to the table in this new sheet. So, each sheet in the workbook can use differnt fonts.
nogridlines hides gridlines on a worksheet. The default in Excel is to display gridlines. This option affects only the on-screen appearance, these lines are not printed.
style(stylename) specifies layouts and formatting for the output table from predefined list of styles. These predefined styles are stored in the xml_tab_options.txt. If file is missing it will be recreated. There are several preset styles: default, S1, S2. user-defined styles can be added to the file. If neither style nor format was specified, the program creates output with style(default).
styles in xml_tab are just sets of options that are written in file xml_tab_options.txt located in the directory where xml_tab is installed. Each style is recorded on the separate line. You can define your own styles by modifying file xml_tab_options.txt. The syntax for the styles consist of the style name, equal sign, and the set of options corresponding to that particular style. To create a style, add a new line in the file xml_tab_options.txt and define your style. For example:
MYTABLES=sd right replace wide font("Arial Narrow" 12) sheet(Table 1)
You can instruct xml_tab to use this style by specifying:
xml_tab *, style(mytables)
This command will create a table using the set of options defined in style(mytables). In other words, the previous command is equivalent to:
xml_tab *, sd right replace wide font("Arial Narrow" 12) sheet(Table 1)
User-specified options supersede any options defined in style. You can check the complete set of options associated with a particular style by using option noisily.
back to top
+---------------------------------------+ ----+ System requirements and other options +----------------------------
xml_tab is designed for Stata Version 8.0 and later. The XML files generated by xml_tab could be opened with 2003 or later versions of Microsoft Excel and OpenOffice Calc 2.0 under Windows OS. It might be possible to open XML files with MS Excel 2002. The XML format is not supported by MS Excel 2000 and earlier.
We have not tested the compatibility of XML output generated by xml_tab with OpenOffice Calc or Microsoft Excel running under Macintosh, UNIX, or Linux OS. We would be thankful for the comments from the users running Stata on these platforms about any problems they encounter with using xml_tab.
xml_tab will try to locate either MS Excel or OO Calc on your computer and if found, will create a link in the results window clicking on which will automatically open the output. You can explicitly specify where should xml_tab look for Excel or Calc if the programs are installed not in their default locations (you will need to do it just once). If neither of two programs was found or Stata runs in "console" or "batch" modes, no link will be created but xml_tab will still generate the output file.
excelpath(["]filename["]), calcpath(["]filename["]) tells xml_tab where to find Excel and/or Calc executables, for example:
.xml_tab, excelpath("D:\My programs\MS Excel\excel.exe")
.xml_tab, calcpath("D:\My programs\OpenOffice.org\scalc.exe")
noisily displays the complete list of options applied to the table. This option could be useful whey using style in order to see all options associated with a particular style.
updateopts is used to recreate the options file. if xml_tab_options.txt is missing, it will be created automatically, otherwise updateopts can be specified to overwrite the existing options file with the one containing the default styles.
which xml_tab displays the version and location of xml_tab installed on the computer. The latest version of the program can be installed in Stata by typing:
.adoupdate xml_tab, update
back to top
Examples
+--------------+ ----+ Introduction +-----------------------------------------------------
In all our examples we use stored estimates and matrixes defined by the code below:
.sysuse auto, clear
.regress price mpg headroom trunk if foreign==0
.estimates store reg1
.regress price mpg headroom trunk turn if foreign==1
.estimates store reg2, title(Only foreign cars)
.heckman mpg weight length, sel(foreign = length displ) nolog
.estimates store heck1, title(Selection model)
In the above code we use datafile auto.dta. We run a regression and save the results of this regression into the stored estimate reg1. We then run another regression with a different specification and save the results in the estimate reg2 giving this estimate a title. Then we estimate heckman selection model. We store the estimates from this model to heck1.
+--------------+ ----+ Basic syntax +-----------------------------------------------------
.xml_tab reg1 reg2, replace stats(r2_a) title("price regressions by car type")
In this example, xml_tab merges the estimates of two regression reg1 and reg2 forming a table with five columns. The first column contains the variable names; the second column contains the estimated coefficients from the first regression; the third column contains the standard errors for the coefficients from the first regression; the estimated coefficients and the standard errors from the second regression are placed in the fourth and fifth columns of the output table.
In addition to the coefficients and standard errors xml_tab also outputs the adjusted R2 for each regression at the bottom of the output table (option stat(r2_a). With option title("price regressions by car type") xml_tab is instructed to place a text "Price regression by car type" on the top of output table. The numerical values in the table are formatted with a default format: both the a coefficients and the standard errors have three decimal places, standard errors are placed to the right from the coefficients and italicized.
The table is outputted into the file stata_out.xml located in the current Stata directory.
Extensions: options below, tstat, save, sheet
.xml_tab reg1 reg2 using "c:\my documents\table1", tstat below sheet("Table 2") stats(N r2_a)
save() saves the XML file as c:\my documents\table1.xml. The output table is placed in the sheet of XML workbook with the name Table 2. Specifying tstat outputs the estimation coefficients and the corresponding t-statistics. Option below places t-statistics in parenthesis under the coefficients.
Extensions: options append, drop/keep
.xml_tab heck1 using "c:\my documents\table1", pvalue right drop(length) sheet("Heckman") stats(ll) append
This command saves the estimation results from heckman. Option append adds a new sheet "Heckman" to XML file used in the previous example and outputs there the estimated coefficients and p-values for the binary and continuous parts of the ML Heckman estimation. Option drop(length) suppresses the output of the estimated coefficient for length. stats(ll) reports log-likelihood value for the system at the bottom of the table.
To suppress the output of the binary part of the ML Heckman estimation (loosely speaking to supers the output of the first stage equation), we insert the equation name as a parameter: drop(foreign:).
Extensions: options long and wide
By default, xml_tab outputs the results of multiple-equation estimations into separate columns for each equation. For example, the table formed in the last example consists of five columns: column with the variable names, two columns with the coefficients and p-values from the first stage equation, and two columns with the second stage equation estimates. To output the coefficients from both equations into one column, specify option long:
.xml_tab heck1 using "c:\my documents\table1", long pvalue right sheet("Heckman_long") stats(ll) append
Note that the code above will keep two previous tables saved in sheets Table 1 and Heckman in the file c:\my documents\table1. xml_tab will add a new sheet Heckman_long to the workbook and output the results in long specification into this sheet.
+--------------------+ ----+ Print ready tables +-----------------------------------------------
xml_tab allows to create print-ready tables from within Stata. Several options control the look of the output table.
Subcategories of variables could be conveniently separated using rblanks() option. The example below inserts an empty row after variable mpg and puts a heading in italic into this row:
.xml_tab reg1 reg2, replace stats(r2_a) rblanks(mpg "Interior Dimensions" S2220)
To separate the results of two regressions, use option cblanks() that insert an empty column after the columns specified as argument of this option. The code below inserts an empty column after the third column (standard errors of the first regressions).
.xml_tab reg1 reg2, replace stats(r2_a) cblanks(3)
Option line() allows to underline rows of the table. To underline the last row of the output table with a double line use the following syntax:
.xml_tab reg1 reg2, replace stats(r2_a) line(LAST_ROW 13)
To underline the row with coefficient mpg with a single line and to underline the last row of the table with a double line use:
.xml_tab reg1 reg2, replace stats(r2_a) line(LAST_ROW 13 mpg 1)
+------------------+ ----+ Marginal effects +-------------------------------------------------
You have to use a non-standard specification in order to create tables of marginal effects, elasticities, and other statistics generated by mfx, dprobit, and dlogit. Suppose we want to generate a table of marginal effects after heckman estimation.
.heckman mpg weight length, sel(foreign = length displ) nolog .mfx, predict(xb) .estimates store heck_mfx, title(Heckman marginals)
Now, if we want to output, for example, the marginal effects from this estimation we write:
.xml_tab heck_mfx(Xmfx_dydx Xmfx_se_dydx), replace
To output the marginal effects and the standard errors after dprobit you would specify (dfdx se_dfdx) statistics.
Note that now we specify the names of matrixes with corresponding statistics parenthesis with the name of the stored estimates. You can check the names of the matrixes with stored statistics using ereturn list.
+--------------------+ ----+ Exporting a matrix +-----------------------------------------------
xml_tab can also output any matrix in Stata memory. You can apply most of the options of xml_tab to control the layout and the formatting of the matrix output. You can create custom tables forming the matrixes of results and outputting them with xml_tab. tabstatmat is a useful command to save various summary statistics into matrixes. The example below outputs matrix M with a custom format into the default XML file.
.matrix M = 1, 2 \ 3, 4 .matrix M_STARS = 0, 2 \ 3, 1 .xml_tab M, format((S2210, S2100, S2100), (S2210, N2301, N2301), (S2230, N2321, N2321))
Since matrix M_STARS exists, output table will contain significance stars based on default levels and symbols. Thus, the output table will have a form:
c1 c2 r1 1.0 2.1** r2 3.0* 4.1***
Another example demonstrates how to create and output a simple table of the means:
.tabstat price mpg rep78 headroom trunk weight length, by(foreign) save .tabstatmat A .matrix TAB=A'
.xml_tab TAB, replace
tabstat generates a table of means for the list of variables categorized by foreigh. tabstatmat saves the resutls to matrix A. This matrix has tree rows for Domestic, Foreign and Total. In the columns of matrix A are the means for the listed variables. We save the transpose matrix A into another matrix TAB. xml_tab ouptuts the matrix into the default XML file.
You can see more examples of using xml_tab in xml_tab_example.do.
Authors
Zurab Sajaia, zsajaia@worldbank.org
Michael Lokshin, mlokshin@worldbank.org
econ.worldbank.org/programs/poverty/toolkit
Development Economics Research Group,
The World Bank, 2006
Acknowledgement
While the concept of xml_tab is original, we borrowed some functionality ideas from such programs as estout by Ben Jann, outreg by John Luke Gallup, outreg2 by Roy Wada, modltbl by John H. Tyler, mktab by Nicholas Winter, outtex by Antoine Terracol, or est2tex by Marc Muendler.
Also see
Online: xmlsave, estimates, matrix, which