-------------------------------------------------------------------------------
help for corrtable 
-------------------------------------------------------------------------------

Correlation matrix as graphical table

corrtable varlist [if exp] [in range] [, half listwise n pval rformat(format) pformat(format) scatteri_options flag1() howflag1() ... flag7() howflag7() rsize(rule) diagonal(scatteri_options) combine(combine_options) ]

Description

corrtable displays a correlation matrix (and optionally corresponding sample sizes and/or P-values) as a table using graph. There is scope for varying the format(s) of correlations and P-values; for emphasising selected correlations with (for example) larger font sizes or different plotregion colours; and for specifying a rule relating font size to the magnitude of each correlation. correlate is used for the basic calculations.

In each off-diagonal cell the correlation is displayed. If requested, the corresponding sample size is displayed immediately below. If requested, the P-value is displayed below other results. The diagonal cells show variable labels, or if they do not exist variable names.

Remarks

Understanding of certain options pivots on knowing that immediately after each correlate command the correlation is available as r(rho).

corrtable is intended as indicative, not definitive. The code may help others who wish to produce similar displays. Special effects for particular problems are sometimes better implemented by adding customised code, rather than by attempting a much more general program with large numbers of options. Users of Stata 10 up may also use the Graph Editor to revise an initial graph.

Some limitations should be flagged.

corrtable can not avoid the problems of displaying large tables legibly, especially if you want to show several results and/or your variable labels are long or complicated.

corrtable does not support the display of stars, which is deprecated by the program author. Persons wanting stars should wonder why and, if the desire persists, write their own code or persuade others to do so.

corrtable can be slow. Under the hood it is producing several individual graphs and then using graph combine. With even modest correlation matrices that entails handling of several graphs. Hence the use of corrtable may be better restricted to producing presentation material.

Options

half specifies display of the lower half of the correlation matrix, together with a diagonal display of variable labels or names. Although not the default, this option is commended as quicker and simpler.

listwise specifies that the results for each pair of variables should be determined for as many observations as possible. Note that as a consequence the number of observations used in each calculation may differ. By default casewise deletion is used to ensure consistency in observations selected.

n specifies display of the sample size in addition to other results. This is usually advisable only in conjunction with listwise, as otherwise the same sample size will be displayed in each off-diagonal cell of the table.

pval specifies display of the P-value in addition to other results. Given sample size n and sample correlation r, the P-value is calculated as 2 * ttail(n - 2, abs(r) * sqrt(n - 2) / sqrt(1 - r^2)). There is no check of the underlying assumptions.

rformat(format) and pformat(format) specify formats for correlations and P-values respectively. The default is %5.3f for correlations and %4.3f for P-values.

scatteri_options control the display of each off-diagonal cell. The defaults include mlabsize(*4) mlabpos(0).

flag1() and howflag1() to flag7() and howflag7() are paired options selecting certain correlations for specific attention. For example, flag1() selects correlations and howflag1() indicates how to show them. In essence, flag? options specify conditions using Stata syntax involving the returned correlation r(rho) that must be true for the correlations to be emphasised, while howflag? specify options of twoway scatteri that control display of emphasised results. Any flagk() option overwrites any flagj() option if k > j. The limit of 7 pairs of such options is arbitrary: programmers may easily clone the program and change the code to raise the limit. Note that there is no wired-in link to any legend option.

rsize() suggests mlabsize() for a given correlation by specifying a rule for size in terms of the returned r(rho) using Stata syntax. Usually, but not necessarily, it will be desired to show larger correlations with larger fonts. A rule of 1 + 5 * abs(r(rho)) will show correlations of +1 or -1 at mlabsize(6) and correlations of 0 at mlabsize(1). Rules using functions such as sqrt(abs(r(rho))) or (abs(r(rho))^4) would make smaller correlations more readable. Precisely what fonts you see is a little dependent on your machine set-up and what graph file format you use. Representation on your monitor may well be coarser than what can be printed (e.g. if you save graph files as .eps first). rsize() overrides any other specification of mlabsize() for off-diagonal cells.

diagonal() specifies options of twoway scatteri that tune the appearance of the diagonal plots that show variable labels (or failing those, variable names). The defaults include mlabsize(*3) mlabpos(0) ysc(off) xsc(off). It may often be advisable to clone the variables and revise the variable labels to obtain more concise but still informative text. For example, while it is usually good practice to specify units of measurement, they are typically immaterial for correlations.

combine() specifies options of graph combine() that tune the combination of the graphs. combine(imargin(zero)) is often a good choice.

Examples

. set scheme s1color

. corrtable price-foreign, flag1(abs(r(rho)) > 0.8) howflag1(mlabsize(*7)) flag2(inrange(abs(r(rho)), 0.6, 0.8)) howflag2(mlabsize(*6)) half

. foreach v of var price-gear { . clonevar `v'2 = `v' . local label : var label `v'2 . if strpos("`label'", "(") local label = substr("`label'", 1, strpos("`label'", "(") - 2) . label var `v'2 "`label'" . } . corrtable mpg2 gear_ratio2 rep782 price2 headroom2-displacement2, flag1(r(rho) > 0) howflag1(plotregion(color(blue * 0.1))) flag2(r(rho) < 0) howflag2(plotregion(color(pink*0.1))) half rsize(2 + 6 * abs(r(rho)))

. corrtable log*, rsize(5 * sqrt(abs(r(rho)))) half combine(imargin(zero))

Acknowledgments

Rob Dunford sparked the development of this program. Vince Wiggins made helpful comments.

Author

Nicholas J. Cox, Durham University, U.K. n.j.cox@durham.ac.uk

Also see

Online: help for correlate; help for pwcorr; help for corrmat (if installed); help for matcorr (if installed); help for makematrix (if installed)