```-------------------------------------------------------------------------------
help for ranktest
-------------------------------------------------------------------------------

ranktest: module for testing the rank of a matrix using the Kleibergen-Paap rk
> statistic

Full syntax

ranktest (varlist1) (varlist2) [weight] [if exp] [in range]
[, partial(varlist3) wald allrank fullrank nullrank robust bw(#)
kernel(string) cluster(varlist) noconstant ]

Version syntax

ranktest, version

ranktest may be used with time-series or panel data, in which case the data
must be tsset before using ranktest; see help tsset.

All varlists may contain time-series operators; see help varlist.  If
(varlist1) or (varlist1) contain a single variable, the parentheses () may be
omitted.

aweights, fweights, iweights and pweights are allowed; see help weights.

ranktest is an r-class program.

Contents
Description
Options
Examples
References
Acknowledgements
Authors
Citation of ranktest

Description

ranktest implements the Kleibergen-Paap (2006) rk test for the rank of a
matrix.  Tests of the rank of a matrix have many practical applications.  For
example, in econometrics the requirement for identification is the rank
condition, which states that a particular matrix must be of full column rank.
Another example from econometrics concerns cointegration in vector
autoregressive (VAR) models; the Johansen trace test is a test of a rank of a
particular matrix.  The traditional test of the rank of a matrix for the
standard (stationary) case is the Anderson (1951) canonical correlations test.
If we denote one list of variables as Y and a second as Z, and we calculate the
squared canonical correlations between Y and Z, the LM form of the Anderson
test, where the null hypothesis is that the matrix of correlations or
regression parameters B between Y and Z has rank(B)=r, is N times the sum of
the r+1 largest squared canonical correlations.  A large test statistic and
rejection of the null indicates that the matrix has rank at least r+1.  The
Cragg-Donald (1993) statistic is a closely related Wald test for the rank of a
matrix.  Both the Anderson and Cragg-Donald tests require the assumption that
the covariance matrix has a Kronecker form; when this is not so, e.g., when
disturbances are heteroskedastic or autocorrelated, the test statistics are no
longer valid.

The Kleibergen-Paap (2006) rk statistic is a generalization of the Anderson
canonical correlation rank test to the case of a non-Kronecker covariance
matrix.  The implementation in ranktest will calculate rk statistics that are
robust to various forms of heteroskedasticity, autocorrelation, and clustering.
For a full discussion of the test statistic and its relationship other test
statistics for the rank of a matrix, see Kleibergen-Paap (2006).

The text is applied to Y and Z, where Y=varlist1 and Z=varlist2.  Optionally, a
third set of variables X=varlist3 can be partialled-out of Y and Z with the
partial() option.  A constant is automatically partialled out, unless the user
specifies the nocons option.  To test if a matrix is rank r+1, the null
hypothesis is Ho: rank(B)=r.  Rejection of the null indicates that the matrix
has at least rank=r+1.  In the standard (stationary) case, the test statistic
is distributed as chi-squared with degrees of freedom = (K-r)*(L-r), where K is
the number of Y variables, L is the number of Z variables, and r is the rank
being tested in Ho.  For example, to test if the matrix is full column rank K
where K<L, the null would be Ho:rank(B)=K-1 and the degrees of freedom of the
test would be (K-r)*(L-r) = (K-(K-1))*(L-(K-1) = (L-K+1).  The default behavior
of ranktest is to perform all possible tests of rank; the fullrank option
causes only the test of whether the matrix is full rank (Ho:r=K-1) to be
reported; the nullrank option causes only the test of whether the matrix is
zero rank (Ho:r=0) to be reported.

The default behavior of ranktest is to report LM tests; the wald option will
cause it to report Wald tests.  P-values are for the standard (stationary) case
using the chi-squared distribution.  Specifying robust, bw(#) (where # is the
bandwidth), or cluster(varname) will generate an rk statistic that is robust to
heteroskedasticity, autocorrelation or within-group clustering; robust combined
with bw(#) will generate a heteroskedasticity and autocorrelation-consistent
(HAC) statistic.  The implementation of an autocorrelation-consistent statistic
and the options available for various kernels follow that in ivreg2; for more
details, see Baum et al. (2007) or help ivreg2 if installed.  If none of the
above options is specified, ranktest defaults to reporting the Anderson
canonical correlations LM test, or, if wald is specified, the Cragg-Donald
(1993) Wald test.

It is useful to note that in the special case of a test for whether a matrix
has rank=zero (e.g., if there is a single variable Y), the Anderson,
Cragg-Donald, and Kleibergen-Paap statistics reduce to familiar statistics
available from OLS estimation.  Thus if K=1, the Cragg-Donald Wald statistic
can be calculated by regressing the single Y on Z and X and testing the joint
significance of Z using a standard Wald test and a traditional non-robust
covariance estimator.  The Anderson LM statistic can be obtained by calculating
an LM test of the same joint hypothesis.  The robust Kleibergen-Paap rk
statistics can be obtained by performing the same tests with the desired robust
covariance estimator.  Similarly, if K>1 the test statistics for rank=0
reported by ranktest can be reproduced by testing the joint significance of the
Z variables across the K equations for the Y variables.  See the examples
below.

Options summary

partial(varlist3) requests that the variables in (varlist3) are partialled out
of the variables in (varlist1) and (varlist2).  A constant is automatically
partialled out as well, unless the option noconstant is specified.

wald requests the Wald instead of the LM version of the test.  The LM version
is the default.

allrank requests that test statistics for rank=0, rank=1, ..., rank=(#cols-1)
be reported, where (#cols-1) is the number of columns of the smaller of the
two matrices (varlists).  allrank is the default.

fullrank requests that only the test statistic for Ho: rank=(#cols-1) be
reported, where (#cols-1) is the number of columns of the smaller of the
two matrices (varlists).  Rejection of the null indicates that the matrix
is of full column rank.

nullrank requests that only the test statistic for Ho: rank=0 be reported.
Rejection of the null indicates that the matrix has at least rank=1.

robust specifies that the Eicker/Huber/White/sandwich heteroskedastic-robust
estimator of variance is to be used.  The reported rk statistic will be
robust to heteroskedasticity.

cluster(varlist) specifies that observations are independent across groups
(clusters) but not necessarily independent within groups.  varname
specifies to which group each observation belongs.  Specifying cluster()
implies robust, i.e., the reported rk statistic will be robust to both
heteroskedasticity and within-cluster correlation.  If ivreg2 version 3.0
or later is installed, 2-way clustering is supported; see help ivreg2 for
details.

bw(#) impements autocorrelation-consistent (AC) or heteroskedasticity- and
autocorrelation-consistent (HAC) covariance estimation with bandwidth equal
to #, where # is an integer greater than zero.  Specifying robust together
with bw(#) implements HAC covariance estimation; omitting robust implements
AC covariance estimation.

kernel(string)) specifies the kernel to be used for AC and HAC covariance
estimation; the default kernel is Bartlett (also known in econometrics as
Newey-West).  Kernels available are (abbreviations in parentheses):
Bartlett (bar); Truncated (tru); Parzen (par); Tukey-Hanning (thann);
Tukey-Hamming (thamm); Daniell (dan); Tent (ten); and Quadratic-Spectral
(qua or qs).  Note that for some kernels (bar, par, thann and thamm) the
bandwidth must be at least 2 to obtain an autocorrelation-consistent
estimator.

noconstant suppresses the constant term (intercept) in the list of
partialled-out variables.

version causes ranktest to display its current version number and to leave it
in the macro s(version).  It cannot be used with any other options.

Saved results

ranktest saves the following results in r():

Scalars
r(N)          Number of observations
r(N_clust)    Number of clusters
r(chi2)       rk statistic for highest rank tested
r(p)          p-value of rk statistic
r(rdf)        dof of rk statistic
r(rank)       Rank of matrix under Ho for highest rank tested

Macros
r(version)    Version number of ranktest

Matrices
r(rkmarix)    Saved results of rank tests
r(ccorr)      Matrix of canonical correlations
r(eval)       Matrix of eigenvalues (=squared canonical correlations)
r(V)          Covariance matrix (W in Kleibergen-Paap (2006), p. 103)

Examples

Tests for underidentification of Klein consumption equation.

(Underidentification means endogenous regressors (profits wagetot) are not iden
> tified
by the excluded instruments (govt taxnetx year wagegovt capital1 L.totinc) afte
> r
partialling-out the included instruments (L.totinc _cons).  Test is equivalent
> to
testing whether the matrix of reduced form coefficients for the endogenous regr
> essors
is full rank (#cols=2) vs. less than full rank (#cols=1).  The test for underid
> entification
should not be confused with a test for "weak identification"; see e.g. Stock an
> d Yogo (2005)
or Baum et al. (2007).)

. webuse klein, clear

. tsset yr

(Klein consumption equation - for reference)

. ivreg2 consump L.profits (profits wagetot = govt taxnetx year
wagegovt capital1 L.totinc)

(Homoskedasticity, LM => Anderson canonical correlations test; test all ranks.
>  Ho of
rank=1 can be rejected, suggesting the model is identified.)

. ranktest (profits wagetot) (govt taxnetx year wagegovt capital1
L.totinc), partial(L.profits)

(Homoskedasticity, Wald => Cragg-Donald (1993) test; test all ranks.  Ho of ran
> k=1 can
be rejected, suggesting model is identified.)

. ranktest (profits wagetot) (govt taxnetx year wagegovt capital1
L.totinc), partial(L.profits) wald

(Heteroskedastic robust, LM statistic, test for full rank only.  Ho of rank=1 n
> ow
cannot be rejected, suggesting model may be underidentified.)

. ranktest (profits wagetot) (govt taxnetx year wagegovt capital1
L.totinc), partial(L.profits) full robust

(Heteroskedastic and autocorrelation robust, LM statistic, test for null rank o
> nly)

. ranktest (profits wagetot) (govt taxnetx year wagegovt capital1
L.totinc), partial(L.profits) null robust bw(2)

Testing for reduced rank in VAR models.

(Relationship of Johansen trace statistic and Anderson canonical correlations s
> tatistic.
Former is an LR test, ranktest reports LM version of latter, but based on the s
> ame
eigenvalues.  Note that the p-values reported by ranktest are not valid in this
>  application
because they are for the standard stationary case.)

. vecrank consump profits wagetot, lags(1)

. ranktest (d.consump d.profits d.wagetot) (L1.consump L1.profits
L1.wagetot)

. mat eval=r(eval)

. mat list eval

(vecrank LR trace statistic for maximum rank=0 vs. ranktest LM canonical correl
> ations
statistic for same.  Both statistics calculated using the same eigenvalues.)

. di -r(N)*(ln(1-eval[1,1]) + ln(1-eval[1,2]) + ln(1-eval[1,3]))

. di r(N)*(eval[1,1] + eval[1,2] + eval[1,3])

Equalities between rk statistic and other test statistics

(Equivalence of rk statistic and canonical correlations under homoskedasticity)

. canon (profits wagetot) (govt taxnetx year wagegovt)

. mat list e(ccorr)

. ranktest (profits wagetot) (govt taxnetx year wagegovt)

. mat list r(rkmatrix)

(Equality of rk statistic and Wald test from OLS regression in special case
of single regressor)

. ranktest (profits) (govt taxnetx year wagegovt capital1 L.totinc),
partial(L.profits) wald robust

. regress profits govt taxnetx year wagegovt capital1 L.totinc
L.profits, robust

. testparm govt taxnetx year wagegovt capital1 L.totinc

. di r(F)*r(df)*e(N)/e(df_r)

(Equality of rk statistic and LM test from OLS regression in special case
of single regressor. Generate a group variable to illustrate cluster)

. gen clustvar = round(yr/2)

. ranktest (profits) (govt taxnetx year wagegovt capital1 L.totinc),
partial(L.profits) cluster(clustvar)

. ivreg2 profits L.profits (=govt taxnetx year wagegovt capital1
L.totinc), cluster(clustvar)

. di e(j)

(Equality of rk statistic of null rank and Wald test from OLS regressions and a
Kronecker covariance matrix (independent and homoskedastic equations).  To show
>  equality,
estimate the equations using reg3 specifying that all regressors are exogenous,
and then test joint significance of Z variables in both regressions.  L.profits
>  is the
partialled-out variable and is not tested.)

. ranktest (profits wagetot) (govt taxnetx year wagegovt capital1
L.totinc), partial(L.profits) wald null

. global e1 (profits govt taxnetx year wagegovt capital1 L.totinc
L.profits)

. global e2 (wagetot govt taxnetx year wagegovt capital1 L.totinc
L.profits)

. reg3 \$e1 \$e2, allexog

. qui test [profits]govt [profits]taxnetx [profits]year
[profits]wagegovt [profits]capital1 [profits]L.totinc

. test [wagetot]govt [wagetot]taxnetx [wagetot]year [wagetot]wagegovt
[wagetot]capital1 [wagetot]L.totinc, accum

(Equality of rk statistic of null rank and Wald test from OLS regressions and s
> uest.
To show equality, use suest to test joint significance of Z variables in both
regressions.  L.profits is the partialled-out variable and is not tested.   Not
> e that
suest introduces a finite sample adjustment of (N-1)/N.)

. ranktest (profits wagetot) (govt taxnetx year wagegovt capital1
L.totinc), partial(L.profits) wald null robust

. di r(chi2)*(r(N)-1)/r(N)

. qui regress profits govt taxnetx year wagegovt capital1 L.totinc
L.profits

. est store e1

. qui regress wagetot govt taxnetx year wagegovt capital1 L.totinc
L.profits

. est store e2

. qui suest e1 e2

. qui test [e1_mean]govt [e1_mean]taxnetx [e1_mean]year
[e1_mean]wagegovt [e1_mean]capital1 [e1_mean]L.totinc

. test [e2_mean]govt [e2_mean]taxnetx [e2_mean]year [e2_mean]wagegovt
[e2_mean]capital1 [e2_mean]L.totinc, accum

References

Anderson, T.W. 1951. Estimating linear restrictions on regression coefficients
for multivariate normal distributions. Annals of Mathematical Statistics,
Vol. 22, pp. 327-51.

Anderson, T.W. 1984. Introduction to Multivariate Statistical Analysis.  2d ed.
New York: John Wiley & Sons.

Baum, C. F., Schaffer, M.E., and Stillman, S. 2007. Enhanced routines for
instrumental variables/GMM estimation and testing. Boston College
Department of Economics Working Paper No. 667.
http://ideas.repec.org/p/boc/bocoec/667.html

Cragg, J.G. and Donald, S.G. 1993. Testing Identfiability and Specification in
Instrumental Variables Models. Econometric Theory, Vol. 9, pp. 222-240.

Kleibergen, F. and Paap, R.  2006.  Generalized Reduced Rank Tests Using the
Singular Value Decomposition.  Journal of Econometrics, Vol. 133, pp.
97-126.

Stock, J.H. and Yogo, M.  2005.  Testing for Weak Instruments in Linear IV
Regression. In D.W.K. Andrews and J.H. Stock, eds. Identification and
Inference for Econometric Models: Essays in Honor of Thomas Rothenberg.
Cambridge: Cambridge University Press, 2005, pp. 80–108.  Working paper
version: NBER Technical Working Paper 284.
http://www.nber.org/papers/T0284.

Acknowledgements

We would like to thank Kit Baum and Austin Nichols for helpful suggestions and
feedback.

Citation of ranktest

ranktest is not an official Stata command. It is a free contribution to the
research community, like a paper. Please cite it as such:

Kleibergen, F., Schaffer, M.E. 2010.  ranktest: module for testing the
rank of a matrix using the Kleibergen-Paap rk statistic
http://ideas.repec.org/c/boc/bocode/s456865.html

Authors

Frank Kleibergen, Brown University, US
Frank_Kleibergen@brown.edu

Mark E Schaffer, Heriot-Watt University, UK
m.e.schaffer@hw.ac.uk

Also see

Manual:  [R] canon

On-line: help for canon, vecrank, ivreg2 (if installed)
```