-------------------------------------------------------------------------------
help for xtoverid
-------------------------------------------------------------------------------

Tests of overidentifying restrictions after xtreg, xtivreg, xtivreg2 and xthtay > lor

xtoverid [, robust cluster(varlist) ]

xtoverid does not support IV estimation with weights.

Description

xtoverid computes versions of a test of overidentifying restrictions (orthogonality conditions) for a panel data estimation. For an instrumental variables estimation, this is a test of the null hypothesis that the excluded instruments are valid instruments, i.e., uncorrelated with the error term and correctly excluded from the estimated equation. The test statistic is distributed as chi-squared with degrees of freedom = L-K, where L is the number of excluded instruments and K is the number of regressors, and a rejection casts doubt on the validity of the instruments. A test of fixed vs. random effects is also a test of overidentifying restrictions, and xtoverid will report this test after a standard panel data estimation with xtreg,re.

If the original estimation reported classical (non-robust) standard errors, xtoverid will report Sargan's statistic. The version of this test that is robust to heteroskedasticity in the errors is Hansen's J statistic, which is what xtoverid reports if the original estimation was robust or if xtoverid is called with the robust option. Similarly, xtoverid will report an overidentification statistic that is robust to arbitrary heteroskedasticity and within-group correlation if the cluster option was used by the original estimation or by the call to xtoverid. Under the assumption of conditional homoskedasticity, Sargan's statistic becomes Hansen's J (see Hayashi (2000), p. 227-28), and hence the two statistics are sometimes referred to as the Sargan-Hansen or Hansen-Sargan statistic. The tests are implemented in xtoverid by calls to the ivreg2 of Baum-Schaffer-Stillman. For further discussion and details of how the test is implemented, see help ivreg2 and Baum et al. (2003, 2006).

A test of fixed vs. random effects can also be seen as a test of overidentifying restrictions. The fixed effects estimator uses the orthogonality conditions that the regressors are uncorrelated with the idiosyncratic error e_it, i.e., E(X_it*e_it)=0. The random effects estimator uses the additional orthogonality conditions that the regressors are uncorrelated with the group-specific error u_i (the "random effect"), i.e., E(X_it*u_i)=0. These additional orthogonality conditions are overidentifying restrictions. The test is implemented by xtoverid using the artificial regression approach described by Arellano (1993) and Wooldridge (2002, pp. 290-91), in which a random effects equation is reestimated augmented with additional variables consisting of the original regressors transformed into deviations-from-mean form. The test statistic is a Wald test of the significance of these additional regressors. A large-sample chi-squared test statistic is reported with no degrees-of-freedom corrections. Under conditional homoskedasticity, this test statistic is asymptotically equivalent to the usual Hausman fixed-vs-random effects test; with a balanced panel, the artificial regression and Hausman test statistics are numerically equal. See Arellano (1993) for an exact statement and the example below for a demonstration. Unlike the Hausman version, the test reported by xtoverid extends straightforwardly to heteroskedastic- and cluster-robust versions, and is guaranteed always to generate a nonnegative test statistic.

The remainder of this help file discusses how the variables are transformed prior to IV estimation and special issues that arise.

The official Stata routines xtivreg and xthtaylor work by transforming the variables in the regression, constructing the instruments, and then estimating a standard single equation IV estimation on the transformed variables; xtoverid works the same way, and includes an internal check that the IV estimation matches the original estimation for which the overidentification statistic is being requested.

For fixed-effects IV estimation (xtivreg,fe or xtivreg2,fe), the "within transformation" is first applied to the data, i.e., all variables have group means subtracted, and then an IV estimation is performed on the demeaned data. The between IV estimator (xtivreg,be) is an IV estimation on group means, the first differences IV estimator (xtivreg,fd or xtivreg2,fd) is an IV estimation on first differences and the default G2SLS random-effects estimator (xtivreg,re) is an IV estimation on variables subjected to the GLS transform. In all these estimators, the excluded instruments are subject to the same transformations as the regressors and dependent variable. Note that the overidentification statistic reported after a fixed effects estimation with either classical or robust standard errors will incorporate a degrees-of-freedom adjustment deriving from the degrees of freedom lost to the number of fixed effects. No adjustment is made (or is required) for a cluster-robust overidentification statistic. See xtivreg2 for further discussion and details.

The GLS IV estimators xtivreg,ec2sls and xthtaylor are slightly different: the dependent variables and regressors are subjected to the GLS transform, but the instrument sets are combinations of demeaned and group mean (or time-invariant) variables. The degrees of freedom of the overidentification statistic for the standard Hausman-Taylor estimator is K1-G2, where K1 is the number of exogenous time-varying variables and G2 is the number of endogenous time-invariant variables. In the Amemiya-MaCurdy version of the estimator (available via xthtaylor,amacurdy), the degrees of freedom will be T*K1-G2, where T is the length of the panel in the time dimension. For further discussion, see the Stata manual entries for these estimators or Baltagi (2005).

Note that following estimation by xtivreg,ec2sls, the number of degrees of freedom of the overidentification statistic is not what is expected based on a simple count of instruments and endogenous variables when the equation includes an exogenous regressor. The reason is that in EC2SLS estimation as implemented in xtivreg,ec2sls, the regressor is subject to the GLS transform and then, in the IV estimation on the transformed data, is treated as an endogenous regressor with both its demeaned and recentered transformation and its group mean transformation as two excluded instruments. When estimating using xtivreg,ec2sls on an unbalanced panel, therefore, including exogenous regressors increases the number of degrees of freedom of the overidentification statistic. The intuition is that exogenous regressors in EC2SLS estimation are overidentified for the same reason that exogenous regressors in a standard random effects estimation are overidentified (see above). See the examples below.

Options

robust requests a heteroskedastic-robust overidentification statistic.

cluster(varlist) requests an overidentification statistic that is robust to arbitrary heteroskedasticity and within-group correlation, where the group is defined by varlist. If ivreg2 version 3.0 or later is installed, 2-way clustering is supported; see help ivreg2 for details.

Examples

. webuse nlswork

. tsset idcode year

. gen age2=age^2

. gen black=(race==2)

. xtivreg ln_wage age (tenure = union south), fe i(idcode)

. xtoverid

. xtoverid, robust

. xtoverid, cluster(idcode)

(Identical to overid stat from xtivreg2 with same options) . xtivreg2 ln_wage age (tenure = union south), fe cluster(idcode)

(Compare overid stat degrees of freedom for G2SLS:) (2 (union, south) - 1 (tenure) = 1) . xtivreg ln_wage age (tenure = union south), re

. xtoverid

(...with degrees of freedom for EC2SLS:) (6 (mean and mean-deviation of union, south, age) - 2 (GLS transform of tenure, > age) = 4) . xtivreg ln_wage age (tenure = union south), ec2sls

. xtoverid

(Changing the number of included exogenous variables changes the dof of the ove > rid stat) (4 (mean and mean-deviation of union, south) - 1 (GLS transform of tenure) = 3) . xtivreg ln_wage (tenure = union south), ec2sls

. xtoverid

(Hausman-Taylor estimation)

(dof = 2 (exogenous time-varying age, age2) - 1 (endogenous time-invariant grad > e) = 1) . xthtaylor ln_wage age age2 tenure hours black birth_yr grade, endog(tenure hours grade) i(idcode)

. xtoverid

(Equivalence of xtoverid statistic and standard Hausman fixed-vs-random effects > test)

. webuse abdata

(Balanced panel) . xtreg n w k if year>=1978 & year<=1982, re

(Artificial regression overid test of fixed-vs-random effects) . xtoverid

. di r(j)

. est store re

. xtreg n w k if year>=1978 & year<=1982, fe

. est store fe

(In homoskedastic balanced panel case, Hausman test using sigma from FE estimat > ion...) . hausman fe re, sigmaless

(... is numerically equal to the artificial regression overid statistic) . di r(chi2)

(Artificial regression overid statistic readily extends to non-homoskedastic ca > se)

. xtreg n w k, re cluster(id)

. xtoverid

Citation

xtoverid is not an official Stata command. It is a free contribution to the research community, like a paper. Please cite it as such:

Schaffer, M.E., Stillman, S. 2010. xtoverid: Stata module to calculate tests of overidentifying restrictions after xtreg, xtivreg, xtivreg2 and xthtaylor http://ideas.repec.org/c/boc/bocode/s456779.html

References

Arellano, M. 1993. On the testing of correlated effects with panel data. Journal of Econometrics, Vol. 59, Nos. 1-2, pp. 87-97.

Baltagi, B. 2005. Econometric analysis of danel data. New York: Wiley.

Baum, C. F., Schaffer, M. E., Stillman, S. 2003. Instrumental variables and GMM: Estimation and testing. The Stata Journal, Vol. 3, No. 1, pp. 1-31. Unpublished working paper version: Boston College Department of Economics Working Paper No 545.

Baum, C. F., Schaffer, M. E., Stillman, S., 2006. Enhanced routines for instrumental variables/GMM estimation and testing. Unpublished working paper, forthcoming.

Hayashi, F. 2000. Econometrics. Princeton: Princeton University Press.

Wooldridge, J.M. 2002. Econometric Analysis of Cross Section and Panel Data. Cambridge, MA: MIT Press.

Authors

Mark E Schaffer, Heriot-Watt University, UK m.e.schaffer@hw.ac.uk

Steven Stillman, Motu Economic and Public Policy Research, NZ stillman@motu.org.nz

Also see

Manual: [R] ivreg On-line: help for xtivreg; xtivreg2 (if installed); xthtaylor; ivreg2 (if installed);