```-------------------------------------------------------------------------------
help for overid
-------------------------------------------------------------------------------

Calculate tests of overidentifying restrictions after ivreg, ivreg2, ivprobit,
> ivtobit, reg3

overid [, chi2 dfr f all depvar(varname)]

overid may be used after IV estimation with aweights, fweights and iweights;
see help weights.

Description

+-----------------------------------+
----+ Instrumental variables regression +--------------------------------

overid computes versions of Sargan's (1958) and Basmann's (1960) tests of
overidentifying restrictions for a regression estimated via instrumental
variables in which the number of instruments exceeds the number of regressors:
that is, for an overidentified equation.  These are tests of the joint null
hypothesis that the excluded instruments are valid instruments, i.e.,
uncorrelated with the error term and correctly excluded from the estimated
equation.  A rejection casts doubt on the validity of the instruments.

For single-equation (limited-information) instrumental variables regression (as
implemented in ivreg or ivreg2), write the full set of instruments as Z and the
residuals from the IV estimation as u, let P represent the "projection matrix"
Z*inv(Z'Z)*Z', and let M=I-P, where I is the identity matrix.  N is the number
of observations, L the number of instruments, K the number of regressors, and
L-K the number of overidentifying restrictions.  Then

Sargan's (1958) statistic = u'Pu / (u'u/N)

Basmann's (1960) statistic = u'Pu / (u'Mu/(N-L))

The statistics share the same numerator.  The denominators can be interpreted
as two different estimates of the error variance of the estimated equation,
both of which are consistent (see Davidson and MacKinnon (1993), pp. 235-36).

Both statistics are distributed as chi-square with L-K degrees of freedom.
Both can be calculated via an artificial regression of the residuals of the IV
estimation regressed on the full set of instruments; the Sargan statistic is N
* the uncentered R-sq from this regression.  See, e.g., Davidson and MacKinnon
(1993), p. 236 and Wooldridge (2002), p. 123.

If there are no overidentifying restrictions (i.e., in the case of exact
identification, where the number of excluded instruments equals the number of
right-hand endogenous variables), an error message is printed.

The version of this test that is robust to heteroskedasticity in the errors is
Hansen's J statistic; under the assumption of conditional homoskedasticity,
Sargan's statistic becomes Hansen's J (see Hayashi (2000), p. 227-28), and
hence the two statistics are sometimes referred to as the Hansen-Sargan
statistic.  Robust overidentification statistics are available via ivreg2.
overid will not produce a result if either the robust or cluster options are
employed in the preceding IV regression.  ivreg2 also provides "diff-Sargan" or
"C" tests for the endogeneity of a subset of instruments; see help ivreg2 (if
installed) for details.

The test will fail to run if N<L. For Z'Z to be of full rank, N>L.

+-----------------------------------------+
----+ Instrumental variables probit and Tobit +--------------------------

overid will report an overidentification statistic after estimation by ivprobit
and ivtobit with the twostep option.  These Stata commands request Newey's
(1987) minimum-distance (or minimum-chi-squared) IV probit and IV Tobit
estimators, respectively.  Lee (1992) shows that the minimized distance for
these estimators provides a test of overidentifying restrictions.  Like Sargan
and Basmann single-equation statistics, the test statistic is distributed as
Chi-squared with (L-K) degrees of freedom under the null that the instruments
are valid.  The test statistic is available after twostep estimation only.

+---------------------------+
----+ Three-stage least squares +----------------------------------------

overid will report an overidentification statistic after system estimation with
reg3. As Davidson and MacKinnon (2004, p.532) indicate, a Hansen-Sargan test of
the overidentifying restrictions is based on the 3SLS criterion function
evaluated at the 3SLS point and interval parameter estimates. Under the null
hypothesis, the statistic is distributed Chi-squared wih (G*L - K) degrees of
freedom, where G is the number of simultaneous equations. The procedure will
take proper account of linear constraints on the parameter vector imposed
during estimation.

+------------------+

The command displays the test statistics, degrees of freedom and P-value, and
places values in the return array. return list for details.

A full discussion of these computations and related topics can be found in
Baum, Schaffer, and Stillman (2003) and Baum, Schaffer and Stillman (2006). A
version of this routine by Schaffer and Stillman that works in the context of
panel data is available as xtoverid.

+----------+
----+ Citation +---------------------------------------------------------

overid is not an official Stata command. It is a free contribution to the
research community, like a paper. Please cite it as such:
Baum, C.F., Schaffer, M.E., Stillman, S., Wiggins, V.  2006.  overid:
Stata module to calculate tests of overidentifying restrictions after
ivreg, ivreg2, ivprobit, ivtobit, reg3.
http://ideas.repec.org/c/boc/bocode/s396802.html

Options

Options chi2, dfr, f and all only pertain to use of overid after ivreg or
ivreg2.

chi2 requests Sargan's and Basmann's chi-squared statistics; this is the
default.

dfr is equivalent to chi2 except that the the Sargan statistic has a
small-sample correction:  u'Pu / (u'u/(N-K))

f requests the pseudo-F test versions of the Sargan and Basmann statistics.
Sargan pseudo-F = u'Pu/(L-K) / (u'u/(N-K))
Basmann pseudo-F = u'Pu/(L-K) / (u'Mu/(N-L))

all causes all five statistics to be reported.

depvar must be used after ivprobit, version 1.1.8 or earlier, to specify the
dependent variable of the estimated equation.

Examples

. sysuse auto

. ivreg price mpg (weight turn=length displacement gear_ratio trunk)

. overid

. overid, all

. ivprobit foreign displacement (mpg=length weight turn), twostep

. overid, depvar(foreign)

. ivtobit gear_ratio displacement (mpg=length weight turn) [fw=rep78],
twostep ll(2.2)

. overid

. webuse klein

. constraint define 1 [consump]wagepriv = [consump]wagegovt

. constraint define 2 [consump]govt = [wagepriv]govt

. reg3 ( consump wagepriv wagegovt govt invest) ( wagepriv consump govt
capital1 taxnetx)

. overid

. reg3 ( consump wagepriv wagegovt govt invest) ( wagepriv consump govt
capital1 taxnetx), c(1 2)

. overid

Acknowledgements

We are grateful to Austin Nichols for providing a better version of the reg3
code which greatly reduces memory use. Martin Weiss was also helpful in
pointing out a recent bug in official ivtobit which was causing overid to
fail.

References

Basmann, R.L., On Finite Sample Distributions of Generalized Classical Linear
Identifiability Test Statistics.  Journal of the American Statisical
Association, Vol. 55, Issue 292, December 1960, pp. 650-59.

Baum, C. F., Schaffer, M. E., Stillman, S., Instrumental variables and GMM:
Estimation and testing. Stata Journal, Vol. 3, 2003, pp. 1-31. Available as
Working Paper no. 545, Boston College Department of Economics.
http://fmwww.bc.edu/ec-p/WP545.pdf

Baum, C. F., Schaffer, M. E., Stillman, S., 2006. Enhanced routines for
instrumental variables/GMM estimation and testing. Unpublished working
paper, forthcoming.

Davidson, R. and MacKinnon, J., Estimation and Inference in Econometrics.
1993. New York: Oxford University Press.

Davidson, R. and MacKinnon, J., Econometric Theory and Methods.  2004. New
York: Oxford University Press.

Hayashi, F., Econometrics.  2000.  Princeton: Princeton University Press.

Lee, L., Amemiya's Generalized Least Squares and Tests of Overidenfication in
Simultaneous Equation Models with Qualitative or Limited Dependent
Variables. Econometric Reviews, Vol. 11, No. 3, 1992, pp. 319-328.

Newey, W.K., Efficient Estimation of Limited Dependent Variable Models with
Endogeneous Explanatory Variables". Journal of Econometrics, Vol. 36, 1987,
pp. 231-250.

Sargan, J.D. The Estimation of Economic Relationships Using Instrumental
Variables.  Econometrica, Vol. 26, 1958, pp. 393-415.

Wooldridge, J.M., Econometric Analysis of Cross Section and Panel Data.  2002.
Cambridge, MA: MIT Press.

Authors

Christopher F Baum, Boston College, USA
baum@bc.edu

Mark E Schaffer, Heriot-Watt University, UK
m.e.schaffer@hw.ac.uk

Steven Stillman, Motu, New Zealand
stillman@motu.org.nz

Vince Wiggins, Stata Corporation, USA
vwiggins@stata.com

Also see

Manual:  [R] ivreg, [R] ivprobit, [R] ivtobit, [R] reg3
On-line:  help for ivreg; ivreg2 (if installed); ivprobit; ivtobit; reg3;
xtoverid (if installed)
```