------------------------------------------------------------------------------- help forivreg2-------------------------------------------------------------------------------

Extended instrumental variables/2SLS, GMM and AC/HAC, LIML and k-class regressi> onFull syntax

ivreg2depvar[varlist1](varlist2=varlist_iv)[weight] [ifexp] [inrange] [,gmm2sbw(#)kernel(string)dkraay(integer)kieferlimlfuller(#)kclass(#)covivcueb0(matrix)robustcluster(varlist)orthog(varlist_ex)endog(varlist_en)redundant(varlist_ex)partial(varlist)smallnoconstanthasconssmatrix(matrix)wmatrix(matrix)firstffirstsavefirstsavefprefix(prefix)rfsaverfsaverfprefix(prefix)nocollinnoidlevel(#)noheadernofootereform(string)depname(varname)plus]Replay syntax

ivreg2[,firstffirstrflevel(#)noheadernofootereform(string)depname(varname)plus]}Version syntax

ivreg2,version

ivreg2is compatible with Stata version 10.1 or later.

ivreg2may be used with time-series or panel data, in which case the data must betssetbefore usingivreg2; see help tsset.All

varlistsmay contain time-series operators, but factor variables are not currently supported; see help varlist.

by,rolling,statsby,xi,bootstrapandjackknifeare allowed; see help prefix.

aweights,fweights,iweights andpweights are allowed; see help weights.The syntax of predict following

ivreg2is

predict[type]newvarname[ifexp] [inrange] [,statistic]where

statisticis

xbfitted values; the defaultresidualsresidualsstdpstandard error of the predictionThese statistics are available both in and out of sample; type "

predict...ife(sample)..." if wanted only for the estimation sample.

Description Robust, cluster and 2-way cluster, AC, HAC, and cluster+HAC SEs and statistics GMM estimation LIML, k-class and GMM-CUE estimation Summary of robust, HAC, AC, GMM, LIML and CUE options Testing overidentifying restrictions Testing subsets of regressors and instruments for endogeneity Tests of under- and weak identification Testing instrument redundancy First-stage regressions, identification, and weak-id-robust inference Reduced form estimates Partialling-out exogenous regressors OLS and Heteroskedastic OLS (HOLS) estimation Collinearities Speed options: nocollin and noid Small sample corrections Options summary Remarks and saved results Examples References Acknowledgements Authors Citation of ivreg2Contents

ivreg2implements a range of single-equation estimation methods for the linear regression model: OLS, instrumental variables (IV, also known as two-stage least squares, 2SLS), the generalized method of moments (GMM), limited-information maximum likelihood (LIML), and k-class estimators. In the language of IV/GMM,varlist1are the exogenous regressors or "included instruments",varlist_ivare the exogenous variables excluded from the regression or "excluded instruments", andvarlist2the endogenous regressors that are being "instrumented".

ivreg2will also estimate linear regression models using robust (heteroskedastic-consistent), autocorrelation-consistent (AC), heteroskedastic and autocorrelation-consistent (HAC) and cluster-robust variance estimates.

ivreg2is an alternative to Stata's officialivregress. Its features include: two-step feasible GMM estimation (gmm2soption) and continuously-updated GMM estimation (cueoption); LIML and k-class estimation; automatic output of overidentification and underidentification test statistics; C statistic test of exogeneity of subsets of instruments (orthog()option); endogeneity tests of endogenous regressors (endog()option); test of instrument redundancy (redundant()option); kernel-based autocorrelation-consistent (AC) and heteroskedastic and autocorrelation consistent (HAC) standard errors and covariance estimation (bw(#)option), with user-specified choice of kernel (kernel()option); two-levelcluster-robust standard errors and statistics; default reporting of large-sample statistics (z and chi-squared rather than t and F);smalloption to report small-sample statistics; first-stage regressions reported with various tests and statistics for identification and instrument relevance;ffirstoption to report only these identification statistics and not the first-stage regression results themselves.ivreg2can also be used for ordinary least squares (OLS) estimation using the same command syntax as officialregressandnewey.+-------------------------------------------------------------------------- --------+ ----+ Robust, cluster and 2-level cluster, AC, HAC, and cluster+HAC SEs and sta > tistics +

The standard errors and test statistics reported by

ivreg2can be made consistent to a variety of violations of the assumption of i.i.d. errors. When these options are combined with either thegmm2sorcueoptions (see below), the parameter estimators reported are also efficient in the presence of the same violation of i.i.d. errors.The options for SEs and statistics are: (1)

robustcausesivreg2to report SEs and statistics that are robust to the presence of arbitrary heteroskedasticity. (2)cluster(varname) SEs and statistics are robust to both arbitrary heteroskedasticity and arbitrary intra-group correlation, wherevarnameidentifies the group. See the relevant Stata manual entries on obtaining robust covariance estimates for further details. (3)cluster(varname1 varname2) provides 2-way clustered SEs and statistics (Cameron et al. 2006, Thompson 2009) that are robust to arbitrary heteroskedasticity and intra-group correlation with respect to 2 non-nested categories defined byvarname1andvarname2. See below for a detailed description. (4)bw(#)requests AC SEs and statistics that are robust to arbitrary autocorrelation. (5)bw(#)combined withrobustrequests HAC SEs and statistics that are robust to both arbitrary heteroskedasticity and arbitrary autocorrelation. (6)bw(#)combined withcluster(varname) is allowed with either 1- or 2-level clustering if the data are panel data that aretsseton the time variablevarname. Following Driscoll and Kray (1998), the SEs and statistics reported will be robust to disturbances that are common to panel units and that are persistent, i.e., autocorrelated. (7)dkraay(#)is a shortcut for the Driscoll-Kraay SEs for panel data in (6). It is equivalent to clustering on thetssettime variable and the bandwidth supplied as#. The default kernel Bartlett kernel can be overridden with thekerneloption. (8)kieferimplements SEs and statistics for panel data that are robust to arbitrary intra-group autocorrelation (butnotheteroskedasticity) as per Kiefer (1980). It is equivalent to to specifying the truncated kernel withkernel(tru)andbw(#)where#is the full length of the panel.Details:

cluster(varname1 varname2) provides 2-way cluster-robust SEs and statistics as proposed by Cameron, Gelbach and Miller (2006) and Thompson (2009). "Two-way cluster-robust" means the SEs and statistics are robust to arbitrary within-group correlation in two distinct non-nested categories defined byvarname1andvarname2. A typical application would be panel data where one "category" is the panel and the other "category" is time; the resulting SEs are robust to arbitrary within-panel autocorrelation (clustering on panel id) and to arbitrary contemporaneous cross-panel correlation (clustering on time). There is no point in using 2-way cluster-robust SEs if the categories are nested, because the resulting SEs are equivalent to clustering on the larger category.varname1andvarname2do not have to uniquely identify observations. The order ofvarname1andvarname2does not matter for the results, but processing may be faster if the category with the larger number of categories (typically the panel dimension) is listed first.Cameron, Gelbach and Miller (2006) show how this approach can accommodate multi-way clustering, where the number of different non-nested categories is arbitary. Their Stata command

cgmregimplements 2-way and multi-way clustering for OLS estimation. The two-way clustered variance-covariance estimator is calculated using 3 different VCEs: one clustered onvarname1, the second clustered onvarname2, and the third clustered on the intersection ofvarname1andvarname2. Cameron et al. (2006, pp. 8-9) discuss two possible small-sample adjustments using the number of clusters in each category.cgmreguses one method (adjusting the 3 VCEs separately based on the number of clusters in the categories VCE clusters on);ivreg2uses the second (adjusting the final 2-way cluster-robust VCE using the smaller of the two numbers of clusters). For this reason,ivreg2andcgmregwill produce slightly different SEs. See also small sample corrections below.

ivreg2allows a variety of options for kernel-based HAC and AC estimation. Thebw(#)option sets the bandwidth used in the estimation andkernel(string)is the kernel used; the default kernel is the Bartlett kernel, also known in econometrics as Newey-West (see help newey). The full list of kernels available is (abbreviations in parentheses): Bartlett (bar); Truncated (tru); Parzen (par); Tukey-Hanning (thann); Tukey-Hamming (thamm); Daniell (dan); Tent (ten); and Quadratic-Spectral (qua or qs). When using the Bartlett, Parzen, or Quadratic spectral kernels, the automatic bandwidth selection procedure of Newey and West (1994) can be chosen by specifyingbw(auto).ivreg2can also be used for kernel-based estimation with panel data, i.e., a cross-section of time series. Before usingivreg2for kernel-based estimation of time series or panel data, the data must betsset; see help tsset.Following Driscoll and Kraay (1998),

bw(#)combined withcluster(varname) and applied to panel data produces SEs that are robust to arbitary common autocorrelated disturbances. The data must betssetwith the time variable specified asvarname. Driscoll-Kraay SEs also can be specified using thedkraay(#)} option, where#is the bandwidth. The default Bartlett kernel can be overridden with thekerneloption. Note that the Driscoll-Kraay variance-covariance estimator is a large-T estimator, i.e., the panel should have a long-ish time-series dimension.Used with 2-way clustering as per Thompson (2009),

bw(#)combined withcluster(varname) provides SEs and statistics that are robust to autocorrelated within-panel disturbances (clustering on panel id) and to autocorrelated across-panel disturbances (clustering on time combined with kernel-based HAC). The approach proposed by Thompson (2009) can be implemented inivreg2by choosing the truncated kernelkernel(tru)andbw(#), where the researcher knows or assumes that the common autocorrelated disturbances can be ignored after#periods.

Important:Users should be aware of the asymptotic requirements for the consistency of the chosen VCE. In particular: consistency of the 1-way cluster-robust VCE requires the number of clusters to go off to infinity; consistency of the 2-way cluster-robust VCE requires the numbers of clusters in both categories to go off to infinity; consistency of kernel-robust VCEs requires the numbers of observations in the time dimension to go off to infinity. See Angrist and Pischke (2009), Cameron et al. (2006) and Thompson (2009) for detailed discussions of the performance of the cluster-robust VCE when the numbers of clusters is small.+----------------+ ----+ GMM estimation +---------------------------------------------------

When combined with the above options, the

gmm2soption generates efficient estimates of the coefficients as well as consistent estimates of the standard errors. Thegmm2soption implements the two-step efficient generalized method of moments (GMM) estimator. The efficient GMM estimator minimizes the GMM criterion function J=N*g'*W*g, where N is the sample size, g are the orthogonality or moment conditions (specifying that all the exogenous variables, or instruments, in the equation are uncorrelated with the error term) and W is a weighting matrix. In two-step efficient GMM, the efficient or optimal weighting matrix is the inverse of an estimate of the covariance matrix of orthogonality conditions. The efficiency gains of this estimator relative to the traditional IV/2SLS estimator derive from the use of the optimal weighting matrix, the overidentifying restrictions of the model, and the relaxation of the i.i.d. assumption. For an exactly-identified model, the efficient GMM and traditional IV/2SLS estimators coincide, and under the assumptions of conditional homoskedasticity and independence, the efficient GMM estimator is the traditional IV/2SLS estimator. For further details, see Hayashi (2000), pp. 206-13 and 226-27.The

wmatrixoption allows the user to specify a weighting matrix rather than computing the optimal weighting matrix. Estimation with thewmatrixoption yields a possibly inefficient GMM estimator.ivreg2will use this inefficient estimator as the first-step GMM estimator in two-step efficient GMM when combined with thegmm2soption; otherwise,ivreg2reports the regression results using this inefficient GMM estimator.The

smatrixoption allows the user to directly specify the matrix S, the covariance matrix of orthogonality conditions.ivreg2will use this matrix in the calculation of the variance-covariance matrix of the estimator, the J statistic, and, if thegmm2soption is specified, the two-step efficient GMM coefficients. Thesmatrixcan be useful for guaranteeing a positive test statistic in user-specified "GMM-distance tests" (see below). For further details, see Hayashi (2000), pp. 220-24.+--------------------------------------+ ----+ LIML, k-class and GMM-CUE estimation +-----------------------------

Maximum-likelihood estimation of a single equation of this form (endogenous RHS variables and excluded instruments) is known as limited-information maximum likelihood or LIML. The overidentifying restrictions test reported after LIML estimation is the Anderson-Rubin (1950) overidentification statistic in a homoskedastic context. LIML, OLS and IV/2SLS are examples of k-class estimators. LIML is a k-class estimator with k=the LIML eigenvalue lambda; 2SLS is a k-class estimator with k=1; OLS is a k-class esimator with k=0. Estimators based on other values of k have been proposed. Fuller's modified LIML (available with the

fuller(#)option) sets k = lambda - alpha/(N-L), where lambda is the LIML eigenvalue, L = number of instruments (L1 excluded and L2 included), and the Fuller parameter alpha is a user-specified positive constant. Nagar's bias-adjusted 2SLS estimator can be obtained with thekclass(#)option by setting k = 1 + (L-K)/N, where L-K = number of overidentifying restrictions, K = number of regressors (K1 endogenous and K2=L2 exogenous) and N = the sample size. For a discussion of LIML and k-class estimators, see Davidson and MacKinnon (1993, pp. 644-51).The GMM generalization of the LIML estimator to the case of possibly heteroskedastic and autocorrelated disturbances is the "continuously-updated" GMM estimator or CUE of Hansen, Heaton and Yaron (1996). The CUE estimator directly maximizes the GMM objective function J=N*g'*W(b_cue)*g, where W(b_cue) is an optimal weighting matrix that depends on the estimated coefficients b_cue.

cue, combined withrobust,cluster, and/orbw, generates coefficient estimates that are efficient in the presence of the corresponding deviations from homoskedasticity. Specifyingcuewith no other options is equivalent to the combination of the optionslimlandcoviv. The CUE estimator requires numerical optimization methods, and the implementation here uses Mata'soptimizeroutine. The starting values are either IV or two-step efficient GMM coefficient estimates. If the user wants to evaluate the CUE objective function at an arbitrary user-defined coefficient vector instead of havingivreg2find the coefficient vector that minimizes the objective function, theb0(matrix)option can be used. The value of the CUE objective function atb0is the Sargan or Hansen J statistic reported in the output.+-------------------------------------------------------+ ----+ Summary of robust, HAC, AC, GMM, LIML and CUE options +------------

Estimator No VCE option specificed VCE option option

robust,cluster,bw, >kernel------------------------------------------------------------------------------- (none) IV/2SLS IV/2SLS with SEs consistent under homoskedasticity robust SEs

limlLIML LIML with SEs consistent under homoskedasticity robust SEs

gmm2sIV/2SLS Two-step GMM with SEs consistent under homoskedasticity robust SEs

cueLIML CUE GMM with SEs consistent under homoskedasticity robust SEs

kclassk-class estimator k-class estimator wi > th SEs consistent under homoskedasticity robust SEs

wmatrixPossibly inefficient GMM Ineff GMM with SEs consistent under homoskedasticity robust SEs

gmm2s+ Two-step GMM Two-step GMM withwmatrixwith user-specified first step robust SEs SEs consistent under homoskedasticity

With the

bworbwandkernelVCE options, SEs are autocorrelation-robust (AC). Combining therobustoption withbw, SEs are heteroskedasticity- and autocorrelation-robust (HAC).For further details, see Hayashi (2000), pp. 206-13 and 226-27 (on GMM estimation), Wooldridge (2002), p. 193 (on cluster-robust GMM), and Hayashi (2000), pp. 406-10 or Cushing and McGarvey (1999) (on kernel-based covariance estimation).

+--------------------------------------+ ----+ Testing overidentifying restrictions +-----------------------------

The Sargan-Hansen test is a test of overidentifying restrictions. The joint null hypothesis is that the instruments are valid instruments, i.e., uncorrelated with the error term, and that the excluded instruments are correctly excluded from the estimated equation. Under the null, the test statistic is distributed as chi-squared in the number of (L-K) overidentifying restrictions. A rejection casts doubt on the validity of the instruments. For the efficient GMM estimator, the test statistic is Hansen's J statistic, the minimized value of the GMM criterion function. For the 2SLS estimator, the test statistic is Sargan's statistic, typically calculated as N*R-squared from a regression of the IV residuals on the full set of instruments. Under the assumption of conditional homoskedasticity, Hansen's J statistic becomes Sargan's statistic. The J statistic is consistent in the presence of heteroskedasticity and (for HAC-consistent estimation) autocorrelation; Sargan's statistic is consistent if the disturbance is homoskedastic and (for AC-consistent estimation) if it is also autocorrelated. With

robust,bwand/orcluster, Hansen's J statistic is reported. In the latter case the statistic allows observations to be correlated within groups. For further discussion see e.g. Hayashi (2000, pp. 227-8, 407, 417).The Sargan statistic can also be calculated after

ivregorivreg2by the commandoverid. The features ofivreg2that are unavailable inoveridare the J statistic and the C statistic; theoveridoptions unavailable inivreg2are various small-sample and pseudo-F versions of Sargan's statistic and its close relative, Basmann's statistic. See help overid (if installed).+---------------------------------------------------------------+ ----+ Testing subsets of regressors and instruments for endogeneity +----

The C statistic (also known as a "GMM distance" or "difference-in-Sargan" statistic) implemented using the

orthogoption, allows a test of a subset of the orthogonality conditions, i.e., it is a test of the exogeneity of one or more instruments. It is defined as the difference of the Sargan-Hansen statistic of the equation with the smaller set of instruments (valid under both the null and alternative hypotheses) and the equation with the full set of instruments, i.e., including the instruments whose validity is suspect. Under the null hypothesis that both the smaller set of instruments and the additional, suspect instruments are valid, the C statistic is distributed as chi-squared in the number of instruments tested. Note that failure to reject the null hypothesis requires that the full set of orthogonality conditions be valid; the C statistic and the Sargan-Hansen test statistics for the equations with both the smaller and full set of instruments should all be small. The instruments tested may be either excluded or included exogenous variables. If excluded exogenous variables are being tested, the equation that does not use these orthogonality conditions omits the suspect instruments from the excluded instruments. If included exogenous variables are being tested, the equation that does not use these orthogonality conditions treats the suspect instruments as included endogenous variables. To guarantee that the C statistic is non-negative in finite samples, the estimated covariance matrix of the full set orthogonality conditions is used to calculate both Sargan-Hansen statistics (in the case of simple IV/2SLS, this amounts to using the MSE from the unrestricted equation to calculate both Sargan statistics). If estimation is by LIML, the C statistic reported is now based on the Sargan-Hansen test statistics from the restricted and unrestricted equation. For further discussion, see Hayashi (2000), pp. 218-22 and pp. 232-34.Endogeneity tests of one or more endogenous regressors can implemented using the

endogoption. Under the null hypothesis that the specified endogenous regressors can actually be treated as exogenous, the test statistic is distributed as chi-squared with degrees of freedom equal to the number of regressors tested. The endogeneity test implemented byivreg2, is, like the C statistic, defined as the difference of two Sargan-Hansen statistics: one for the equation with the smaller set of instruments, where the suspect regressor(s) are treated as endogenous, and one for the equation with the larger set of instruments, where the suspect regressors are treated as exogenous. Also like the C statistic, the estimated covariance matrix used guarantees a non-negative test statistic. Under conditional homoskedasticity, this endogeneity test statistic is numerically equal to a Hausman test statistic; see Hayashi (2000, pp. 233-34). The endogeneity test statistic can also be calculated afterivregorivreg2by the commandivendog. Unlike the Durbin-Wu-Hausman tests reported byivendog, theendogoption ofivreg2can report test statistics that are robust to various violations of conditional homoskedasticity; theivendogoption unavailable inivreg2is the Wu-Hausman F-test version of the endogeneity test. See help ivendog (if installed).+-----------------------------------------+ ----+ Tests of under- and weak identification +--------------------------

ivreg2automatically reports tests of both underidentification and weak identification. The underidentification test is an LM test of whether the equation is identified, i.e., that the excluded instruments are "relevant", meaning correlated with the endogenous regressors. The test is essentially the test of the rank of a matrix: under the null hypothesis that the equation is underidentified, the matrix of reduced form coefficients on the L1 excluded instruments has rank=K1-1 where K1=number of endogenous regressors. Under the null, the statistic is distributed as chi-squared with degrees of freedom=(L1-K1+1). A rejection of the null indicates that the matrix is full column rank, i.e., the model is identified.For a test of whether a particular endogenous regressor alone is identified, see the discussion below of the Angrist-Pischke (2009) procedure.

When errors are assumed to be i.i.d.,

ivreg2automatically reports an LM version of the Anderson (1951) canonical correlations test. Denoting the minimum eigenvalue of the canonical correlations as CCEV, the smallest canonical correlation between the K1 endogenous regressors and the L1 excluded instruments (after partialling out the K2=L2 exogenous regressors) is sqrt(CCEV), and the Anderson LM test statistic is N*CCEV, i.e., N times the square of the smallest canonical correlation. With thefirstorffirstoptions,ivreg2also reports the closely-related Cragg-Donald (1993) Wald test statistic. Again assuming i.i.d. errors, and denoting the minimum eigenvalue of the Cragg-Donald statistic as CDEV, CDEV=CCEV/(1-CCEV), and the Cragg-Donald Wald statistic is N*CDEV. Like the Anderson LM statistic, the Cragg-Donald Wald statistic is distributed as chi-squred with (L1-K1+1) degrees of freedom. Note that a result of rejection of the null should be treated with caution, because weak instrument problems may still be present. See Hall et al. (1996) for a discussion of this test, and below for discussion of testing for the presence of weak instruments.When the i.i.d. assumption is dropped and

ivreg2reports heteroskedastic, AC, HAC or cluster-robust statistics, the Anderson LM and Cragg-Donald Wald statistics are no longer valid. In these cases,ivreg2reports the LM and Wald versions of the Kleibergen-Paap (2006) rk statistic, also distributed as chi-squared with (L1-K1+1) degrees of freedom. The rk statistic can be seen as a generalization of these tests to the case of non-i.i.d. errors; see Kleibergen and Paap (2006) for discussion, and Kleibergen and Schaffer (2007) for a Stata implementation,ranktest.ivreg2requiresranktestto be installed, and will prompt the user to install it if necessary. Ifivreg2is invoked with therobustoption, the rk underidentification test statistics will be heteroskedastic-robust, and similarly withbwandcluster."Weak identification" arises when the excluded instruments are correlated with the endogenous regressors, but only weakly. Estimators can perform poorly when instruments are weak, and different estimators are more robust to weak instruments (e.g., LIML) than others (e.g., IV); see, e.g., Stock and Yogo (2002, 2005) for further discussion. When errors are assumed to be i.i.d., the test for weak identification automatically reported by

ivreg2is an F version of the Cragg-Donald Wald statistic, (N-L)/L1*CDEV, where L is the number of instruments and L1 is the number of excluded instruments. Stock and Yogo (2005) have compiled critical values for the Cragg-Donald F statistic for several different estimators (IV, LIML, Fuller-LIML), several different definitions of "perform poorly" (based on bias and test size), and a range of configurations (up to 100 excluded instruments and up to 2 or 3 endogenous regressors, depending on the estimator).ivreg2will report the Stock-Yogo critical values if these are available; missing values mean that the critical values haven't been tabulated or aren't applicable. See Stock and Yogo (2002, 2005) for details.When the i.i.d. assumption is dropped and

ivreg2is invoked with therobust,bworclusteroptions, the Cragg-Donald-based weak instruments test is no longer valid.ivreg2instead reports a correspondingly-robust Kleibergen-Paap Wald rk F statistic. The degrees of freedom adjustment for the rk statistic is (N-L)/L1, as with the Cragg-Donald F statistic, except in the cluster-robust case, when the adjustment is N/(N-1) * (N_clust-1)/N_clust, following the standard Stata small-sample adjustment for cluster-robust. In the case of two-way clustering, N_clust is the minimum of N_clust1 and N_clust2. The critical values reported byivreg2for the Kleibergen-Paap statistic are the Stock-Yogo critical values for the Cragg-Donald i.i.d. case. The critical values reported with 2-step GMM are the Stock-Yogo IV critical values, and the critical values reported with CUE are the LIML critical values.+-------------------------------+ ----+ Testing instrument redundancy +------------------------------------

The

redundantoption allows a test of whether a subset of excluded instruments is "redundant". Excluded instruments are redundant if the asymptotic efficiency of the estimation is not improved by using them. Breusch et al. (1999) show that the condition for the redundancy of a set of instruments can be stated in several equivalent ways: e.g., in the reduced form regressions of the endogenous regressors on the full set of instruments, the redundant instruments have statistically insignificant coefficients; or the partial correlations between the endogenous regressors and the instruments in question are zero.ivreg2uses a formulation based on testing the rank of the matrix cross-product between the endogenous regressors and the possibly-redundant instruments after both have all other instruments partialled-out;ranktestis used to test whether the matrix has zero rank. The test statistic is an LM test and numerically equivalent to a regression-based LM test. Under the null that the specified instruments are redundant, the statistic is distributed as chi-squared with degrees of freedom=(#endogenous regressors)*(#instruments tested). Rejection of the null indicates that the instruments are not redundant. When the i.i.d. assumption is dropped andivreg2reports heteroskedastic, AC, HAC or cluster-robust statistics, the redundancy test statistic is similarly robust. See Baum et al. (2007) for further discussion.Calculation and reporting of all underidentification and weak identification statistics can be supressed with the

noidoption.+-----------------------------------------------------------------------+ ----+ First-stage regressions, identification, and weak-id-robust inference +

The

firstandffirstoptions report various first-stage results and identification statistics. Tests of both underidentification and weak identification are reported for each endogenous regressor separately, using the method described by Angrist and Pischke (2009), pp. 217-18 (see also the note on their "Mostly Harmless Econometrics" blog.The Angrist-Pischke (AP) first-stage chi-squared and F statistics are tests of underidentification and weak identification, respectively, of individual endogenous regressors. They are constructed by "partialling-out" linear projections of the remaining endogenous regressors. The AP chi-squared Wald statistic is distributed as chi2(L1-K1+1)) under the null that the particular endogenous regressor in question is unidentified. In the special case of a single endogenous regressor, the AP statistic reported is identical to underidentification statistics reported in the

ffirstoutput, namely the Cragg-Donald Wald statistic (if i.i.d.) or the Kleibergen-Paap rk Wald statistic (if robust, cluster-robust, AC or HAC statistics have been requested); see above. Note the difference in the null hypotheses if there are two or more endogenous regressors: the AP test will fail to reject if a particular endogenous regressor is unidentified, whereas the Anderson/Cragg-Donald/Kleibergen-Paap tests of underidentification will fail to reject ifanyof the endogenous regressors is unidentified.The AP first-stage F statistic is the F form of the same test statistic. It can be used as a diagnostic for whether a particular endogenous regressor is "weakly identified" (see above). Critical values for the AP first-stage F as a test of weak identification are not available, but the test statistic can be compared to the Stock-Yogo (2002, 2005) critical values for the Cragg-Donald F statistic with K1=1.

The first-stage results are always reported with small-sample statistics, to be consistent with the recommended use of the first-stage F-test as a diagnostic. If the estimated equation is reported with robust standard errors, the first-stage F-test is also robust.

A full set of first-stage statistics for each of the K1 endogenous regressors is saved in the matrix e(first). These include (a) the AP F and chi-squared statistics; (b) the "partial R-squared" (squared partial correlation) corresponding to the AP statistics; (c) Shea's (1997) partial R-squared measure (closely related to the AP statistic, but not amenable to formal testing); (d) the simple F and partial R-squared statistics for each of the first-stage equations, with no adjustments if there is more than one endogenous regressor. In the special case of a single endogenous regressor, these F statistics and partial R-squareds are identical.

The first-stage output also includes two statistics that provide weak-instrument robust inference for testing the significance of the endogenous regressors in the structural equation being estimated. The first statistic is the Anderson-Rubin (1949) test (not to be confused with the Anderson-Rubin overidentification test for LIML estimation; see above). The second is the closely related Stock-Wright (2000) S statistic. The null hypothesis tested in both cases is that the coefficients of the endogenous regressors in the structural equation are jointly equal to zero, and, in addition, that the overidentifying restrictions are valid. Both tests are robust to the presence of weak instruments. The tests are equivalent to estimating the reduced form of the equation (with the full set of instruments as regressors) and testing that the coefficients of the excluded instruments are jointly equal to zero. In the form reported by

ivreg2,the Anderson-Rubin statistic is a Wald test and the Stock-Wright S statistic is a GMM-distance test. Both statistics are distributed as chi-squared with L1 degrees of freedom, where L1=number of excluded instruments. The traditional F-stat version of the Anderson-Rubin test is also reported. See Stock and Watson (2000), Dufour (2003), Chernozhukov and Hansen (2005) and Kleibergen (2007) for further discussion. For related alternative test statistics that are also robust to weak instruments, see condivreg and rivtest, and the corresponding discussions in Moreira and Poi (2003) and Mikusheva and Poi (2006), and in Finlay and Magnusson (2009), respectively.The

savefirstoption requests that the individual first-stage regressions be saved for later access using theestimatescommand. If saved, they can also be displayed usingfirstorffirstand theivreg2replay syntax. The regressions are saved with the prefix "_ivreg2_", unless the user specifies an alternative prefix with thesavefprefix(prefix)option.+------------------------+ ----+ Reduced form estimates +-------------------------------------------

The

rfoption requests that the reduced form estimation of the equation be displayed. Thesaverfoption requests that the reduced form estimation is saved for later access using theestimatescommand. If saved, it can also be displayed using therfand theivreg2replay syntax. The regression is saved with the prefix "_ivreg2_", unless the user specifies an alternative prefix with thesaverfprefix(prefix)option.+--------------------------------------+ ----+ Partialling-out exogenous regressors +-----------------------------

The

partial(varlist)option requests that the exogenous regressors invarlistare "partialled out" from all the other variables (other regressors and excluded instruments) in the estimation. If the equation includes a constant, it is also automatically partialled out as well. The coefficients corresponding to the regressors invarlistare not calculated. By the Frisch-Waugh-Lovell (FWL) theorem, in IV, two-step GMM and LIML estimation the coefficients for the remaining regressors are the same as those that would be obtained if the variables were not partialled out. (NB: this does not hold for CUE or GMM iterated more than two steps.) Thepartialoption is most useful when usingclusterand #clusters < (#exogenous regressors + #excluded instruments). In these circumstances, the covariance matrix of orthogonality conditions S is not of full rank, and efficient GMM and overidentification tests are infeasible since the optimal weighting matrix W = S^-1 cannot be calculated. The problem can be addressed by usingpartialto partial out enough exogenous regressors for S to have full rank. A similar problem arises when the regressors include a variable that is a singleton dummy, i.e., a variable with one 1 and N-1 zeros or vice versa, if a robust covariance matrix is requested. The singleton dummy causes the robust covariance matrix estimator to be less than full rank. In this case, partialling-out the variable with the singleton dummy solves the problem. Specifyingpartial(_cons)will cause just the constant to be partialled-out, i.e., the equation will be estimated in deviations-from-means form. Whenivreg2is invoked withpartial, it reports test statistics with the same small-sample adjustments as if estimating withoutpartial. Note that after estimation using thepartialoption, the post-estimationpredictcan be used only to generate residuals, and that in the current implementation,partialis not compatible with endogenous variables or instruments (included or excluded) that use time-series operators.+-----------------------------------------------+ ----+ OLS and Heteroskedastic OLS (HOLS) estimation +--------------------

ivreg2also allows straightforward OLS estimation by using the same syntax asregress, i.e.,ivreg2 depvar varlist1. This can be useful if the user wishes to use one of the features ofivreg2in OLS regression, e.g., AC or HAC standard errors.If the list of endogenous variables

varlist2is empty but the list of excluded instrumentsvarlist_ivis not, and the optiongmm2sis specified,ivreg2calculates Cragg's "heteroskedastic OLS" (HOLS) estimator, an estimator that is more efficient than OLS in the presence of heteroskedasticity of unknown form (see Davidson and MacKinnon (1993), pp. 599-600). If the optionbw(#)is specified, the HOLS estimator is efficient in the presence of arbitrary autocorrelation; if bothbw(#)androbustare specified the HOLS estimator is efficient in the presence of arbitrary heteroskedasticity and autocorrelation; and ifcluster(varlist)is used, the HOLS estimator is efficient in the presence of arbitrary heteroskedasticity and within-group correlation. The efficiency gains of HOLS derive from the orthogonality conditions of the excluded instruments listed invarlist_iv. If no endogenous variables are specified andgmm2sis not specified,ivreg2reports standard OLS coefficients. The Sargan-Hansen statistic reported when the list of endogenous variablesvarlist2is empty is a Lagrange multiplier (LM) test of the hypothesis that the excluded instrumentsvarlist_ivare correctly excluded from the restricted model. If the estimation is LIML, the LM statistic reported is now based on the Sargan-Hansen test statistics from the restricted and unrestricted equation. For more on LM tests, see e.g. Wooldridge (2002), pp. 58-60. Note that because the approach of the HOLS estimator has applications beyond heteroskedastic disturbances, and to avoid confusion concerning the robustness of the estimates, the estimators presented above as "HOLS" are described in the output ofivreg2as "2-Step GMM", "CUE", etc., as appropriate.+----------------+ ----+ Collinearities +---------------------------------------------------

ivreg2checks the lists of included instruments, excluded instruments, and endogenous regressors for collinearities and duplicates. If an endogenous regressor is collinear with the instruments, it is reclassified as exogenous. If any endogenous regressors are collinear with each other, some are dropped. If there are any collinearities among the instruments, some are dropped. In Stata 9+, excluded instruments are dropped before included instruments. If any variables are dropped, a list of their names are saved in the macrose(collin)and/ore(dups). Lists of the included and excluded instruments and the endogenous regressors with collinear variables and duplicates removed are also saved in macros with "1" appended to the corresponding macro names.Collinearity checks can be supressed with the

nocollinoption.+----------------------------------+ ----+ Speed options: nocollin and noid +---------------------------------

Two options are available for speeding execution.

nocollinspecifies that the collinearity checks not be performed.noidsuspends calculation and reporting of the underidentification and weak identification statistics in the main output.+--------------------------+ ----+ Small sample corrections +-----------------------------------------

Mean square error = sqrt(RSS/(N-K)) if

small, = sqrt(RSS/N) otherwise.If

robustis chosen, the finite sample adjustment (see[R] regress) to the robust variance-covariance matrix qc = N/(N-K) ifsmall, qc = 1 otherwise.If

clusteris chosen, the finite sample adjustment qc = (N-1)/(N-K)*M/(M-1) ifsmall, where M=number of clusters, qc = 1 otherwise. If 2-way clustering is used, M=min(M1,M2), where M1=number of clusters in group 1 and M2=number of clusters in group 2.The Sargan and C (difference-in-Sargan) statistics use error variance = RSS/N, i.e., there is no small sample correction.

A full discussion of these computations and related topics can be found in Baum, Schaffer, and Stillman (2003) and Baum, Schaffer and Stillman (2007). Some features of the program postdate the former article and are described in the latter paper. Some features, such as two-way clustering, postdate the latter article as well.

gmm2srequests the two-step efficient GMM estimator. If no endogenous variables are specified, the estimator is Cragg's HOLS estimator.

limlrequests the limited-information maximum likelihood estimator.

fuller(#)specifies that Fuller's modified LIML estimator is calculated using the user-supplied Fuller parameter alpha, a non-negative number. Alpha=1 has been suggested as a good choice.

kclass(#)specifies that a general k-class estimator is calculated using the user-supplied #, a non-negative number.

covivspecifies that the matrix used to calculate the covariance matrix for the LIML or k-class estimator is based on the 2SLS matrix, i.e., with k=1. In this case the covariance matrix will differ from that calculated for the 2SLS estimator only because the estimate of the error variance will differ. The default is for the covariance matrix to be based on the LIML or k-class matrix.

cuerequests the GMM continuously-updated estimator (CUE).

b0(matrix)specifies that the J statistic (i.e., the value of the CUE objective function) should be calculated for an arbitrary coefficient vectorb0. That vector must be provided as a matrix with appropriate row and column names. Under- and weak-identification statistics are not reported in the output.

robustspecifies that the Eicker/Huber/White/sandwich estimator of variance is to be used in place of the traditional calculation.robustcombined withcluster()further allows residuals which are not independent within cluster (although they must be independent between clusters). See[U] Obtainingrobust variance estimates.

cluster(varlist)specifies that the observations are independent across groups (clusters) but not necessarily independent within groups. With 1-way clustering,cluster(varname)specifies to which group each observation belongs; e.g.,cluster(personid)in data with repeated observations on individuals. With 2-way clustering,cluster(varname1 varname2)specifies the two (non-nested) groups to which each observation belongs. Specifyingcluster()impliesrobust.

bw(#)impements AC or HAC covariance estimation with bandwidth equal to#, where#is an integer greater than zero. Specifyingrobustimplements HAC covariance estimation; omitting it implements AC covariance estimation. If the Bartlett (default), Parzen or Quadratic Spectral kernels are selected, the valueautomay be given (rather than an integer) to invoke Newey and West's (1994) automatic bandwidth selection procedure.

kernel(string))specifies the kernel to be used for AC and HAC covariance estimation; the default kernel is Bartlett (also known in econometrics as Newey-West). The full list of kernels available is (abbreviations in parentheses): Bartlett (bar); Truncated (tru); Parzen (par); Tukey-Hanning (thann); Tukey-Hamming (thamm); Daniell (dan); Tent (ten); and Quadratic-Spectral (qua or qs).Note: in the cases of the Bartlett, Parzen, and Tukey-Hanning/Hamming kernels, the number of lags used to construct the kernel estimate equals the bandwidth minus one. Stata's official

neweyimplements HAC standard errors based on the Bartlett kernel, and requires the user to specify the maximum number of lags used and not the bandwidth; see help newey. If these kernels are used withbw(1), no lags are used andivreg2will report the usual Eicker/Huber/White/sandwich variance estimates.

wmatrix(matrix)specifies a user-supplied weighting matrix in place of the computed optimal weighting matrix. The matrix must be positive definite. The user-supplied matrix must have the same row and column names as the instrument variables in the regression model (or a subset thereof).

smatrix(matrix)specifies a user-supplied covariance matrix of the orthogonality conditions to be used in calculating the covariance matrix of the estimator. The matrix must be positive definite. The user-supplied matrix must have the same row and column names as the instrument variables in the regression model (or a subset thereof).

orthog(varlist_ex)requests that a C-statistic be calculated as a test of the exogeneity of the instruments invarlist_ex. These may be either included or excluded exogenous variables. The standard order condition for identification applies: the restricted equation that does not use these variables as exogenous instruments must still be identified.

endog(varlist_en)requests that a C-statistic be calculated as a test of the endogeneity of the endogenous regressors invarlist_en.

redundant(varlist_ex)requests an LM test of the redundancy of the instruments invarlist_ex. These must be excluded exogenous variables. The standard order condition for identification applies: the restricted equation that does not use these variables as exogenous instrumenst must still be identified.

smallrequests that small-sample statistics (F and t-statistics) be reported instead of large-sample statistics (chi-squared and z-statistics). Large-sample statistics are the default. The exception is the statistic for the significance of the regression, which is always reported as a small-sample F statistic.

noconstantsuppresses the constant term (intercept) in the regression. Ifnoconstantis specified, the constant term is excluded from both the final regression and the first-stage regression. To include a constant in the first-stage whennoconstantis specified, explicitly include a variable containing all 1's invarlist_iv.

firstrequests that the full first-stage regression results be displayed, along with the associated diagnostic and identification statistics.

ffirstrequests the first-stage diagnostic and identification statistics. The results are saved in various e() macros.

nocollinsuppresses the checks for collinearities and duplicate variables.

noidsuppresses the calculation and reporting of underidentification and weak identification statistics.

savefirstrequests that the first-stage regressions results are saved for later access using theestimatescommand. The names under which the first-stage regressions are saved are the names of the endogenous regressors prefixed by "_ivreg2_". If these use Stata's time-series operators, the "." is replaced by a "_". The maximum number of first-stage estimation results that can be saved depends on how many other estimation results the user has already saved and on the maximum supported by Stata.

savefprefix(prefix)requests that the first-stage regression results be saved using the user-specified prefix instead of the default "_ivreg2_".

rfrequests that the reduced-form estimation of the equation be displayed.

saverfrequests that the reduced-form estimation of the equation be saved for later access using theestimatescommand. The estimation is stored under the name of the dependent variable prefixed by "_ivreg2_". If this uses Stata's time-series operators, the "." is replaced by a "_".

saverfprefix(prefix)requests that the reduced-form estimation be saved using the user-specified prefix instead of the default "_ivreg2_".

partial(varlist)requests that the exogenous regressors invarlistbe partialled out from the other variables in the equation. If the equation includes a constant, it is automatically partialled out as well. The coefficients corresponding to the regressors invarlistare not calculated.

level(#)specifies the confidence level, in percent, for confidence intervals of the coefficients; see help level.

noheader,eform(),depname()andplusare for ado-file writers; see[R] ivregand[R] regress.

nofootersuppresses the display of the footer containing identification and overidentification statistics, exogeneity and endogeneity tests, lists of endogenous variables and instruments, etc.

versioncausesivreg2to display its current version number and to leave it in the macroe(version). It cannot be used with any other options. and will clear any existinge()saved results.

ivreg2does not report an ANOVA table. Instead, it reports the RSS and both the centered and uncentered TSS. It also reports both the centered and uncentered R-squared. NB: the TSS and R-squared reported by officialivregis centered if a constant is included in the regression, and uncentered otherwise.

ivreg2saves the following results ine():Scalars

e(N)Number of observationse(yy)Total sum of squares (SS), uncentered (y'y)e(yyc)Total SS, centered (y'y - ((1'y)^2)/n)e(rss)Residual SSe(mss)Model SS =yyc-rss if the eqn has a constant, =yy-rss otherwisee(df_m)Model degrees of freedome(df_r)Residual degrees of freedome(r2u)Uncentered R-squared, 1-rss/yye(r2c)Centered R-squared, 1-rss/yyce(r2)Centered R-squared if the eqn has a constant, uncentered other > wisee(r2_a)Adjusted R-squarede(ll)Log likelihoode(rankxx)Rank of the matrix of observations on rhs variables=Ke(rankzz)Rank of the matrix of observations on instruments=Le(rankV)Rank of covariance matrix V of coefficientse(rankS)Rank of covariance matrix S of orthogonality conditionse(rmse)root mean square error=sqrt(rss/(N-K)) if -small-, =sqrt(rss/N > ) if note(F)F statistice(N_clust)Number of clusters (or min(N_clust1,N_clust2) if 2-way cluster > ing)e(N_clust1)Number of clusters in dimension 1 (if 2-way clustering)e(N_clust2)Number of clusters in dimension 2 (if 2-way clustering)e(bw)Bandwidthe(lambda)LIML eigenvaluee(kclass)k in k-class estimatione(fuller)Fuller parameter alphae(sargan)Sargan statistice(sarganp)p-value of Sargan statistice(sargandf)dof of Sargan statistic = degree of overidentification = L-Ke(j)Hansen J statistice(jp)p-value of Hansen J statistice(jdf)dof of Hansen J statistic = degree of overidentification = L-Ke(arubin)Anderson-Rubin overidentification LR statistic N*ln(lambda)e(arubinp)p-value of Anderson-Rubin overidentification LR statistice(arubin_lin)Anderson-Rubin linearized overidentification statistic N*(lamb > da-1)e(arubin_linp)p-value of Anderson-Rubin linearized overidentification statis > tice(arubindf)dof of A-R overid statistic = degree of overidentification = L > -Ke(idstat)LM test statistic for underidentification (Anderson or Kleiber > gen-Paap)e(idp)p-value of underidentification LM statistice(iddf)dof of underidentification LM statistice(widstat)F statistic for weak identification (Cragg-Donald or Kleiberge > n-Paap)e(arf)Anderson-Rubin F-test of significance of endogenous regressorse(arfp)p-value of Anderson-Rubin F-test of endogenous regressorse(archi2)Anderson-Rubin chi-sq test of significance of endogenous regre > ssorse(archi2p)p-value of Anderson-Rubin chi-sq test of endogenous regressorse(ardf)degrees of freedom of Anderson-Rubin tests of endogenous regre > ssorse(ardf_r)denominator degrees of freedom of AR F-test of endogenous regr > essorse(redstat)LM statistic for instrument redundancye(redp)p-value of LM statistic for instrument redundancye(reddf)dof of LM statistic for instrument redundancye(cstat)C-statistice(cstatp)p-value of C-statistice(cstatdf)Degrees of freedom of C-statistice(cons)1 when equation has a Stata-supplied constant; 0 otherwisee(partialcons)as above but prior to partialling-out (seee(partial))e(partial_ct)Number of partialled-out variables (seee(partial))Macros

e(cmd)ivreg2e(cmdline)Command line invoking ivreg2e(version)Version number of ivreg2e(model)ols, iv, gmm, liml, or kclasse(depvar)Name of dependent variablee(instd)Instrumented (RHS endogenous) variablese(insts)Instrumentse(inexog)Included instruments (regressors)e(exexog)Excluded instrumentse(collin)Variables dropped because of collinearitiese(dups)Duplicate variablese(ecollin)Endogenous variables reclassified as exogenous because of collinearities with instrumentse(clist)Instruments tested for orthogonalitye(redlist)Instruments tested for redundancye(partial)Partialled-out exogenous regressorse(small)smalle(wtype)weight typee(wexp)weight expressione(clustvar)Name of cluster variablee(vcetype)Covariance estimation methode(kernel)Kernele(tvar)Time variablee(ivar)Panel variablee(firsteqs)Names of stored first-stage equationse(rfeq)Name of stored reduced-form equatione(predict)Program used to implement predictMatrices

e(b)Coefficient vectore(V)Variance-covariance matrix of the estimatorse(S)Covariance matrix of orthogonality conditionse(W)GMM weighting matrix (=inverse of S if efficient GMM estimator > )e(first)First-stage regression resultse(ccev)Eigenvalues corresponding to the Anderson canonical correlatio > ns teste(cdev)Eigenvalues corresponding to the Cragg-Donald testFunctions

e(sample)Marks estimation sample

. use http://fmwww.bc.edu/ec-p/data/hayashi/griliches76.dta (Wages of Very Young Men, Zvi Griliches, J.Pol.Ec. 1976)

. xi i.year

(Instrumental variables. Examples follow Hayashi 2000, p. 255.)

. ivreg2 lw s expr tenure rns smsa _I* (iq=med kww age mrt)

. ivreg2 lw s expr tenure rns smsa _I* (iq=med kww age mrt), small ffirst

(Testing for the presence of heteroskedasticity in IV/GMM estimation)

. ivhettest, fitlev

(Two-step GMM efficient in the presence of arbitrary heteroskedasticity)

. ivreg2 lw s expr tenure rns smsa _I* (iq=med kww age mrt), gmm2s robust

(GMM with user-specified first-step weighting matrix or matrix of orthogonality conditions)

. ivreg2 lw s expr tenure rns smsa _I* (iq=med kww age mrt), robust

. predict double uhat if e(sample), resid

. mat accum S = `e(insts)' [iw=uhat^2]

. mat S = 1/`e(N)' * S

. ivreg2 lw s expr tenure rns smsa _I* (iq=med kww age mrt), gmm2s robust smatrix(S)

. mat W = invsym(S)

. ivreg2 lw s expr tenure rns smsa _I* (iq=med kww age mrt), gmm2s robust wmatrix(W)

(Equivalence of J statistic and Wald tests of included regressors, irrespective of instrument choice (Ahn, 1997))

. ivreg2 lw (iq=med kww age), gmm2s

. mat S0 = e(S)

. qui ivreg2 lw (iq=kww) med age, gmm2s smatrix(S0)

. test med age

. qui ivreg2 lw (iq=med) kww age, gmm2s smatrix(S0)

. test kww age

. qui ivreg2 lw (iq=age) med kww, gmm2s smatrix(S0)

. test med kww

(Continuously-updated GMM (CUE) efficient in the presence of arbitrary heteroskedasticity. NB: may require 30+ iterations.)

. ivreg2 lw s expr tenure rns smsa _I* (iq=med kww age mrt), cue robust

(Sargan-Basmann tests of overidentifying restrictions for IV estimation)

. ivreg2 lw s expr tenure rns smsa _I* (iq=med kww age mrt)

. overid, all

(Tests of exogeneity and endogeneity)

(Test the exogeneity of one regressor)

. ivreg2 lw s expr tenure rns smsa _I* (iq=med kww age mrt), gmm2s orthog(s)

(Test the exogeneity of two excluded instruments)

. ivreg2 lw s expr tenure rns smsa _I* (iq=med kww age mrt), gmm2s orthog(age mrt)

(Frisch-Waugh-Lovell (FWL): equivalence of estimations with and without partial > ling-out)

. ivreg2 lw s expr tenure rns smsa _I* (iq=med kww age), cluster(year)

. ivreg2 lw s expr tenure rns smsa _I* (iq=med kww age), cluster(year) partial(_I*)

(

partial(): efficient GMM with #clusters<#instruments feasible after partiallin > g-out). ivreg2 lw s expr tenure rns smsa (iq=med kww age), cluster(year) partial(_I*) gmm2s

(Examples following Wooldridge 2002, pp.59, 61)

. use http://fmwww.bc.edu/ec-p/data/wooldridge/mroz.dta

(Equivalence of DWH endogeneity test when regressor is endogenous...)

. ivreg2 lwage exper expersq (educ=age kidslt6 kidsge6)

. ivendog educ

(... endogeneity test using the

endogoption). ivreg2 lwage exper expersq (educ=age kidslt6 kidsge6), endog(educ)

(...and C-test of exogeneity when regressor is exogenous, using the

orthogopti > on). ivreg2 lwage exper expersq educ (=age kidslt6 kidsge6), orthog(educ)

(Heteroskedastic Ordinary Least Squares, HOLS)

. ivreg2 lwage exper expersq educ (=age kidslt6 kidsge6), gmm2s

(Equivalence of Cragg-Donald Wald F statistic and F-test from first-stage regre > ssion in special case of single endogenous regressor. Also illustrates

savefirstopt > ion.). ivreg2 lwage exper expersq (educ=age kidslt6 kidsge6), savefirst

. di e(widstat)

. estimates restore _ivreg2_educ

. test age kidslt6 kidsge6

. di r(F)

(Equivalence of Kleibergen-Paap robust rk Wald F statistic and F-test from firs > t-stage regression in special case of single endogenous regressor.)

. ivreg2 lwage exper expersq (educ=age kidslt6 kidsge6), robust savefirst

. di e(widstat)

. estimates restore _ivreg2_educ

. test age kidslt6 kidsge6

. di r(F)

(Equivalence of Kleibergen-Paap robust rk LM statistic for identification and L > M test of joint significance of excluded instruments in first-stage regression in spec > ial case of single endogenous regressor. Also illustrates use of

ivreg2to perform > an LM test in OLS estimation.). ivreg2 lwage exper expersq (educ=age kidslt6 kidsge6), robust

. di e(idstat)

. ivreg2 educ exper expersq (=age kidslt6 kidsge6) if e(sample), robust

. di e(j)

(Equivalence of an LM test of an excluded instrument for redundancy and an LM t > est of significance from first-stage regression in special case of single endogenous r > egressor.)

. ivreg2 lwage exper expersq (educ=age kidslt6 kidsge6), robust redundant(age)

. di e(redstat)

. ivreg2 educ exper expersq kidslt6 kidsge6 (=age) if e(sample), robust

. di e(j)

(Weak-instrument robust inference: Anderson-Rubin Wald F and chi-sq and Stock-Wright S statistics. Also illusrates use of

saverfoption.). ivreg2 lwage exper expersq (educ=age kidslt6 kidsge6), robust ffirst saverf

. di e(arf)

. di e(archi2)

. di e(sstat)

(Obtaining the Anderson-Rubin Wald F statistic from the reduced-form estimation > )

. estimates restore _ivreg2_lwage

. test age kidslt6 kidsge6

. di r(F)

(Obtaining the Anderson-Rubin Wald chi-sq statistic from the reduced-form estim > ation. Use

ivreg2withoutsmallto obtain large-sample test statistic.). ivreg2 lwage exper expersq age kidslt6 kidsge6, robust

. test age kidslt6 kidsge6

. di r(chi2)

(Obtaining the Stock-Wright S statistic as the value of the GMM CUE objective f > unction. Also illustrates use of

b0option. Coefficients on included exogenous regresso > rs are OLS coefficients, which is equivalent to partialling them out before obtain > ing the value of the CUE objective function.). mat b = 0

. mat colnames b = educ

. qui ivreg2 lwage exper expersq

. mat b = b, e(b)

. ivreg2 lwage exper expersq (educ=age kidslt6 kidsge6), robust b0(b)

. di e(j)

(LIML and k-class estimation using Klein data)

. webuse klein . tsset yr

(LIML estimates of Klein's consumption function)

. ivreg2 consump L.profits (profits wagetot = govt taxnetx year wagegovt capital1 L.totinc), liml

(Equivalence of LIML and CUE+homoskedasticity+independence)

. ivreg2 consump L.profits (profits wagetot = govt taxnetx year wagegovt capital1 L.totinc), liml coviv

. ivreg2 consump L.profits (profits wagetot = govt taxnetx year wagegovt capital1 L.totinc), cue

(Fuller's modified LIML with alpha=1)

. ivreg2 consump L.profits (profits wagetot = govt taxnetx year wagegovt capital1 L.totinc), fuller(1)

(k-class estimation with Nagar's bias-adjusted IV, k=1+(L-K)/N=1+4/21=1.19)

. ivreg2 consump L.profits (profits wagetot = govt taxnetx year wagegovt capital1 L.totinc), kclass(1.19)

(Kernel-based covariance estimation using time-series data)

. use http://fmwww.bc.edu/ec-p/data/wooldridge/phillips.dta . tsset year, yearly

(Autocorrelation-consistent (AC) inference in an OLS Regression)

. ivreg2 cinf unem, bw(3)

. ivreg2 cinf unem, kernel(qs) bw(auto)

(Heteroskedastic and autocorrelation-consistent (HAC) inference in an OLS regre > ssion)

. ivreg2 cinf unem, bw(3) kernel(bartlett) robust small

. newey cinf unem, lag(2)

(AC and HAC in IV and GMM estimation)

. ivreg2 cinf (unem = l(1/3).unem), bw(3)

. ivreg2 cinf (unem = l(1/3).unem), bw(3) gmm2s kernel(thann)

. ivreg2 cinf (unem = l(1/3).unem), bw(3) gmm2s kernel(qs) robust orthog(l1.unem)

(Examples using Large N, Small T Panel Data)

. use http://fmwww.bc.edu/ec-p/data/macro/abdata.dta . tsset id year

(Two-step effic. GMM in the presence of arbitrary heteroskedasticity and autoco > rrelation)

. ivreg2 n (w k ys = d.w d.k d.ys d2.w d2.k d2.ys), gmm2s cluster(id)

(Kiefer (1980) SEs - robust to arbitrary serial correlation but not heteroskeda > sticity)

. ivreg2 n w k, kiefer

. ivreg2 n w k, bw(9) kernel(tru)

(Equivalence of cluster-robust and kernel-robust with truncated kernel and max > bandwidth)

. ivreg2 n w k, cluster(id)

. ivreg2 n w k, bw(9) kernel(tru) robust

(Examples using Small N, Large T Panel Data. NB: T is actually not very large > - only 20 - so results should be interpreted with caution)

. webuse grunfeld . tsset

(Autocorrelation-consistent (AC) inference)

. ivreg2 invest mvalue kstock, bw(1) kernel(tru)

(Heteroskedastic and autocorrelation-consistent (HAC) inference)

. ivreg2 invest mvalue kstock, robust bw(1) kernel(tru)

(HAC inference, SEs also robust to disturbances correlated across panels)

. ivreg2 invest mvalue kstock, robust cluster(year) bw(1) kernel(tru)

(Equivalence of Driscoll-Kraay SEs as implemented by

ivreg2andxtscc) (See Hoeschle (2007) for discussion ofxtscc). ivreg2 invest mvalue kstock, dkraay(2) small

. ivreg2 invest mvalue kstock, cluster(year) bw(2) small

. xtscc invest mvalue kstock, lag(1)

(Examples using Large N, Large T Panel Data. NB: T is again not very large - o > nly 20 - so results should be interpreted with caution)

. webuse nlswork . tsset

(One-way cluster-robust: SEs robust to arbitrary heteroskedasticity and within- > panel autocorrelation)

. ivreg2 ln_w grade age ttl_exp tenure, cluster(idcode)

(Two-way cluster-robust: SEs robust to arbitrary heteroskedasticity and within- > panel autocorrelation, and contemporaneous cross-panel correlation, i.e., the cross-p > anel correlation is not autocorrelated)

. ivreg2 ln_w grade age ttl_exp tenure, cluster(idcode year)

(Two-way cluster-robust: SEs robust to arbitrary heteroskedasticity and within- > panel autocorrelation and cross-panel autocorrelated disturbances that disappear afte > r 2 lags)

. ivreg2 ln_w grade age ttl_exp tenure, cluster(idcode year) bw(2) kernel(tru)

Ahn, Seung C. 1997. Orthogonality tests in linear models. Oxford Bulletin of Economics and Statistics, Vol. 59, pp. 183-186.

Anderson, T.W. 1951. Estimating linear restrictions on regression coefficients for multivariate normal distributions. Annals of Mathematical Statistics, Vol. 22, pp. 327-51.

Anderson, T. W. and H. Rubin. 1949. Estimation of the parameters of a single equation in a complete system of stochastic equations. Annals of Mathematical Statistics, Vol. 20, pp. 46-63.

Anderson, T. W. and H. Rubin. 1950. The asymptotic properties of estimates of the parameters of a single equation in a complete system of stochastic equations. Annals of Mathematical Statistics, Vol. 21, pp. 570-82.

Angrist, J.D. and Pischke, J.-S. 2009. Mostly Harmless Econometrics: An Empiricist's Companion. Princeton: Princeton University Press.

Baum, C.F., Schaffer, M.E., and Stillman, S. 2003. Instrumental Variables and GMM: Estimation and Testing. The Stata Journal, Vol. 3, No. 1, pp. 1-31. http://ideas.repec.org/a/tsj/stataj/v3y2003i1p1-31.html. Working paper version: Boston College Department of Economics Working Paper No. 545. http://ideas.repec.org/p/boc/bocoec/545.html. Citations in published work.

Baum, C. F., Schaffer, M.E., and Stillman, S. 2007. Enhanced routines for instrumental variables/GMM estimation and testing. The Stata Journal, Vol. 7, No. 4, pp. 465-506. http://ideas.repec.org/a/tsj/stataj/v7y2007i4p465-506.html. Working paper version: Boston College Department of Economics Working Paper No. 667. http://ideas.repec.org/p/boc/bocoec/667.html. Citations in published work.

Breusch, T., Qian, H., Schmidt, P. and Wyhowski, D. 1999. Redundancy of moment conditions. Journal of Econometrics, Vol. 9, pp. 89-111.

Cameron, A.C., Gelbach, J.B. and Miller, D.L. 2006. Robust Inference with Multi-Way Clustering. NBER Technical Working paper 327. http://www.nber.org/papers/t0327. Forthcoming in the Journal of Business and Economic Statistics.

cgmregis available at http://www.econ.ucdavis.edu/faculty/dlmiller/statafiles.Chernozhukov, V. and Hansen, C. 2005. The Reduced Form: A Simple Approach to Inference with Weak Instruments. Working paper, University of Chicago, Graduate School of Business.

Cragg, J.G. and Donald, S.G. 1993. Testing Identfiability and Specification in Instrumental Variables Models. Econometric Theory, Vol. 9, pp. 222-240.

Cushing, M.J. and McGarvey, M.G. 1999. Covariance Matrix Estimation. In L. Matyas (ed.), Generalized Methods of Moments Estimation. Cambridge: Cambridge University Press.

Davidson, R. and MacKinnon, J. 1993. Estimation and Inference in Econometrics. 1993. New York: Oxford University Press.

Driscoll, J.C. and Kraay, A. 1998. Consistent Covariance Matrix Estimation With Spatially Dependent Panel Data. Review of Economics and Statistics. Vol. 80, No. 4, pp. 549-560.

Dufour, J.M. 2003. Identification, Weak Instruments and Statistical Inference in Econometrics. Canadian Journal of Economics, Vol. 36, No. 4, pp. 767-808. Working paper version: CIRANO Working Paper 2003s-49. http://www.cirano.qc.ca/pdf/publication/2003s-49.pdf.

Finlay, K., and Magnusson, L.M. 2009. Implementing Weak-Instrument Robust Tests for a General Class of Instrumental-Variables Models. The Stata Journal, Vol. 9, No. 3, pp. 398-421. http://www.stata-journal.com/article.html?article=st0171.

Hall, A.R., Rudebusch, G.D. and Wilcox, D.W. 1996. Judging Instrument Relevance in Instrumental Variables Estimation. International Economic Review, Vol. 37, No. 2, pp. 283-298.

Hayashi, F. Econometrics. 2000. Princeton: Princeton University Press.

Hansen, L.P., Heaton, J., and Yaron, A. 1996. Finite Sample Properties of Some Alternative GMM Estimators. Journal of Business and Economic Statistics, Vol. 14, No. 3, pp. 262-280.

Hoechle, D. 2007. Robust Standard Errors for Panel Regressions with Crossñsectional Dependence. Stata Journal, Vol. 7, No. 3, pp. 281-312. http://www.stata-journal.com/article.html?article=st0128.

Kiefer, N.M. 1980. Estimation of Fixed Effect Models for Time Series of Cross-Sections with Arbitrary Intertemporal Covariance. Journal of Econometrics, Vol. 14, No. 2, pp. 195-202.

Kleibergen, F. 2007. Generalizing Weak Instrument Robust Statistics Towards Multiple Parameters, Unrestricted Covariance Matrices and Identification Statistics. Journal of Econometrics, forthcoming.

Kleibergen, F. and Paap, R. 2006. Generalized Reduced Rank Tests Using the Singular Value Decomposition. Journal of Econometrics, Vol. 133, pp. 97-126.

Kleibergen, F. and Schaffer, M.E. 2007. ranktest: Stata module for testing the rank of a matrix using the Kleibergen-Paap rk statistic. http://ideas.repec.org/c/boc/bocode/s456865.html.

Mikusheva, A. and Poi, B.P. 2006. Tests and Confidence Sets with Correct Size When Instruments are Potentially Weak. The Stata Journal, Vol. 6, No. 3, pp. 335-347.

Moreira, M.J. and Poi, B.P. 2003. Implementing Tests with the Correct Size in the Simultaneous Equations Model. The Stata Journal, Vol. 3, No. 1, pp. 57-70.

Newey, W.K. and K.D. West, 1994. Automatic Lag Selection in Covariance Matrix Estimation. Review of Economic Studies, Vol. 61, No. 4, pp. 631-653.

Shea, J. 1997. Instrument Relevance in Multivariate Linear Models: A Simple Measure. Review of Economics and Statistics, Vol. 49, No. 2, pp. 348-352.

Stock, J.H. and Wright, J.H. 2000. GMM with Weak Identification. Econometrica, Vol. 68, No. 5, September, pp. 1055-1096.

Stock, J.H. and Yogo, M. 2005. Testing for Weak Instruments in Linear IV Regression. In D.W.K. Andrews and J.H. Stock, eds. Identification and Inference for Econometric Models: Essays in Honor of Thomas Rothenberg. Cambridge: Cambridge University Press, 2005, pp. 80Ò108. Working paper version: NBER Technical Working Paper 284. http://www.nber.org/papers/T0284.

Thompson, S.B. 2009. Simple Formulas for Standard Errors that Cluster by Both Firm and Time. http://ssrn.com/abstract=914002.

Wooldridge, J.M. 2002. Econometric Analysis of Cross Section and Panel Data. Cambridge, MA: MIT Press.

We would like to thanks various colleagues who helped us along the way, including David Drukker, Frank Kleibergen, Austin Nichols, Brian Poi, Vince Wiggins, and, not least, the users of

ivreg2who have provided suggestions, spotted bugs, and helped test the package. We are also grateful to Jim Stock and Moto Yogo for permission to reproduce their critical values for the Cragg-Donald statistic.

ivreg2is not an official Stata command. It is a free contribution to the research community, like a paper. Please cite it as such:Baum, C.F., Schaffer, M.E., Stillman, S. 2010. ivreg2: Stata module for extended instrumental variables/2SLS, GMM and AC/HAC, LIML and k-class regression. http://ideas.repec.org/c/boc/bocode/s425401.html

AuthorsChristopher F Baum, Boston College, USA baum@bc.edu

Mark E Schaffer, Heriot-Watt University, UK m.e.schaffer@hw.ac.uk

Steven Stillman, Motu Economic and Public Policy Research stillman@motu.org.nz

Also seeArticles:

Stata Journal, volume 3, number 1: st0030Stata Journal, volume 7, number 4: st0030_3Manual:

[U] 23 Estimation and post-estimation commands[U] 29 Overview of model estimation in Stata[R] ivregOn-line: help for ivregress, ivreg, newey; overid, ivendog, ivhettest, ivreset, xtivreg2, xtoverid, ranktest, condivreg (if installed); rivtest (if installed); cgmreg (if installed); xtscc (if installed); est, postest; regress