..- help for ^hausman^ (Jeroen Weesie/ICS Aug 10, '97) ..- Hausman-test ------------ ^hausman b1 V1 b2 V2 [^, ^t^able ^e^stnames^(^str1 str2^) z^ero ] Description ----------- ^hausman^ computes a Hausman test-statistic H for a hypothesis Ho. The test is based on the difference between two estimates b1 and b2. Under Ho, b1 is assumed to be consistent and efficient estimate with asymptotic covariance matrix V1. The alternative estimator b2, with asymptotic covariance matrix V2, is consistent --but usually inefficient--both under Ho and the alternative hypothesis Ha. A large difference b1-b2 between the estimates is seen as evidence against Ho. Here "large" is measured by the Mahalanobis distance beteen b1 and b2. Under the assumptions above, var(b1-b2) = V2-V1 under Ho. Then H = (b1-b2)' Inv(V2-V1) (b1-b2) is asymptotically chi-square distributed with k = rank(V2-V1) degrees of freedom under Ho (See Hausman & McFadden 1984; Amemiya 1985: 146-7). ^hausman^ expects that b1/b2 are row-vectors and V1/V2 symmetric (semi-)positive definite matrices. The column-labels of b1 and V2, and similarly of b2 and V2 should be identical. These conditions are met if b/V are obtained by get(_b) and get(VCE) after an estimation command. Moreover, V2-V1 should be (semi-) positive definite. You have to beware with the order of (b1,V1) and (b2,V2)! It may be the case that some of the parameters (elements) in b1 and b2 do not occur in the other vector (for an example, see below). Then ^hausman^ selects the elements from b1 and b2 that occur in both, based on the names including the equation names. Thus, the user of ^hausman^ should be careful in defining the labels of b1/V1 and b2/V2 in a compatible way. On the other hand, ^hausman^ does not assume that the parameters occur in the same order in b1 and b2. On output, ^S_1^ contains the H-statistic, ^S_2^ contains the degrees of freedom, and ^S_3^ the names of the variables, including the equation names, that where included in the test. Options ------- ^table^ produces a table with for each of the common parameters the coefficients and standard deviations of both estimates, and the scaled difference between the coefficients. ^estnames(^str1 str2^)^ specify labels for the two estimators in the table of coefficients. The labels should be seperated by white space. They will be truncated at length 16. If no labels are specified, "Estimator 1" and "Estimator 2" are used. ^zero^ specifies that parameters with associated zero variance are to be treated as absent. ^nodisplay^ suppresses the display of the statistic and test. Example ------- Hausman's "context of discovery" is a test for the assumption of "Independence of Irrevant Alternatives" (iia) that underlies the multinomial logit model. We will discuss Hausman's test in this context. The command ^iia^ provides a canned solution. Consider the choice between three means of transportation for commuting (1=car, 2=public, 3=other), based on three kinds of costs of each alternative: costs in terms of money, time, and greenish. Data on a respondent with respnr=177, who uses a car for commuting and with greenish convictions looks like this: means choice money time green respnr -------------------------------------------- 1 1 10 20 0 117 2 0 4 40 0 117 3 0 2 60 1 117 Green is a dummy that marks the alternatives public/other by 1 for respondents who agrees with environmentally-friendly policies. According the the iia assumption the ratio of the probabilities "public/other" does not depend on whether the alternative "car" is available. To test this assumption, we first fit a model with the hard constraints money and time. . ^clogit choice money time, group(respnr)^ . ^mat b1 = get(_b)^ . ^mat V1 = get(VCE)^ Under the assumption of the Independence of Irrelavant Alternatives, the weights of money and time in the decision between "public" and "other" should be the same as when deciding among the three alternatives. The preference order between "public" and "other" is "revealed" only when either of these two is chosen. Thus, next we fit the model without the alternative "car" and with all data on "car" choosers excluded. In Stata we only have to exclude the record on Car (means==1); the other records on car-users are automatically excludes. . ^clogit choice money time if means != 1, group(respnr)^ . ^mat b2 = get(_b)^ . ^mat V2 = get(VCE)^ In both models, we have estimated the "weights" of money and time costs in the commuting choice. We can now test the iia assumption with the command ^hausman^ . ^hausman b1 V1 b2 V2, t^ We now include "soft" ideological convictions. . ^clogit choice money time green, group(respnr)^ . ^mat b1 = get(_b)^ . ^mat V1 = get(VCE)^ In this analysis, we estimated the two weights of "hard" constraints, and one soft constraint. Again we estimate the restricted model. . ^clogit choice money time green if means != 1, group(respnr)^ . ^mat b2 = get(_b)^ . ^mat V2 = get(VCE)^ What happens? Stata noted that -green- does not vaty between the included alternatives, and drops the variable. We can, however, still use the command . ^hausman b1 V1 b2 V2^ since ^hausman^ is smart enough to select only the parameters included both in b1 and b2. These are identified by the names that Stata associates with matrices b1 and b2. References ---------- Amemiya, Takeshi (1985) Advanced Econometrics. Basil Blackwell. (p 145-146). Hausman, J.A. and D. McFadden (1984) Specification Tests for the Multinomial Logit Model. Econometrica 52: 1377-1398. Also See -------- Manual: ^[R] st^ On-line: help for @iia@, @lrtest@, @test@