..-
help for ^hausman^ (Jeroen Weesie/ICS Aug 10, '97)
..-
Hausman-test
------------
^hausman b1 V1 b2 V2 [^, ^t^able ^e^stnames^(^str1 str2^) z^ero ]
Description
-----------
^hausman^ computes a Hausman test-statistic H for a hypothesis Ho. The test
is based on the difference between two estimates b1 and b2. Under Ho, b1 is
assumed to be consistent and efficient estimate with asymptotic covariance
matrix V1. The alternative estimator b2, with asymptotic covariance matrix
V2, is consistent --but usually inefficient--both under Ho and the alternative
hypothesis Ha.
A large difference b1-b2 between the estimates is seen as evidence against Ho.
Here "large" is measured by the Mahalanobis distance beteen b1 and b2. Under
the assumptions above, var(b1-b2) = V2-V1 under Ho. Then
H = (b1-b2)' Inv(V2-V1) (b1-b2)
is asymptotically chi-square distributed with k = rank(V2-V1) degrees of
freedom under Ho (See Hausman & McFadden 1984; Amemiya 1985: 146-7).
^hausman^ expects that b1/b2 are row-vectors and V1/V2 symmetric (semi-)positive
definite matrices. The column-labels of b1 and V2, and similarly of b2 and V2
should be identical. These conditions are met if b/V are obtained by get(_b)
and get(VCE) after an estimation command. Moreover, V2-V1 should be (semi-)
positive definite. You have to beware with the order of (b1,V1) and (b2,V2)!
It may be the case that some of the parameters (elements) in b1 and b2 do not
occur in the other vector (for an example, see below). Then ^hausman^ selects
the elements from b1 and b2 that occur in both, based on the names including
the equation names. Thus, the user of ^hausman^ should be careful in defining
the labels of b1/V1 and b2/V2 in a compatible way. On the other hand,
^hausman^ does not assume that the parameters occur in the same order in b1
and b2.
On output, ^S_1^ contains the H-statistic, ^S_2^ contains the degrees of freedom,
and ^S_3^ the names of the variables, including the equation names, that where
included in the test.
Options
-------
^table^ produces a table with for each of the common parameters the coefficients
and standard deviations of both estimates, and the scaled difference between
the coefficients.
^estnames(^str1 str2^)^ specify labels for the two estimators in the table of
coefficients. The labels should be seperated by white space. They will be
truncated at length 16. If no labels are specified, "Estimator 1" and
"Estimator 2" are used.
^zero^ specifies that parameters with associated zero variance are to be treated
as absent.
^nodisplay^ suppresses the display of the statistic and test.
Example
-------
Hausman's "context of discovery" is a test for the assumption of "Independence
of Irrevant Alternatives" (iia) that underlies the multinomial logit model. We
will discuss Hausman's test in this context. The command ^iia^ provides a
canned solution.
Consider the choice between three means of transportation for commuting (1=car,
2=public, 3=other), based on three kinds of costs of each alternative: costs in
terms of money, time, and greenish. Data on a respondent with respnr=177, who
uses a car for commuting and with greenish convictions looks like this:
means choice money time green respnr
--------------------------------------------
1 1 10 20 0 117
2 0 4 40 0 117
3 0 2 60 1 117
Green is a dummy that marks the alternatives public/other by 1 for respondents
who agrees with environmentally-friendly policies. According the the iia
assumption the ratio of the probabilities "public/other" does not depend on
whether the alternative "car" is available. To test this assumption, we first
fit a model with the hard constraints money and time.
. ^clogit choice money time, group(respnr)^
. ^mat b1 = get(_b)^
. ^mat V1 = get(VCE)^
Under the assumption of the Independence of Irrelavant Alternatives, the
weights of money and time in the decision between "public" and "other" should
be the same as when deciding among the three alternatives. The preference order
between "public" and "other" is "revealed" only when either of these two is
chosen. Thus, next we fit the model without the alternative "car" and with all
data on "car" choosers excluded. In Stata we only have to exclude the record on
Car (means==1); the other records on car-users are automatically excludes.
. ^clogit choice money time if means != 1, group(respnr)^
. ^mat b2 = get(_b)^
. ^mat V2 = get(VCE)^
In both models, we have estimated the "weights" of money and time costs in
the commuting choice. We can now test the iia assumption with the command
^hausman^
. ^hausman b1 V1 b2 V2, t^
We now include "soft" ideological convictions.
. ^clogit choice money time green, group(respnr)^
. ^mat b1 = get(_b)^
. ^mat V1 = get(VCE)^
In this analysis, we estimated the two weights of "hard" constraints, and one
soft constraint. Again we estimate the restricted model.
. ^clogit choice money time green if means != 1, group(respnr)^
. ^mat b2 = get(_b)^
. ^mat V2 = get(VCE)^
What happens? Stata noted that -green- does not vaty between the included
alternatives, and drops the variable. We can, however, still use the command
. ^hausman b1 V1 b2 V2^
since ^hausman^ is smart enough to select only the parameters included both in
b1 and b2. These are identified by the names that Stata associates with
matrices b1 and b2.
References
----------
Amemiya, Takeshi (1985)
Advanced Econometrics. Basil Blackwell. (p 145-146).
Hausman, J.A. and D. McFadden (1984)
Specification Tests for the Multinomial Logit Model.
Econometrica 52: 1377-1398.
Also See
--------
Manual: ^[R] st^
On-line: help for @iia@, @lrtest@, @test@