{smcl}
{* Last revised September 29, 2014}{...}
{hline}
help for {hi:gologit29}
{hline}
{p 4 4 2} NOTE: {cmdab:gologit29} is designed for users of Stata 8 through 10. Those with
newer versions of Stata should use {cmdab:gologit2}, which supports factor variables.
{title:Generalized Ordered Logit Models for Ordinal Dependent Variables}
{p 8 15 2}
{cmdab:gologit29} {it:depvar} [{it:indepvars}]
[{it:weight}] [{cmd:if} {it:exp}] [{cmd:in} {it:range}]
[{cmd:,}
{cmdab:p:l}
{cmdab:p:l(}{it:varlist}{cmd:)}
{cmdab:np:l}
{cmdab:np:l(}{it:varlist}{cmd:)}
{cmdab:auto:fit}
{cmdab:auto:fit(}{it:alpha}{cmd:)}
{cmd:link(}{it:logit/probit/cloglog/loglog/cauchit}{cmd:)}
{cmdab:force}
{cmdab:lrf:orce}
{cmdab:g:amma}
{cmdab:g:amma(}{it:name}{cmd:)}
{cmdab:nol:abel}
{cmdab:mls:tart}
{cmdab:sto:re(}{it:name}{cmd:)}
{cmdab:c:onstraints:(}{it:clist}{cmd:)} {cmdab:r:obust}
{cmdab:cl:uster:(}{it:varname}{cmd:)}
{cmdab:l:evel:(}{it:#}{cmd:)}
{cmdab:or} {cmdab:log} {cmdab:v1}
{cmd:svy} {it:svy_options}
{it:maximize_options} ]
{p 4 4 2}
where {it:svy_options} are
{p 8 8 2}
{cmdab:sub:pop:(}{it:subpop_spec}{cmd:)}
{cmdab:nosvy:adjust}
{cmdab:pr:ob}
{cmd:ci}
{cmd:deff}
{cmd:deft}
{cmd:meff}
{cmd:meft}
{p 4 8 2}
and {it:subpop_spec} is
{p 12 12 2}
[{it:varname}]
[{cmd:if} {it:exp}]
[{cmd:in} {it:range}]
[, {cmdab:srs:subpop} ]
{p 4 4 2}
{cmd:gologit29} shares the features of all estimation commands; see help
{help est}.
{cmd:gologit29} typed without arguments redisplays previous results.
The following options may be given when redisplaying results:
{p 8 8 2}
{cmdab:g:amma}
{cmdab:g:amma(}{it:name}{cmd:)}
{cmdab:sto:re(}{it:name}{cmd:)}
{cmd:or}
{cmdab:l:evel:(}{it:#}{cmd:)}
{cmdab:pr:ob}
{cmd:ci}
{cmd:deff}
{cmd:deft}
{p 4 4 2} {cmd:gologit29} works under both Stata 8.2 and Stata 9 or
higher. If using Stata 9, the {opt by}, {opt nestreg}, {opt stepwise},
{opt xi}, and possibly other prefix commands are allowed; see
{help prefix}. The {cmd:svy} prefix command is NOT currently supported;
use the {cmd:svy} option instead.
{p 4 4 2} {cmd:fweight}s, {cmd:iweight}s, and {cmd:pweight}s are
allowed; see help {help weights}.
{p 4 4 2}
The syntax of {help predict} following {cmd:gologit29} is
{p 8 16 2}{cmd:predict} [{it:type}] {it:newvarname}({it:s})
[{cmd:if} {it:exp}] [{cmd:in} {it:range}]
[{cmd:,} {it:statistic} {cmdab:o:utcome:(}{it:outcome}{cmd:)} ]
{p 4 4 2}
where {it:statistic} is
{p 8 21 2}{cmd:p}{space 8}probability (specify one new variable and
{cmd:outcome()} option, or specify k new variables, k = # of
outcomes); the default{p_end}
{p 8 21 2}{cmd:xb}{space 7}linear prediction ({cmd:outcome()} option
required){p_end}
{p 8 21 2}{cmd:stdp}{space 5}S.E. of linear prediction ({cmd:outcome()} option
required){p_end}
{p 8 21 2}{cmd:stddp}{space 4}S.E. of difference in linear predictions
({cmd:outcome()} option is
{cmd:outcome(}{it:outcome1}{cmd:,}{it:outcome2}{cmd:)}){p_end}
{p 4 4 2}
Note that you specify one new variable with {cmd:xb}, {cmd:stdp}, and
{cmd:stddp} and specify either one or k new variables with {cmd:p}.
{p 4 4 2}
These statistics are available both in and out of sample; type
"{cmd:predict} {it:...} {cmd:if e(sample)} {it:...}" if wanted only for the
estimation sample.
{title:Description}
{p 4 4 2} {cmd:gologit29} is designed for users of Stata 8 through 10.
It does not support factor variables. Those with newer versions of Stata
should use {cmdab:gologit2} instead. {cmd:gologit29} will probably not see further
updates or revisions.
{p 4 4 2} {cmd:gologit29} is a user-written program that estimates
generalized ordered logit models for ordinal dependent variables. The
actual values taken on by the dependent variable are irrelevant except
that larger values are assumed to correspond to "higher" outcomes. Up
to 20 outcomes are allowed. {cmd:gologit29} is inspired by Vincent Fu's
{cmd:gologit} program and is backward compatible with it but offers
several additional powerful options.
{p 4 4 2} A major strength of {cmd:gologit29} is that it can also
estimate three special cases of the generalized model: the {it}
proportional odds/parallel lines model{sf}, the {it}partial proportional odds
model{sf}, and the {it}logistic regression model{sf}. Hence,
{cmd:gologit29} can estimate models that are less restrictive than the
proportional odds /parallel lines models estimated by {cmd:ologit}
(whose assumptions are often violated) but more parsimonious and
interpretable than those estimated by a non-ordinal method, such as
multinomial logistic regression (i.e. {cmd:mlogit}). The {cmd:autofit}
option greatly simplifies the process of identifying partial
proportional odds models that fit the data.
{p 4 4 2} An alternative but equivalent parameterization of the model
that has appeared in the literature is reported when the {cmd:gamma}
option is selected. Other key advantages of {cmd:gologit29} include
support for linear constraints (making it possible to use {cmd:gologit29} for
constrained logistic regression), survey data
estimation, and the computation of estimated probabilities via the
{cmd:predict} command.
{p 4 4 2} Also, if the user considers them more appropriate for their
data, probit, complementary log-log, log-log and cauchit links can be used
instead of logit by specifying the {cmd:link} option, e.g. {opt link(l)}
for logit (the default), {opt link(p)} for probit, {opt link(c)} for complementary log-log,
{opt link(ll)} for log-log, and {opt link(ca)} for cauchit.
{p 4 4 2} {cmd:gologit29} works under both Stata 8.2 and Stata 9 or
higher. Syntax is the same for both versions; but if you are using
Stata 9 or higher, gologit29 supports several prefix commands,
including {cmd:by}, {cmd:nestreg}, {cmd:xi} and {cmd:sw}. Stata 9's
{cmd:svy} prefix command is NOT currently supported; use the {cmd:svy} option
instead.
{p 4 4 2} If you want to estimate marginal effects after
{cmd:gologit29}, it is recommended that you install the most current
versions of the user-written {cmd:mfx2} and/or {cmd:margeff} commands,
both available from SSC. These commands are generally easier to use
than {cmd:mfx} and make it possible to output the results using table-formatting
programs like {cmd:outreg2} and {cmd:estout}.
{p 4 4 2} More information on the statistical theory behind
{cmd:gologit29} as well as several worked examples and a
troubleshooting FAQ can be found at
{browse "http://www3.nd.edu/~rwilliam/gologit2/"}.
{title:Warning & Error Messages}
{p 4 4 2}Note: A trouble-shooting FAQ with additional information can
be found at{break}
{browse "http://www3.nd.edu/~rwilliam/gologit2/tsfaq.html"}
{p 4 4 2}An oddity of gologit/goprobit models is that it is possible
to get negative predicted probabilities. McCullaph & Nelder discuss
this in Generalized Linear Models, 2nd edition, 1989, p. 155: "The
usefulness of non-parallel regression models is limited to some extent
by the fact that the lines must eventually intersect. Negative fitted
values are then unavoidable for some values of x, though perhaps not
in the observed range. If such intersections occur in a sufficiently
remote region of the x-space, this flaw in the model need not be
serious."
{p 4 4 2}This seems to be a fairly rare occurrence, and when it does
occur there are often other problems with the model, e.g. the model is
overly complicated and/or there are very small Ns for some categories
of the dependent variable. Combining categories or simplifying the
model often helps. {opt gologit29} will give a warning message whenever
any in-sample predicted probabilities are negative. If it is just a
few cases, it may not be worth worrying about, but if there are many
cases you may wish to modify your model, data, or sample, or use a
different statistical technique altogether.
{title:Options}
{p 4 8 2} {cmd:pl}, {cmd:npl}, {cmd:npl()}, {cmd:pl()}, {cmd:autofit}
and {cmd:autofit()} provide alternative means for imposing or relaxing
the proportional odds/ parallel lines assumption. Only one may be
specified at a time.
{p 8 12 2} {cmd:autofit(}{it:alpha}{cmd:)} uses an iterative process to
identify the partial proportional odds model that best fits the data.
{it:alpha} is the desired significance level for the tests; {it:alpha}
must be greater than 0 and less than 1. If {cmd:autofit} is specified
without parameters, the default alpha-value is .05. Note that, the
higher {it:alpha} is, the easier it is to reject the parallel lines
assumption, and the less parsimonious the model will tend to be. This
option can take a little while because several models may need to be
estimated. The use of {cmd:autofit} is highly recommended but the other
options provide more control over the final model if the user wants it.
{p 8 12 2} {cmd:pl} specified without parameters constrains all
independent variables to meet the proportional odds/ parallel lines
assumption. It will produce results that are equivalent to
{cmd:ologit}.
{p 8 12 2} {cmd:npl} specified without parameters relaxes the
proportional odds/ parallel lines assumption for all explanatory
variables. This is the default option and presents results equivalent
to the original {cmd:gologit.}
{p 8 12 2} {cmd:pl(}{it:varlist}{cmd:)} constrains the specified
explanatory variables to meet the proportional odds/ parallel lines
assumption. All other variable effects do not need to meet the
assumption. The variables specified must be a subset of the explanatory
variables.
{p 8 12 2} {cmd:npl(}{it:varlist}{cmd:)} frees the specified explanatory
variables from meeting the proportional odds/ parallel lines assumption.
All other explanatory variables are constrained to meet the assumption.
The variables specified must be a subset of the explanatory variables.
{p 4 8 2} {cmd:link(}{it:logit/probit/cloglog/loglog/cauchit}{cmd:)} specifies
the link function to be used. The legal values are {opt link(logit)},
{opt link(probit)}, {opt link(cloglog)}, {opt link(loglog)}, and {opt link(cauchit)},
which can be abbreviated as {opt link(l)}, {opt link(p)}, {opt link(c)},
{opt link(ll)} and {opt link(ca)}. {opt link(logit)} is the default if the option is
omitted.
{p 8 8 2} The following advice is adapted from Norusis (2005, p. 84):
Probit and logit models are reasonable choices when the changes in the
cumulative probabilities are gradual. If there are abrupt changes,
other link functions should be used. The log-log link may be a good
model when the cumulative probabilities increase from 0 fairly slowly
and then rapidly approach 1. If the opposite is true, namely that the
cumulative probability for lower scores is high and the approach to 1
is slow, the complementary log-log link may describe the data. The
cauchit distribution has tails that are bigger than the normal
distribution’s, hence the cauchit link may be useful when you have
more extreme values in either direction.
{p 8 8 2} NOTE: Programs differ in the names used for these latter two
links. Stata's loglog link corresponds to SPSS PLUM's cloglog link;
and Stata's cloglog link is called nloglog in SPSS.
{p 8 8 2} NOTE: Post-estimation commands that work with this program
may support some links but not others. Check the program documentation
to be sure it works correctly with the link you are using. For example,
post-estimation commands that work with the original {cmd:gologit} will
often work with this program, but only if you are using the logit link.
{p 4 8 2} {opt force} can be used to force {cmd:gologit29} to issue
only warning messages in some situations when it would normally give a
fatal error:
{p 8 8 2} By default, the dependent variable can have a maximum of
20 categories. A variable with more categories than that is probably
a mistaken entry by the user, e.g. a continuous variable has been
specified rather than an ordinal one. But, if your dependent variable
really is ordinal with more than 20 categories, {opt force} will let
{cmd:gologit29} analyze it (although other practical limitations, such
as small sample sizes within categories, may keep it from coming up
with a final solution.)
{p 8 8 2} Also, variables specified in {opt npl(varlist)} and {opt pl(varlist)} are
required to be a subset of the explanatory variables. However, prefix commands like {opt sw}
and {opt nestreg} estimate models that may use only a subset of the X variables, and those
subsets may not include all the variable specified in the npl/pl varlists. Using
{opt force} will allow model estimation to continue in those cases.
{p 8 8 2} Obviously, you should only use {opt force} when you are confident that
you are not making a mistake. {opt force} does not always work in Stata versions
before 9.0.
{p 4 8 2} {cmd:lrforce} forces Stata to report a Likelihood Ratio
Statistic under certain conditions when it ordinarily would not. Some
types of constraints can make a Likelihood Ratio chi-square test
invalid. Hence, to be safe, Stata reports a Wald statistic whenever
constraints are used. But, Likelihood Ratio statistics should be correct
for the types of constraints imposed by the {cmd:pl} and {cmd:npl}
commands. Note that the {cmd:lrforce} option will be ignored when
robust standard errors are specified either directly or indirectly, e.g.
via use of the {cmd:robust} or {cmd:svy} options. Use this option with
caution if you specify other constraints since these may make a LR chi-
square statistic inappropriate.
{p 4 8 2} {cmd:store(}{it:name}{cmd:)} causes the command
{cmd:estimates store {it:name}}
to be executed when {cmd:gologit29} finishes. This is useful
for when you wish to estimate a series of models and want to save the
results. See help {help estimates}.
{p 4 8 2} {cmd:gamma} displays an alternative but equivalent
parameterization of the partial proportional odds model used by Peterson
and Harrell (1990) and Lall et al (2002). Under this parameterization,
there is one Beta coefficient and M-2 Gamma coefficients for each
explanatory variable, where M = the number of categories for Y. The
gammas indicate the extent to which the proportional odds assumption is
violated by the variable, i.e. when the gammas do not significantly
differ from 0 the proportional odds assumption is met. Advantages of
this parameterization include the fact that it is more parsimonious than
the default layout. In addition, by examining the test statistics for
the Gammas, you can get a feel for which variables meet the
proportionality assumption and which do not.
{p 4 8 2} {opt gamma(name)} causes the gamma estimates to be stored as
{it:name}, e.g. {opt g(gamma1)} would store the gamma estimates under
the name {it:gamma1}. This makes the gamma results easily usable with
post- estimation table formatting commands like {cmd:outreg2} and
{cmd:estout}. Do NOT try to make the gamma results active and then
use other post-estimation commands, e.g. {cmd:predict} or {cmd:test}.
Such commands either will not work or, if they do work, may give
incorrect results. Note that only the variances and standard errors
of the gamma estimates are correct; all the covariances of the
estimates are set equal to zero.
{p 4 8 2} {cmd:nolabel} causes the equations to be named eq1, eq2,
etc. The default is to use the first 32 characters of the value labels
and/or the values of Y as the equation labels. Note that some
characters cannot be used in equation names, e.g. the space ( ), the
period (.), the dollar sign ($), and the colon(:), and will be
replaced with the underscore (_) character. Square brackets ([]) and
parentheses will be replaced with curly brackets ({}). The default behavior
works well when the value labels are short and descriptive. It may
not work well when value labels are very long and/or include
characters that have to be changed. If the printout
looks unattractive and/or you are getting strange errors, try changing
the value labels of Y or else use the {cmd:nolabel} option.
{p 4 8 2} {cmd:mlstart} uses an alternative method for computing start
values. This method is slower but sometimes surer. This option
shouldn't be necessary but it can be used if the program is having
trouble for unclear reasons or if you want to confirm that the program
is working correctly.
{p 4 8 2} {cmd:v1} causes {cmd:gologit29} to return results in a
format that is consistent with {cmd: gologit 1.0}. This may be
useful/ necessary for post-estimation commands that were written
specifically for {cmd:gologit}. However, post-estimation commands
written for {cmd:gologit29} may not work correctly if {cmd:v1} is
specified. The {cmd:v1} option only works when using
{cmd:link(logit)}.
{p 4 8 2} {cmd:log} displays the iteration log. By default it is
suppressed.
{p 4 8 2} {cmd:or} reports the estimated coefficients transformed to
relative odds ratios, i.e., exp(b) rather than b; see {hi:[R] ologit}
for a description of this concept. Options {cmd:rrr}, {cmd:eform},
{cmd:irr} and {cmd:hr} produce identical results (labeled differently)
and can also be used. It is up to the user to decide whether the
exp(b) transformation makes sense given the link function used, e.g.
it probably doesn't make sense when using the probit link.
{p 4 8 2} {cmd:constraints(}{it:clist}{cmd:)} specifies the linear
constraints to be applied during estimation. The default is to
perform unconstrained estimation. Constraints are defined with the {
help constraint} command. {cmd:constraints(1)} specifies that the
model is to be constrained according to constraint 1;
{cmd:constraints(1-4)} specifies constraints 1 through 4;
{cmd:constraints(1-4,8)} specifies 1 through 4 and 8. Keep in mind
that the {cmd:pl}, {cmd:npl}, and {cmd:autofit} options work by
generating across-equation constraints, which may affect how any
additional constraints should be specified. When using the
{cmd:constraint} command, it is usually easiest and safest to refer to
equations by their equation #, e.g. #1, #2, etc.
{p 4 8 2} {cmd:robust} specifies that the Huber/White/sandwich
estimator of variance is to be used in place of the traditional
calculation. {cmd:robust} combined with {cmd:cluster()} allows
observations which are not independent within cluster (although they
must be independent between clusters). If you specify {cmd:pweight}s,
{cmd:robust} is implied.
{p 4 8 2} {cmd:cluster(}{it:varname}{cmd:)} specifies that the
observations are independent across groups (clusters) but not
necessarily within groups. {it:varname} specifies to which group each
observation belongs; e.g., {cmd:cluster(personid)} in data with repeated
observations on individuals. {cmd:cluster()} affects the estimated
standard errors and variance-covariance matrix of the estimators (VCE),
but not the estimated coefficients. {cmd:cluster()} can be used with
{cmd:pweight}s to produce estimates for unstratified cluster-sampled
data.
{p 4 8 2} {cmd:level(}{it:#}{cmd:)} specifies the confidence level in
percent for the confidence intervals of the coefficients; see help
{help level}.
{p 4 8 2} {it:maximize_options} control the maximization process; see
help {help maximize}. You should never have to specify most of these.
However, the {opt difficult} option can sometimes be useful with models
that are running very slowly or not converging at all.
{title:Additional Options for Survey Data Estimation}
{p 4 8 2} Stata 9's {cmd:svy} prefix command is not currently
supported. The {cmd:svy} option accomplishes most of the same things.
{cmd:svy} indicates that {cmd:gologit29} is to pick up the {cmd:svy}
settings set by {cmd:svyset} and use the robust variance estimator.
Thus, this option requires the data to be {cmd:svyset}; see help
{help svyset}. {cmd:svy} may not be supplied with {it:weight}s or the
{cmd:strata()}, {cmd:psu()}, {cmd:fpc()}, or {cmd:cluster()} options.
When using svy estimation, Use of {cmd:if} or {cmd:in} restrictions
will not produce correct variance estimates for subpopulations in many
cases. To compute estimates for subpopulations, use the {cmd:subpop()}
option. A typical command using the svy option would look something
like
{p 8 12 2}{cmd:. gologit29 health female black age age2, autofit svy}
{p_end}
{p 4 8 2} The following options are available when the {cmd:svy} option
has been specified. If {cmd:svy} has not been specified, use of these
options will produce an error.
{p 8 12 2}
{cmd:subpop(}{it:subpop_spec}{cmd:)} specifies that estimates be computed
for the single subpopulation identified in {it:subpop_spec}.
{it:subpop_spec} is
{p 16 16 2}
[{it:varname}]
[{cmd:if} {it:exp}]
[{cmd:in} {it:range}]
[, {cmdab:srs:subpop} ]
{p 12 12 2}
Thus the subpopulation is defined by the observations for which
{it:varname}!=0 that also meet the {cmd:if} and {cmd:in} conditions.
Typically, {it:varname}=1 defines the subpopulation and {it:varname}=0
indicates observations not belonging to the subpopulation. For observations
whose subpopulation status is uncertain, {it:varname} should be set to
missing.
{p 12 16 2}
{cmd:srssubpop} requests that deff and deft be computed using an estimate of
simple-random-sampling variance for sampling within a subpopulation. If
{cmd:srssubpop} is not specified, deff and deft are computed using an estimate
of simple-random-sampling variance for sampling from the entire population.
Typically, {cmd:srssubpop} would be given when computing subpopulation
estimates by strata or by groups of strata.
{p 8 12 2}
{cmd:nosvyadjust} specifies that the model Wald test be carried out as W/k
distributed F(k,d), where W is the Wald test statistic, k is the number of
terms in the model excluding the constant, d = total number of sampled PSUs
minus total number of strata, and F(k,d) is an F distribution. By default, an
adjusted Wald test is conducted: (d-k+1)W/(kd) distributed F(k,d-k+1). Use
of the {cmd:nosvyadjust} option is not recommended.
{p 8 12 2}
{cmd:prob} requests that the t statistic and p-value be displayed. The
degrees of freedom for the t are d = total number of sampled PSUs minus the
total number of strata (regardless of the number of terms in the model). If
no display options are specified then, by default, the t statistic and p-value
are displayed.
{p 8 12 2}
{cmd:ci} requests that confidence intervals be displayed. If no
display options are specified then, by default, confidence intervals are
displayed.
{p 8 12 2}
{cmd:deff} requests that the design-effect measure deff be displayed.
{p 8 12 2}
{cmd:deft} requests that the design-effect measure deft be displayed.
{p 8 12 2}
{cmd:meff} requests that the meff measure of misspecification effects
be displayed. This option must be specified at the time of the initial
estimation.
{p 8 12 2}
{cmd:meft} requests that the meft measure of misspecification effects
be displayed. This option must be specified at the time of the initial
estimation.
{title:Options for predict}
{p 4 8 2} {cmd:p}, the default, calculates predicted probabilities.
{p 8 8 2} If you do not also specify the {cmd:outcome()} option, you
must specify k new variables. For instance, say you fitted your model
by typing "{cmd:gologit29 insure age male}" and that {cmd:insure} takes on
three values. Then you could type "{cmd:predict p1 p2 p3, p}" to obtain
all three predicted probabilities.
{p 8 8 2} If you also specify the {cmd:outcome()} option, then you
specify one new variable. Say that {hi:insure} took on values 1, 2, and
3. Then typing "{cmd:predict p1, p outcome(1)}" would produce the same
{hi:p1} as above, "{cmd:predict p2, p outcome(2)}" the same {hi:p2} as
above, etc. If {hi:insure} took on values 7, 22, and 93, you would
specify {cmd:outcome(7)}, {cmd:outcome(22)}, and {cmd:outcome(93)}.
Alternatively, you could specify the outcomes by referring to the
equation number ({cmd:outcome(#1)}, {cmd:outcome(#2)}, and
{cmd:outcome(#3)}, or, if the variable values are labeled, you can do
something like {cmd:outcome(low)}, {cmd:outcome(medium)}, and
{cmd:outcome(high)}.
{p 4 8 2} If you do not specify {cmd:outcome()}, {cmd:p} (with one new variable specified),
{cmd:xb}, and {cmd:stdp} assume {cmd:outcome(#1)}.
You must specify {cmd:outcome()} with the {cmd:stddp} option.
{p 4 8 2} {cmd:xb} calculates the linear prediction. You should also
specify the {cmd:outcome()} option. Default is {cmd:outcome(#1)}.
{p 4 8 2} {cmd:outcome()} specifies for which outcome the statistic is
to be calculated. {cmd:equation()} is a synonym for {cmd:outcome()}: it
does not matter which you use. {cmd:outcome()} and {cmd:equation()} can
be specified using (1) {cmd:#1}, {cmd:#2}, ..., with {cmd:#1} meaning
the first category of the dependent variable, {cmd:#2} the second
category, etc.; (2) values of the dependent variable; or (3) the value
labels (if any) of the dependent variable.
{p 4 8 2}
{cmd:stdp} calculates the standard error of the linear prediction.
You should also specify the {cmd:outcome()} option. Default is {cmd:outcome(#1)}.
{p 4 8 2}
{cmd:stddp} calculates the standard error of the difference in two
linear predictions. You must specify option {cmd:outcome()} and in this case
you specify the two particular outcomes of interest inside the parentheses;
for example, "{cmd:predict sed, stddp outcome(1,3)}".
{title:Some cautions about using {opt gologit29} with the Stata 9 prefix commands}
{p 4 4 2} In general, {opt gologit29} seems to work well with Stata 9's
prefix commands, e.g. {opt xi}, {opt nestreg}. There are, however,
some combinations of prefix commands and {cmd:gologit29} options that can
be problematic that users should be aware of.
{p 4 4 2} The {opt sw} and {opt nestreg} prefix commands should work
fine with the {opt pl} and {opt npl} options. However, they may
produce error messages and/or unexpected results when combined with
the {opt autofit}, {opt autofit(alpha)}, {opt npl(varlist)} or
{opt pl(varlist)} options. For example, the submodels estimated by
{opt nestreg} may not include all the variables specified in
{opt pl(varlist)}, resulting in a fatal error message. You can override
this error by using the {opt force} option. {opt sw} and
{opt autofit} would be an especially questionable combination since both stepwise
selection of variables and stepwise selection of constraints would be
going on.
{p 4 4 2} Other than the above, {opt gologit29} will hopefully work fine
in most common situations where prefix commands are used. However,
there are many possible esoteric combinations of prefix commands and
{opt gologit29} options and {opt gologit29} is not guaranteed to be
problem-free with all of them.
{title:Examples}
{p 4 4 2} {it}Example 1. Proportional Odds/ Parallel Lines Assumption
Violated.{sf} Long and Freese (2003) present data from the 1977/ 1989
General Social Survey. Respondents are asked to evaluate the
following statement: "A working mother can establish just as warm and
secure a relationship with her child as a mother who does not work."
Responses were coded as 1 = Strongly Disagree (1SD), 2 = Disagree
(2D), 3 = Agree (3A), and 4 = Strongly Agree (4SA). We can do a
global test of the proportional odds assumption by estimating a model
in which no variables are constrained to meet the assumption and
contrasting it with a model in which all are so constrained (the
latter is equivalent to the {cmd:ologit} model).
{p 8 12 2}{cmd:. use http://www.indiana.edu/~jslsoc/stata/spex_data/ordwarm2.dta, clear}{p_end}
{p 8 12 2}{cmd:. gologit29 warm yr89 male white age ed prst, store(unconstrained)}{p_end}
{p 8 12 2}{cmd:. gologit29 warm yr89 male white age ed prst, pl lrf store(constrained)}
{p_end}
{p 8 12 2}{cmd:. lrtest constrained unconstrained}{p_end}
{p 4 4 2}The LR chi-square from the above is 49.20. These data fail the global test.
{p 4 4 2} However, we can now use the {cmd:autofit} option to see whether a
partial proportional odds model can fit the data. In a partial proportional odds model,
some variables meet the proportional odds assumption while others do not.
{p 8 12 2}{cmd:. gologit29 warm yr89 male white age ed prst, autofit}{p_end}
{p 4 4 2} The results show that 4 of the 6 variables (white, age, ed,
prst) meet the parallel lines assumption. Only yr89 and male do not.
This model is less restrictive than a model estimated by {cmd:ologit}
would be (whose assumptions are violated in this case) but much more
parsimonious than a non-ordinal alternative such as {cmd:mlogit}.
{p 4 4 2} {it}Example 2: Alternative parameterization using the {cmd:gamma}
option.{sf} Peterson & Harrell (1990) and Lall et al (2002) present an
alternative parameterization of the partial proportional odds model. In
this parameterization, gamma coefficients represent deviations from
proportionality. When the gammas of an explanatory variable do not
significantly differ from 0, the parallel lines assumption for that
variable is met. Using the {cmd:autofit} and {cmd:gamma} options, we can (a)
confirm that Lall came up with the correct partial proportional odds
model, and (b) replicate the results from his Table 5.
{p 8 12 2}{cmd:. use http://www3.nd.edu/~rwilliam/gologit2/lall.dta}{p_end}
{p 8 12 2}{cmd:. gologit29 hstatus heart smoke, lrf gamma autofit}{p_end}
{p 4 4 2} {it}Example 3: Survey data estimation.{sf} By using the
{cmd:svy} option, we can estimate models with data that have been svyset.
{p 8 12 2}{cmd:. use http://www.stata-press.com/data/r8/nhanes2f.dta}{p_end}
{p 8 12 2}{cmd:. gologit29 health female black age age2, svy autofit}{p_end}
{p 8 12 2}{cmd:. gologit29 health black age age2, svy autofit subpop(female)}{p_end}
{p 4 4 2} {it}Example 4. {cmd:gologit 1.0} compatibility.{sf} Some
post-estimation commands - specifically, the {cmd:spost} routines of
Long and Freese - currently work with the original {cmd:gologit} but
not {cmd:gologit29}. That should change in the future. For now, you
can use the {cmd:v1} parameter to make the stored results from
{cmd:gologit29} compatible with {cmd:gologit 1.0}. (Note, however, that
this may make the results non-compatible with post-estimation routines
written for {cmd:gologit29}; also, you have to be using the dafualt
logit link.) Using the working mother's data again,
{p 8 12 2}{cmd:. use http://www.indiana.edu/~jslsoc/stata/spex_data/ordwarm2.dta, clear}{p_end}
{p 8 12 2}{cmd:. * Use the v1 option to save internally stored results in gologit 1.0 format}{p_end}
{p 8 12 2}{cmd:. quietly gologit29 warm yr89 male white age ed prst, pl(yr89 male) lrf v1}{p_end}
{p 8 12 2}{cmd:. * Use one of Long & Freese's spost routines}{p_end}
{p 8 12 2}{cmd:. prvalue, x(male=0 yr89=1 age=30) rest(mean)}{p_end}
{p 4 4 2} {it}Example 5. The {cmd:predict} command. {sf}
In addition to the standard options ({cmd:xb, stdp, stddp}) the {cmd:predict}
command supports the {cmd:pr} option (abbreviated {cmd:p}) for predicted
probabilities; {cmd:pr} is the default option if nothing else is specified.
For example,
{p 8 12 2}{cmd:. use http://www.indiana.edu/~jslsoc/stata/spex_data/ordwarm2.dta, clear}{p_end}
{p 8 12 2}{cmd:. quietly gologit29 warm yr89 male white age ed prst, pl(yr89 male) lrf}{p_end}
{p 8 12 2}{cmd:. predict p1 p2 p3 p4}{p_end}
{p 4 4 2} {it}Example 6. Constrained logistic regression. {sf}
The models estimated by Stata's {cmd:logit} and {cmd:ologit} commands are
special cases of the gologit model; but neither of these commands currently
supports the use of linear constraints, such as two variables having equal
effects. {cmd:gologit29} can be used for this purpose. For example,
{p 8 12 2}{cmd:. use http://www.indiana.edu/~jslsoc/stata/spex_data/ordwarm2.dta, clear}{p_end}
{p 8 12 2}{cmd:. recode warm (1 2 = 0)(3 4 = 1), gen(agree)}{p_end}
{p 8 12 2}{cmd:. * Constrain the effects of male and white to be equal}{p_end}
{p 8 12 2}{cmd:. constraint 1 male = white}{p_end}
{p 8 12 2}{cmd:. gologit29 agree yr89 male white age ed prst, lrf store(constrained) c(1)}{p_end}
{p 4 4 2} {it}Example 7. Other link functions. {sf} By default, and
as its name implies. {cmd:gologit29} uses the logit link. If you
prefer, however, you can specify probit, complementary log log,
log log, or cauchit links. For example, to estimate a goprobit model,
{p 8 12 2}{cmd:. use http://www.indiana.edu/~jslsoc/stata/spex_data/ordwarm2.dta, clear}{p_end}
{p 8 12 2}{cmd:. gologit29 warm yr89 male white age ed prst, link(p)}{p_end}
{p 4 4 2} {it}Example 8. Prefix commands. {sf} If you are using Stata 9 or higher,
{cmd:gologit29} supports many of the prefix commands. For example,
{p 8 12 2}{cmd:. use http://www.indiana.edu/~jslsoc/stata/spex_data/ordwarm2.dta, clear}{p_end}
{p 8 12 2}{cmd:. sw, pe(.05): gologit29 warm yr89 male}{p_end}
{p 8 12 2}{cmd:. xi: gologit29 warm yr89 i.male}{p_end}
{p 8 12 2}{cmd:. nestreg: gologit29 warm (yr89 male white age) (ed prst)}{p_end}
{p 4 4 2} {it}Example 9. Post-estimation table formatting commands. {sf} Here is an example of how
you could use {cmd:outreg2} to format the results from multiple models. In this example I use both the
regular and the gamma results but most people would probably choose one or the other. The store option
stores the regular results while the g option stores the results from the gamma parameterization.
{p 8 12 2}{cmd:. use http://www.indiana.edu/~jslsoc/stata/spex_data/ordwarm2.dta, clear}{p_end}
{p 8 12 2}{cmd:. * Unconstrained model}{p_end}
{p 8 12 2}{cmd:. gologit29 warm yr89 male white age ed prst, npl lrf store(gologit) g(gologit_g)}{p_end}
{p 8 12 2}{cmd:. * Autofit model}{p_end}
{p 8 12 2}{cmd:. gologit29 warm yr89 male white age ed prst, autofit lrf store(gologit29) g(gologit29_g)}{p_end}
{p 8 12 2}{cmd:. * Use outreg2 to output the regular results in a single table}{p_end}
{p 8 12 2}{cmd:. outreg2 [gologit gologit29] using regular, replace onecol long nor2 seeout}{p_end}
{p 8 12 2}{cmd:. * Use outreg2 to output the gamma results in a single table}{p_end}
{p 8 12 2}{cmd:. outreg2 [gologit_g gologit29_g] using gamma, replace onecol long nor2 seeout}{p_end}
{title:Author}
{p 5 5}
Richard Williams{break}
Notre Dame Department of Sociology{break}
Richard.A.Williams.5@ND.Edu{break}
{browse "http://www3.nd.edu/~rwilliam/gologit2/"}{p_end}
{title:Acknowledgements}
{p 5 5} Vincent Kang Fu of the Utah Department of Sociology wrote
{cmd:gologit 1.0} and graciously allowed Richard Williams to incorporate
parts of its source code and documentation in {cmd:gologit29}.
{p 5 5}The documentation for Stata 8.2's {cmd:mlogit} command and the
program {cmd:mlogit_p} were major aids in developing the {cmd:gologit29}
documentation and in adding support for the {cmd:predict} command. Much
of the code is adapted from {it}Maximum Likelihood Estimation with Stata,
Second Edition{sf}, by William Gould, Jeffrey Pitblado and William Sribney.
{p 5 5} Sarah Mustillo, Dan Powers, J. Scott Long, Nick Cox, Kit Baum
and Joseph Hilbe provided stimulating and helpful comments. Jeff Pitblado
was extremely helpful in updating the program to use Stata 9's new features.
{title:References}
{p 5 5}Fu, Vincent. 1998. "Estimating Generalized Ordered Logit
Models." Stata Technical Bulletin 8:160-164.
{p 5 5}Lall, R., S.J. Walters, K. Morgan, and MRC CFAS Co-operative
Institute of Public Health. 2002. "A Review of Ordinal Regression
Models Applied on Health-Related Quality of Life Assessments."
Statistical Methods in Medical Research 11:49-67.
{p 5 5}Long, J. Scott and Jeremy Freese. 2006. "Regression Models
for Categorical Dependent Variables Using Stata, 2nd Edition." College
Station, Texas: Stata Press.
{p 5 5}Norusis, Marija. 2005. "SPSS 13.0 Advanced Statistical
Procedures Companion." Upper Saddle River, New Jersey: Prentice Hall.
{p 5 5}Peterson, Bercedis and Frank E. Harrell Jr. 1990. "Partial
Proportional Odds Models for Ordinal Response Variables." Applied
Statistics 39(2):205-217.
{title:Suggested citation if using {cmd:gologit29} in published work }
{p 5 5}{cmd:gologit29} is not an official Stata command. It is a free
contribution to the research community, like a paper. Please cite it
as such.
{p 5 5}Williams, Richard. 2006. "Generalized Ordered Logit/ Partial
Proportional Odds Models for Ordinal Dependent Variables." The Stata
Journal 6(1):58-82.
{p 5 5} The above document provides more detailed explanations and
examples and is recommended reading.
A pre-publication version that includes information
on updates to the program since the article was published is available at
{browse "http://www3.nd.edu/~rwilliam/gologit2/gologit2.pdf"}.
The published article is available for free at
{browse "http://www.stata-journal.com/article.html?article=st0097"}.
{title:Also see}
{p 4 13 2} Online: help for {help estcom}, {help postest}, {help constraint},
{help ologit}, {help svy}, {help svyologit}; if installed
{help mfx2}, {help margeff}