------------------------------------------------------------------------------- help forgoprobit-------------------------------------------------------------------------------

Maximum-Likelihood Estimation of Generalized Ordered Probit Models

goprobitdepvar[indepvars] [weight] [ifexp] [inrange] [,plpl(varlist)nplnpl(varlist)constraints(clist)robustcluster(varname)level(#)score(newvarlist|stub*)maximize_options]

goprobitshares the features of all estimation commands; see help est.goprobittyped without arguments redisplays previous results.fweights,iweights, andpweights are allowed; see help weights.The syntax of predict following

goprobitis

predict[type]newvarname(s) [ifexp] [inrange] [,statisticoutcome(outcome)]where

statisticis

pprobability (specify one new variable andoutcome()option, or specify k new variables, k = # of outcomes); the default

xblinear prediction (outcome()option required)

stdpS.E. of linear prediction (outcome()option required)

stddpS.E. of difference in linear predictions (outcome()option isoutcome(outcome1,outcome2))Note that you specify one new variable with

xb,stdp, andstddpand specify either one or k new variables withp.These statistics are available both in and out of sample; type "

predict...if e(sample)..." if wanted only for the estimation sample.

Description

goprobitis a user-written program that estimates generalized ordered probit models. The actual values taken on by the dependent variable are irrelevant except that larger values are assumed to correspond to "higher" outcomes. This model relaxes theparallel regressionassumption of the standard ordered probit model; see below and help oprobit.goprobitsupports linear constraints and allows the user to partially relax equal coefficients by specifying variables innpl()orpl().

goprobitis a modified version of Vincent Kang Fu'sgologitand particularly Richard Williams'gologit2programs. The current version ofgologit2allows to estimate the generalized ordered probit model using the link(probit) option and therefore produces results equivalent to goprobit.goprobitwas written for Stata 8 and many of the references in this help file are for Stata 8 manuals and commands.

Options

pl,npl,npl(),pl()provide alternative means for imposing or relaxing equal coefficients. Only one may be specified at a time.

plspecified without parameters constrains all independent variables to meet the parallel regression assumption. It will produce results that are equivalent tooprobit.

nplspecified without parameters relaxes the parallel regression assumption for all explanatory variables. This is the default option.

pl(varlist)constrains the specified explanatory variables to meet the parallel regression assumption. All other variables do not need to meet the assumption. The variables specified must be a subset of the explanatory variables.

npl(varlist)frees the specified explanatory variables from meeting the parallel regression assumption. All other explanatory variables are constrained to meet the assumption. The variables specified must be a subset of the explanatory variables.

constraints(clist)specifies the linear constraints to be applied during estimation. The default is to perform unconstrained estimation. Constraints are defined with the constraint command.constraints(1)specifies that the model is to be constrained according to constraint 1;constraints(1-4)specifies constraints 1 through 4;constraints(1-4,8)specifies 1 through 4 and 8. Keep in mind that theplandnploptions work by generating across-equation constraints, which may affect how any additional constraints should be specified. When using theconstraintcommand, refer to equations by their equation #, e.g. #1, #2, etc.

robustspecifies that the Huber/White/sandwich estimator of variance is to be used in place of the traditional calculation; see[U] 23.14Obtaining robust variance estimates.robustcombined withcluster()allows observations which are not independent within cluster (although they must be independent between clusters). If you specifypweights,robustis implied.

cluster(varname)specifies that the observations are independent across groups (clusters) but not necessarily within groups.varnamespecifies to which group each observation belongs; e.g.,cluster(personid)in data with repeated observations on individuals.cluster()affects the estimated standard errors and variance-covariance matrix of the estimators (VCE), but not the estimated coefficients.cluster()can be used withpweights to produce estimates for unstratified cluster-sampled data.

level(#)specifies the confidence level in percent for the confidence intervals of the coefficients; see help level.

score(newvarlist|stub*)creates J-1 new variables, where J is the number of observed outcomes. Each new variable contains the contributions to the scores for an equation in the model; see[U] 23.15 Obtainingscores.If

score(newvarlist)is specified, J-1 new variables must be provided.If

score(stub*)is specified, then variablesstub1,stub2, ...,stubJ-1will be created.The first variable contains d(ln L_i)/d(x_i B_1); the second variable contains d(ln L_i)/d(x_i B_2); and so on.

maximize_optionscontrol the maximization process; see help maximize. You should never have to specify them.

Options forpredict

p, the default, calculates predicted probabilities.If you do not specify the

outcome()option, you must specify k new variables. For instance, say you fitted your model by typing "goprobit happy income health" and thathappytakes on three values. Then you could type "predict p1 p2 p3, p" to obtain all three predicted probabilities.If you also specify the

outcome()option, then you specify one new variable. Say thathappytook on values 1, 2, and 3. Then typing "predict p1, p outcome(1)" would produce the samep1as above, "predict p2, p outcome(2)" the samep2as above, etc. Ifhappytook on values 7, 22, and 93, you would specifyoutcome(7),outcome(22), andoutcome(93). Alternatively, you could specify the outcomes by referring to the equation number (outcome(#1),outcome(#2), andoutcome(#3).

xbcalculates the linear prediction. You must also specify theoutcome()option.

stdpcalculates the standard error of the linear prediction. You must specify optionoutcome().

stddpcalculates the standard error of the difference in two linear predictions. You must specify optionoutcome(), in this case with two particular outcomes of interest inside the parentheses; for example, "predict sed, stdp outcome(1,3)".

outcome()specifies for which outcome the statistic is to be calculated.equation()is a synonym foroutcome(): it does not matter which one you use.outcome()andequation()can be specified using (1)#1,#2, ..., with#1meaning the first category of the dependent variable,#2the second category, etc.; or (2) values of the dependent variable.

RemarksThe

oprobitcommand included with Stata imposes what is called theparallel regression assumption. By default,goprobitrelaxes the parallel regression assumption and allows the effects of the explanatory variables to vary with the point at which the categories of the dependent variable are dichotomized. However, if theploption is specified,goprobitestimates the standard ordered probit model, e.g. the commandsoprobit yx1 x2 x3andgoprobit y x1 x2 x3, plwill produce equivalent results.In practice, the parallel regression assumption is often violated by the data. Standard advice in such situations is to go to a non-ordinal model, such as

mlogit. Unfortunately, such models do not take into account the ordinal nature of the dependent variable and therefore cannot be efficient.goprobitprovides an alternative generalized model introduced by Maddala (1983:46) and Terza (1985). This model possibly relaxes the parallel regression assumption for some explanatory variables while being maintained for others. For example, the commandgoprobit yx1 x2 x3, npl(x1)would relax the parallel regression assumption for x1 while maintaining it for x2 and x3. An equivalent command isgoprobit yx1 x2 x3, pl(x2 x3)which forces x2 and x3 to meet the parallel regression assumption while not imposing it on x1.More formally, suppose we have an ordinal dependent variable Y which takes on the values 1, 2, ..., J. The generalized ordered probit model estimates a set of coefficients (including one for the constant) for each of the J - 1 points at which the dependent variable can be dichotomized. The probabilities that Y will take on each of the values 1, ..., J is equal to

P( Y = 1 ) = F( -XB_1 ) P( Y = j ) = F( -XB_j ) - F( -XB_(j-1) ) j = 2, ..., J - 1 P( Y = m ) = 1 - F( -XB_(J-1) )

The generalized ordered probit model uses the normal distribution as the cumulative distribution F(.), although other distributions may also be used; see help gologit and help gologit2.

The standard ordered probit model (estimated by Stata's

oprobitcommand and bygoprobitwith theploption) restricts the B_j coefficients to be the same for every dividing point j = 1, ..., J-1. The generalized ordered probit model (estimated ingoprobitvia thenpl()andpl()options) restricts some B_j coefficients to be the same for every dividing point while others are free to vary.Note that the generalized ordered probit model imposes explicit restrictions on the parameters. Since probabilities are by definition constrained to be in the range [0,1], valid combinations must satisfy the following inequalities:

XB_1 >= XB_2 >= XB_3 ... >= XB_J-1

The current version of

goprobitdoes not impose these restrictions during the maximization process. After fitting the model, the user should verify the validity of the model by calculating predicted probabilities. See help gologit2 and http://www.nd.edu/~rwilliam/gologit2/ for further discussion on this topic.A panel data version of

goprobitwith random effects can be estimated byregoprob; see help regoprob if installed.

Examples

. goprobit happy linc unempl health if male == 1, robust

. goprobit happy linc unempl health if male == 1, robust npl(linc)

. goprobit, level(99)

. predict xb1, xb outcome(#1)

AuthorStefan Boes Socioeconomic Institute Statistics and Empirical Economics Research Group University of Zurich boes@sts.unizh.ch http://www.unizh.ch/sts/

AcknowledgementsRichard Williams of the Notre Dame Department of Sociology wrote

gologit2. Richard Williams kindly gave me permission to use his code ofgologit2and I adapted much of it when programminggoprobit. For a more detailed description ofgologit2and its features, see the reference below or help gologit2.

ReferencesBoes, S. and R. Winkelmann (2006) "Ordered Response Models." Allgemeines Statistisches Archiv 90: 165-179.

Fu, V.K. (1998) "Estimating Generalized Ordered Logit Models." Stata Technical Bulletin 8: 160-164.

Long, J.S and J. Freese (2003) "Regression Models for Categorical Dependent Variables Using Stata", revised edition, Stata Press.

Maddala, G. (1983) "Limited-Dependent and Qualitative Variables in Econometrics." Cambridge University Press: Cambridge.

Terza, J. (1985) "Ordered Probit: A Generalization." Communications in Statistics – A. Theory and Methods 14: 1–11.

Williams, R. (2006) "Generalized Ordered Logit/ Partial Proportional Odds Models for Ordinal Dependent Variables." The Stata Journal 6(1): 58-82. A pre-publication version is available at http://www.nd.edu/~rwilliam/gologit2/gologit2.pdf.

Winkelmann, R. and S. Boes (2006) "Analysis of Microdata." Springer: Berlin.

Also seeManual:

[U] 23 Estimation and post-estimation commands,[U] 29 Overview of Stata estimation commandsOnline: help for estcom, postest, constraint, oprobit, ologit, gologit, gologit2, regoprob