{smcl} {* MLB 02Aug2012}{...} {* MLB 22Mar2012}{...} {* MLB 10Apr2010}{...} {* MLB 14Jul2009}{...} {hline} help for {hi:seqlogit10} {hline} {title:Sequential logit model for Stata 9 and 10} {p 8 17 2} {cmd:seqlogit10} {depvar} [{indepvars}] {ifin} {weight} {cmd:,} {opt tree(tree)} [ {opt ofint:erest(varname)} {opt over(varlist)} {opt sd(numlist)} {opt deltasd(varname numlist)} {opt rho(#)} {cmd:{c -(} } {opt pr(numlist)} | {opt mn(# # , # # [, # #, etc.])} | {opt uniform} {cmd:{c )-}} {opt draws(#)} {opt drawstart(#)} {cmd:or} {opt c:onstraints(numlist)} {cmdab:r:obust} {opt cl:uster(clustervar)} {cmd:nolog} {opt l:evel(#)} {it:{help seqlogit##maximize_options:maximize_options}} ] {p 4 4 2}{cmd:by} {it:...} {cmd::} may be used with {cmd:seqlogit}; see help {help by}. {p 4 4 2}{cmd:pweight}s, {cmd:fweight}s and {cmd:iweight}s are allowed; see help {help weights}. {title:Description} {p 4 4 2} {cmd:seqlogit10} fits a sequential logit model in Stata 9.2 or 10. Users with more recent versions of Stata should use {cmd:seqlogit}. This model is know under a variety of other names: sequential response model (maddala 1983), continuation ratio logit (Agresti 2002), model for nested dichotomies (fox 1997), and the Mare model (shavit and blossfeld93) (after (Mare 1981)). {p 4 4 2}A sequential logit model can be estimated quite simply by estimating a number of {help logit} models. The {cmd:seqlogit10} package serves three additional purposes: First, it makes it easier to {help test} hypotheses across transitions since the entire model is estimated simultaneously. Second, it implements the decomposition proposed by Buis (2010a) of the effect of an explanatory variable on the outcome of the process described by the sequential logit into the contributions of each of the transitions. The implementation is discussed in {help seqlogit10 postestimation}. Third, it implements and extends the strategy proposed by Buis (2011) of doing a sensitivity analysis to investigate the potential influence of unobserved variables. {p 4 4 2} For this last purpose, the {cmd:seqlogit10} package allows own to estimate a sequential logit given a scenario concerning the unobserved variables. These effects will only be estimated when the {opt sd()} option is specified. A regular sequential logit model, which assumes that there is no unobserved heterogeneity, is estimated if the {opt sd()} option is not specified. The scenarios assume that these unobserved variables either add up to a standardized normally (Gaussian) distributed variable (the default), or to a standardized discrete variable (when the {opt pr()} option is specified), or a mixture of normal distributions (when the {opt mn()} option is specified), or a uniform distribution (when the {opt uniform} distribution is specified. The effects of this agregate unobserved variable during each transition are specified in the {opt sd()} option. One can allow the effect of the unobserved variable to change over another variable using the {opt deltasd()} option. The correlation during the first transition between this unobserved variable and the variable specified in the {cmd:ofinterest()} option is specified in the {cmd:rho()} option. The scenarios are estimated using maximum simulated likelihood, while the regular sequential logit model is estimated using regular maximum likelihood. Advise on how to use these different scenarios in a sensitivity analysis is given here: {help seqlogit10_sensitivity}. {title:Options} {dlgtab:Model} {phang} {opt tree(tree)} specifies the sequence of transitions that make up the model. The transitions are seperated with commas and the choices within transitions are seperated with colons. The levels are represented by the levels of the {it:depvar}. It is thus convenient to code {it:depvar} as a series of integers. For example, say there are three levels, 1, 2, and 3, and the first transition consist of a choice between value 1 versus values 2 or 3, and the second transition consists (for those who didn't choose value 1) of a choice between values 2 and 3. The tree option should than be: tree(1 : 2 3 , 2 : 3). {p 8 8 2}All values of {it:depvar} must be specified in the tree and all values in the tree must occur in {it:depvar}. Furthermore, all levels must be reachable through one and only one path through the tree. {phang} {opt ofint:erest(varname)} specifies the variable whose effect will be decomposed when using the {helpb seqlogit10 postestimation##seqlogitdecomp:seqlogitdecomp10} command. The variable specified is added to the list of explanatory variables. {phang} {opt over(varlist)} specifies the variable(s) over which the effect of the variable specified in the {opt ofinterest()} option is allowed to change. This/these variable(s) and the interaction effect between the variable(s) spefied in {opt over()} and {opt ofinterest()} are added to the list of explanatory variables. {opt ofinterest()} needs to be specified when specifying {opt over()}. {phang} {opt c:onstraints(numlist)} specifies linear constraints to be applied during estimation, see {helpb constraint}. {dlgtab:Scenarios} {phang} {opt sd(numlist)} specifies for each transition the effect of the standardized unobserved variable. If only one number is specified, this effect will be assumed constant over transitions, otherwise the first number will refer to the first transition, the second number to the second transition, etc. The default is 0. {phang} {opt deltasd(varname numlist)} specifies how the effect of the unobserved variable changes over {it:varname}. If only one number is specified, this change will be assumed constant over transitions, otherwise the first number will refer to the first transition, the second number to the second transition, etc. The default is that the effect of the unobserved variable is constant over all variables. {phang} {opt rho(#)} specifies the correlation of the unobserved variable and the variable specified in {cmd:ofinterest()}. The default is 0. {phang} The {cmd:pr()}, {cmd:mn()}, and {cmd:uniform} options govern the distribution of the unobserved variable. They are mutually exclusive. If the {cmd:pr()}, {cmd:mn()}, and {cmd:uniform} are not specified but the {cmd:sd()} option is specified, than the unobserved variable will be represented by a standard normal distribution. {phang} {opt pr(numlist)} specifies that the unobserved variable is to be represented by a discrete distribution. The numbers in {cmd:pr()} represent the proportion of observations that belong to each category. Since the numbers are propotions, they all need to be larger than 0 and they need to add up to 1. The location of these categories will be chosen such that the mean is 0, the standard deviation is 1, and all categories are separated by the same distance. The {cmd:pr()} option may not be specified without specifying the {cmd:sd()} option. {phang} {opt mn(# # , # # [, # #, etc.])} specified that the unobserved variable is to be represented as a mixture of normal distributions. This option should consist of multiple elements separated by commas, whereby each element consists of two numbers, the first is the proprotion the latter a mean of one of the components of the mixture distribution. The proportions should add up to 1. The means of the components will be transformed such that the mean of the of the distribution equals 0. The variances of the components are assumed to be equal and are choses such that the overall variance equals 1. This means that not all proportion-mean combination will lead to valid mixture distributions, in which case {cmd:seqlogit10} will report an error and stop. {phang} {opt uniform} specifies that the unobserved variable is to be represented by a uniform distribution with mean 0 and standard deviation 1. This could make sense when we think that it is the rank order on the unobserved variables that influences the outcome rather than some metric value, as the distribution of rank scores is a uniform distribution. This assumption could be justified when we assume that the unobserved variables are also be hard to observe for the actors themselves, and that the rank score is easier to observe (Jill is smarter than Jack, we just don't know how much smarter.). {phang} {opt draws(#)} specifies the number of pseudo random draws per observation used when calculating the simulated likelihood. The default is 100. Because maximum simulated likelihood is only used when the {cmd:sd()} option is specified, the {opt draws()} option can only be specified when the {cmd:sd()} option is specified. {phang} {opt drawstart(#)} specifies the index at which the Halton sequence starts. The default is 15. {dlgtab:Reporting} {phang} {opt or} report odds ratios {phang} {opt r:obust} specifies that the Huber/White/sandwich estimator of variance is to be used in place of the traditional calculation; see {hi:[U] 23.14 Obtaining robust variance estimates}. {cmd:robust} combined with {cmd:cluster()} allows observations which are not independent within cluster (although they must be independent between clusters). {phang} {opt c:luster(clustervar)} specifies that the observations are independent across groups (clusters) but not necessarily within groups. {it:clustervar} specifies to which group each observation belongs; e.g., {cmd:cluster(personid)} in data with repeated observations on individuals. See {hi:[U] 23.14 Obtaining robust variance estimates}. Specifying {cmd:cluster()} implies {cmd:robust}. {phang} {opt l:evel(#)} specifies the confidence level, in percent, for the confidence intervals of the coefficients; see help {help level}. {phang} {opt nolog} suppresses an iteration log of the log likelihood {marker maximize_options}{...} {phang} {opt maximize_options}: {opt diff:icult}, {opt tech:nique(algorithm_spec)}, {opt iter:ate(#)}, {opt tr:ace}, {opt grad:ient}, {opt showstep}, {opt hess:ian}, {opt shownr:tolerance}, {opt tol:erance(#)}, {opt ltol:erance(#)}, {opt gtol:erance(#)}, {opt nrtol:erance(#)}, {opt nonrtol:erance(#)}; see {help maximize}. These options are seldom used. {title:Example} {cmd} sysuse nlsw88, clear gen ed = cond(grade< 12, 1, /// cond(grade==12, 2, /// cond(grade<16,3,4))) if grade < . gen byr = (1988-age-1950)/10 gen white = race == 1 if race < . seqlogit10 ed byr south, /// ofinterest(white) over(byr) /// tree(1 : 2 3 4, 2 : 3 4, 3 : 4) /// levels(1=6, 2=12, 3=14, 4= 16) /// or seqlogitdecomp10, /// overat(byr -.5, byr 0, byr .4) /// subtitle("1945" "1950" "1954") /// eqlabel(`""finish" "high school""' /// `""high school v" "some college""' /// `""some college v" "college""') /// xline(0) yline(0) seqlogit10 ed byr south, /// ofinterest(white) over(byr) /// tree(1 : 2 3 4, 2 : 3 4, 3 : 4) /// or sd(1) uhdesc10 {txt} {title:Author} {p 4 4 2} Maarten L. Buis{break} Wissenschaftszentrum Berlin für Sozialforschung, WZB{break} Research unit Skill Formation and Labor Markets {break} maarten.buis@wzb.eu {p_end} {title:Suggested citation if using seqlogit10 in published work} {p 4 4 2} {cmd:seqlogit} is not an official Stata command. It is a free contribution to the research community, like a paper. Please cite it as such. {p 4 4 2} Buis, Maarten L. 2007. "SEQLOGIT: Stata module to fit a sequential logit model" {browse "http://ideas.repec.org/c/boc/bocode/s456843.html"} {p 4 4 2} or: {p 4 4 2} Buis, Maarten L. 2010 ``Chapter 6, Not all transitions are equal: The relationship between inequality of educational opportunities and inequality of educational outcomes'', In: Buis, Maarten L. ``Inequality of Educational Outcome and Inequality of Educational Opportunity in the Netherlands during the 20th Century''. PhD thesis. {browse "http://www.maartenbuis.nl/dissertation/chap_6.pdf"} {p 4 4 2} Buis, maarten L. 2011 ``The Consequences of Unobserved Heterogeneity in a Sequential Logit Model'', Research in Social Stratification and Mobility, 29(3), pp. 247-262. {title:Acknowledgements} {pstd} I appreciate the useful comments I received at the 2009 Summer meeting of the RC28, and a bug report by Dominik Becker. {title:References} {p 4 4 2} Agresti, Alan 2002 . {it:Categorical Data Analysis, 2nd edition.} Hoboken, NJ: Wiley-Interscience. {p 4 4 2} Buis, Maarten L. 2010a ``Chapter 6, Not all transitions are equal: The relationship between inequality of educational opportunities and inequality of educational outcomes'', In: Buis, Maarten L. ``Inequality of Educational Outcome and Inequality of Educational Opportunity in the Netherlands during the 20th Century''. PhD thesis. {browse "http://www.maartenbuis.nl/dissertation/chap_6.pdf"} {p 4 4 2} Buis, maarten L. 2011 ``The Consequences of Unobserved Heterogeneity in a Sequential Logit Model'', Research in Social Stratification and Mobility, 29(3), pp. 247-262. {browse "http://dx.doi.org/10.1016/j.rssm.2010.12.006"} {p 4 4 2} Fox, John 1997 {it:Applied Regression Analysis, Linear Models, and Related Methods.} Thousand Oaks: Sage. {p 4 4 2} Maddala, G.S. 1983 {it:Limited Dependent and Qualitative Variables in Econometrics} Cambridge: Cambridge University Press. {p 4 4 2} Mare, Robert D. 1981 ``Change and Stability in educational Stratification'' {it:American Sociological Review}, 46(1), p.p. 72-87. {p 4 4 2} Shavit, Yossi and Hans-Peter Blossfeld 1993 {it:Persistent Inequality: Changing Educational Attainment in Thirteen Countries} Boulder: Westview Press. {title:Also see} {p 4 13 2} Online: help for {help seqlogit10 postestimation}, {help seqlogit10_sensitivity} help for {help logit}, {help mlogit}