Title
seqlogit postestimation -- Postestimation tools for seqlogit
Description
post estimation tools specifically for seqlogit:
seqlogitdecomp Makes a graph showing a decomposition of the effect of a variable on the highest achieved level of the dependent variable into effects of that variable on passing each transition and the importance of that transition as described in (Buis 2010).
uhdesc creates a table of describtive statistics of the unobserved variable at each transition. This is only available when the sd() option was used for seqlogit.
seqlogit_sensitivity is strictly speaking not a tool, but a helpfile showing how to run a sensitivity analysis with seqlogit.
The following standard postestimation commands are also available:
command description ------------------------------------------------------------------------- INCLUDE help post_estat INCLUDE help post_estimates INCLUDE help post_lincom INCLUDE help post_lrtest INCLUDE help post_margins INCLUDE help post_nlcom predict predictions INCLUDE help post_predictnl INCLUDE help post_suest INCLUDE help post_test INCLUDE help post_testnl -------------------------------------------------------------------------
------------------------------------------------------------------------------- help for seqlogitdecomp -------------------------------------------------------------------------------
Syntax for seqlogitdecomp
seqlogitdecomp [ varlist ] , [ overat(overatlist) { table | area } marg at(atlist) subtitle(titlelist) eqlabel(labellist) eqlegend xline(linearg) yline(linearg) title(title) name(name, replace) yscale(axis_suboptions) xscale(axis_suboptions) ysize(#) xsize(#) format(%fmt) z ]
Description
The idea behind a sequential logit model is that it models the influence of explanatory/independent/right-hand-side/x variables on the probability of passing a set of transitions. For example, one can model the process of attaining education as two transitions: a transition between finishing high school or not, and a transition between wheter one went to college or not given that one finished high school. If we assign a value to each of these end states --- in the case of education those would typically be (pseudo-)years of education --- than one can also study the effect of the explanatory variables on the expected final outcome.
The aim of seqlogitdecomp is to study the relationship between the effects on each transition and the effects on the final outcome. It turns out (Buis 2010) that these total effects --- that is, the marginal effect, which is the derivative of the expected final outcome with respect to the explanatory variables --- are a weighted sum of the effects on each transition.
If these transition specific effects are measured in terms of log odds ratios, than the weight assigned to each transition is the product of three elements: the proportion at risk, the variance, and the expected gain from passing. So a tranisition becomes more important if more people have to face that transition. The variance component of the weight is a function that is small when virtually everybody passes a transition or everybody fails a transition, and is large when the probability of passing is about 50%. So, a transition does not add much to the total effect if virtually everybody passes or fails that transition. Finally, a transition becomes more important if the expected gain from passing increases.
If these transition specifc effects are measured in terms of marginal effects, than the weights assigned to each transition are the product of two elements: the proportion at risk, and the expected gain from passing.
seqlogitdecomp displays a graph or a table showing a decomposition of the effect of a variable on the final outcome into effects of the variable on passing each transition and the importance of each transition (the weight) as described in (Buis 2010). For the graph the variable whose effect will be decomposed is specified in the ofinterest() option in seqlogit. For the table the variable is specified as the varlist, the default is all variables in the seqlogit model.
The effect on the expected value can differ between groups in the population, for example cohorts. The default graph is designed to show how these differences are due to differences in effects on the transitions across groups and differences in the importance of each transition across groups. To continue the example: the effect of parental status can change over cohorts, and {seqlogitdecomp} will tell the extend to which this is due to changes in the effects on the transitions between levels of education or changes in the importance of each transition. The graph that will be displayed with the area option shows the contribution of each transition without splitting it up into weights and effects.
The table is designed to show this extra detail of this decomposition without the comparison of groups. It will show the effects on each transition, the weights and their components, the probabiltiy of passing each transition, and the effect on the final outcome. It also shows the standard errors for each of these components except for those components that are by definition fixed and thus not uncertain, e.g. the proportion at risk at the first transitions, which is by definition 1.
Options
+--------------+ ----+ Main options +-----------------------------------------------------
overat(overlist) Specifies the values of the explanatory variables of the groups that are to be compared. It cannot be specified in combination with the table or area option. It overrides any value specified in the at option. Each comparison is seperated by a comma. The syntax for overlist is:
varname_1 # [varname_2 # [...]], varname_1 # [varname_2 # [...]], [...]
at(atlist) specifies the values at which the equations are evaluated. The syntax for atlist is: varname # [varname # ...]. The equations will be evaluated at the mean values of any of the variables not specified in at if those variables are not categorical factor variables. For cateforical factor variables the default is the minimum (the first category).
Say the dependent variable is highest achieved level of education, which is influenced by child's Socio Economic Status (ses) and cohort (coh) and the interaction between ses and coh (c.ses#c.coh). We want to compare the decomposition of the effect of ses over different cohorts for mean value of ses. Say that coh has only three values: 1, 2, and 3 and the mean value of ses is .5. Than the overat and {otp at} options would read:
overat( coh 1, coh 2, coh 3 ) at( ses .5 )
marg specifies that the transition specific effects are marginal effects instead of log odds ratios. This option may not be specified in combination witht the area option.
table specifies that the decomposition is to be displayed as a table instead of a graph. It consists of multiple calls to margins, and it can take a while to run. The default is to show an array of rectangles whose width represents the weight of a transition and the height the effect.
area specifies that an area graph is displayed showing the contribution of each transition. The default is to show an array of rectangles whose width represents the weight of a transition and the height the effect, thus splitting each transitions contribution in an effect and a weight.
format(%fmt) specifies the format used to display the results in the table. This option can only be specified in combination with the table option.
z specifies that z-values are displayed instead of standard errors. This option can only be specified in combination with the table option.
+---------------+ ----+ Graph options +----------------------------------------------------
The graph options cannot be specified in combination with the table option.
subtitel(titlelist) specifies the titles above each group, cohort in the example above. The syntax of titlelist is "string" "string" [...]. The number of titles must equal the number of groups. This option may not be specified in combination with the area option.
eqlabel(labellist) specifies labels for each transition. The syntax of labellist is "string" "string" [...]. The number of labels must equal the number of transitions. If one wants to let the label span more than one line, one can use `" "string1" "string2" "'.
eqlegend specifies that a legend is used to identify the different transitions. By default the transitions are identified using titles on the right of the graph.
xline(numlist) see: added line options
yline(numlist) see: added line options
title(title) see: title_options
name(name, replace) see: name_option
[y|x]scale(axis sub options) see: axis_scale_options
[y|x]lable(rule_or_values) see: axis_options
[y|x]title(title) see: axis_title_options
[y|x]size(#) see: region_options
Example use "http://fmwww.bc.edu/repec/bocode/g/gss.dta", clear recode degree 4=3 label define degree 0 "lt high school" /// 1 "high school" /// 2 "junior college" /// 3 "college", modify label value degree degre
seqlogit degree south /// c.coh##c.coh if black == 0 , /// tree(0 : 1 2 3 , 1 : 2 3 , 2 : 3 ) /// ofinterest(paeduc) /// over(c.coh##c.coh) /// levels(0=9, 1=12, 2=14, 3=16)
seqlogitdecomp, overat(coh 1.5, /// coh 2.5, /// coh 3.5, /// coh 4.5, /// coh 5.5, /// coh 6.5) /// at(south 0 paeduc 12) /// yline(0) xline(0) /// subtitle("1915" "1925" "1935" /// "1945" "1955" "1965") /// eqlabel(`""less than high school" "versus" "high school or more""' /// `""high school" "versus" "any college""' /// `""junior college" "versus" "college""' ) seqlogitdecomp paeduc, table /// at(coh 1.5 south 0 paeduc 12) seqlogitdecomp, area /// at(south 0 paeduc 12) /// eqlabel(`""less than high school" "versus" "high school or more""' /// `""high school" "versus" "any college""' /// `""junior college" "versus" "college""' ) /// xlab(2 "1920" 3 "1930" 4 "1940" 5 "1950" 6 "1960" 7 "1970") /// xtitle("year of birth")
------------------------------------------------------------------------------- help for uhdesc -------------------------------------------------------------------------------
Syntax for uhdesc
uhdesc , [ at(atlist) overat(overatlist) levels(levellist) overlab(stringlist) draws(#) ]
Description
uhdesc creates a table of describtive statistics of the unobserved variable at each transition. This is only available when the sd() option was used for seqlogit. When the sd() option is specified one is estimating the parameters that would occur if there is an unobserved variable, which is normally distributed, wich at the first transition has a mean of zero, a standard deviation as specified in the sd() option, and is uncorrelated witht the observed covariates, and one correctly controlled for this unobserved variable. The consequences of such an unobserved variable and the way to estimate the parameters in such a scenario are discussed in (Buis 2011). The aim of uhdesc is to show what happens to this unobserved variable at the different transitions, and thus get an insight into why the estimates in the scenario are different (or not) from a regular sequential logit.
Options
overat(overlist) Specifies the values of the explanatory variables of the groups that are to be compared. It overrides any value specified in the at option. Each comparison is seperated by a comma. The syntax for overlist is:
varname_1 # [varname_2 # [...]], varname_1 # [varname_2 # [...]], [...]
at(atlist) specifies the values at which the equations are evaluated. The syntax for atlist is: varname # [varname # ...]. The equations will be evaluated at the mean values of any of the variables not specified in at.
Say the dependent variable is highest achieved level of education, which is influenced by child's Socio Economic Status (ses) and cohort (coh) and the interaction between ses and coh (_ses_X_coh). We want to compare the decomposition of the effect of ses over different cohorts for mean value of ses. Say that coh has only three values: 1, 2, and 3 and the mean value of ses is .5. Than the overat and {otp at} options would read:
overat( coh 1, coh 2, coh 3 ) at( ses .5 )
Notice that the values for the interaction term need not be specified in the overat() option, as long as it was created using the over() option in seqlogit.
overlab(stringlist) specifies the label that is to be attached to each group specified in the overatlist() option. Spaces are not allowed but an _ will be displayed as an space. The number of labels has to be the same as the number of groups specified in the overatlist() option.
To continue the example above: Say that a value of 1 on the variable coh corresponds to the cohort born in 1950, a value 2 corresponds to the cohort born in 1970, the value 3 corresponds to the cohort born in 1990, then the {cmd overlab()} option would read:
overlab(1950 1970 1990)
levels(levellist) specifies the values attached to each level of the dependent variable. If it is not specified the values of the dependent variabel will be used. The syntax for levels is: # = # [, # = #, ...]
Example
sysuse nlsw88, clear gen ed = cond(grade< 12, 1, /// cond(grade==12, 2, /// cond(grade<16,3,4))) if grade < . gen byr = (1988-age-1950)/10 gen white = race == 1 if race < .
seqlogit ed byr south, /// ofinterest(white) over(byr) /// tree(1 : 2 3 4, 2 : 3 4, 3 : 4) /// or sd(1) uhdesc
------------------------------------------------------------------------------- help for predict -------------------------------------------------------------------------------
Syntax for predict
predict [type] newvar [if] [in] [, statistic outcome(#) transition(#) choice(#) equation(#) levels(levellist) ]
statistic Description ------------------------------------------------------------------------- xb xb, fitted values stdp standard error of the prediction trpr probability of passing transition tratrisk proportion of respondents at risk of passing transition trvar variance of the indicator variable indicating whether or not the respondent passed the transition trgain difference in expected highest achieved level between those that pass the transition and those that do not trweight weight assigned to transition if transition specific effects are log odds ratios trmweight weight assigned to transition if transition specific effects are marginal effects treffect contribution of transition to the total effect. pr probability that an outcome is the highest achieved outcome. y expected highest achieved level effect Effect of variable of interest on expected highest achieved level. This variable is specified in the ofinterest() option in seqlogit. Interactions with the variables specified in the over() option of seqlogit are automatically taken into account. residuals difference between highest achieved level and expected highest achieved level. score first derivative of the log likelihood with respect to the linear predictor. -------------------------------------------------------------------------
Options for predict
transition(#) specifies the transition, 1 is the first transition specified in the tree option in seqlogit, 2 the second, etc.
choice(#) specifies the choice within the transition, 0 is the choice (the reference category), 1 the second, etc.
equation(#) specifies the equation, #1 is the first equation, #2 the second, etc.
levels(levellist) specifies the values attached to each level of the dependent variable. If it is not specified the values of the dependent variabel will be used. The syntax for levels is: # = # [, # = #, ...]
References
Buis, Maarten L. 2010 ``Chapter 6, Not all transitions are equal: The relationship between inequality of educational opportunities and inequality of educational outcomes'', In: Buis, Maarten L. ``Inequality of Educational Outcome and Inequality of Educational Opportunity in the Netherlands during the 20th Century''. PhD thesis. http://www.maartenbuis.nl/dissertation/chap_6.pdf
Buis, maarten L. 2011 ``The Consequences of Unobserved Heterogeneity in a Sequential Logit Model'', Research in Social Stratification and Mobility, 29(3), pp. 247-262. http://dx.doi.org/10.1016/j.rssm.2010.12.006
Also see
Online: help for seqlogit, estimates, lincom, lrtest, mfx, nlcom, predictnl, suest, test, testnl