{smcl} {* *! version 1.1.7 03apr2020 Ben Jann & Simon Seiler}{...} {vieweralsosee "[R] mlogit" "help mlogit"}{...} {viewerjumpto "Syntax" "udiff##syntax"}{...} {viewerjumpto "Description" "udiff##description"}{...} {viewerjumpto "Options" "udiff##options"}{...} {viewerjumpto "Postestimation" "udiff##postest"}{...} {viewerjumpto "Examples" "udiff##examples"}{...} {viewerjumpto "Methods and formulas" "udiff##methods"}{...} {viewerjumpto "Saved results" "udiff##saved_results"}{...} {viewerjumpto "References" "udiff##references"}{...} {viewerjumpto "Authors" "udiff##authors"}{...} {hi:help udiff} {hline} {title:Title} {pstd}{hi:udiff} {hline 2} Generalized unidiff model for individual-level data {marker syntax}{...} {title:Syntax} {pstd} Simple syntax: {p 8 15 2} {cmd:udiff} {depvar} {help varlist:{it:xvars}} {help varname:{it:layervar}} {ifin} {weight} [{cmd:,} {help udiff##opts:{it:options}} ] {pstd} Advanced syntax: {p 8 15 2} {cmd:udiff} {depvar} {it:term} [{it:term} ...] [{help varlist:{it:controlvars}}] {ifin} {weight} [{cmd:,} {help udiff##opts:{it:options}} ] {pmore} where {it:term} is a unidiff term specified as {cmd:(}{help varlist:{it:xvars}} {help varname:{it:layervar}}{cmd:)} or {cmd:(}{help varlist:{it:xvars}} {cmd:<-} {help varlist:{it:layervars}}{cmd:)} or {cmd:(}{help varlist:{it:layervars}} {cmd:->} {help varlist:{it:xvars}}{cmd:)} {pmore} {it:xvars} must be unique across unidiff terms, {it:layervars} may be repeated; parentheses may be omitted if there are no control variables and if only one unidiff term is specified. {synoptset 22 tabbed}{...} {marker opts}{...} {synopthdr} {synoptline} {syntab :Main} {synopt :{opt cf:only}}estimate constant-fluidity model instead of unidiff model{p_end} {synopt :{opth constr:aints(numlist)}}apply specified linear constraints{p_end} {synopt :{opt b:aseoutcome(#)}}value of {depvar} that will be the base outcome{p_end} {synopt :{opt nocons:tant}}suppress constant term{p_end} {syntab :SE/Robust} {synopt :{opth vce(vcetype)}}{it:vcetype} may be {opt oim}, {opt r:obust}, {opt cl:uster} {it:clustvar}, {opt boot:strap}, or {opt jack:knife}{p_end} {synopt :{opt r:obust}}synonym for {cmd:vce(robust)}{p_end} {synopt :{opt cl:uster(clustvar)}}synonym for {cmd:vce(cluster} {it:clustvar}{cmd:)}{p_end} {syntab :Reporting} {synopt :{opt l:evel(#)}}set confidence level; default is {cmd:level(95)}{p_end} {synopt :{opt all:equations}}report results for all equations; by default only the unidiff parameters are displayed{p_end} {synopt :{opt eform}}report coefficients in exponentiated form{p_end} {synopt :{opt noh:eader}}suppress header display above coefficient table{p_end} {synopt :{it:{help estimation_options##display_options:display_options}}}standard display options{p_end} {synopt :{opt coefl:egend}}display legend instead of statistics{p_end} {synopt :{opt noi:sily}}display output from initial constant-fluidity model{p_end} {syntab :Maximization} {synopt :{it:{help maximize:maximize_options}}}maximization options{p_end} {synoptline} {p 4 6 2}{it:xvars}, {it:layervars}, and {it:controlvars} may contain factor variables; see {help fvvarlist}.{p_end} {p 4 6 2}{helpb svy} and {helpb mi estimate} are supported; see {help prefix}.{p_end} {p 4 6 2}{cmd:fweight}s, {cmd:aweight}s, {cmd:iweight}s, and {cmd:pweight}s are allowed; see help {help weight}.{p_end} {p 4 6 2}{helpb udiff##postest:predict} and other postestimation commands are available after {cmd:udiff}; see {help udiff##postest:below}.{p_end} {p 4 6 2}{helpb udiff##postest:estat rescale} computes rescaled unidiff parameters after {cmd:udiff}; see {help udiff##postest:below}.{p_end} {p 4 6 2}{helpb udiff##postest:estat lambda} computes lambda coefficients after {cmd:udiff}; see {help udiff##postest:below}.{p_end} {p 4 6 2}{helpb udiff##postest:estat kappa} computes kappa indices after {cmd:udiff}; see {help udiff##postest:below}.{p_end} {marker description}{...} {title:Description} {pstd} {cmd:udiff} estimates parameters of the so-called unidiff model (Erikson and Goldthorpe 1992), also known as the log-multiplicative layer effect model (Xie 1992), which is often used to study differences in intergenerational class mobility between birth cohorts or countries. {pstd} The original unidiff model has been expressed as a log-linear model of cell frequencies in a three-way contingency table (origin by destination by cohort or country). The model, however, can also be expressed at the individual-level (similar to a multinomial logit model). {cmd:udiff} estimates such a re-expressed unidiff model for individual-level data. Furthermore, it generalized the model to allow for multiple layers and non-categorical predictors. For details see {help udiff##methods:Methods and Formulas} below. For an implementation of the classic log-linear unidiff model for aggregate data see Pisati (2000). {pstd} {it:depvar} is the (categorical) destination variable (e.g. class of respondent). {pstd} {it:xvars} specifies the origin variable(s) (e.g. class of respondent's parents). Typically, {it:xvars} is a single categorical variable specified as {cmd:i.}{it:varname}, although multiple variables as well as continuous variables are allowed. {pstd} {it:layervars} specifies the layer variable(s) to be interacted with {it:xvars}. Typically, {it:layervars} is a single categorical variable specified as {cmd:i.}{it:varname} (e.g. countries or birth-cohort categories), although multiple variables as well as continuous variables are allowed. For example, specify {cmd:(}{it:xvars} {cmd:<-} {cmd:c.cohort##c.cohort}{cmd:)} to model the unidiff scaling factor as a quadratic function of variable {cmd:cohort}. Likewise, if your data contains information on countries and birth cohorts, you could type {cmd:(}{it:xvars} {cmd:<-} {cmd:i.country i.cohort}{cmd:)} to include separate unidiff parameters for both dimensions. Furthermore, you could type {cmd:(}{it:xvars} {cmd:<-} {cmd:i.country##i.cohort}{cmd:)} to include unidiff parameters for all country-cohort combinations. {pstd} {it:controllvars} are control variables whose effects are assumed to be constant across layers. {marker options}{...} {title:Options} {phang} {opt cfonly} causes the constant-fluidity model to be reported instead of the unidiff model. Estimation of the unidiff model will be skipped. {phang} {opth constraints(numlist)} applies linear constraints to the estimation. {it:numlist} specifies the constraints by number, after they have been defined using the {helpb constraint} command. An {help udiff##exconstr:example} is provided below. {phang} {opt baseoutcome(#)} specifies the value of {depvar} to be treated as the base outcome. The default is to choose the most frequent outcome. {phang} {opt noconstant} suppresses the constant (outcome-specific intercepts) in the model. {phang} {opt vce(vcetype)} specifies the type of variance estimation to be used to determine the standard errors. {it:vcetype} may be {opt oim}, {opt r:obust}, {opt cl:uster} {it:clustvar}, {opt boot:strap}, or {opt jack:knife}; see {help vce_option:[R] {it:vce_option}}. {phang} {opt robust} is a synonym for {cmd:vce(robust)}. {phang} {opt cluster(clustvar)} is a synonym for {cmd:vce(cluster} {it:clustvar}{cmd:)}. {phang} {opt level(#)} specifies the confidence level, as a percentage, for confidence intervals. The default is {cmd:level(95)} or as set by {helpb set level}. {phang} {opt allequations} reports results for all equations of the model. By default, only the first equation containing the unidiff parameters is displayed. {phang} {opt eform} displays the coefficients in exponentiated form. That is, for each coefficient, exp({it:b}) rather than {it:b} is displayed, and standard errors and confidence intervals are transformed accordingly. {phang} {opt noheader} suppresses the header above the coefficient table that displays the final log-likelihood value, the number of observations, and the unidiff significance test. {phang} {it:display_options} are standard display options; see {helpb estimation_options##display_options:[R] estimation options}. {phang} {opt coeflegend} specifies that the legend of the coefficients and how to specify them in an expression be displayed rather than displaying the statistics for the coefficients. {phang} {opt noisily} displays the {helpb mlogit} output of the initial constant-fluidity model. By default, the initial model is not displayed. {phang} {it:maximize_options} are maximization options such as {cmd:iterate()} or {cmd:difficult}. See {helpb maximize:[R] maximize}. These options will only be applied to the unidiff model, but not to the initial constant-fluidity model. {marker postest}{...} {title:Postestimation commands} {pstd} Usual postestimation commands such as {helpb predict}, {helpb test}, {helpb estat}, {helpb lincom}, {helpb nlcom}, {helpb margins}, or {helpb suest} are available after {cmd:udiff}. Details on {cmd:estat} and {cmd:predict} are as follows. {pstd} Note that, after a model that has been estimated using the {helpb svy} prefix, {cmd:estat rescale}, {cmd:estat lambda}, and {cmd:estat kappa} have to be specified as {cmd:. udiff_estat} {it:subcmd} {pstd} where {it:subcmd} is {cmd:rescale}, {cmd:lambda}, or {cmd:kappa}. {dlgtab:estat rescale} {p 8 15 2} {cmd:estat} {cmdab:res:cale} [{it:{help numlist}}] [{cmd:,} {opt post} {opt l:evel(#)} {it:{help estimation_options##display_options:display_options}} ] {pstd} Report rescaled unidiff parameters using the normalization suggested by Xie (1992). The normalization is only supported for unidiff terms that contain a single categorical layer variable specified as {cmd:i.}{it:varname} (factor variable). The normalization is such that the sum of the squared parameters equals 1 (within each unidiff term). {phang} {it:numlist} specifies the unidiff terms to be included; this is only relevant if a model contains multiple unidiff terms. The default is to include all unidiff terms found in the model. To only include, say, the second unidiff term, type {cmd:estat rescale 2}. {phang} {opt post} causes the rescaled results to be posted in {cmd:e(b)} and {cmd:e(V)}. This will clear out the previous estimation results. Without the {cmd:post} option, the results are stored in {cmd:r(b)} and {cmd:r(V)}; see {help udiff##saved_results:Saved results} below. {phang} {opt level(#)} specifies the confidence level, as a percentage, for confidence intervals. The default is {cmd:level(95)} or as set by {helpb set level}. {phang} {it:display_options} are standard display options; see {helpb estimation_options##display_options:[R] estimation options}. {dlgtab:estat lambda} {p 8 15 2} {cmd:estat} {cmdab:lam:bda} [{it:#}] [{cmd:,} {opt std:ize} {opt eform} {opt comp:act} {opt post} {opt l:evel(#)} {it:{help estimation_options##display_options:display_options}} ] {pstd} Report lambda coefficients for unidiff term {it:#} (if {it:#} is omitted, the first unidiff term is used). {cmd:estat lambda} only supports unidiff terms that contain a single categorical layer variable and a single categorical predictor, both specified as {cmd:i.}{it:varname} (factor variable). See Pisati (2000) for a definition of the lambda coefficients. {phang} {opt stdize} requests standardized lambda coefficients. The default is to report raw lambda coefficients. {phang} {opt eform} reports the results in exponentiated form. {phang} {opt compact} requests that the lambda coefficients be displayed in a two-way table with one column per outcome level. Standard errors will not be displayed in this case. The default is to display the coefficients in a one-way table including standard errors and confidence intervals. {phang} {opt post} causes the lambda coefficients to be posted in {cmd:e(b)} and {cmd:e(V)}. This will clear out the previous estimation results. Without the {cmd:post} option, the results are stored in {cmd:r(b)} and {cmd:r(V)}; see {help udiff##saved_results:Saved results} below. {phang} {opt level(#)} specifies the confidence level, as a percentage, for confidence intervals. The default is {cmd:level(95)} or as set by {helpb set level}. {phang} {it:display_options} are standard display options; see {helpb estimation_options##display_options:[R] estimation options}. {dlgtab:estat kappa} {p 8 15 2} {cmd:estat} {cmdab:kap:pa} [{it:#}] [{cmd:,} {opt post} {opt l:evel(#)} {it:{help estimation_options##display_options:display_options}} ] {pstd} Report kappa indices coefficients for unidiff term {it:#} (if {it:#} is omitted, the first unidiff term is used). {cmd:estat kappa} only supports unidiff terms that contain a single categorical layer variable and a single categorical predictor, both specified as {cmd:i.}{it:varname} (factor variable). See Pisati (2000) for a definition of the kappa indices. {phang} {opt post} causes the kappa indices to be posted in {cmd:e(b)} and {cmd:e(V)}. This will clear out the previous estimation results. Without the {cmd:post} option, the results are stored in {cmd:r(b)} and {cmd:r(V)}; see {help udiff##saved_results:Saved results} below. {phang} {opt level(#)} specifies the confidence level, as a percentage, for confidence intervals. The default is {cmd:level(95)} or as set by {helpb set level}. {phang} {it:display_options} are standard display options; see {helpb estimation_options##display_options:[R] estimation options}. {dlgtab:predict} {p 8 15 2} {cmd:predict} [{it:{help datatypes:type}}] {newvar} {ifin} [{cmd:,} {opt xb} {opt e:quation(equation)} ] {p 8 15 2} {cmd:predict} [{it:{help datatypes:type}}] {newvar} {ifin}{cmd:,} {opt p:r} [ {opt o:utcome(outcome)} ] {p 8 15 2} {cmd:predict} [{it:{help datatypes:type}}] {c -(}{it:stub}{cmd:*} | {help newvarlist:{it:newvarlist}}{c )-} {ifin}{cmd:,} {opt sc:ores} [ {opt e:quation(equation)} ] {phang} {opt xb} calculates linear predictions for the equation specified by {cmd:equation()}. {cmd:xb} is the default unless {cmd:pr} or {cmd:scores} is specified. If {opt equation()} is omitted, linear predictions are calculated for the first equation. {phang} {opt equation(equation)} specifies the equation for which linear predictions are to be calculated. {it:equation} can be an equation name, or an equation index specified as {cmd:#1}, {cmd:#2}, etc. Option {opt equation()} is not allowed with {cmd:pr}. {phang} {opt pr} calculates predicted probabilities for the outcome specified by {cmd:outcome()}. If {opt outcome()} is omitted, predicted probabilities are calculated for the first outcome. {phang} {opt outcome(outcome)} specifies the outcome for which predicted probabilities are to be calculated. {it:outcome} can be an outcome value, or an outcome index specified as {cmd:#1}, {cmd:#2}, etc. Option {opt outcome()} is only allowed with {cmd:pr}. {phang} {opt scores} calculates equation-level score variables (first derivative of the log likelihood). If {opt equation()} is omitted, score variables are generated for all equations (one variable per equation; if {it:k} is the number of outcomes, then the number of equations is equal to ({it:k}-1)*2+1). {marker examples}{...} {title:Examples} {help udiff##exbasic:Basic example} {help udiff##exrescale:Normalized unidiff parameters} {help udiff##exlambda:Lambda coefficients and kappa indices} {help udiff##exconstr:Specifying constraints} {help udiff##exfit:Testing model fit} {help udiff##excontinuous:Continuous origin variables} {help udiff##exmultiple:Multiple unidiff terms} {help udiff##excont:Continuous layer variables} {help udiff##excontrol:Control variables} {marker exbasic}{...} {dlgtab:Basic example} {pstd} The unidiff model in Example 2 in Pisati (2000) can be reproduced as follows: . {stata "use http://www.stata.com/stb/stb55/sg142/example2.dta, clear"} . {stata udiff son i.father i.country [fweight=obs]} {pstd} Using advanced syntax we could type . {stata udiff son (i.father <- i.country) [fweight=obs]} or . {stata udiff son (i.country -> i.father) [fweight=obs]} {pstd} A likelihood-ratio test against the constant-fluidity model is included in the header of the output table. In the example, the test is highly significant and confirms that there are differences in the unidiff parameters between the countries. {pstd} By default, {cmd:udiff} omits the base category from the output (Australia in this example) and displays the unidiff parameters in logarithmic form. To include the base category in the output, specify {cmd:baselevels}; to report unidiff parameters as multipliers, add the {cmd:eform} option: . {stata udiff, eform baselevels} {pstd} Furthermore, by default only the unidiff scaling parameters are reported. To report all parameters of the model, specify option {cmd:all}: . {stata udiff, all} {marker exrescale}{...} {dlgtab:Normalized unidiff parameters} {pstd} To obtain rescaled unidiff parameters using the normalization suggested by Xie (1992), you can apply command {helpb udiff##postest:estat rescale} after model estimation: . {stata "use http://www.stata.com/stb/stb55/sg142/example2.dta, clear"} . {stata udiff son i.father i.country [fweight=obs], eform base} . {stata estat rescale} {pstd} Note that {helpb udiff##postest:estat rescale} is only supported for unidiff terms that contain a single categorical layer variable. {marker exlambda}{...} {dlgtab:Lambda coefficients and kappa indices} {pstd} To obtain lambda coefficients (see Pisati 2000) you can apply command {helpb udiff##postest:estat lambda} after model estimation: . {stata "use http://www.stata.com/stb/stb55/sg142/example1.dta, clear"} . {stata udiff son i.father i.country [fweight=obs], eform base} . {stata estat lambda, stdize eform compact} {pstd} The {cmd:compact} option has been specified to display the coefficients in a two-way table. This means that standard errors are not shown. Omit the {cmd:compact} option if you are interested in the standard errors or confidence intervals. {pstd} The kappa indices, which are based on standardized lambda coefficients, can be obtained as follows: . {stata estat kappa} {pstd} Note that {helpb udiff##postest:estat lambda} and {helpb udiff##postest:kappa} are only supported for unidiff terms that contain a single categorical layer variable and a single categorical predictor. {marker exconstr}{...} {dlgtab:Specifying constraints} {pstd} In case of empty cells or similar problems, it may be necessary to specify constraints for the model to converge. Using the same data as above, assume that the combinations of father = "NonManual" and son = "Farm" is missing: . {stata "use http://www.stata.com/stb/stb55/sg142/example2.dta, clear"} . {stata replace obs = 0 if son==3 & father==1} {pstd} To make {cmd:udiff} converge in this example, we can set the parameter for "NonManual" in the psi-equation for "Farm" to zero (while at the same time making sure that "NonManual" is not used as the base category). The following commands would do: . {stata "constraint 1 [Psi_3]: 1.father"} . {stata udiff son ib2.father i.country [fweight=obs], allequations constraints(1)} {marker exfit}{...} {dlgtab:Testing model fit} {pstd} To test the fit of the unidiff model, a likelihood-ratio test against a saturated model can be performed, where the saturated model is a fully-interacted multinomial logit. A significant test statistic would indicate, that the saturated model fits the data significantly better than the unidiff model. An example is as follows: . {cmd:use http://www.stata.com/stb/stb55/sg142/example1.dta, clear} . {cmd:udiff son i.father i.country [fweight=obs]} . {cmd:estimates store udiff} . {cmd:mlogit son i.father##i.country [fweight=obs]} . {cmd:lrtest udiff ., force} {pstd} Option {cmd:force} is needed because different estimation commands have been used to estimate the two models. {pstd} Be aware that the likelihood-ratio test is only valid in case of simple random sampling. Do not use the test with complex samples, i.e., if sampling weights or the {cmd:svy} prefix have been specified. {marker excontinuous}{...} {dlgtab:Continuous origin variables} {pstd} Assume that, apart from the categorical information on father's class, your data also contains a continuous origin variable such as father's ISEI score ({cmd:fisei}). Such information could easily be included in the model by adding the variable to the list of predictors in the unidiff term: . {cmd:udiff son (i.father fisei <- i.country)} {marker exmultiple}{...} {dlgtab:Multiple unidiff terms} {pstd} Assume your data also contains information on mothers. You could include this information in the unidiff model, for example, as follows: . {cmd:udiff son (i.father i.mother <- i.country)} {pstd} In this case, a single unidiff scaling factor would be used for both the effects of fathers and the effects of mothers. To use different unidiff factors and thus allow the effects of father and mothers to vary differently across countries, you could type: . {cmd:udiff son (i.father <- i.country) (i.mother <- i.country)} {marker excont}{...} {dlgtab:Continuous layer variables} {pstd} The layer variable(s) do not need to be categorical. For example, if you have individual-level data containing information on the birth years of the respondents, you could model the layer effects as a parabolic function of the birth year to analyze how social mobility changes over time. To avoid convergence issues it is a good idea to center the birth years at a date that actually exists in the data. For example, define {cmd:cohort} = (birth year - 1980) and then type . {cmd:udiff son (i.father <- c.cohort##c.cohort)} {pstd} Likewise, you could model the layer effects in terms of country characteristics: . {stata "use http://www.stata.com/stb/stb55/sg142/example2.dta, clear"} . {stata udiff son (i.father <- develop socdem i.east i.asia) [fweight=obs]} {pstd} Statistical inference may not be credible in this example and we might want to cluster on countries: . {stata udiff son (i.father <- develop socdem i.east i.asia) [fweight=obs], cluster(country)} {pstd} No value for the joint Wald test of the unidiff parameters (i.e. the test against the constant-fluidity model) is reported in this case due to the way how {helpb ml} (the underlying command used for model estimation) determines the degrees of freedom for the test. You can obtain the test using the {helpb test} command after model estimation: . {stata test [Phi]} {pstd} However, note that the number of countries is small. Cluster-robust standard errors may be inconsistent in such a setting (a general recommendation is that the number of clusters should be at least 40 or 50). {marker excontrol}{...} {dlgtab:Control variables} {pstd} Assume that the age structure (or distribution of birth years) is different across countries and you want to take account of that in your analysis. You could, for example, type . {cmd:udiff son (i.father <- i.country) age} {pstd} In this way a an age effect that is common to all countries is included in the model. You could, of course, also use a more complex specification, such as, e.g., . {cmd:udiff son (i.father <- i.country) c.age##c.age} {marker methods}{...} {title:Methods and formulas} {dlgtab:The unidiff model} {pstd} The unidiff model is typically used to study differences in intergenerational social mobility between birth cohorts or countries. Let {it:mu}(x,y,z) be the cell frequencies in a three-way table of X (origin class, e.g. class of parents) by Y (destination class, e.g. class of children) by Z (e.g. cohort). Lowercase x, y, and z denote the levels of X, Y, and Z. In a saturated log-linear model the cell frequencies are parametrized as ln {it:mu}(x,y,z) = {it:a} + {it:a}(x) + {it:a}(y) + {it:a}(z) + {it:a}(x,y) + {it:a}(x,z) + {it:a}(y,z) + {it:a}(x,y,z) {pstd} where {it:a} is an overall intercept capturing the average cell frequency, {it:a}(x), {it:a}(y), and {it:a}(z) are factors capturing the marginal distributions of X, Y, and Z, {it:a}(x,y), {it:a}(x,z), and {it:a}(y,z) capture two-way associations, and {it:a}(x,y,z) captures the three-way association. For example, if X, Y, and Z are independent from each other, {it:a}(x,y), {it:a}(x,z), {it:a}(y,z), and {it:a}(x,y,z) will be zero for all x, y, and z. Likewise, if the association between X and Y is constant over cohorts, {it:a}(x,y,z) will be zero for all x, y, and z, such that ln {it:mu}(x,y,z) = {it:a} + {it:a}(x) + {it:a}(y) + {it:a}(z) + {it:a}(x,y) + {it:a}(x,z) + {it:a}(y,z) {pstd} This is the so-called constant-fluidity model. The saturated model accurately describes the data, but has too many parameters to be informative; the constant-fluidity model is too simplistic because it assumes away any change in relative mobility. The unidiff model takes a middle ground in that it allows the association between X and Y to vary with Z, but places a specific restriction on the form of this variation. In particular, the unidiff model introduces a scaling factor {it:b}(z) such that ln {it:mu}(x,y,z) = {it:a} + {it:a}(x) + {it:a}(y) + {it:a}(z) + {it:a}(x,z) + {it:a}(y,z) + {it:a}(x,y) * {it:b}(z) {pstd} That is, the unidiff model assumes that there is a common association pattern between X and Y, but the "strength" of the pattern can differ across cohorts. {dlgtab:Re-expression at the individual level} {pstd} Traditionally, the unidiff model has been estimated from tabular data. However, the model (or, at least, the interesting part of it) can also be expressed such that it takes the form of a regression model fitted to individual-level data. From a perspective with Y as the "dependent" variable, the saturated log-linear model is equivalent to a multinomial logit of Y on X, Z, and the interaction between X and Z, where X and Z are treated as factor variables. Likewise, the constant-fluidity model is a multinomial logit of Y on X and Z, without interaction between X and Z. Furthermore, the unidiff model is equivalent to a multinomial logit written as Pr(Y = y| X, Z) = exp(W'{it:theta}(y) + X'{it:psi}(y) * exp(Z'{it:phi})) / D {pstd} where D is the sum of the expression in the numerator across all levels of Y, and W is equal to Z augmented by a constant, i.e. W = (1,Z')' (again, X and Z are treated as factor variables, i.e. think of X and Z as vectors of dummy variables). {it:theta}(y), {it:phi}, and {it:psi}(y) are parameter vectors; {it:phi} is common to all levels of Y, {it:theta}(y) and {it:psi}(y) are level-specific. In this model, {it:theta}(y) represents {it:a}(y) and {it:a}(y,z) (the marginal distribution of Y as well as the main effects of Z, i.e. how the marginal distribution of Y depends on Z), exp({it:phi}) represents {it:b}(z) (the unidiff scaling factors), and {it:psi}(y) represents {it:a}(x,y) (the association between X and Y). Terms {it:a}(x) (marginal distribution of X), {it:a}(z) (marginal distribution of Z), {it:a}(x,z) (association between X and Z) are not represented in the model (i.e., the model only contains parameters that are related to Y). {dlgtab:Generalization: multiple unidiff terms} {pstd} Generally seen, the unidiff model is just a multinomial logit model that contains a special kind of interaction terms. The model may thus be useful also for research questions that have nothing to do with social mobility. Furthermore, the model can be generalized so that it contains multiple unidiff terms. Let X1 and X2 be two sets of independent variables, Z1 and Z2 two sets of layer variables, and C a set of control variables that are not interacted with Z1 or Z2. The model can then be written as: {p 8 8 2}Pr(Y = y| X1, Z1, X2, Z2, C) ={p_end} {p 12 12 2}exp(W'{it:theta}(y) + X1'{it:psi1}(y) * exp(Z1'{it:phi1}) + X2'{it:psi2}(y) * exp(Z2'{it:phi2})) / D{p_end} {pstd} where W = (1, Z1', Z2', C')'. The model can be extended analogously to accommodate more than two unidiff terms. {dlgtab:Estimation} {pstd} {cmd:udiff} estimates the unidiff model using {helpb ml}. To obtain good starting values, {cmd:udiff} first fits a constant-fluidity model (which is equivalent to a standard {helpb mlogit} model ignoring the layer variables). A test of the unidiff model against the constant-fluidity model is included in the output (as an LR test or a Wald test, depending on context). {pstd} As usual in a multinomial logit, the coefficients are set to zero for one of the levels of Y to identify the model. Furthermore, as is usual for factor variables, {it:phi} is set to zero for one of the levels of Z if Z is a categorical variable. exp({it:phi}) then expresses the unidiff scaling factors with respect to this base category. {pstd} Estimating the unidiff model from individual-level data is more demanding than fitting the model to a contingency table (although note that, for efficient computation, {cmd:fweight}s can be used on collapsed data), but it brings about enhanced flexibility. For example, it is easily possible to include continuous (rather than categorical) origin and layer variables, control variables whose effects as assumed constant over cohorts can be taken into account (by including them in W), and standard errors for the parameter estimates are readily available (including support for sampling weights or other characteristics of a complex survey design). {marker saved_results}{...} {title:Saved results} {pstd} {cmd:udiff} stores results as described in {helpb ml##results:[R] ml}, as well as the following elements: {p2colset 7 22 26 2}{...} {p2col 5 22 26 2: Scalars}{p_end} {p2col : {cmd:e(k_out)}}number of outcomes {p_end} {p2col : {cmd:e(ibaseout)}}index of the base outcome {p_end} {p2col : {cmd:e(k_unidiff)}}number of unidiff terms {p_end} {p2col : {cmd:e(k_eform)}}number of equations to be affected by the {cmd:eform} option {p_end} {p2col 5 22 26 2: Macros}{p_end} {p2col : {cmd:e(cmd)}}{cmd:udiff} {p_end} {p2col : {cmd:e(predict)}}{cmd:udiff_p} {p_end} {p2col : {cmd:e(estat_cmd)}}{cmd:udiff_estat} {p_end} {p2col : {cmd:e(cfonly)}}{cmd:cfonly} or empty {p_end} {p2col : {cmd:e(layervars)}}names of layer variables; if {cmd:e(k_unidiff)}=1 {p_end} {p2col : {cmd:e(layervars#)}}names of layer variables of #th unidiff term; if {cmd:e(k_unidiff)}>1 {p_end} {p2col : {cmd:e(xvars)}}names of independent variables; if {cmd:e(k_unidiff)}=1 {p_end} {p2col : {cmd:e(xvars#)}}names of independent variables of #th unidiff term; if {cmd:e(k_unidiff)}>1 {p_end} {p2col : {cmd:e(controlvars)}}names of control variables {p_end} {p2col : {cmd:e(eqnames)}}names of equations {p_end} {p2col : {cmd:e(out)}}values of {it:depvar} {p_end} {p2col : {cmd:e(baseout)}}value of {it:depvar} treated as the base outcome {p_end} {p2col : {cmd:e(out_labels)}}value labels of {it:depvar} (if available) {p_end} {pstd} Without the {cmd:post} option, {cmd:estat rescale}, {cmd:estat lambda}, and {cmd:estat kappa} store the following results in {cmd:r()}: {p2colset 7 22 26 2}{...} {p2col 5 22 26 2: Scalars}{p_end} {p2col : {cmd:r(N)}}number of observations {p_end} {p2col 5 22 26 2: Matrices}{p_end} {p2col : {cmd:r(b)}}coefficients {p_end} {p2col : {cmd:r(V)}}variance matrix {p_end} {p2col : {cmd:r(lambda)}}compact representation coefficients ({cmd:estat lambda} only) {p_end} {pstd} Without the {cmd:post} option, {cmd:estat rescale}, {cmd:estat lambda}, and {cmd:estat kappa} store the following results in {cmd:e()}: {p2colset 7 22 26 2}{...} {p2col 5 22 26 2: Scalars}{p_end} {p2col : {cmd:r(N)}}number of observations {p_end} {p2col : {cmd:r(N_clust)}}number of clusters (if {it:vcetype} is {cmd:cluster}) {p_end} {p2col : {cmd:r(k_eq)}}number of equations {p_end} {p2col : {cmd:e(k_eform)}}number of equations to be affected by the {cmd:eform} option {p_end} {p2col 5 22 26 2: Macros}{p_end} {p2col : {cmd:e(cmd)}}{cmd:udiff_estat} {p_end} {p2col : {cmd:e(subcmd)}}{cmd:rescale} or {cmd:lambda} {p_end} {p2col : {cmd:e(estat_cmd)}}{cmd:udiff_estat} {p_end} {p2col : {cmd:e(title)}}title used in output {p_end} {p2col : {cmd:e(vce)}}{it:vcetype} as specified when calling {cmd:udiff} {p_end} {p2col : {cmd:e(vcetype)}}title used to label Std. Err. {p_end} {p2col : {cmd:e(clustvar)}}name of cluster variable {p_end} {p2col : {cmd:e(properties)}}{cmd:b V} {p_end} {p2col 5 22 26 2: Matrices}{p_end} {p2col : {cmd:e(b)}}coefficients {p_end} {p2col : {cmd:e(V)}}variance matrix {p_end} {p2col : {cmd:r(lambda)}}compact representation coefficients ({cmd:estat lambda} only) {p_end} {marker references}{...} {title:References} {phang} Erikson, R., J.H. Goldthorpe. 1992. The Constant Flux: A Study of Class Mobility in Industrial Societies. Oxford: Oxford University Press. {p_end} {phang} Pisati, M. 2000. {stata "net describe sg142, from(http://www.stata.com/stb/stb55)":sg142}: Uniform layer effect models for the analysis of differences in two-way associations. Stata Technical Bulletin 55: 33-47. {p_end} {phang} Xie, Y. 1992. The Log-Multiplicative Layer Effect Model for Comparing Mobility Tables. American Sociological Review 57(3): 380–395. {p_end} {marker authors}{...} {title:Authors} {pstd} Ben Jann, University of Bern, ben.jann@soz.unibe.ch {p_end} {pstd} Simon Seiler, University of Bern, simon.seiler@icer.unibe.ch {pstd} Thanks for citing this software as follows: {pmore} Jann, B., S. Seiler. 2019. udiff: Stata module to estimate the generalized unidiff model for individual-level data. Available from {browse "http://ideas.repec.org/c/boc/bocode/s458711.html"}.