Title
sheafcoef -- Post-estimation command that displays sheaf coefficients.
Syntax
sheafcoef , latent(latend_spec) [ equation(## | name) eform beta post level(#) ]
Description
sheafcoef is a post-estimation command that estimates sheaf coefficients (Heise 1972). A sheaf coefficient assumes that a block of variables influence the dependent variable through a latent variable. shearcoef displays the effect of the latent variable and the effect of the observed variables on the latent variable. The scale of the latent variable is identified by setting the standard deviation equal to one. The origin of the latent variable is identified by setting it to zero when all observed variables in its block are equal to zero. This means that the mean of the latent variable is not (necesarily) equal to zero. The final identifying assumption is that the effect of the latent variable is always positive, so to give a substantive interpretation of the direction of the effect, one needs to look at the effects of the observed variables on the latent variable. Alternatively, one can specify one "key" variable in each bloack of variables, which identifies the direction of a latent variable, either by spedifying that the latent variable has a high value when the key variable has a high value or that the latent variable has a low value when the key variable has a high value.
The assumption that the effect of a block of variables occurs through a latent variable is not a testable constraint; it is just a different way of presenting the results from the original model. Its main usefulness is in comparing the relative strength of the influence of several blocks of variables. For example, say we want to know what determines the probability of working non-standard hours and we have a block of variables representing characteristics of the job and another block of variables representing the family situation of the respondent, and we want to say something about the relative importance of job characteristics versus family situation. In that case one could estimate a logit model with both blocks of variables and optionally some other control variables. After that one can use sheafcoef to display the effects of two latent variables, family background and job characteristics, which are both standardized to have a standard deviation of 1, and can thus be more easily compared.
The output is divided into a number of equations. The top equation, labeled "main", represents the effects of the latent variables and other control variables (if any) on the dependent variable. The names of the latent variables are as specified in the latent() option. If no names are specified, they will be called "lvar1", "lvar2", etc. Below the main equation, one additional equation for every latent variable is displayed, labelled "on_name1", "on_name2", etc., where "name1" and "name2" are the names of the latent variables. These are the effects of the observed variables on the latent variable.
The sheaf coeficients and the variance covariance matrix of all the coefficients are estimated using nlcom. sheafcoef can be used after any regular estimation command (that is, a command that leaves its results behind in e(b) and e(V)), The only constraint is that the observed variables that make up the latent variable(s) must all come from the same equation.
Options
latent(latent_spec) specifies the blocks of variables that make up the latent variabls. In summary, the syntax for latent_spec is:
[name1:]varlist_1 [if] [; [name2:]varlist_2 [if] [; [name3:]varlist_3 [if][...]]]
It consists of blocks each such that each block is seperated by a semicolon (;). Each block needs to constist of at least two variables. These variables must be explanatory variables in the estimation command preceding sheafcoef, and the same variable can only appear in one block. Optionally, each block of variables can be preceded by a name for the latent variable followed by a colon (:).
Moreover, one can identify one key variable in each block of variables, by attaching a + or - (without a space) to a variable in a block. If one of the observed variables has + attached to it, then the latent variable will have a high value when that observed variable is high and a low value when that observed variable is low. The opposite is true when one of the observed variables has a - attached to it. If no observed variable in a block has a + or a - attached to it, than the direction of that latent variable is identified such that it's effect on the dependent variable is positive.
Finaly the if conditions determine which observations are used to identify the scale of the latent variable. This can be useful when comparing effects across groups. Consider the following example:
sysuse nlsw88, clear
gen ln_w = ln(wage)
drop if race == 3 gen byte black = race == 2 gen byte white = race == 1
gen blackXmarried = black*married gen blackXnever_married = black*never_married gen whiteXmarried = white*married gen whiteXnever_married = white*never_married
reg ln_w black* white*, nocons
(click to run)
In this example we look at the effect of marital status on income for black and white women. The interaction effect with black are the effects of the marital status variables in the black sample, while the interaction effects with white are the effects of the marital status variables in the white sample. To use sheafcoef in this case we need to make sure that we identifiy the scale of the latent martial status variable for the black sample using only the black respondents and identify the scale of the latent variable for the white sample using only the white respondents. This is what the if conditions are for:
sheafcoef, latent(black_marst: blackXmarried blackXnever_married if bla > ck ; /// white_marst: whiteXmarried whiteXnever_married if whi > te )
(click to run)
equation(## | name) specifies the equation from the previous estimation command to be used when computing the sheaf coefficients. This option is relevant when using sheafcoef after commands like mlogit or heckman that return results in multiple equations. One can either specify whether sheafcoef should consider the first, second, etc. equation or one can type in the name of that equation. In the former case the number of the equation should be preceded by a #.
eform specifies that the effects of the latent variable and the control variables are exponentiated. The effects of the observed variables in each block on its latent variable are not exponentiated, because these represent the effects of these variables on the standardized latent variable and not on the dependent variable.
beta asks that standardized beta coefficients be reported. The beta coefficients are the regression coefficients obtained by first standardizing all variables to have a mean of 0 and a standard deviation of 1. When specifying the beta option after logit, logistic or probit the latent dependent variable (often refered to as y*) is standardized. beta may only be specified after regress, logit, logistic, or probit, and may not be specified when the previous estimation command used clustered standard errors or the svy prefix.
post causes sheafcoef to behave like a Stata estimation (eclass) command. When post is specified, sheafcoef will post the vector of transformed estimators and its estimated variance-covariance matrix to e(). This option, in essence, makes the transformation permanent. Thus you could, after posting, treat the transformed estimation results in the same way as you would treat results from other Stata estimation commands. For example, after posting, you could use test to perform simultaneous tests of hypotheses on linear combinations of the transformed estimators.
Specifying post clears out the previous estimation results, which can be recovered only by refitting the original model or by storing the estimation results before running nlcom and then restoring them; see [R] estimates store.
level(#) specifies the confidence level, as a percentage, for confidence intervals. The default is level(95) or as set by set level.
Examples
sysuse nlsw88, clear gen byte lower = inlist(occupation, 9, 10, 11, 12, 13) if occupation < . gen byte middle = inlist(occupation, 3, 4, 5, 6, 7, 8) if occupation < . glm wage lower middle married never_married union grade, link(log) sheafcoef, latent(class: lower middle; /// marital: married never_married) /// post test [main]_b[class]=[main]_b[marital]
(click to run)
Notice that in the example above the scale of class, as shown in the equation "on_class", runs from lower (large negative value) to higher (0) indicating that the effect of class is possitive: women with higher occupations receive more income.
sysuse nlsw88, clear gen byte lower = inlist(occupation, 9, 10, 11, 12, 13) if occupation < . gen byte middle = inlist(occupation, 3, 4, 5, 6, 7, 8) if occupation < . logit union middle lower married never_married sheafcoef, latent(class: lower middle; /// marital: married never_married) sheafcoef, latent(class: lower middle; /// marital: married never_married) /// eform
(click to run)
Notice that in the example above the scale of class, as shown in the equation "on_class", runs from higher (0) to lower (large positive value) indicating that the effect of class is negative: women with higher occupations are less likely to be a union member.
sysuse nlsw88, clear gen byte lower = inlist(occupation, 9, 10, 11, 12, 13) if occupation < . gen byte middle = inlist(occupation, 3, 4, 5, 6, 7, 8) if occupation < . logit union middle lower married never_married sheafcoef, latent(class: -lower middle; /// marital: married never_married)
(click to run)
Notice that in the example above, by attaching a - to lower, we are saying that we want the latent variable to have a low value when the observed variable lower has a high value and vice versa. The consequence is that we changed the direction of latent variable class such that the scale of class runs from lower (large negative number) to higher (zero), and the effect of class now becomes negative.
Author
Maarten L. Buis Universitaet Tuebingen Institut fuer Soziologie maarten.buis@uni-tuebingen.de
References
Heise, David R. (1972). Employing nominal variables, induced variables, and block variables in path analysis. Sociological Methods & Research, 1(2): 147--173.
Also see
Online: nlcom
If installed: propcnsreg