```help for sheafcoef
-------------------------------------------------------------------------------

Title

sheafcoef -- Post-estimation command that displays sheaf coefficients.

Syntax

sheafcoef , latent(latend_spec) [ equation(## | name) eform beta post
level(#) ]

Description

sheafcoef is a post-estimation command that estimates sheaf coefficients
(Heise 1972). A sheaf coefficient assumes that a block of variables
influence the dependent variable through a latent variable. shearcoef
displays the effect of the latent variable and the effect of the observed
variables on the latent variable. The scale of the latent variable is
identified by setting the standard deviation equal to one. The origin of
the latent variable is identified by setting it to zero when all observed
variables in its block are equal to zero. This means that the mean of the
latent variable is not (necesarily) equal to zero. The final identifying
assumption is that the effect of the latent variable is always positive,
so to give a substantive interpretation of the direction of the effect,
one needs to look at the effects of the observed variables on the latent
variable. Alternatively, one can specify one "key" variable in each
bloack of variables, which identifies the direction of a latent variable,
either by spedifying that the latent variable has a high value when the
key variable has a high value or that the latent variable has a low value
when the key variable has a high value.

The assumption that the effect of a block of variables occurs through a
latent variable is not a testable constraint; it is just a different way
of presenting the results from the original model. Its main usefulness is
in comparing the relative strength of the influence of several blocks of
variables. For example, say we want to know what determines the
probability of working non-standard hours and we have a block of
variables representing characteristics of the job and another block of
variables representing the family situation of the respondent, and we
want to say something about the relative importance of job
characteristics versus family situation. In that case one could estimate
a logit model with both blocks of variables and optionally some other
control variables. After that one can use sheafcoef to display the
effects of two latent variables, family background and job
characteristics, which are both standardized to have a standard deviation
of 1, and can thus be more easily compared.

The output is divided into a number of equations. The top equation,
labeled "main", represents the effects of the latent variables and other
control variables (if any) on the dependent variable. The names of the
latent variables are as specified in the latent() option. If no names are
specified, they will be called "lvar1", "lvar2", etc. Below the main
equation, one additional equation for every latent variable is displayed,
labelled "on_name1", "on_name2", etc., where "name1" and "name2" are the
names of the latent variables. These are the effects of the observed
variables on the latent variable.

The sheaf coeficients and the variance covariance matrix of all the
coefficients are estimated using nlcom. sheafcoef can be used after any
regular estimation command (that is, a command that leaves its results
behind in e(b) and e(V)), The only constraint is that the observed
variables that make up the latent variable(s) must all come from the same
equation.

Options

latent(latent_spec) specifies the blocks of variables that make up the
latent variabls. In summary, the syntax for latent_spec is:

[name1:]varlist_1 [if] [; [name2:]varlist_2 [if] [; [name3:]varlist_3
[if][...]]]

It consists of blocks each such that each block is seperated by a
semicolon (;).  Each block needs to constist of at least two
variables. These variables must be explanatory variables in the
estimation command preceding sheafcoef, and the same variable can
only appear in one block. Optionally, each block of variables can be
preceded by a name for the latent variable followed by a colon (:).

Moreover, one can identify one key variable in each block of
variables, by attaching a + or - (without a space) to a variable in a
block. If one of the observed variables has + attached to it, then
the latent variable will have a high value when that observed
variable is high and a low value when that observed variable is low.
The opposite is true when one of the observed variables has a -
attached to it. If no observed variable in a block has a + or a -
attached to it, than the direction of that latent variable is
identified such that it's effect on the dependent variable is
positive.

Finaly the if conditions determine which observations are used to
identify the scale of the latent variable. This can be useful when
comparing effects across groups. Consider the following example:

sysuse nlsw88, clear

gen ln_w = ln(wage)

drop if race == 3
gen byte black = race == 2
gen byte white = race == 1

gen blackXmarried = black*married
gen blackXnever_married = black*never_married
gen whiteXmarried = white*married
gen whiteXnever_married = white*never_married

reg ln_w black* white*, nocons

(click to run)

In this example we look at the effect of marital status on income for
black and white women. The interaction effect with black are the
effects of the marital status variables in the black sample, while
the interaction effects with white are the effects of the marital
status variables in the white sample. To use sheafcoef in this case
we need to make sure that we identifiy the scale of the latent
martial status variable for the black sample using only the black
respondents and identify the scale of the latent variable for the
white sample using only the white respondents.  This is what the if
conditions are for:

sheafcoef, latent(black_marst: blackXmarried blackXnever_married if bla
> ck ; ///
white_marst: whiteXmarried whiteXnever_married if whi
> te )

(click to run)

equation(## | name) specifies the equation from the previous estimation
command to be used when computing the sheaf coefficients.  This
option is relevant when using sheafcoef after commands like mlogit or
heckman that return results in multiple equations. One can either
specify whether sheafcoef should consider the first, second, etc.
equation or one can type in the name of that equation. In the former
case the number of the equation should be preceded by a #.

eform specifies that the effects of the latent variable and the control
variables are exponentiated. The effects of the observed variables in
each block on its latent variable are not exponentiated, because
these represent the effects of these variables on the standardized
latent variable and not on the dependent variable.

beta asks that standardized beta coefficients be reported.  The beta
coefficients are the regression coefficients obtained by first
standardizing all variables to have a mean of 0 and a standard
deviation of 1. When specifying the beta option after logit, logistic
or probit the latent dependent variable (often refered to as y*) is
standardized. beta may only be specified after regress, logit,
logistic, or probit, and may not be specified when the previous
estimation command used clustered standard errors or the svy prefix.

post causes sheafcoef to behave like a Stata estimation (eclass) command.
When post is specified, sheafcoef will post the vector of transformed
estimators and its estimated variance-covariance matrix to e(). This
option, in essence, makes the transformation permanent.  Thus you
could, after posting, treat the transformed estimation results in the
same way as you would treat results from other Stata estimation
commands.  For example, after posting, you could use test to perform
simultaneous tests of hypotheses on linear combinations of the
transformed estimators.

Specifying post clears out the previous estimation results, which can
be recovered only by refitting the original model or by storing the
estimation results before running nlcom and then restoring them; see
[R] estimates store.

level(#) specifies the confidence level, as a percentage, for confidence
intervals.  The default is level(95) or as set by set level.

Examples

sysuse nlsw88, clear
gen byte lower = inlist(occupation, 9, 10, 11, 12, 13) if occupation < .
gen byte middle = inlist(occupation, 3, 4, 5, 6, 7, 8) if occupation < .

sheafcoef, latent(class:   lower middle;          ///
marital: married never_married) ///
post
test [main]_b[class]=[main]_b[marital]

(click to run)

Notice that in the example above the scale of class, as shown in the
equation "on_class", runs from lower (large negative value) to higher (0)
indicating that the effect of class is possitive: women with higher

sysuse nlsw88, clear
gen byte lower = inlist(occupation, 9, 10, 11, 12, 13) if occupation < .
gen byte middle = inlist(occupation, 3, 4, 5, 6, 7, 8) if occupation < .

logit union middle lower married never_married

sheafcoef, latent(class:   lower middle;          ///
marital: married never_married)
sheafcoef, latent(class:   lower middle;          ///
marital: married never_married) ///
eform

(click to run)

Notice that in the example above the scale of class, as shown in the
equation "on_class", runs from higher (0) to lower (large positive value)
indicating that the effect of class is negative: women with higher
occupations are less likely to be a union member.

sysuse nlsw88, clear
gen byte lower = inlist(occupation, 9, 10, 11, 12, 13) if occupation < .
gen byte middle = inlist(occupation, 3, 4, 5, 6, 7, 8) if occupation < .

logit union middle lower married never_married

sheafcoef, latent(class:  -lower middle;         ///
marital: married never_married)

(click to run)

Notice that in the example above, by attaching a - to lower, we are
saying that we want the latent variable to have a low value when the
observed variable lower has a high value and vice versa. The consequence
is that we changed the direction of latent variable class such that the
scale of class runs from lower (large negative number) to higher (zero),
and the effect of class now becomes negative.

Author

Maarten L. Buis
Universitaet Tuebingen
Institut fuer Soziologie
maarten.buis@uni-tuebingen.de

References

Heise, David R. (1972). Employing nominal variables, induced variables,
and block variables in path analysis.  Sociological Methods &
Research, 1(2): 147--173.

Also see

Online: nlcom

If installed: propcnsreg
```