{smcl} {cmd:help genyhats} {hline} {title:Title} {p2colset 5 20 22 2}{...} {p2col :genyhats {hline 2}}Generates {it:y_hat} affinity measures for PTV analysis{p_end} {p2colreset}{...} {title:Syntax} {p 8 16 2} {opt genyhats} yhatname1: {indepvars}_1 [ || yhatname2: {indepvars}_2 ... ] [{cmd:,} {it:options}]{p_end} or {p 8 16 2} {opt genyhats} {varlist} [{cmd:,} {it:options}] {p_end} {synoptset 21 tabbed}{...} {synopthdr} {synoptline} {synopt :{opt dep:var(varname)}}the dependent variable for which affinities are being estimated{p_end} {synopt :{opt con:textvars(varlist)}}the variables identifying different electoral contexts{p_end} {synopt :{opt sta:ckid(varname)}}a variable identifying different "stacks", for which y-hats will be separately generated if {cmd:genyhats} is called after stacking{p_end} {synopt :{opt nos:tack}}override the default behavior that treats each stack as a separate context (has no effect if the command is used before the data have been stacked.){p_end} {synopt :{opt ypr:efix}}the prefix for new variable(s) generated from those in {it:varlist} (default is to use a prefix of "y_"){p_end} {synopt :{opt log:it}}use a logit model instead of linear regression (the default){p_end} {synopt :{opt adj:ust(mean | constant | no )}}adjust the y_hat by subtracting the mean (default) or subtracting the constant term. Alternatively, make no adjustment.{p_end} {synopt :{opt rep:lace}}drop all {it:indepvars} after the generation of y-hats{p_end} {synopt :{opt eff:ects(window | rtf | csv | html)}}display a summary table of stack-specific effects from the regression used to generate a y-hat{p_end} {synopt :{opt efm:t}}change the coefficients reported by the effects() option{p_end} {synopt :{opt output}}(not recommended) directly flush the results of each stack-specific regression into the standard output.{p_end} {synoptline} {title:Description} {pstd} {cmd:genyhats} generates (multiple) {it:y-hat} affinitie(s) for {it:depvar} based on each (set of) {it:indepvars}, saving them into the corresponding {it:yhatname} separately for each combination of {it:stackvars}. Multiple y-hat variables can be generated by specifying multiple models separated by {bf:||}, but each of them has to involve affinities with the same depvar so, if issued before stacking, {cmd:genyhats} will have to be issued repeatedly (once for each of the item-specific depvars that will, after stacking, become a single generic depvar).{break} {pstd} Optionally, a variable list can be used instead of the "...: ...||" primary syntax. When a variable list is employed with {cmd:genyhats} each variable in the list is treated as a single independent variable in a set of separate predictions of {it:depvar}. With this syntax the y-hat affinity variable will be named by prefixing the independent variable with "y_" or such other prefix as may be established by the {cmd:yprefix()} option.{break} {pstd} The two syntaxes may be combined in that any appearance of || causes the previous variables to be treated as a variable list of which the first (unless it was followed by ":") will provide the suffix for the new variable name, being prefixed by "y_" or such other prefix as may be established by the {cmd:yprefix()} option. {pstd} The {cmd:genyhats} command estimates the effect of each (set of) indep(s) on the depvar, separately for each stack if the data are stacked (unless {cmd:nostack} is optioned) and separately for each context if the {cmd:contextvars()} option was employed. It uses Stata's {cmd:predict} command to produce predicted values of the depvar for each case. These sets of so-called "y-hats" are each adjusted by subtracting the mean from the prediction equation (separately for each stack and context, if present) - unless some other adjustment is optioned by means of the {cmd:adjust} option - and saved under the appropriate variable name as described above. Estimation is by OLS unless {cmd:logit} is optioned.{break} NOTE that, if the y_hat is not adjusted, the stack-specific mean will be included in the estimated y-hats, creating inconsistencies as between stacks and contexts that can cause large anomalies in subsequent analyses using these variables. As a result, in published work the choice of subtracting the mean has mostly been employed (and is the default option in {cmd:genyhats}). However, the option of subtracting the constant term is also available. {pstd} The {cmd:genyhats} command can be issued before or after stacking. If issued after stacking, by default it treats each stack as a separate context to take into account along with any higher-level contexts. This yields the same y-hat estimates as would have been created for separate unstacked depvars. However, the {cmd:nostack} option can be employed to force {cmd:genyhats} to ignore the stack-specific contexts. In addition, {cmd:genyhats} can be employed with or without distinguishing between higher-level contexts, if any, (with or without the {cmd:contextvars()} option) depending on what makes methodological sense. If issued after stacking the command need only be issued once for the (generic) depvar instead of separately for each unstacked depvar. This makes {cmd:genyhats} simpler to use and saves creating a mass of temporary variables which hugely increase the size of the (often already very large) stacked file, but takes longer because estimation is performed with a much larger dataset, selecting a different stack on each pass. {pstd} NOTE that when used in subsequent analyses (for instance in regression models) estimated coefficients for y-hat variables are not readily interpretable. In the absence of error variance and multicolinearity, each coefficient calculated for a y-hat independent variable predicting the ptv dependent variable would be +1.0. The actual values of these coefficients thus constitute a quasi-measure of covariance - like a partial correlation coefficient. However, standard errors (along with beta coefficients from OLS) retain their customary meanings. {title:Options} {phang} {opt depvar(varname)} if specified, the variable for which affinities are estimated (default is {it:ptv}).{p_end} {phang} {opt contextvars(varlist)} if specified, the variables whose combinations of values identify different electoral contexts (by default all cases are treated as part of a single context).{p_end} {phang} {opt nostack} if present, overrides the default behavior of treating each stack as a separate context (has no effect if the {cmd:genyhats} command is issued before stacking).{p_end} {phang} {opt stackid(varname)} if specified, a variable identifying different "stacks", for which y-hats will be separately generated. The default is to use the "genstacks_stack" variable if the {cmd:genyhats} command is issued after stacking.{p_end} {phang} {opt yprefix} if specified, provides a prefix for y-hat affinities generated for each variable in a variable list (the default is "y_"). NOTE that the prefix, whether default or provided, can be overridden by explicitly specifying the y-hat variable name before a colon introducing the variable(s) to be used in estimating this y-hat.{p_end} {phang} {opt logit} if specified, invokes a logit model instead of linear regression (the default).{p_end} {phang} {opt adjust( constant | mean | no )} if specified, adjusts the y_hat by subtracting the constant term (default) or subtracting the mean. Alternatively, make no adjustment. Note that, when a logit model is optioned, the adjustment takes place on propensity values, and then mapped back to probability values.{p_end} {phang} {opt replace} if specified, drops all {it:indepvars} for all specified models after the generation of y-hats.{p_end} {phang} {opt effects(window | rtf | csv | html)} if specified, displays a table (in publication format) that summarizes the different effects of the same predictors in different stacks. The {cmd:window} option flushes the table to the standard output, while the other option saves the table in an external file, according to the chosen file format. By default, z-values are reported, along with significance stars. The {cmd:efmt()} option can be used to change the coefficients reported in the table.{p_end} {phang} {opt efmt} if specified, changes the coefficient reported in tables generated by {cmd:effects}. {efmt()} accepts two types of values: either {cmd:beta} (in order to obtain beta coefficients) or any format string that is accepted by the {bf:{help estout##cells:cells()}} option of the {cmd:estout} command. As an example, {cmd:efmt(b(fmt(3)star))} displays b coefficients with three decimal digits and significance stars.{p_end} {phang} {opt output} if specified, the results of each stack-specific regression used to generate a y-hat are directly flushed into the standard output.{p_end} {title:Examples:} {pstd}The following command generates two y-hat variables for {it:ptv} (the default dependent variable), based on working conditions and issues, with observations clustered by {it:t102}; and drops the original independent variables. In this example {cmd:stackid} is set to the variable {it:stack}, implying that the name "stack" was specified in the {cmd:stackid()} option of a previous {cmd:genstacks} command, or that the data were reshaped in some other fashion (eg using stata's {bf:{help reshape:reshape}} command with "stack" specified for the {cmd:j()} option. {phang2} {cmd:. genyhats ywork: work_* || yissues: q56-q67, context(t102) stackid(stack) replace}{p_end} {pstd}The following command generates four y-hat variables for {it:chosen} (a binary dependent variable), one each for {it:age, income, union} and {it:educ}, with observations clustered by {it:t102}. Because the dependent variable is binary a logit model is used. The {cmd:yprefix} option ensures that the resulting y-hat variables will be named {it:yl_age, yl_income, yl_union} and {it:yl_educ}, possibly to distinguish them from similar variables created by OLS. Because {cmd:replace} is not optioned, all variables will be retained. Because the {cmd:stackid} variable is not defined, either {cmd:genstacks} is operating on unstacked data or on stacked data whose stacks are identified by the default "genstacks_stack" variable created by {cmd:genstacks}. {phang2} {cmd:. genyhats age income union educ, depvar(chosen) contextvars(t102) yprefix(yl_) logit}{p_end} {title:Generated variables} {pstd} {cmd:genyhats} saves the following variables: {synoptset 27 tabbed}{...} {synopt:{it:yhatname1 [yhatname2...]}} a set of y-hat (predicted) variables, each one either named before the colon introducing the corresponding (set of) indepvar(s) or (if no colon was employed) constructed from the corresponding indep by prefixing it with "y_" or whatever prefix may have been set using the {cmd:yprefix} option.{p_end}