{smcl} {cmd:help gendist} {hline} {title:Title} {p2colset 5 20 22 2}{...} {p2col :gendist {hline 2}}Generates distances for a battery of spatial placements{p_end} {p2colreset}{...} {title:Syntax} {p 8 16 2} {opt gendist} {varlist} [{cmd:,} {it:options}] {synoptset 25 tabbed}{...} {synopthdr} {synoptline} {synopt :{opth res:pondent(varname)}}(required) the variable containing the respondent's self-placement in the space (e.g. the issue space) in which items (e.g. the political parties) have been placed.{p_end} {synopt :{opth con:textvars(varlist)}}a set of variables identifying different electoral contexts (by default all cases are treated as part of the same context).{p_end} {synopt :{opth sta:ckid(varname)}}a variable identifying different "stacks", for which distances will be separately generated if {cmd:gendist} is issued after stacking.{p_end} {synopt :{opt nos:tack}}override the default behavior that treats each stack as a separate context.{p_end} {synopt :{opt mis:sing(mean|same|diff)}}plugs missing values on object placements{p_end} {synopt :{opt rou:nd}}rounds plugged values to the nearest integer{p_end} {synopt :{opt ppr:efix(name)}}prefix for generating mean-plugged placement variables (default is "p_"){p_end} {synopt :{opt mpr:efix(name)}}prefix for generating variables indicating original missingness of either component (item location or respondent location) of a distance measure (default is "m_"){p_end} {synopt :{opt dpr:efix(name)}}prefix for generating distance variables (default is "d_"){p_end} {synopt :{opt mco:untname(name)}}name of a generated variable reporting original count of missing items for each case (default is "_gendist_mc"){p_end} {synopt :{opt mpl:uggedcountname(name)}}name of a generated variable reporting the count of missing items for each case after mean-plugging (default is "_gendist_mpc"){p_end} {synopt :{opt rep:lace}}drops all party location variables and mean-plugged placement variables after the generation of distances. {synoptline} {title:Description} {pstd} {cmd:gendist} generates Euclidean distances for a battery of spatial items, where variables in {bf:{it:varlist}} contain the placement of different objects on the spatial scale and the variable specified in {bf:respondent} contains the self-placement of the respondent on the same spatial scale. Distances between the respondent and each spatial item in the battery are placed in corresponding members of a new battery of items. Only one battery of items can be processed on a single invocation of {cmd:gendist}. {pstd} The items in the new battery are given names derived from appending the names in {it:varlist} to the prefix established in option {bf:dprefix} (default {it:d_}). {pstd} If optioned by {bf:missing}, {cmd:gendist} also generates a new battery of items with the prefix established in option {bf:pprefix} (default {it:p_}) which is identical to the original battery but with missing values plugged by mean values. These mean values can be mean placements (e.g. of political parties on the left-right scale) by all respondents, mean placements by respondents who themselves have the same position as the placement, or mean placements by respondents themselves having a different position, depending on what is specified in option {bf:missing}. {pstd} Conventionally in published work the plugged value has been based on all placements. However, it might be thought that respondents having the same position would be more knowledgeable about the object concerned. Alternatively it might be thought that respondents having the same position might include individuals who were simply assuming that 'their' party had the same position as they did. Each of the {bf:missing} options is defensible theoretically so the user should think carefully about which to employ. The default is not to plug the missing data, so that distances are generated only for valid cases. {pstd} The {cmd:gendist} command can be issued before or after stacking. If issued after stacking, by default it treats each stack as a separate context to take into account along with any higher-level contexts. However, the {cmd:nostack} option can be employed to force {cmd:gendist} to ignore the stack-specific contexts. In addition, this command can be employed with or without distinguishing between higher-level contexts, if any, (with or without the {cmd:contextvars} option) depending on what makes methodological sense.{break} NOTE that it is unlikely to make methodological sense to employ {cmd:gendist} after stacking along with both the {cmd:nostack} and the {cmd:mean} options, since this would result in missing values being plugged with a mean that combined the values of what were (before stacking) several different variables. {pstd} SPECIAL NOTE ON MULTIPLE BATTERIES. Gendist is only aware of the battery it is currently processing. Thus it cannot diagnose an error if that battery is of a different length than other batteries of items pertaining to the objects (eg political parties) being asked about. Yet stacked datasets (the type of datasets for which distances are wanted) absolutely require all batteries pertaining to the objects being stacked to contain the same number of items and have these items in the correct sequential order ({cmd:gendist} will produce stacks in the correct order, padded as needed with stacks that contain only missing values, if the numeric suffixes to all batteries of items are correct). In datasets derived from election studies is is quite common for some questions (eg about party locations on certain issues) to be asked only for a subset of the objects being investigated (eg parties). Moreover, questions relating to those objects may not always list them in the same order. If the user employs {cmd:tab1} or {cmd:gendummies} to generate a battery of dummy variables corresponding to questions that did not list all parties or listed them in a different order then not only may the number of items in the resulting battery be different from those in another battery but also the numeric suffixes generated by {cmd:tab1} or {cmd:gendummies} may refer to different objects in the case of items from the different batteries. One part of this problem is alleviated by the use of {bf:{help gendummies:gendummies}} which generates dummy variable suffixes from the values actually found in the data, rather then numbering these sequentially as does {cmd:tab1}. But those values do need to be correct, which only the user can check. See also the special note on multiple batteries in the help text for {bf:{help genstacks:genstacks}}. {title:Options} {phang} {opth respondent(varname)} (required) the variable containing the respondent's self-placement on the battery of items. {phang} {opth contextvars(varlist)} if present, variables whose combinations identify different electoral contexts (by default all cases are assumed to belong to the same context). {phang} {opth stackid(varname)} if specified, a variable identifying different "stacks", for which distances will be separately generated in the absence of the {cmd:nostack} option. The default is to use the "genstacks_stack" variable if the {cmd:gendist} command is issued after stacking. {phang} {opt nostack} if present, overrides the default behavior of treating each stack as a separate context (has no effect if data are not stacked). {phang} {opth missing(mean|same|diff)} if present, determines treatment of missing values for object placement variables (by default they remain missing).{break} If {bf:mean} is specified, missing values are replaced with the overall mean placement of that object, calculated on the whole sample.{break} If {bf:same} is specified, missing values are replaced with the mean placement of the object, calculated only among those respondents that placed themselves on the same position as the object.{break} If {bf:diff} is specified, missing values are replaced with the mean placement of the object, calculated only among those respondents who placed themselves on a different position than the object (see discussion under 'Description' above regarding choice between these options).{break} When missing values are plugged, a set of p_{it:varlist} variables is generated, and the original variables are left unchanged (the p_ prefix can be altered by use of the option {bf:pprefix}).{break} NOTE: More sophisticated imputation facilites are offered by {bf:{help iimpute:iimpute}}. {phang} {opt round} if present, causes rounding of all plugged values to the closest integer. {phang} {opth dprefix(name)} if present, provides a prefix for generated distance variables (default is "d_"). {phang} {opth pprefix(name)} if present, provides a prefix for generated mean-plugged placements (default is "p_"). {phang} {opth mprefix(name)} if present, provides a prefix for generated variables indicating for each case whether, before mean-plugging of an item in the battery, either the item placement or the respondent placement was missing for that case (default is "m_"). {phang} {opth mcountname(name)} if specified, name of a generated variable reporting original number of missing items (default is "_gendist_mc"){p_end} {phang} {opth mpluggedcountname(name)} if specified, name of a generated variable reporting number of missing items after mean-plugging, which could still be non-zero (even after all missing values on item positions have been plugged) if the respondent's own self-placement is missing (default is "_gendist_mpc"){p_end} {phang} {opt replace} if specified, drops all party position and mean-plugged placement variables after the generation of distance measures{p_end} {title:Examples:} {pstd}The following command generates distances on a left-right dimension, where party placements are in variables lrp1-lrp10, and R's self-placement is in lrresp; missing placements are replaced by simple mean-plugging, and then rounded to the nearest integer.{p_end}{break} {phang2}{cmd:. gendist lrp1-lrp10, respondent(lrresp) missing(mean) round}{p_end} {title:Generated variables} {pstd} {cmd:gendist} saves the following variables and variable sets: {synoptset 16 tabbed}{...} {synopt:p_{it:var1} p_{it:var2} ... (or other prefix set by option {bf:pprefix})} a set of mean-plugged placement variables with names p_var1, p_var2, etc., where the names var1, var2, etc. match the original variable names. Those variables are left unchanged.{p_end} {synopt:m_{it:var1} m_{it:var2} ... (or other prefix set by option {bf:mprefix})} a set of variables with names m_var1, m_var2, etc., where the names var1, var2, etc. match the original variable names of the battery of items. These variables indicate the original missingness of var1, var2, etc., or of the corresponding placement of the respondent on the scale concerned.{p_end} {synopt:d_{it:var1} d_{it:var2 ...} (or other prefix set by option {bf:dprefix})} a set of distances from the respondent to each (mean-plugged if optioned) placement variable. These distance variables are named d_var1, d_var2, etc., where the names var1, var2, etc. match the original variable names. Those variables are left unchanged.{p_end} {synopt:_gendist_mc} a variable showing the original count of missing items for each case.{p_end} {synopt:_gendist_mpc} a variable showing the count of remaining missing items for each case after mean-plugging.{p_end} {phang} NOTE that a subsequent invocation of {cmd:gendist} will replace {it:_gendist_mc} and {it:_gendist_mpc} with new counts of missing values for that invocation of {cmd:gendist}. So the user should save these values after issuing the previous command, if the values will be of later interest.