{smcl} {* 19-Jan2004; rev 9-Mar2005, 26Jan2006, 8feb2006, 27mar2008, 1apr2008, 2012feb9, 2012feb17, 2012nov12} {hline} help for {hi:mahapick} {hline} {title:Select matching observations based on a Mahalanobis scoring} {p 8 17 2} {cmd:mahapick} {it:varlist} [{it:weight}] {cmd:, idvar(}{it:idvarname}{cmd:)} {cmd:treated(}{it:treatedvar}{cmd:)} [ {cmd:pickids(}{it:pickidvars}{cmd:)} {cmd:genfile(}{it:filename}{cmd:)} {cmd:replace} {cmd:prime_id(}{it:prime_id_var}{cmd:)} {cmd:matchnum(}{it:matchnum_var}{cmd:)} {cmd:nummatches(}{it:#}{cmd:)} {cmd:full} {cmd:matchon(}{it:matchonvars}{cmd:) sliceby(}{it:slicebyvars}{cmd:)} {cmd:clear fast} {cmd:score} {cmd:scorevar(}{it:scorevarname}{cmd:)} {cmd:all} {cmdab:unsq:uared} {cmdab:eucl:idean} {cmdab:disp:lay(}{it:display_options}{cmd:)} {cmd:float} {cmdab:nocovtrlim:itation} ] {title:Description} {p 4 4 2} {cmd:mahapick} seeks matching observations for a set of "treated" observations, using a Mahalanobis distance measure which it calculates. {p 4 4 2} The "treated" observations are the ones for which you are seeking matches; the others, the non-treated, form the pool of potential matches (or "control" observations). (The use of the term "treated" comes from the study of medical treatments.){bind: } Both the treated and non-treated observations are expected to be present together in one dataset, currently in memory. The treated observations are identified by {it:treatedvar}. {p 4 4 2} For each treated observation, the closest matching non-treated observation(s) will be chosen, according to the calculated distance measure, and subject to the constraints of {cmd:matchon(}{it:matchonvars}{cmd:)} if that option is used. The selection of matches is done independently for each treated observation; a given control abservation may appear as a match for more than one treated observation. (But, of course, matched control observations are unique within the set selected for any particular treated observation, if multiple matches are chosen.) {col 12}{hline} {p 12 12 12} {hi:technical note:} Choosing unique matches is a beyond the scope of what {cmd:mahapick} was designed for, and involves a multitude of complex issues. However, users can take the output of {cmd:mahapick} and perform further processing to arrive at a uniquely-chosen set. See the {cmd:score} and {cmd:all} options for more remarks about this topic. See also {help mahascores}. {p_end} {p 12 12 12} Users desiring a unique selection based on a randomization process should see {help mahaselectunique}.{p_end} {col 12}{hline} {p 4 4 2} {it:varlist} (the "covariates") is a set of numeric variables on which to build the distance measure {c -} the Mahalanobis score. For each pair of observations, the distance measure (or score) is the matrix product d'Xd, where d is a vector of differences in the set of variables, and X is the inverse of the covariance matrix of {it:varlist}. If i and j are indices of two observations, then d = (v1[i]-v1[j] \ v2[i]-v2[j] \ ... \ vn[i]-vn[j]), where v1 v2 ... vn are the variables of {it:varlist}. {p 4 4 2} Thus, the score is the sum of all the possible products of pairs of elements of d, weighted by corresponding elements of X. See {help mahascore} for a further explanation of this. Note that the result is the square of what is properly the Mahalanobis distance, but this distinction should have no effect on the selection of closest matches (lowest scores). The {cmd:unsquared} option will cause the scores to be the proper unsquared values. {p 4 4 2} The covariaces are computed on the treated observations only, also limited to the set of observations that have all elements of {it:varlist} non-missing. I.e., the computation of covariances uses case-wise deletion when encountering missing values; the resulting values are potentially different from pair-wise covariances. This may seem like a limitation but it is appropriate; any treated observation with a missing value in one or more elements of {it:varlist} will get no matches, so it might as well be excluded at the outset. See {cmd:nocovtrlimitation} for how to override the limitation to treated observations. {p 4 4 2} Weights are allowed, but affect only the computation of the covariances. {p 4 4 2} The variables of {it:varlist} should be of numeric significance {c -} not categorical. Any categorical variables should be replaced by a set of indicator variables. {title:Required Options} {p 4 4 2} {cmd:idvar(}{it:idvarname}{cmd:)} specifies an identifying variable. It can be of any type, but it must be a single variable. Thus, if the existing identifying scheme consists of multiple variables, you should find a way to combine them uniquely into a single variable. {p 4 4 2} It is the user's responsibility to assure that {it:idvarname} uniquely identifies all observations, thus assuring a usable result. {p 4 4 2} {cmd: treated(}{it:treatedvar}{cmd:)} specifies a numeric variable that distinguishes the treated observations. Its values must be 0 or 1, where 1 indicates a treated observation. {title:Semi-required Options} {p 4 4 2} {cmd:pickids(}{it:pickidvars}{cmd:)} and {cmd:genfile(}{it:filename}{cmd:)} are two ways of preserving the results of the matching. You must use one or the other, or both. {p 4 4 2} {cmd:pickids(}{it:pickidvars}{cmd:)} specifies a set of one or more pre-existing variables to hold the id's of the matched observations. It/they must be of the same type as {it:idvarname}, and must be filled with missing values ("" for strings) unless the {cmd:clear} option is specified. {p 4 4 2} If {it:pickidvars} consists of more than one variable, then the first will get the best match, the second will get the second best match, and so on. {p 4 4 2} {cmd:genfile(}{it:filename}{cmd:)} specifies a file into which to {help post} the results. {p 4 4 2} Note that {cmd:pickids(}{it:pickidvars}{cmd:)} puts the results into wide form within the current dataset, whereas {cmd:genfile(}{it:filename}{cmd:)} puts them into long form in a separate dataset. (See {help reshape} for a discussion of wide- versus long-shaped data.) {p 4 4 2} Another difference between these methods is that with {cmd:pickids(}{it:pickidvars}{cmd:)}, it is up to the user to subsequently {help save} the dataset {c -} (or use it directly after its creation), whereas {cmd:genfile(}{it:filename}{cmd:)} writes the results to a separate file. {p 4 4 2} If you create a (wide) dataset using {cmd:pickids(}{it:pickidvars}{cmd:)}, you can subsequently convert it to long form using {help stackids}. {col 12}{hline} {p 12 12 12} {hi:Technical note:} {cmd:pickids} was the original method provided; {cmd:genfile} was a later addition, and is probably more useful. {p_end} {col 12}{hline} {p 4 4 2} If {cmd:genfile(}{it:filename}{cmd:)} is used, the resulting file is a Stata dataset with these variables: {p 8 10 2} A "prime_id" variable of the same type as {it:idvarname}. This holds the id of the treated observation for which matches are being found. The default name for this is _prime_id; it can be changed using the {cmd:prime_id} option. {p 8 10 2} {it:idvarname} {c -} the same name and type as in {cmd:idvar(}{it:idvarname}{cmd:)}. This holds the ids of all observations {c -} treated or matching control observations. {p 8 10 2} A "matchnum" variable {c -} an int to count up the series of matches for each treated observation. The default name is _matchnum; it can be changed using the {cmd:matchnum} option. This variable will range from 0 to {it:#}. {p 8 10 2} Optionally, a "score" variable, if the {cmd:score} option is specified. This holds the score {c -} the distance measure between the treated (prime_id) observation and the given control observation. See the {cmd:score} option for more about this. {p 4 4 2} Within this file, there will be, for each treated observation... {p 8 10 2} one observation representing the treated observation itself, with _matchnum=0, {it:idvarname}=_prime_id (or {it:prime_id_var}), and _score (or {it:scorevarname}) =0 (if {cmd:score} was specified); this is followed by... {p 8 10 2} zero or more observations for the matches, with _matchnum=1, 2, ... , {it:#}, and {it:idvarname} holding the id of the matched observations. The first will get the best match, the second will get the second best match, and so on. {p 4 4 2} For each treated observation, _prime_id (or {it:prime_id_var}) is a constant, equaling the id of the treated observation. Note that {it:idvarname} = _prime_id for the observations where _matchnum=0. {p 4 4 2} The notion of "best match" and "second best match", etc., is ambiguous when ties occur in the scoring. In this case, the present sort order determines the choices. See "identical scorings" under {ul:Remarks} for more on this matter. {title:Optional Options} {p 4 4 2} {cmd:matchon(}{it:matchonvars}{cmd:)} imposes a restriction on the matching process, such that matches will be made only to observations that completely agree with the treated observation on the values in {it:matchonvars}. In other words, the dataset is logically partitioned into subsets, as determined by the values in {it:matchonvars}, and matching will occur only within each partition. ({it:matchonvars} may not include {it:treatedvar}.) {p 4 4 2} It is best that the variables in {it:matchonvars} take a fairly small set of values; generally, only categorical variables are appropriate. The types may be numeric or string. {p 4 4 2} Do not confuse {it:matchonvars} with {it:varlist}. {it:varlist} is a set of variables on which you want the matches to be "close"; {it:matchonvars} are variables on which you require perfect agreement. {p 4 4 2} Missing values (including the extended missing values .a .b, etc.) in {it:matchonvars} are regarded as distinct. {p 4 4 2} {cmd:sliceby(}{it:slicebyvars}{cmd:)} imposes the same kind of restriction as does {cmd:matchon()}, restricting the matching to stay within the subsets as determined by the values in {it:slicebyvars}. However, {cmd:sliceby()} achieves the effect by different means, dividing the dataset into subsets, running the matching process separately on each subset, and reuniting them afterwards. By contrast, {cmd:matchon()} (without {cmd:sliceby()}) merely limits the matches that are chosen. {p 4 4 2} {cmd:sliceby(}{it:slicebyvars}{cmd:)} may only be specified if {cmd:matchon(}{it:matchonvars}{cmd:)} is also specified, and {it:slicebyvars} must be a subset of {it:matchonvars}. Thus, {it:matchonvars} gives the full set of variables on which the matches must completely agree; {it:slicebyvars} specifies which of those variables will be the basis for actual slicing of the dataset to achieve the effect. Of course, {it:slicebyvars} may equal {it:matchonvars}, but there may be some advantage to not doing that, as will be explained shortly. {p 4 4 2} {cmd:sliceby()} can result in very significant speed improvements for large datasets. But, of course, it is appropriate only where such a partitioning is an existing requirement of the desired matching operation. {p 4 4 2} {cmd:sliceby()} achieves its speed advantage by reducing unnecessary sorting {c -} at the expense of manipulating many intermediary files. If the slices are exceedingly fine, the work involved in slicing may overshadow the advantages gained. Thus, it may be better for the slices to be coarser than the matchon sets; i.e., use {cmd:sliceby()} to go part-way in dividing up the data, and use {cmd:matchon()} (with one or more additional variables) to complete the effect. {p 4 4 2} Because {it:slicebyvars} is a subset of {it:matchonvars}, all remarks regrding {it:matchonvars} apply to {it:slicebyvars}. In particular, they ought to be categorical, all types are allowed, and extended missing values are regarded as distinct. {col 12}{hline} {p 12 12 12} {hi:Technical note:} it was not functionally necessary to require {it:slicebyvars} to be a subset of {it:matchonvars}. But it makes for clearer syntax in that it reminds the user that slicing implicitly restricts the matching. That is, regardless of that requirement, the use of {cmd:sliceby(}{it:slicebyvars}{cmd:)} implies the same effect as having {it:slicebyvars} among the {it:matchonvars}. {p 12 12 12} Also note that the requirement that {it:slicebyvars} be a subset of {it:matchonvars} imposes the opposite relation between the corresponding subsets of the data; the data subsets corresponding to {it:matchonvars} are subsets of those corresponding to {it:slicebyvars}.){bind: } {p_end} {col 12}{hline} {p 4 4 2} Note that the covariance matrix and its inverse are precalculated on the whole set (of treated observations only), not on each slice or matchon set. Thus, the use of {cmd:matchon(}{it:matchonvars}{cmd:)}, with or without {cmd:sliceby(}{it:slicebyvars}{cmd:)}, is not the same as if you were to run {cmd:mahapick} on each matchon set separately. {p 4 4 2} {cmd:fast} applies only if {cmd:sliceby} is specified. It causes {cmd:mahapick} to bypass the {help preserve} and {help restore} commands that surround the slicing operation, and thereby can save some time {c -} at the expense of safety. Without {cmd:fast}, if you press the Break key during the processing of the slices, the original dataset will be restored (though any matches made during the processing and recorded using {cmd:pickids()} will be lost). With {cmd:fast}, if you press the Break key during the processing of the slices, you will be left with only the present slice. {p 4 4 2} {cmd:nocovtrlimitation} specifies that the covariance computation not be limited to treated observations. {p 4 4 2} {cmd:unsquared} modifies the score values to be the unsquared values, that is, the square roots of the default values. As mentioned elsewhere, the choice of squared or unsquared values ought to have no effect on the selection of matches. Thus, this should only affect the {cmd:genfile} option. {p 4 4 2} {cmd:euclidean} specifies that the normalized Euclidean measure is to be used, rather than the true Mahalanobis measure {c -} meaning that the off-diagonal elements of the covariance matrix are replaced with zeroes prior to inverting. The result is a measure that accounts for the scale of measurement in each variable of {it:varlist}, but ignores correlation between the variables. This is probably not desirable, given the advantages of the true Mahalanobis measure, but is provided as an alternative and for comparison to (or emulation of) earlier releases of {help mahascore} and {help mahapick}. See notes under {ul:Change History} as well as {help mahascore} for more details on this matter. {p 4 4 2} {cmd:display(}{it:display_options}{cmd:)} turns on the display of certain data structures used in the computation. If {it:display_options} contains {cmd:covar}, then the covariance matrix is listed; if is contains {cmd:invcov}, then the inverse covariance matrix is listed. Any other content is ignored. {title:Options for use with {cmd:pickids} only} {p 6 6 2} {cmd:clear} indicates that if {it:pickidvars} are not all missing, then it is okay to go ahead and replace them with missing values at the start of the process. {title:Options for use with {cmd:genfile} only} {p 6 6 2} {cmd:replace} indicates that if {it:filename} already exists, then it is okay to replace it. {p 6 6 2} {cmd:prime_id(}{it:prime_id_var}{cmd:)} allows you to specify the name for the prime_id variable. The default name is _prime_id. {p 6 6 2} {cmd:matchnum(}{it:matchnum_var}{cmd:)} allows you to specify the name for the matchnum variable. The default name is _matchnum. {p 6 6 2} {cmd:nummatches(}{it:#}{cmd:)} specifies how many matches to collect for each treated observation. The default is 1. Note that this corresponds to the number of {it:pickidvars} in the {cmd:pickids} option. {p 6 6 2} {cmd:full} specifies that if matches cannot be made, then observations with missing values in {it:idvarname} are to be written so that there will always be {it:#} +1 observations (i.e., {it:#} "matches") for each treated observation. Suppose that you specify {cmd:nummatches(3)}, and that for a given treated observation, only one match can be found. Then by default, only two observations will be written: one for the treated observation, and one for the match. If {cmd:full} is specified, then two additional observations (with missing values in {it:idvarname}) will be written. {p 6 6 2} {cmd:score} specifies that the file will contain an additional variable, holding the computed distance measure between the treated observation and the control observation. The default name is _score, and its type is double. {p 6 6 2} Note that to record all the distance measures between all treated observations and all other observations "in place" (using {cmd:pickids()}) would require adding as many new variables as there are control observations, which may or may not be practical. Such a structure would be in wide form; the {cmd:score} option captures that information, but puts it in long form, which may be more practical. See also the remarks about {help mahascores}, below. {p 6 6 2} One possible use for this option is to allow users to supplement the results with an algorithm for further refinement of the matchings, for example, to reduce a set of candidate matches to a smaller set of unique matches, while minimizing the sum of all distance measures in the selected observations. {col 12}{hline} {p 12 12 12} {hi:technical note:} Implementing such an algorithm may be difficult in Stata; it may be necessary to export the results for use by a program written in a general-purpose programming language. On the other hand, it may be feasible to do it in Mata. {p_end} {col 12}{hline} {p 6 6 2} {cmd:scorevar(}{it:scorevarname}{cmd:)} allows you to specify the name of the score variable, if the {cmd:score} option is used. The default name is _score. {p 6 6 2} {cmd:all} signifies that all possible control observations will be included. {cmd:all} without {cmd:full} renders {cmd:nummatches(}{it:#}{cmd:)} irrelevant, and is equivalent to specifying {cmd:nummatches(}{it:#}{cmd:)}, where {it:#} is at least as large as the maximal number of available control observations (within matchon groups, if specified). {p 6 6 2} {cmd:all} with {cmd:full} causes {it:#} to be the miniumum number of control observation records written for each treated observation (possibly with some filled with missing values to fill out the quota), but there will be more control observations written if they are available. {p 6 6 2} Note that with {cmd:all}, the action of {cmd:mahapick} process is not so much a selecting, but rather a scoring and ranking process. Also, the number of control matches written per treated observation can vary from one matchon group to another, if {cmd:matchon} was specified. {p 6 6 2} The intent of the {cmd:all} option is that it would be used with {cmd:score}, by users who want to take the scores (of all potential pairings) and do their own selection algorithm. But if the user desires the score values, without the sorting or selecting of control observations, then it is recommended to use {help mahascores} instead of mahapick. That provides a way to simply capture the score values for all pairs of observations (or possibly all treated-to-non-treated pairs), and should prove to be faster than mahapick. {p 6 6 2} {cmd:float} specifies that the type for the score variable generated by {cmd:genfile()} will be float, rather than double. {title:Remarks} {p 4 4 2} If any of these conditions occur, then the score will be missing, and no matches will be made for the given treated observation: {p 8 8 2} Any covariate (variable in {it:varlist}) is missing in the treated observation. {p 8 8 2} Any of the variances are missing or zero (this would affect the whole set). (You can automatically avoid this by the use of the {cmd:omitmiszer} option.) {p 4 4 2} In addition, if any covariate is missing in a control observation, then that observation is excluded from consideration. {p 4 4 2} It may happen that no matchable control observations are found for a given treated observation, and no match will be assigned. More generally, there may be fewer than {it:#} (or fewer than the number of variables in {it:pickidvars}) matchable control observations. For example, if you have {cmd:nummatches(3)} (or three {it:pickidvars}), and only two eligible matches are found for a given treated observation, then, only two matches will be recorded in {it:filename} (or only the first two of the {it:pickidvars} will be assigned) for that observation. {p 4 4 2} Any of these situations are unlikely to occur if the pool of control observations is large {c -} interpreted within each matchon group if {cmd:matchon()} is specified. {p 4 4 2} There may be cases where identical scorings occur for several potential matches. In this case, the existing sort order is used for breaking ties, taking the earlier-placed observations first (using a stable sort). Consequently, repeated runs will yield identical results, even if ties exist, provided that the initial sort order is kept the same. {p 4 4 2} Identical scorings are less likely to occur if there are many variables in {it:varlist}, or if these variables take on many different values. When identical scorings occur, they usually are the result of identical values in {it:varlist} {c -} including cases where {it:varlist} is the same for the treated and control observations (for a score of 0). {p 4 4 2} Note that, while the processing involves sorting, the dataset is returned to its original sort order unless {cmd:sliceby(}{it:slicebyvars}{cmd:)} is specified, in which case, the order is that of a stable sort on {it:slicebyvars}. {p 4 4 2} {cmd:mahapick} is rather noisy in its displayed output. {p 4 4 2} This calls {help mahascore} and {help covariancemat}, other programs by the same author. {p 4 4 2} It is up to the user to make use of the matches. Generally, you will want to {help merge} some "content" data onto the resulting set for analysis. If you use {cmd:genfile()} (or {cmd:pickids()}, followed by {cmd:stackids()}) your resulting set will be a "basis" in long form, with treated and matched observations together in the same dataset. You will subsequently want to merge content data on to this, presumably using {it:idvarname} as the matching variable. This is probably the most desirable form of the resulting data for analysis purposes. {p 4 4 2} Presumably, {it:treatedvar} is an important variable in the analysis, but it may not be present in the basis set, if constructed as described above. You can recover it by including it in the merge, or you can reconstruct it by identifying observations where _matchnum==0. {p 4 4 2} If you have used {cmd:pickids()} and are leaving the data in wide form, you would need to merge on the content data, once for the treated observation, and once for each pickid, with distinct variable names for the content data in each of these merges. Such a data structure may be cumbersome, but it has the one advantage of directly embodying the connection between treated and matched observations {c -} in case that is important to your planned analysis. (For example, you can construct differences between the treated and matched observations.) {col 12}{hline} {p 12 12 12} {hi:technical note:} If you have used {cmd:pickids()} (and not {cmd:genfile()}), but find that you prefer the results in long form, you can either rerun the match process using {cmd:genfile()}, or convert the results to long form using {help stackids}. The latter option may be convenient if the matching process takes a lot of time. ({cmd:stackids} is similar to {help reshape} and {help stack}, but includes provisions to preserve the correspondence between the treated and the matched observations.) {p_end} {col 12}{hline} {p 4 4 2} One useful way of using {cmd:mahapick} is to take several more matches per treated observation than you actually expect to use. That is, you specify a large {cmd:nummatches()} value (or a large set of {it:pickidvars}). For example, if you want three matches per treated observation, you might collect, say, eight matches per treated observation (specifying {cmd:nummatches(8)}). Then in subsequent analyses, using some code to pre-screen your data, you take the first (best) three "good" matches {c -} good in the sense that they have no missing values in variables needed in the analysis. (Those would be variables in the "content" data mentioned above, which are typically {it:not} among those used in the matching (i.e., {it:varlist}).) The advantage of this is that, rather than filtering for observations with non-missing values in the content data before the match, you do it at the time you analyze the data. In susbsequent analyses, you might adjust the set of variables involved, thereby potentially shifting the set of control observations to exclude. But, given this setup, you will not need to rerun the match. You can also have several analyses with different mixes of variables, each of which takes its own best set of matches. The program {help screenmatches} does this screening for you (with the data in long form). {p 4 4 2} If the inverse covariance matrix is computed on a very small set of observations, it may not be valid and may yield strange results. It might fail to be positive semi-definite, and can yield negative measures. (It may also cause the {cmd:unsquared} option to have a real effect on the choice of matches.) {p 4 4 2} As it stands presently, there are no [{cmd:if} {it:exp}] or [{cmd:in} {it:range}] features provided. They were not deemed essential when {cmd:mahapick} was first created, but could be added if there is a demand for them. {p 4 4 2} See {help mahaselectunique} for a further discussion of issues relating to the formulation of the covariate set and the quality of the scoring, as well as how that relates to unique selection. {title:Examples} {p 4 8 2} {cmd:. mahapick income age numkids, idvar(id0) genfile(myfile)} {cmd:nummatches(8) full} {cmd:treated(assisted)}{p_end} {p 4 8 2} {cmd:. mahapick income age numkids, idvar(id0) genfile(myfile)} {cmd:nummatches(8) full} {cmd:treated(assisted)} {cmd:matchon(sex region) sliceby(region)}{p_end} {p 4 8 2} {cmd:. mahapick income age numkids, idvar(id0) pickids(id1 id2 id3)} {cmd:treated(assisted)}{p_end} {p 4 8 2} {cmd:. mahapick income age numkids, idvar(id0) pickids(id1 id2 id3)} {cmd:treated(assisted)} {cmd:matchon(sex region)}{p_end} {p 4 8 2} {cmd:. mahapick income age numkids, idvar(id0) pickids(id1 id2 id3)} {cmd:treated(assisted)} {cmd:matchon(sex region) sliceby(region)}{p_end} {title:Change History} {p 4 4 2} The 1Apr2008 release implements the full Mahalanobis measure. Prior to that release, the normalized Euclidean measure was used, which is equivalent to the current version under the {cmd:euclidean} option. Referring to the d vector mentioned under the description of {it:varlist}, the normalized Euclidean measure is the sum of the squares of the components of d, weighted by the inverse variance of each variable. {p 4 4 2} The 1Apr2008 release eliminated the {cmd:common} and {cmd:omitmiszer} options, which were deemed as inappropriate for the changes to the program. Note that {cmd:common} was to limit variance computations to the set of common observations that have no missing values in {it:varlist}; the present method (for covariances) always imposes that limitation. {p 4 4 2} The 1Apr2008 release added these options: {cmd:unsquared}, {cmd:euclidean}, {cmd:float}, {cmd:display()}, and {cmd:nocovtrlimitation}. {title:Acknowledgement} {p 4 4 2} The author wishes to thank Joseph Harkness, formerly of The Institute for Policy Studies at Johns Hopkins University for guidance in developing this program, as well as Heiko Giebler of Wissenschaftszentrum Berlin fur Sozialforschung GmbH, for suggesting further improvements. {title:Author} {p 4 4 2} David Kantor; initial development was done at The Institute for Policy Studies, Johns Hopkins University. Email {browse "mailto:kantor.d@att.net":kantor.d@att.net} if you observe any problems. {title:Also See} {p 4 4 2} {help mahascore}, {help mahascores}, {help mahascore2}, {help covariancemat}, {help variancemat}, {help screenmatches}, {help stackids}, {help mahaselectunique}.