-------------------------------------------------------------------------------
help for hcavar                                            Jean-Benoit Hardouin
-------------------------------------------------------------------------------

Hierarchical Clusters Analysis of variables

hcavar varlist [, prox(keyword) matrix(matrix) method(keyword) partition(numlist) measures detect nodendrogram]

Description

hcavar is the new name of the old hcaccprox module.

hcavar realizes a Hierarchical Clusters Analysis on variables. The variables can be numerous, ordinal or binary. The distances (dissimilarity measures for binary variables) between two variables are computed as the squared root of 2 times one minus the Pearson correlation. For binary variables, it is possible to use other similarity coefficients as Matching, Jaccard, Russel or Dice (See measure option for more details). The distance matrix is computed as the squared root of one minus the value of these coefficients. In the field of Item Response Theory, it is possible to define conditional measures to the score as defined by Roussos, Stout and Marden (1998): conditional correlations, conditional covariance, or Mantel-Haenszel measures of similarity. In the same field, it is possible to compute, for a set of obtained partition of the items, the DETECT, Iss and R indexes defined by Zhang and Stout (1999).

Options

prox defines the proximity measures to use : jaccard (alias a), russel, dice, matching (alias ad), pearson (alias corr), conditional covariance (ccov), conditional correlation (ccor), or Mantel Haenszel (mh). By default, this option is put to pearson. pearson is the only one option available with ordinal or numerous variables.

matrix allows using a matrix as distance matrix.

method defines the method to aggregate two clusters. See cluster for more details about these methods. The complete name of the method must be indicated (with or without "linkage"), none abbreviation is allowed. waveragelinkage is used by default.

partition lists the partitions of variables to detail by the program.

measures displays the used proximity measures matrix between the variables.

detect computes, for binary variables, the DETECT, Iss and R indexes for the partitions indicated in the partitions option.

{cmdnodendrogram} enables the displaying of th dendrogram.

Examples

. hcavar var1-var10 /*displays only the dendrogram*/

. hcavar var*, partition(1/6) measures method(single) /*Single linkage, details of 6 partitions*/

. hcavar itemA1-itemA7 itemB1-itemB7, prox(ccor) method(single) detect part(1/4) /*details of 4 partitions, conditional correlations*/

Outputs

. r(nbvar) contains the number of variables

. r(measures) is the distances measures matrix between the variables

. r(clusters) is a matrix obtained with the partition option containing the composition of the partitions defined with this option.

. r(indexes) is obtained with the detect option. This matrix contain the DETECT, Iss and R indexes associated to each partition defined with the partition option.

Reference

Roussos L. A, Stout W. F. and Marden J. I., Using new proximity measures with hierarchical cluster analysis to detect multidimensionality. Journal of Educational Measurement, 35(1), pp 1-30, 1998.

Zhang J. and Stout W. F., The theorical DETECT index of dimensionality and its application to approximate simple structure. Psychometrika, 64(2), pp 213-249, 1999.

Also see

help for cluster, help for detect (if installed)

Author

Jean-Benoit Hardouin, PhD, assistant professor EA 4275 SPHERE "Team of Biostatistics, Clinical Research and Subjective Measures in Health Sciences" University of Nantes - Faculty of Pharmaceutical Sciences 1, rue Gaston Veil - BP 53508 44035 Nantes Cedex 1 - FRANCE Email: jean-benoit.hardouin@univ-nantes.fr