Hierarchical Clusters Analysis with conditional proximity measures
hcaccprox varlist [, prox(keyword) method(keyword) partition(numlist) measures details detect(#) ]
partition numlist
Description
hcaccprox realize a Hierarchical Clusters Analysis on dichotomoux items based on specific measures of proximity as conditional proximity measures. The program permit to obtain indexes to test the obtained partition (the detect program is necessary in this case).
partition permit, after a hcaccprox step to obtain the composition of some specific partitions of the items.
Options
prox:(keyword) define the method to compute the proximity between the items. Six measures are possible. The three first ones are unconditional measures named a, ad and cor. The three last ones are conditional measures named ccov, ccor and mh. See Roussos, Stout and Marden (1998) for details of these six measures. By default, the ccov option is used.
method define the method to aggregate two clusters, single for a single linkage, complete for a complete linkage, and UPGMA for the Unweighted Pair-Group Method of Average. By default, the UPGMA option is used.
partition(numlist) lists the partitions to detail by the program. List like (2 4 6) or (2(2)6) are authorized.
measures display the used proximity measures between the items.
details display the results of the algorithm of aggregation.
detect(#) specifies for all the partitions with a number of clusters inferior or equal to # to compute the DETECT, Iss and R indexes.
numlist, for the partition program, define the partitions with the number of clusters indicated in the numlist to detail.
Examples
. hcaccprox q1-q10
. partition 3 5 6
. hcaccprox item1-item9 dotest1-dotest6, detect(6) measures
. hcaccprox c1 c2 c3 c4 c5 c6 c7, prox(a) method(single)
Outputs
. r(varlist) is a macro who contain varlist
. r(nbitems) is a macro who contain the number of items
. r(nodes) is a matrix who contain all the informations about all the possible clusters of items. Each column represent a node (the first ones represent each item of varlist, and the following columns represent each aggregation of clusters), the first line represent the number of items in each cluster, the third and the fourth lines represent the two cluster who are aggregated to form the new cluster, and the following lines represent the list of items composing each cluster
. r(mempart) list the number of cluster composing each possible partition : the last column is the partition in only one cluster, the preceeding column represent the partition in two cluster, and so on
. r(affect#) is obtained with the partition option. In this vector, the number of the cluster (of the partition in # clusters) is associated to each item
. r(indexes) is obtained with the detect option. This matrix contain the DETECT, Iss and R indexes associated to each partition with a number of clusters inferior to the number defined in the detect option
Reference
Roussos L. A, Stout W. F. and Marden J. I., Using new proximity measures with hierarchical cluster analysis to detect multidimensionality. Journal of Educational Measurement, 35(1), pp 1-30, 1998.
Zhang J. and Stout W. F., The theorical DETECT index of dimensionality and its application to approximate simple structure. Psychometrika, 64(2), pp 213-249, 1999.
Also see
help for detect
Author
Jean-Benoit Hardouin, Regional Health Observatory (ORS) - 1, rue Porte Madeleine - BP 2439 - 45032 Orleans Cedex 1 - France. You can contact the author at jean-benoit.hardouin@neuf.fr and visit the