{smcl} {hline} help for {hi:clustergram}{right:{hi: Matthias Schonlau}} {hline} {title:Graph for visualizing hierarchical and non-hierarchical cluster analyses} {p 8 27} {cmd:clustergram} {it:varlist} [{cmd:if} {it:exp}] [{cmd:in} {it:range}] {cmd:,} {cmdab:cl:uster(}{it:clustervarlist}{cmd:)} [ {cmdab:fr:action(}{it:#}{cmd:)} {cmd:fill} {it:graph_options} ] {title:Description} {p}{cmd:clustergram} draws a graph to examine how cluster members are assigned to clusters as the number of clusters increases in a cluster analysis. This is similar in spirit to the dendrograms (tree graphs) used for hierarchical cluster analyses. The graph is especially useful for non-hierarchical clustering algorithms, such as {it:k}-means, and for hierarchical cluster algorithms when the number of observations is too large for dendrograms to be practical. {p}{it:varlist} usually contains the variables with which the cluster algorithm was run. These variables are only used to compute the value of the vertical axis for each cluster. It is also possible to specify a single variable to examine the cluster assignments w.r.t that variable. It is also possible to specify a variable that was not among the variables that was used for the cluster analysis. When more than one variable are specified it often makes sense to standardize the variables to ensure that no single variable dominates the scale. {title:Options} {p 0 4} {cmd:cluster(}{it:clustervarlist}{cmd:)} specifies the variables containing cluster assignments, as previously produced by {cmd:cluster}. More precisely, they usually successively specify assignments to 1, 2, ... clusters. Typically they will be named something like {cmd:cluster1-cluster}{it:max}, where {it:max} is the maximum number of clusters identified. It is possible to specify assignments other than to 1,2, ... clusters (e.g. omitting the first few clusters, or in reverse order). A warning will be displayed in this case. This option is required. {p 0 4}{cmd:fraction(}{it:#}{cmd:)} specifies a fudge factor controlling the width of line segments and is typically modified to reduce visual clutter. The relative width of any two line segments is not affected. The value should be between 0 and 1. The default is 0.2. {p 0 4} {cmd:fill} specifies that individual graph segments are to be filled (solid). By default only the outline of each segment is drawn. {p 0 4} {it:graph_options} are options of {cmd:graph, twoway} other than {cmd:symbol()} and {cmd:connect()}. The defaults include {cmd:ylabel}s showing three (rounded) levels and {cmd:gap(5)}. {title:Examples} {p}Plot the clustergram for {inp:cluster1} through {inp:cluster20}: {p 4 8}{inp:. clustergram var1-var7, cluster(cluster1-cluster20)} {p}Plot the clustergram with a smaller width to reduce visual clutter: {p 4 8}{inp:. clustergram sepallen-petalwid, cluster(cluster1-cluster10) fraction(.1) xla(1/10)} {p}Examine the effect of the cluster assignments on a single variable: {p 4 8}{inp:. clustergram petalwid, cluster(cluster1-cluster10) l2title("petalwid") } {title:References} {p 0 8} Schonlau M. The clustergram: a graph for visualizing hierarchical and non-hierarchical cluster analyses. The Stata Journal, 2002; 2 (4):391-402. {p 0 8}Schonlau, M. Visualizing Hierarchical and Non-Hierarchical Cluster Analyses with Clustergrams. Computational Statistics: (to appear). {title:Author} Matthias Schonlau, University of Waterloo schonlau at uwaterloo dot ca {browse "http://www.schonlau.net":www.schonlau.net} {title:Also see} {p 0 19}Manual: {hi:[R] cluster}, {hi:[R] cluster dendrogram}, {hi:[R] cluster kmeans}, {hi:[R] graph}{p_end} {p 0 19}On-line: help for {help cluster}, {help graph}{p_end}