.- help for ^clustpop^ - 1.0 - 17 Apr 2011 .- Estimate population cluster group assignments .- ^clustpop^ varlist [^if^ exp] [^in^ range] [, reps(100) k(3) seed(string) alpha(0.05) type("kmeans") distopt("L2") dots ] Description ----------- clustpop is a routine to estimate population group assignments using the @cluster@ command. The cluster command groups cases based on the values of a variable, or the mean/median of a group of variables. However, the group assignments will vary depending on the random seed that starts off the process. So if the -cluster- command is executed many times, it will produce different group assignments. In other words, There is population of group assignments from which the -cluster- command samples a single possibility. Therefore the results from -cluster- are like taking a sample (N=1) from a population and using that result as an estimate of the group assignment population. ^clustpop^ runs the -cluster- command many times in order to create a larger sample. For each case, the most frequently occuring group assignment is taken as an estimate of the most common group assignment in the population. The case is assigned to this group only if the lower bound (at a given alpha) of the population estimate is greater than half. In other words, it must be probable that the most frequently occuring group assignment is the group assignment more than half the time in the population. If this is not so, the group assignment is set to missing. Output: ------- Three variables are produced as output: 1. The estimate of the group assignment 2. The proportion of cases that are assigned as in variable 1 above 3. The lower bound of the proportion of cases that are assigned as in variable 1 above Options: -------- ^reps(^#^)^ specifies the number of times the cluster command is repeated. The default is 30. ^k(^#^)^ specifies the number of groups (see help for the @cluster@ command). The default is 3. ^seed(^"string"^)^ specifies the random number seed to use at the start. The default is "123456789" (you can just -set seed- intead). ^type(^"string"^)^ indicates the type of averaging (see help for the @cluster@ command). The default is "kmeans". ^distopt(^"string"^)^ Specifies how the distance between the cases is calculated (see help for the cluster command). The default is "L2". ^alpha(^0.00-0.99^)^ gives the alpha for the statistical test. Default is 0.05. ^dots^ will print a dot for each replication (shows the command is working...) Other routines called --------------------- @matsort@ must be installed for clustpop to function Examples -------- . sysuse auto,clear . clustpop mpg rep78 displacement,k(4) Author: Paul Millar www.paulmillar.ca paulmi@@nipissingu.ca See also: --------- Online: help for @cluster@