------------------------------------------------------------------------------- help forclvJean-Benoit Hardouin -------------------------------------------------------------------------------

Clustering of variables around latent components

clv[varlist] [ifexp] [inrange] [weight] [,nostandardizedbarconsolidation(#)nodendrosavedendro(filename[,replace])cutnumber(#)showcounttextsize(string)deltaThorizontalabbrev(#)title(string)caption(string)kernel(numlist)method(keyword)nobiplotaddvargenlv(string)replacestddim(string)]

Description

clvclusters variables around latent components. The variables are clustered stepwise by seeking to minimize at each step the decrease of the T criterion, computed as the sum of the first eigenvalues of the matrices of data of all the clusters. A hierarchical cluster analysis based on this criterion is performed. A consolidation procedure can be run subsequently which allows each variable to be assigned to the latent component it is the most correlated with.

Options

nostandardizeduses centered variables instead of standardized variables.

bardisplays a chart of the decrease in the T criterion at each step.

consolidationperforms a consolidation procedure with the obtained partition into the specified number of clusters (by default, no consolidation procedure is performed).

nodendrosuppresses the display of the dendogram.

savedendrosaves the dendrogram in the file defined by this option. If this file already exists, it is possible to replace it with thereplaceoption.

cutnumberdefines the number of clusters presented in the dendrogram (40 by default).

showcountdisplays the number of variables in each cluster (usefull with thecutnumberoption).

textsizedefines the size of the labels of the variables on the dendrogram (see textsizestyle).

deltaTuses the variation of the T criterion as height variable for the dendrogram.

horizontaldisplays an horizontal (instead vertical) dendrogram.

abbrevdefines the length of the variables labels on the dendrogram (15 characters by default).

titledefines the title of the dendrogram.

captiondefines the caption of the axis of the dendrogram which indicates the names of the variables.

kerneldefines one or several kernels of variables (variables which are clustered together in an initial step). The first number indicates that the first variables are clustered together, the second number indicates that the following variables are clustered together...

methodindicates the method to cluster the variables amongclassical(by default) for the method described by Vigneau and Qannari,polychoricfor a use of the matrix of polychoric coefficients of correlation (instead of Pearson coefficients of correlation),v2for a modified algorithm wich search to minimize the maximum second eigenvalue among the clusters of 2 variables and more,polychoricv2which correspond to thev2option with the matrix of polychoric coefficients of correlation, andcentroidwhich is defined by Vigneau and Qannari as an adaptation of CLV when the sign of the correlation coefficients between the variables is important.

nobiplotavoids to display a biplot of the latent variables with theconsolidationoption.

genlvsaves the latent variables in new variables with the string as prefix (followed by a number). This option must be used in conjonction with theconsolidationoption.

replaceallows replacing the variables creates with thegenlvoption if they already exist.

stdallows standardizing the latent variables for the graphical representation on the biplot.

dim(string) allows choosing the axes represented on the biplot.If no

varlistis indicated, the procedure uses the varlist from the lastclvprocedure, but does not perform the hierarchical cluster analysis.

NotesThe classifications around latent variables (CLV) is defined by its authors (Vigneau and Qannari, 2003) only for continuous variables. Results with binary or ordinal variables must be interpreted with precautions.

Only

fweightsare allowed. The biplots are disabled if weights are used.In this procedure, all the individuals with at least one missing value are omitted.

With the

polychoricandpolychoricv2methods, thenostandardizedoption is disabled.This module uses the following modules downloadable on SSC: ssc describe polychoric, ssc describe biplotvlab and ssc describe genscore

Example

. clv var1-var15/*performs the HCA procedure*/

. clv var1-var15, cons(6) bar nodendro meth(centroid)/* performs the HCA procedure based on the centroid method followed by a consolidation procedure with 6 clusters*/

. clv, cons(3) addvar/*performs only the consolidation procedure with 3 clusters, based on the preceeding HCA procedure*/

AknowledgementsThe author thanks Ronan Conroy for all the propositions of improvements.

ReferenceVigneau E. and Qannari E. M. Clustering of variables around latent components. Communications in Statistics - Simulation and Computation. 32(4): 1131-1150, 2003.

AuthorJean-Benoit Hardouin, PhD, assistant professor EA 4275 SPHERE "Team of Biostatistics, Clinical Research and Subjective Measures in Health Sciences" University of Nantes - Faculty of Pharmaceutical Sciences 1, rue Gaston Veil - BP 53508 44035 Nantes Cedex 1 - FRANCE Email: jean-benoit.hardouin@univ-nantes.fr