help for clsort                                                  (Ph. Van Kerm)

Single variable sorting

Syntax 1

egen newvarname = clsort(varname1) [if exp] [in range] [, inplace ]

Syntax 2

egen newvarname = clsort(varname1 varname2) [if exp] [in range] [, inplace posvar(varname){cmd:)]


clsort is an extension to generate (i.e. egen) function to create a variable sorted in increasing order (syntax 1), or in increasing order of a key variable (syntax 2). Unlike sort and gsort that affect all variables in a dataset, clsort produces a single sorted variable and leaves the order of the rest of the data unaffected.

In the simplest case, i.e. syntax 1 without if and/or in clauses,

egen newvarname = clsort(varname1)

the created variable newvarname has exactly the same values as the argument variable varname1 sorted in increasing order.

If if and/or in clauses are specified in syntax 1,

egen newvarname = clsort(varname1) [if exp] [in range] [, inplace]

the created variable contains only the values of varname1 taken on by the observations selected by the if and/or in clauses. By default all these values are ordered and fill the first rows of the data. This default behaviour can be overridden by the option inplace. When inplace is specified, the sorted values only fill the rows of the selected observations. To illustrate the difference let

egen newvarname_default = clsort(varname1) if select and egen newvarname_inplace = clsort(varname1) if select , inplace

be issued with the data below. newvarname_default only fills the top rows whereas newvarname_inplace fills the rows of the selected observations only.

varname1 select newvarname_default newvarname_inplace 20 0 10 . 40 1 15 10 15 1 40 15 10 1 60 40 55 0 . . 60 1 . 60

In syntax 1, the variable created is sorted in increasing order. However a key can be specified with the optional argument varname2 (syntax 2). If varname2 is specified, then the created variable is sorted in increasing order of varname2.


inplace requires that the selected observations only are filled with the new variable. inplace is irrelevant without if and/or in clauses.

posvar(varname){cmd:) completely changes the behaviour of clsort in syntax 2 and should be used with care. It requires that the order of the generated variable match the order of the second variable passed as argument varname2): the smallest obs. of varname1 will be placed in the same line as the smallest obs. of varname2, etc.


Philippe VAN KERM <philippe.vankerm@ceps.lu> CEPS/INSTEAD B.P. 48 L-4501 Differdange, G.-D. Luxembourg.

Also see

Manual: [R] egen On-line: help for egen