-------------------------------------------------------------------------------
help for vallist                                                       [P.Joly]
-------------------------------------------------------------------------------

List distinct values of a variable

vallist varname [if exp] [in range] [, sort freq reverse missing nolabels quoted max(#) words format(%fmt) sep(string) notrim local(macname) ]

Description

vallist puts a list of the distinct values of varname, which may be numeric or string, into returned macro r(list) and displays that list. Values are listed according to the order in which they appear in the data unless option sort or freq is specified. Missing values are ignored unless missing is specified.

vallist may be used interactively, but is most likely to be useful to programmers dealing with categorical or counted data.

Remarks

1. Earlier versions of vallist produced a sorted list. Starting with version 3.0, the list may be sorted as an option.

2. Numerical variables containing non-integer values are displayed using varname's display format which may be overridden by option format(). The output format is therefore similar (if not identical) to the output of the list command. Programmers who do not want to allow the listing of fractional values are advised to use the levels command instead, considered safer for this very reason.

Options

sort requests that the list be sorted alphanumerically. Variables with value labels are sorted according to their numerical value.

freq requests that the list be sorted in descending order of frequency. (Ties are broken up arbitrarily.)

reverse causes vallist to begin selecting distinct values starting from the bottom (last observation) of the data as opposed to the top (first observation). If specified with sort or freq it produces the list of distinct values in reverse alphanumerical order or reverse (ascending) order of frequency, respectively.

missing specifies that missing values should also be listed. Missing (i.e. empty) values of string variables are specified as "missing". Note that this description may be truncated with the max() option.

nolabels suppresses the use of labels for numeric variables with value labels.

quoted specifies that values should be placed in `" "'. This may be useful for string values or value labels containing embedded spaces.

max(#) specifies that at most the first # characters of text (string values or value labels) should be used for each value. For example, max(32) may be needed to cut down text to elements acceptable in Stata 7 as matrix row or column names.

words specifies that text should be truncated to the first whole `word', that is, at just before the first space after a non-space character. For example, a string value of "foo bar" would be represented by "foo".

format(%fmt) specifies a format for use. This is likely to be most useful with non-integer numeric values. A string format should be specified for string values or numeric values with value labels.

sep(string) specifies a separator other than a space, which is the default.

notrim suppresses trimming of leading and trailing spaces from string values or value labels.

local(macname) puts the list into local macro macname.

The sequence of operations is max(), words, format(), sep().

Saved results

r(list) contains the list of distinct values.

Examples

. sysuse auto . vallist rep78 . vallist rep78, sort . vallist gear_ratio, reverse . vallist mpg if foreign, sep(,) sort . vallist foreign, nolabels local(vals)

Author

Patrick Joly, Industry Canada pat.joly@utoronto.ca

Acknowledgements

The original author was Nicholas J. Cox. With his full permission vallist is now maintained by the current author. Fred Wolfe suggested that missing values be listable and raised the problem of embedded spaces.

Also see

On-line: help for tabulate, listutil, levels (if installed)