Unique values of a variable or group of variables

^unique^ varlist [^if^ exp] [^in^ range], ^by(^varname^) gen^erate^(^varname^) > d^etail

Description -----------

The command ^unique^ without a ^by^ reports the number of unique values for the varlist. With a ^by^ it does the same, but also creates a new variable (^_Unique^ by default but can be named in the ^gen^ option). This new variable contains the number of unique values of the varlist for each level of the ^by^ variable. The new variable is coded missing except for the first record in each group defined by the levels of the ^by^ variable.

The command is useful for exploring data where the records refer to more than one level, for example longitudinal data where each record refers to a visit by a subject.

Options -------

^by^ provides the name of a categorical variable.

^generate()^ supplies the name of the new variable which contains the number of unique values of the varlist by levels defined by the categorical variable supplied in the ^by^.

^detail^ request summary statistics on the number of records which are present for unique values of the varlist. If you have longitudinal data for instance, this option reports the mean, median, minimum and maximum number of visits per subject.

Example -------

Consider a longitudinal data set in which each record corresponds to a visit by a subject. The subject identity is in the variable id, and the visit is identified by the variable visit, within id. Then

^unique id^

reports the number of subjects

^unique id visit^

reports a number which will be the same as the number of records unless there are duplicate records with the same id and visit number.

^unique visit, by(id) gen(vno)^

creates a new variable at the subject level which contains the number of visits for that subject. Alternatively

^unique id, detail^

reports summary statistics for the number of visits per subject.

Authors -------

Michael Hills mhills@@regress..demon.co.uk

Tony Brady tbrady@@rpms.ac.uk

Also see --------

On-line: help for @count@, @longch@