------------------------------------------------------------------------------- help for gen_tail -------------------------------------------------------------------------------

Generate an indicator variable, indicating which observations are at or after a > given observation.

gen_tail varname, gen(newvar)


gen_tail generates an indicator variable (valued 0 or 1), with values of 1 corresponding to the "tail end" of a series of observations. The tail starts at the first nonzero value (including missing) in varname and continues until the end of the dataset or by-group.

Typically, varname is itself an indicator variable, though any numeric variable will suffice.

by ... : may be used with gen_tail. Typically, you would use it with by and you would include a secondary by-variable or set of by-variables; see help by and the example, below.


. bysort personid (year): gen_tail incm_mis, gen(incm_mis_tail)

In such a usage, it is the user's responsibility to insure that the secondary by-variable(s) (year, in this case), in combination with the primary by-variable(s), establish a unique sort order. Otherwise, you will not have meaningful or consistent results.

A typical usage sequence would be as follows.

. gen byte incm_mis = mi(income) . bysort personid (year): gen_tail incm_mis, gen(incm_mis_tail) . logit outcomevar ... income ... if ~incm_mis_tail


gen_tail was created for use with duration data, though it may be useful in other data-construction situations. With duration data, it is used for creating variables that indicate censoring, where censoring begins at the first occurrence (within a spell) of a censoring event. That is, a censoring event causes its own observation to be omitted, along with all subsequent observations within the spell.


David Kantor, Institute for Policy Studies, Johns Hopkins University. Email dkantor@jhu.edu if you observe any problems.

Also See

carryforward, a related program by the same author.