-------------------------------------------------------------------------------
help for disjoint
-------------------------------------------------------------------------------

Generate end variable demarcating disjoint spells

disjoint start end [if exp] [in range] , generate(newvar) [ id(idvar) ]

Description

disjoint calculates an end variable demarcating disjoint spells based on information in start and end variables containing integer dates (and optionally an identifier variable idvar).

Remarks

Consider the following dataset:

+------------------+ | id start end | |------------------| 1. | 1 3 6 | 2. | 1 12 17 | 3. | 1 13 14 | 4. | 1 13 22 | 5. | 1 14 20 | 6. | 2 2 8 | 7. | 2 3 12 | 8. | 2 13 15 | 9. | 2 13 16 | 10. | 2 18 28 | +------------------+ For each identifier id the variables start and end define spells, but some spells overlap (and some are indeed wholly included in others). For some purposes we need to demarcate disjoint spells and we can do this by defining a new end variable, such that each end is always less than the next start. This is what disjoint does.

. disjoint start end, id(id) gen(end2) . l

+-------------------------+ | id start end end2 | |-------------------------| 1. | 1 3 6 6 | 2. | 1 12 17 12 | 3. | 1 13 14 . | 4. | 1 13 22 22 | 5. | 1 14 20 . | 6. | 2 2 8 2 | 7. | 2 3 12 11 | 8. | 2 12 15 12 | 9. | 2 13 16 16 | 10. | 2 18 28 28 | +-------------------------+

This is easier to understand if we focus on observations with non-missing generated values.

. l id start end2 if end2 < .

+-------------------------+ | id start end end2 | |-------------------------| 1. | 1 3 6 6 | 2. | 1 12 17 12 | 4. | 1 13 22 22 | 6. | 2 2 8 2 | 7. | 2 3 12 12 | 9. | 2 13 16 16 | 10. | 2 18 28 28 | +-------------------------+

That is, the "spells" for id 1 are (3,6), (12,12) and (13,22). Observations for id 1 with (start,end) of (13,14) and (14,20) are ignored for this purpose as they indicate spells included in other spells. The generated variable marks a true end whenever it is equal to the original end variable. Note that the length of each "spell" would usually be defined as 1 + generated variable - start. With this approach no observations are inserted or deleted.

See also packages on SSC spellutil (Edwin Leuven) and spell (Nicholas J. Cox).

Options

generate() specifies the name of the new variable and is not optional.

id() specifies an identifier variable. Spells are determined separately for each identifier.

Author

Nicholas J. Cox, University of Durham, U.K. n.j.cox@durham.ac.uk

Also see

On-line: help for spell (if installed), spellutil (spellsplit, spellmerge, spell2panel) (if installed)