{smcl}
{* 24 Nov 2008/8 Dec 2008/26 May 2020}{...}
{hline}
help for {hi:panelthin}
{hline}

{title:Identify observations for possible thinned panel dataset}

{p 8 17 2}{cmd:panelthin} 
[{cmd:if} {it:exp}]
[{cmd:in} {it:range}]
, 
{cmdab:g:enerate(}{it:newvar}{cmd:)} 
{cmdab:m:inimum(}{it:#}{cmd:)} 
  

{title:Description}

{p 4 4 2}{cmd:panelthin} identifies observations that would belong in a
thinned panel dataset in which observations in each panel are at least a
minimum time apart.  The result is a new variable tagging observations
in the thinned dataset by 1 and others by 0.  


{title:Remarks} 

{p 4 4 2}{cmd:panelthin} assumes {help tsset} data and automatically
works separately on each panel in a panel dataset. 

{p 4 4 2}In essence, the first observation in each panel is selected,
then the next after at least a minimum time, and so on. 

{p 4 4 2}If a thinned dataset is acceptable, then (provided that the
main dataset is {help save}d elsewhere) {help keep} the set with 
observations tagged 1 in the new variable. 

{p 4 4 2}The tags can be used to identify spells or runs in the data, 
which most obviously may be helpful if {help collapse} is to be used 
in a reduction of the dataset. For more on the principles of identifying
spells, see Cox (2007).  


{title:Options} 

{p 4 4 2}{cmd:generate()} specifies the name of a new variable to include 
tags for selected observations. It is a required option. 

{p 4 4 2}{cmd:minimum()} specifies the minimum acceptable spacing in the
units of the time variable defining the panel. It is a required option. 


{title:Examples}

{p 4 4 2}{cmd:clear}{p_end}
{p 4 4 2}{cmd:input id year whatever}{p_end}
{p 4 4 2}{cmd:4   1987 1}{p_end}
{p 4 4 2}{cmd:4   1988 3}{p_end}
{p 4 4 2}{cmd:4   1989 5}{p_end}
{p 4 4 2}{cmd:4   1990 7}{p_end}
{p 4 4 2}{cmd:4   1992 11}{p_end}
{p 4 4 2}{cmd:4   1993 13}{p_end}
{p 4 4 2}{cmd:4   1994 15}{p_end}
{p 4 4 2}{cmd:9   1987 42}{p_end}
{p 4 4 2}{cmd:9   1988 44}{p_end}
{p 4 4 2}{cmd:9   1989 46}{p_end}
{p 4 4 2}{cmd:9   1990 48}{p_end}
{p 4 4 2}{cmd:9   1992 52}{p_end}
{p 4 4 2}{cmd:9   1993 54}{p_end}
{p 4 4 2}{cmd:9   1994 56}{p_end}
{p 4 4 2}{cmd:end}

{p 4 4 2}{cmd:tsset id year}{p_end}
{p 4 4 2}{cmd:panelthin, min(2) gen(tag)}{p_end}
{p 4 4 2}{cmd:list}

{p 4 4 2}{* alternative 1: brute force}{p_end}
{p 4 4 2}{cmd:keep if tag}{p_end}
{p 4 4 2}{cmd:list, sepby(id)}

{p 4 4 2}{* alternative 2: collapse first}{p_end}
{p 4 4 2}{cmd:bysort id (year): gen spell = sum(tag)}{p_end}
{p 4 4 2}{cmd:collapse (min) year (mean) whatever, by(id spell)}{p_end}
{p 4 4 2}{cmd:list, sepby(id)}


{title:Author}

{p 4 4 2}Nicholas J. Cox, Durham University, UK{break} 
n.j.cox@durham.ac.uk


{title:Acknowledgments}

{p 4 4 2}This problem was suggested by Rajesh Tharyan on Statalist.
{browse "http://www.stata.com/statalist/archive/2008-05/msg00772.html":http://www.stata.com/statalist/archive/2008-05/msg00772.html}

{p 4 4 2}Leny Matthew signalled a bug in an earlier version. 

{p 4 4 2}Andrea Stringhetti posted a problem at 
{browse "https://www.statalist.org/forums/forum/general-stata-discussion/general/1555093-sorting-and-creation-of-variables":https://www.statalist.org/forums/forum/general-stata-discussion/general/1555093-sorting-and-creation-of-variables}
which led to emphasis on the scope for spell identification. 


{title:Reference} 

{p 4 8 2}Cox, N.J. 2007. 
Speaking Stata: Identifying spells. 
{it:Stata Journal} 7: 249{c -}265. 
{browse "http://www.stata-journal.com/article.html?article=dm0029":http://www.stata-journal.com/article.html?article=dm0029}


{title:Also see}

{p 4 13 2}On line: help for {help tsset}