Identify observations for possible thinned panel dataset
panelthin [if exp] [in range] , generate(newvar) minimum(#)
Description
panelthin identifies observations that would belong in a thinned panel dataset in which observations in each panel are at least a minimum time apart. The result is a new variable tagging observations in the thinned dataset by 1 and others by 0.
Remarks
panelthin assumes tsset data and automatically works separately on each panel in a panel dataset.
In essence, the first observation in each panel is selected, then the next after at least a minimum time, and so on.
If a thinned dataset is acceptable, then (provided that the main dataset is saved elsewhere) keep the set with observations tagged 1 in the new variable.
Options
generate() specifies the name of a new variable to include tags for selected observations. It is a required option.
minimum() specifies the minimum acceptable spacing in the units of the time variable defining the panel. It is a required option.
Examples
. tsset id time . panelthin, min(5) gen(OK) . browse id time OK . keep if OK
Author
Nicholas J. Cox, Durham University, UK n.j.cox@durham.ac.uk
Acknowledgments
This problem was suggested by Rajesh Tharyan on Statalist. http://www.stata.com/statalist/archive/2008-05/msg00772.html
Leny Matthew signalled a bug in an earlier version.
Also see
On line: help for tsset