Title
search within your datasets for keywords
Syntax
grep search strings or "search phrase", [path(directory_name)] [filter(dta file secification)]
Description grep emulates the unix/linux command by the same name and will of course run on all operating systems. You can use it to parse any list of dta file and find ones with variables whose variable name or variable labels contain strings that interest you. It display the results in smcl format and they are clickable so you can directly describe or usethe results. Furthermore it returns everything including datasets and variables found so you can program on top of it.
It is useful when you have produced plenty of files within a project and you are not quite sure where everything is. It is also useful in case master datasets contain large number of files e.g. SOEP.
If you want to find all datasets in the mydir whose name starts with "a" and whose variable names or labels contain "household" then you run grep household, path(mydir) filter(a*). If you found a_1.dta,...,a_n.dta with variables ajv1,....ajvm_j then r(no) = n, r(djno) = m_j, r(dj)=a_j.dta r(djvs)=ajvs.
If you want to find all datasets whose variable names or labels contain the phrase "Trunk Space" in the Stata install directory then you run grep "Trunk Space", path(`"`c(sysdir_stata)'"') filter(*) whereas if you want to find all variables within file.dta in the current directory whose variable names or labels contain the Words "Trunk" or "Space" then you run grep Trunk Space, path(.) filter(file.dta)
Examples The command below . grep turn, path(`"`c(sysdir_stata)'"') filter(auto.dta) will find "dataset: use,describe. variables: use,describe: 2 vars in auto.dta", while it returns macros
r(d1v1) : "turn" r(d1no) : "1" r(d1) : "auto.dta" r(no) : "1"
The above can be achieved by:
cd `"`c(sysdir_stata)'"' grep turn, filter(auto.dta) or cd `"`c(sysdir_stata)'"' grep turn, filter(a*.dta) or cd `"`c(sysdir_stata)'"' grep turn
since in the absence of a (path,filter) specification the (current directory,*) > will be assumed.
Note Useful suggestions by Dan Blanchette (dan.blanchette@duke.edu) are gratefully a > cknowledged.
Author Nikos Askitas, IZA, Bonn, Germany. Email: nikos@iza.org