{smcl}
{hline}
help for {hi:vtokenize}{right: Bill Rising}
{hline}

{hi:Split All Observations of a Variable into Tokens}
{* put the syntax in what follows. Don't forget to use [ ] around optional items}
{p 8 14}
   {cmd:vtokenize}
   {it:varname} 
   [{cmd:if} {it:exp}]
   [{cmd:in} {it:range}]
   [{cmd:,}
   {cmdab:stub} {cmdab:p:arse} {cmd:nospace} {cmdab:nodelim:eters}
{p_end}

{title:Description}

{p}
Splits the {it:varname} into its component tokens, generating as many new variables as needed.
(The original variable is left untouched.)
Used for working with truly nasty text files.	
{p_end}

{title:Options}

{p 0 4}{cmd:stub} specifies the start of the names of the variables which will be generated.
If omitted, the stub will simply be the name of the variable being split.
The resulting variables will have suffixes _1, _2, ... or _01, _02, ... or _001, _002, ... depending on the number of variables generated.
{p_end}

{p 0 4}{cmd:parse(delimiters)} gives the list of delimiters which are used to separate tokens.
If omitted, the only delimiter is whitespace (one or more spaces).
There is no need to specify space a delimiter, though explicitly specifying it will not cause problems.
{p_end}

{p 0 4}{cmd:nospace} is used to {bf:prevent} spaces from being used as delimiters.
{p_end}

{p 0 4}{cmd:nodelimiters} is used to {bf:prevent} delimiters from being stored as tokens. 
Note that just as with {help gettoken}, spaces are never kept as tokens.
{p_end}

{title:Example(s)}

{p 8 12}{inp:. vtokenize foo}{break}
Splits {it:foo} into words by breaking on space(s), storing the first word for each observation in {it:foo_1}, the second word in {it:foo_2}, etc. {it:foo} itself is not altered.
{p_end}

{p 8 12}{inp:. vtokenize foo, stub(bar)}{break}
Splits {it:foo} into words by breaking on space(s), storing the first word for each observation in {it:bar_1}, the second word in {it:bar_2}, etc.
{p_end}

{p 8 12}{inp:. vtokenize foo, stub(bar) parse(":") nospace}{break}
Splits {it:foo} into words by breaking on colons (:), storing the first token for each observation in {it:bar_1}, the second token in {it:bar_2}, etc.
The colons themselves {bf:are} treated as tokens.
{p_end}

{p 8 12}{inp:. vtokenize foo, stub(bar) parse(":") nospace nodelimiters}{break}
Splits {it:foo} into words by breaking on colons (:), storing the first token for each observation in {it:bar_1}, the second token in {it:bar_2}, etc.
The colons themselves are {bf:not} treated as tokens.
{p_end}

{p 8 12}{inp:. vtokenize foo, stub(bar) parse(":")}{break}
Splits {it:foo} into words by breaking on colons (:) and spaces, storing the first token for each observation in {it:bar_1}, the second token in {it:bar_2}, etc.
The colons themselves are treated as tokens, but the spaces are not.
{p_end}

{title:Notes}

{p}
{cmd:vtokenize} checks only to see if the variable {it:stub_*1} exists when doing error checking.
Thus, it will die ungracefully if, say {it:stub_3} exists, but it needs to generate its own {it:stub_3}
{p_end}

{title:Also see}
{p}
{help tokenize}, {help gettoken}, {help vgettoken}
{p_end}

{title:Author}
Bill Rising 
email: {browse "mailto:brising@louisville.edu":brising@louisville.edu} 
web: {browse "http://www.louisville.edu/~wrrisi01":http://www.louisville.edu/~wrrisi01}

snailmail:
Dept. of Family and Community Medicine
University of Louisville
MedCenter One, Suite 270
501 E. Broadway
Louisville, KY 40202

{title:Last Updated}: December 9, 2003 @ 21:51:25