help lstrfun                                                     Dan Blanchette
-------------------------------------------------------------------------------

Title

lstrfun -- Modify long local macros that contain strings

Syntax

lstrfun local_macro_name [, options]

where local_macro_name is a valid name of either an existing local macro that will be modified or a valid name of a local macro that will be generated by lstrfun. The strings being modified can be as long as the maximum number of characters in a local macro minus about 50 (which is about how many characters are used to submit lstrfun).

Only one local macro name can be submitted and only one option can be specified.

options Description ------------------------------------------------------------------------- In these descriptions string is referring to the contents of the local macro:

lower(macname) make string all lowercase. proper(macname) make string proper case. upper(macname) make string all uppercase. ltrim(macname) remove leading blanks from string. itrim(macname) replace multiple internal blanks with a single blank in string. rtrim(macname) remove trailing blanks from string. substr(macname, pos, length) cut string from starting point for the specified length. subinstr(macname, "from", "to", n) replace "from string" with "to string" in the string for the specified number of occurrences. subinword(macname, "from", "to", n) replace "from word" with "to word" in the string for the specified number of occurrences. strdup(macname, n) duplicate the string the specified number of times. reverse(macname) reverse the string. _substr(macname, "tosub", pos) substitutes "tosub" into the string at the specified position.

Submit a new local macro name to be generated when using these options since these options return numbers/codes: strlen(macname) returns the length of the string. strpos(macname, "needle") returns the position of the specified string. strmatch(macname, pattern) returns 1 if the string matches the specified pattern. soundex(macname) returns the soundex code for the string. soundex_nara(macname) returns the U.S. Census soundex code for the string. indexnot("characters", macname) returns the position of the first character in the set of characters specified that is not found in the string, or it returns 0 if all characters in the set of characters specified are found in the string. regexm(macname, pattern) returns either a 1 or a 0 if regexm finds the pattern in the macro. regexms(macname, pattern with groups defined, nth_group) returns the nth group if regexm finds the pattern in the macro. regexr(macname, pattern, what to replace) returns what is wanted to be replaced if regexr finds the pattern in the macro.

Description

lstrfun allows you to modify local macros that contain strings using Mata's string functions which are not restricted to the maximum length of strings in Stata like the normal string functions in Stata are limited. If you are using Stata SE or Stata MP, then your setting of maxvar affects the maximum number of characters in a macro. The maximum number of characters in a macro is only 1,081,511 when maxvar is set to 32767. When maxvar is set to 5000 (the default value) then the maximum number of characters in a macro is 165,200. The local macro submitted to lstrfun will be modified if it already exists or it will be created if it does not exist. lstrfun can be very helpful when trying to modify value labels and notes in Stata since value labels and notes are allowed to have more characters than normal string functions are allowed to handle. The option strlen() has an equivalent function in Stata's extended macro functions, but the options subinstr() and subinword() have similar functions in Stata's extended macro functions which do not allow you to specify a specific number of replacements to make.

Local macro strings that are submitted in any of the options have to be enclosed in at least double quotes. It is a good idea to enclose local macro strings in compound double quotes in case the macro contains quotes.

The maximum number of characters submitted to an option of lstrfun in total cannot exceed the maximum length of strings in Stata minus about 50 (which is about how many characters are used to submit lstrfun). The command file read can create local macros with more characters than Stata allows. If you get the following error messages when submitting lstrfun, then the submitted macro has too many characters:

unmatched quote r(198); too many macros r(920); command too long r(1004);

If you are using Stata SE or Stata MP, then increase your setting of maxvar since it affects the maximum number of characters in a macro.

Examples of modifying local macros It is a good idea to use macval() when submitting a macro so that if the submitted macro contains a macro: `something' then it is left as is.

. lstrfun mvar, lower(`"`mvar'"')

. lstrfun mvar, itrim(`"`macval(mvar)'"')

. lstrfun mvar, substr(`"`mvar'"', 332, 3235)

. lstrfun mvar, _substr(`"`macval(mvar)'"', "XX", 20345)

. lstrfun mvar, regexr(`"`macval(mvar)'"', `"Geo(rgi)e.Boy"', "George's son")

Examples of creating new local macros

. lstrfun new_mvar, strpos(`"`mvar'"', "needle in the haystack")

. lstrfun new_mvar, strmatch(`"`mvar'"', "mat*me")

. lstrfun new_mvar, strmatch(`"`macval(mvar)'"', `"`macval(mvar2)'"')

. lstrfun new_mvar, indexnot("abcdef", `"`mvar'"')

. lstrfun new_mvar, regexm(`"`macval(mvar)'"', `".*Georgie.Boy*"')

. lstrfun new_mvar, regexms(`"`macval(mvar)'"', `".*Geo(rgi)e.Boy*"', 1)

Author

Dan Blanchette The Carolina Population Center University of North Carolina - Chapel Hill, USA dan_blanchette@unc.edu

Note Useful suggestions and feedback by Nick Cox are gratefully acknowledged.

Also see

Online: string functions, Mata string manipulation functions, and extended macro functions