{smcl} {* 18mar2021}{...} {hline} {cmd:help myaxis} {hline} {title:Title} {p 8 8 2} {hi:myaxis} {hline 2} Reorder categorical variable by specified sort criterion {title:Syntax} {p 8 17 2}{cmd:myaxis} {it:newvar}{cmd:=}{it:varname} {ifin} {cmd:,} {opt sort(criterion)} [ {opt subset(true_or_false_condition)} {opt miss:ing} {opt desc:ending} {opt varlabel(string)} {opt valuelabelname(string)} ] {title:Description} {p 4 4 2}{cmd:myaxis} maps an existing "categorical" variable, meaning usually a numeric variable with integer codes and value labels, or equivalently a string variable, to a new variable with integer values 1 up and with value labels, sorted according to a specified criterion. {title:Remarks} {p 4 4 2}The command name {cmd:myaxis} is to be parsed "my axis". The second element "axis" arises from a leading application of the command. You have a categorical variable that would define an axis of a graph, or one dimension of a table (the rows, or the columns, say), but the existing order of categories is not ideal. Some graph and table commands offer sorting on the fly, but this command may help wherever other commands do not offer that. {p 4 4 2}The first element "my" is at best harmless whimsy, but arises because a command named just {cmd:axis} would be harder to spot among other uses of the term. If you find it irritating or annoying, clone and rename the command. Now it's yours, modulo your use of my work. {p 4 4 2}The problem is split by {cmd:myaxis} into these parts: {p 8 8 2}1. Calculation of a numeric variable on which to sort categories. {cmd:myaxis} treats this as an application of {help egen}. Note: If a variable already exists that defines the sort order and is constant within categories, then asking for (say) its minimum, mean, or maximum within each category will suffice. {p 8 8 2}2. Deciding whether you want ascending order (the default) or descending order (highest value goes first). Descending order requires negation of the variable from #1. {p 8 8 2}3. Mapping your categorical variable to integers 1 up. The {cmd:group()} function of {help egen} does the work here, but {cmd:myaxis} is careful to split ties according to the original variable. (For example: suppose nominal categories A, B, C, D, E have frequencies 7, 7, 42, 3, 1 and you want them sorted by frequency. You don't want A and B lumped together because they have the same frequency.) {p 8 8 2}4. Fixing a variable label. {cmd:myaxis} uses a new variable label if supplied; otherwise, the original variable label; and, if that does not exist, the original variable name. {p 8 8 2}5. Fixing value labels. This is even more important than #4 for helpful display in a graph or table. {cmd:myaxis} uses the original value labels if defined and otherwise the original string or numeric values. {title:Options} {p 4 4 2}{cmd:sort()} specifies the criterion for sorting. It is a required option. The criterion should always include the name of an {help egen} function. The function may be community-contributed so long as the code is visible along your {help adopath}. The criterion may also include the name of an existing variable and that is essential whenever the sort criterion is not based on {it:varname}. {p 4 4 2}{cmd:subset()} specifies a subset of the data on which the sort criterion should be calculated. Concretely, imagine two variables that define {it:y} and {it:x} axes of a graph or rows and columns of a table. You might want rows to be sorted by values calculated for a particular column, or columns to be sorted by values calculated for a particular row. {p 4 4 2}{cmd:missing} specifies that missing values of {it:varname} are to be included. The default is to ignore them. {p 4 4 2}{cmd:descending} specifies sorting with highest value first. The default sort order is ascending, with lowest value first. {p 4 4 2}{cmd:varlabel()} specifies a variable label for the new variable. Otherwise see #4 in the Remarks. {p 4 4 2}{cmd:valuelabelname()} specifies a new value label name for the value labels of the new variable. This will be needed if there is already a set of value labels called {it:newvar}. {title:Examples} {p 4 8 2}{cmd:. sysuse auto, clear}{p_end} {p 4 8 2}{cmd:. myaxis wanted=rep78, sort(count) descending}{p_end} {p 4 8 2}{cmd:. tab wanted}{p_end} {p 4 8 2}{cmd:. tab wanted, nola}{p_end} {p 4 8 2}{cmd:. myaxis wanted2=rep78, sort(mean mpg) descending}{p_end} {p 4 8 2}{cmd:. format mpg %2.1f}{p_end} {p 4 8 2}{cmd:. tab wanted2, su(mpg)}{p_end} {p 4 8 2}{cmd:. myaxis wanted3=rep78, sort(mean mpg) subset(foreign==1) descending}{p_end} {p 4 8 2}{cmd:. tab wanted3 foreign , su(mpg) nost nofreq}{p_end} {p 4 8 2}{cmd:. webuse nlsw88, clear}{p_end} {p 4 8 2}{cmd:. myaxis wanted=industry, sort(median wage) descending}{p_end} {p 4 8 2}{cmd:. tabstat wage, s(median mean) by(wanted) format(%3.2f)}{p_end} {title:Author} {p 4 4 2}Nicholas J. Cox, Durham University, UK{break} n.j.cox@durham.ac.uk {title:Also see} {psee} Online: {manhelp egen D}, {manhelp tabulate_oneway R}, {manhelp tabulate_twoway R}, {manhelp graph_dot G-2}, {manhelp graph_bar G-2}, {help labmask} ({it:Stata Journal}; if installed), {help egenmore} (SSC; if installed), {help tabplot} ({it:Stata Journal}; if installed), {help stripplot} (SSC; if installed) {p_end}