help digituse
-------------------------------------------------------------------------------

Title

digituse -- Tabulate the pattern of digit use in a variable

Syntax

digituse varname [, places(#)]

Description

The digituse function displays the proportionate use of the digits 0-9 in each significant digit of a variable. The number of places to the right of the decimal point can be defined using an option. The number of places to the left of the decimal point is automatically set as required by the maximum value stored in the variable.

Options

+---------+ ----+ General +----------------------------------------------------------

places(#) indicates the number of decimal places which should be examined. The default is 9 which often shows rounding errors in the extreme places. Generally you should specify at least one more decimal place than the expected number of places: and then expect the extra place to contain 100% zeros and no other digits.

Remarks

This function outputs a table in which a series of columns of percentage values indicate the proportionate use of digits in each significant digit of the variable. These percentages are rounded to integers so an entry of zero indicates some instances of that digit were found in the indicated place. An entry of '.' indicates no instances of the digit were found at that place.

The following annotated example of the output shows how this function can be used:

. digituse sodium, places(3) ------------------------------------ Digit | Position (10^) | 2 1 0 -1 -2 -3 -------+---------------------------- 0 | 1 1 16 100 100 100 1 | 99 0 15 . . . 2 | . 1 12 . . . 3 | . 44 8 . . . 4 | 0 54 6 . . . 5 | . 0 5 . . . 6 | . 0 5 . . . 7 | . . 8 . . . 8 | . 0 10 0 . . 9 | . . 15 . . . ------------------------------------

In the 10^2 column (hundreds), the zero suggests there are a small number of values in the 400's range while the 1 and 99 clearly indicate most values are in the 100's with a small proportion below this.

The 44 and 54 in the second columns extend this knowledge and show that most values are in the 130s and 140s, suggesting the one or more 400s are outliers and should be investigated.

The grouping of values in the third column suggests most values end with an 8, 9, 0, 1 or a 2 (i.e. fall between 138 and 142).

The 100 values in the fourth, fifth and sixth columns indicate the majority of values have zero in the tenths, hundredths and thousands. However the zero in the tenths column shows that at least one value has an 8 in this position. This is clearly an anomaly and worthy of further investigation. The zerouse function could be used to identify these cases.

Examples

. digituse y1 . digituse y1, places(3)

Author

Richard J. Atkins London School of Hygiene and Tropical Medicine e-mail: richard.atkins@lshtm.ac.uk