help for romantoarabic

Roman numerals to arabic numbers

romantoarabic romanvar [if exp] [in range] , generate(arabicvar)


romantoarabic creates a numeric variable arabicvar from a string variable romanvar following these rules:

1. Lower case letters are treated as if upper case.

2. Any spaces are ignored.

3. Numerals must match the Stata regular expression "^M*(CM|CD|D?)C*(XC|XL|L?)X*(IX|IV|V?I*)$". Note that this is more generous than many authorities would allow.

4. Single occurrences of CM, CD, XC, XL, IX, IV are treated as 900, 400, 90, 40, 9, 4 respectively.

5. M, D, C, L, X, V, I are treated as 1000, 500, 100, 50, 10, 5, 1 respectively as many times as they occur.

6. The results of 4 and 5 are added.

7. Input of any other expression or characters is treated as an error and results in missing.


generate() specifies the name of the new numeric variable to be created and is not optional.


. romantoarabic roman, gen(arabic)


Peter A. [Tony] Lachenbruch suggested this problem on Statalist. Sergiy Radyakin's comments on that list provoked more error checking.


Nicholas J. Cox, Durham University, U.K. n.j.cox@durham.ac.uk