Description
The kountry command performs the following tasks:
1. It standardizes country names from various sources which makes it much easier to merge datasets that use different spellings, abbreviations, and numeric codes for the same country,
2. It converts country names from one coding scheme to another, and
3. It generates a "geographical region" variable.
The three features are described in detail below.
Syntax
kountry country_var , from(database_name | other) [to(database_name) geo(geo_option) marker stuck]
country_var is the variable that contains the country codes or names you wish to standardize. country_var can be character or numeric.
from() is always required. It specifies the database your country_var comes from. Use other if you cannot identify the database. The new variable containing standardized names is called NAMES_STD. See Table 1 below for a list of supported database_names and their abbreviations. See kountrynames for a list of standardized country names.
to() specifies the coding scheme country_var is to be converted to. This option generates a new variable called _VAR_ where VAR is a capitalized database_name keyword. For example, if the user specifies to(marc), the new variable will be called _MARC_. See Table 1 for a list of supported database_names.
geo() generates a variable called GEO that assigns a country to a geographical region. See Table 2 below for a list of supported geo_options.
marker generates a variable called MARKER that takes on the value of 1 if a given country name was standardized successfully, 0 otherwise. MARKER makes it easy to see which names failed to standardize as the user, after running the command, can type:
. tabulate country_var if MARKER==0
stuck is explained below
As of 8/19/2011 country names ("character") and codes ("numerical") from the following sources are supported:
Table 1: Supported data sets ------------------------------------------------------------------ database_name Dataset character numerical ------------------------------------------------------------------ Correlates of War cowc cown EUGene cowc cown International Crisis Behavior cown IMF imfn ISO 3166 alpha-2 iso2c ISO 3166 alpha-3 iso3c ISO 3166 numeric iso3n McClelland mcc MARC (Library of Congress) marc MARGene cown Militarized Interstate Disputes cown National Capabilities capc Penn World Table penn Polity IV cown World Bank iso3c UNCTAD unc UN Stats iso3n Type other to convert character names from any other databases ------------------------------------------------------------------
other cannot be specified if to() and from() are used concurrently. The reason for this is that there is no one-to-one mapping from from(other) to to().
to() makes use of the kountry.dta dataset which should reside in your ado/plus/k folder or directory. ssc install should automatically place kountry.dta in the right location. See sysdir if Stata is not able to load kountry.dta.
Use the stuck option when your country_var is a bunch of long names and it is impossible to use to(). stuck converts country_var to NAMES_STD, then converts NAMES_STD to _ISO3N_. From there, you can translate _ISO3N_ into any dataset listed above.
The syntax with stuck is
kountry country_var, from(other) stuck [marker]
marker will mark the observations that failed to standardize in the first step.
As of 8/19/2011, the following regions can be specified:
Table 2: Geographical regions ------------------------------------------------------------------ geo_option description ------------------------------------------------------------------ cow Correlates of War "home regions" marc MARC (Library of Congress) regions men Middle East "narrow" meb Middle East "broad" (incl. North Africa) sov makes a separate post-Soviet region un UN Stats undet UN Stats, detailed ------------------------------------------------------------------
See kountryregions for further notes on geographical regions.
Notes and warnings
Make sure numeric codes are stored as numeric variables, otherwise kountry will not convert them properly.
Whenever possible, I use the most current coding for a given country. For example, if you convert to(marc), Belarus will be recorded to bw even though it was coded bwr before June 1992. The lack of one-to-one mapping is notorious for post Soviet states and other states that split or consolidated.
Here is an incomplete list of such cases. The Federal Republic of Germany and Prussia will both be recoded to Germany. Korea and South Korea will both be recoded to South Korea. The USSR, Soviet Union, and Russian Federation will all be recoded to Russia. Serbia and Serbia/Montenegro will be recoded to Yugoslavia.
On 9 July 2011, the 736 UN and ISO code for Sudan was retired. The new code is 729. For the time being, kountry continues to use the old code.
Examples
. kountry statename, from(other) m
. kountry imfcode, from(imfn) to(marc)
. kountry wbankcode, from(iso3c) geo(sov)
References
The command is described in more detail in Raciborski, R. (2008). "kountry: A Stata utility for merging cross-country data from multiple sources," The Stata Journal, 8(3), 390-400.
Author
Rafal Raciborski Email: rraciborski@gmail.com
Also see
Online: kountryadd, kountrybackup, kountrynames, kountryregions