Nearest neighbour interpolation
nnipolate yvar xvar [if exp] [in range] , generate(newvar) [ties(ties_rule)]
by ... : may be used with nnipolate; see help by.
Description
nnipolate creates newvar by averaging non-missing values of yvar and using nearest neighbour interpolation of missing values of yvar, given xvar. That is, provided that xvar is not missing,
1. When yvar is not missing, newvar is the mean of yvar over observations with the same value of xvar. If a value of xvar is unique, then each mean is just the same as the value of yvar at that point.
2. When yvar is missing, newvar is filled in using nearest neighbour interpolation. As interpolation is with respect to xvar, that means the value of the previous known value of yvar or the value of the next known value of yvar, depending on which is nearer in terms of xvar. Previous or next mean with lower or higher values of xvar.
3. When previous and next values are equally distant from a known value, users have a choice of rules that they may wish applied. By default, nnipolate uses the mean of the two values. The ties() option provides alternative rules.
Remarks
This method is presumably most natural or appealing when the underlying pattern of change is step-functional, so that the series being interpolated is piecewise constant.
This interpolation method also extrapolates, as unknown values before the first known value and unknown values after the last known value are replaced by those respective known values.
The examples are based on the help for ipolate in Stata 10 up. Any Stata 8 or 9 users will need to substitute their own.
'Neighbour' is the standard spelling in (British) English. 'Neighbor' is the standard spelling in American English.
nnipolate does not support interpolation in two or more dimensions.
Options
generate() is not optional; it specifies the name of the new variable to be created.
ties() specifies an alternative to the default rule whereby previous and next values equally distant from a given point are averaged. The user may choose one of next (next value is used), previous (previous value is used), minimum (smaller value is used), or maximum (larger value is used). As indicated, any unambiguous abbreviation is allowed.
Examples
--------------------------------------------------------------------------- Setup . webuse ipolxmpl1
List the data . list, sep(0)
Create y1 containing a nearest neighbour interpolation of y on x for missing values of y . nnipolate y x, gen(y1)
Use alternative rules for handling ties: . nnipolate y x, ties(next) gen(ynext) . nnipolate y x, ties(prev) gen(yprev) . nnipolate y x, ties(max) gen(ymax) . nnipolate y x, ties(min) gen(ymin)
List the results . list, sep(0)
--------------------------------------------------------------------------- Setup . webuse ipolxmpl2
Show years for which the circulation data are missing . tabulate circ year if circ == ., missing
Create csicirc containing a nearest neighbour interpolation of circ on year for missing values of circ and perform this calculation separately for each magazine . by magazine: nnipolate circ year, gen(csicirc) ---------------------------------------------------------------------------
Author
Nicholas J. Cox, Durham University, U.K. n.j.cox@durham.ac.uk
Also see
Manual: [D] ipolate
On-line: help for ipolate, help for cipolate (if installed), help for