-------------------------------------------------------------------------------
help for ipfweight
-------------------------------------------------------------------------------

IPF-Algorithm to create adjustment survey weights

ipfweight varlist [if exp], generate(newvar) values(numlist) maxiter(#) [startwgt(varname) tolerance(#) upthreshold(#) lothreshold(#) misrep]

Description

ipfweight is based on the iterative proportional fitting algorithm (also known as raking) first proposed by Deming and Stephan (1940). Like Nick Winter's survwgt rake it performs a stepwise adjustment of survey sampling weights to achieve known population margins (e.g. sex, education, age etc.) but offers some additional features. The adjustment process is repeated until the difference between the weighted margins of the variables listed in varlist and the known population margins specified in values() is smaller than a tolerance value specified in tolerance() or the maximum number of iterations specified in maxiter() is obtained.

Options

generate(newvar) creates a new variable containing the final weighting factors. It is required.

values(numlist) contains the known population margins. The order of the specified population margins in numlist has to correspond to the values of each variable in varlist.

maxiter(#) defines the maximum number of iterations. # has to be larger than 1.

startwgt(varname) uses the values of varname as starting weights. For example, a variable containing design weights that transform a sample of households into a sample of individuals can be used here. If startwgt() is not specified, each case gets a starting weight of 1.

tolerance(#) specifies the maximum deviation between the weighted margins of the variables listed in varlist and the known population margins specified in values() that is tolerated. If tolerance() is not specified, the iterative process is repeated # times as specified in maxiter(#).

upthreshold(#) specifies an upper threshold for the final weighting factors. If a weighting factor exceeds this threshold, it is trimmed to # before the iterative process is continued. An upper threshold of about 5 is suggested (DeBell et al. 2009: 31).

lothreshold(#) specifies a lower threshold for the final weighting factors. If a weighting factor falls below this threshold, it is trimmed to # before the iterative process is continued.

misrep replaces missing values in varlist with a weighting factor of 1 before the iteration process is continued. If misrep is not specified, weighting factors for all cases with at least one missing value in varlist cannot be computed. However, a more promising solution is to multiple impute missing values before using ipfweight.

Examples

. ipfweight sex educ, gen(wgt) val(48.3 51.7 43.7 30.7 25.6) maxit(10)

. ipfweight sex educ region, gen(wgt) val(48.3 51.7 43.7 30.7 25.6 78.0 22.0) maxit(25) st(designwgt) tol(.1) up(5) lo(.2) mis

References

DeBell, Matthew/Jon A. Krosnick/Arthur Lupia/Caroline Roberts. 2009. User’s Guide to the Advance Release of the 2008-2009 ANES Panel Study. Palo Alto, CA and Ann Arbor, MI: Stanford University and University of Michigan.

Deming, W. Edwards/Frederick F. Stephan. 1940. On a Least Squares Adjustment of a Sampled Frequency Table When the Expected Marginal Totals Are Known, in: The Annals of Mathematical Statistics 11 (4): 427-444.

Author

Michael Bergmann, University of Mannheim, michael.bergmann@uni-mannheim.de

Also see

Manual: [R] weight

On-line: help for weight; survwgt