help reweight2
-------------------------------------------------------------------------------

Title

reweight2 -- Reweight data to user-defined control totals

Syntax

reweight2 using filename , newweight(newvar) [oldweight(varname)]

options Description ------------------------------------------------------------------------- Main oldweight(varname) name of original weighting variable if it exists, otherwise constant is created newweight(newvar) name of new variable generated by reweight2 containing new weights

Description

reweight2 calculates new weights for data to match control totals specified in filename, where filename is a text (ASCII) file that can be read in by insheet (i.e. the format of the file must be specified otherwise .raw will be assumed). The first column in filename must contain names of variables in the original dataset and the second column must contain the control totals that you want the weighted data to sum to with newweight. Note that reweight2 does not deal with categorical variables, these must be converted to dummy variables for each category using tab, gen, for example.

reweight2 uses the algoroithm defined in Gomulka (1992) to minimize the difference between oldweight and newweight subject to the constraints specified in filename being satisfied.

Options

+------+ ----+ Main +-------------------------------------------------------------

oldweight(varname) specifies the existing weight variable in the data. If oldweight is not specified, reweight2 creates a constant and minimises the distance between this and newweight.

newweight(newvar) specifies the name of the new weighting variable you want reweight2 to create.

Example

Consider the following example from Creedy(2003). id is the identification number of each unit included in the survey, x1, x2, x3 and x4 are variables included in the survey, weight is the vector of original survey weights:

. use http://fmwww.bc.edu/repec/bocode/r/reweight.dta, clear

. list

+---------------------------------+ | id x1 x2 x3 x4 weight | |---------------------------------| 1. | 1 1 1 0 0 3 | 2. | 2 0 1 0 0 3 | 3. | 3 1 0 2 0 5 | 4. | 4 0 0 6 1 4 | 5. | 5 1 0 4 1 2 | |---------------------------------| 6. | 6 1 1 0 0 5 | 7. | 7 1 0 5 0 5 | 8. | 8 0 0 6 1 4 | 9. | 9 0 1 0 0 3 | 10. | 10 0 0 3 1 3 | |---------------------------------| 11. | 11 1 0 2 0 5 | 12. | 12 1 1 0 1 4 | 13. | 13 1 0 3 1 4 | 14. | 14 1 0 4 0 3 | 15. | 15 0 0 5 0 5 | |---------------------------------| 16. | 16 0 1 0 1 3 | 17. | 17 1 0 2 1 4 | 18. | 18 0 0 6 0 5 | 19. | 19 1 0 4 1 4 | 20. | 20 0 1 0 0 3 | +---------------------------------+

The vector of survey weights produces the following aggregate totals:

. tabstat x1 x2 x3 x4 [w = weight], s(su) (analytic weights assumed)

stats | x1 x2 x3 x4 ---------+---------------------------------------- sum | 44 24 213 32 --------------------------------------------------

Now, let us assume that external information on these variables are available, and that the true totals are:

stats | x1 x2 x3 x4 ---------+---------------------------------------- sum | 50 20 230 35 --------------------------------------------------

In this case, reweight2 can be used to adjust the survey weights so that the new survey totals match the true totals:

To do this, first create a spreadsheet file with these control totals and save as, say, example.csv:

x1 50 x2 20 x3 230 x4 35

Then run the following command:

. reweight2 using example.csv, oldweight(weight) newweight(newweight)

. tabstat x1 x2 x3 x4 [w = newweight], s(su) (analytic weights assumed)

stats | x1 x2 x3 x4 ---------+---------------------------------------- sum | 50 20 230 35 --------------------------------------------------

. list weight newweight

+--------------------+ | weight newweight | |--------------------| 1. | 3 3.043628 | 2. | 3 1.70822 | 3. | 5 4.451928 | 4. | 4 4.323331 | 5. | 2 2.835381 | |--------------------| 6. | 5 5.072713 | 7. | 5 7.048317 | 8. | 4 4.323331 | 9. | 3 1.70822 | 10. | 3 2.048059 | |--------------------| 11. | 5 4.451928 | 12. | 4 4.756732 | 13. | 4 4.865517 | 14. | 3 3.628476 | 15. | 5 3.95583 | |--------------------| 16. | 3 2.002268 | 17. | 4 4.174617 | 18. | 5 4.610522 | 19. | 4 5.670763 | 20. | 3 1.70822 | +--------------------+

Also see

Help: [R] tabulate [D] insheet

Thanks for citing reweight2 as follows

Browne, J. (2012), Reweight2: Stata command to reweight data to user-defined control totals.

Disclaimer

THIS SOFTWARE IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.

IN NO EVENT WILL THE COPYRIGHT HOLDERS OR THEIR EMPLOYERS, OR ANY OTHER PAR > TY WHO MAY MODIFY AND/OR REDISTRIBUTE THIS SOFTWARE, BE LIABLE TO YOU FOR DAM > AGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USER'S ABILITY TO US > E THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDER > ED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EV > EN IF SUCH HOLDER OR OTHER PARTY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

References

Gomulka, J. (1992), ‘Grossing-up revisited’, in R. Hancock and H. Sutherland (eds), Microsimulation Models for Public Policy Analysis: New Frontiers, STICERD Occasional Paper, London: London School of Economics.

Creedy, J. (2003), Survey Reweighting for Tax Microsimulation Modelling, Treasury Working Paper Series 03/17, New Zealand Treasury.

Author

James Browne, Institute for Fiscal Studies, London, UK. If you observe any problems mailto:james_browne@ifs.org.uk