help calibest
-------------------------------------------------------------------------------

Title

calibest -- Estimates proportions and means after survey data have been calibrated to population totals

Syntax

calibest varlist , marginals(varlist) selwt(varname) calibwt(varname) psu(psu) method(method) [options]

options Description ------------------------------------------------------------------------- Required selwt(varname) is the selection weight calibwt(varname) is the calibration weight marginals(varlist) are the variables that were used in the calibration psu(psu) is the primary sampling unit (or _n if simple random sampling was used) method(method) specifies the estimation method

Options design(design) specifies the design options used in the sampling design -------------------------------------------------------------------------

Description

calibest estimates means or proportions of survey data after selection weights have been calibrated to population totals using, for example, one of methods 1 or 2 of Deville and Särndal (1992). It can be regarded as a generalisation of Stata's post-stratification estimation commands.

The selection weight, selwt, the calibrated weight, calibwt, the variables used in the calibration, marginals, the primary sampling unit, psu, and the method to be used (means or proportions) must all be specified. Sampling design options can also be specified.

The output includes estimates, asymptotic standard errors and 95% confidence intervals. The asymptotic standard errors and confidence intervals are calculated using methods based on Särndal, Swensson, and Wretman (1992) and are valid if the sample size is large. They are asymptotically valid regardless of which calibration method is used.

Options

method(method) specifies the estimation method. Options are mean or prop.

design(design) specifies the design options used in the sampling design. Standard Stata syntax to define strata, secondary sampling units or finite population corrections can be included.

Also see

The program calibrate can calculate the calibration weights to be used in the estimation. See also svyset, which can be used when calibrating to a single categorical variable.

Example 1

Demonstrate on the multistage dataset.

. use http://www.stata-press.com/data/r9/multistage

The population consists of 8,000,000 high school seniors. If we were estimating the mean weight of high school seniors and the distribution of sex and race using selection weight only, we would use the design options given in the Stata manual.

. svyset county [pweight=sampwgt], strata(state) fpc(ncounties) || school, fpc(nschools) . svy: mean weight . svy: prop sex race

To estimate the same quantities using the calibration weights we need information about population totals. Suppose it is known that the population is 50% male and 50% female, and contains 7,000,000 white seniors. First create calibration weights. Start by converting the categorical variables sex and race into binary indicator variables.

. tab sex, gen(isex) . tab race, gen(irace)

Make a row matrix of popultaion totals (male, female, white).

. matrix M=[4000000, 4000000, 7000000]

Now calibrate (creating a calibration weight wt1). We use linear calibration.

. calibrate , marginals(isex1 isex2 irace1) poptot(M) entrywt(sampwgt) exitwt(wt1)

Estimation using calibest:

. calibest weight, method(mean) selwt(sampwgt) calibwt(wt1) marginals(isex1 isex2 irace1) psu(county) design(strata(state) fpc(ncounties) || school, fpc(nschools)) . calibest sex race, method(prop) selwt(sampwgt) calibwt(wt1) marginals(isex1 isex2 irace1) psu(county) design(strata(state) fpc(ncounties) || school, fpc(nschools))

Example 2 (Post-stratification)

The special case where we are post-stratifying (calibrating to a single categorical variable) can be dealt with using the post-stratification option of svyset, but calibest should give the same result. To illustrate, we post-stratify to sex (4,000,000 male and 4,000,000 female).

. gen sextot=4000000 . svyset county [pweight=sampwgt], strata(state) fpc(ncounties) || school, fpc(nschools) poststrata(sex) postweight(sextot) . svy: mean weight . svy: prop sex race

The same results can be obtained using calibest.

. matrix M=[4000000, 4000000] . calibrate , marginals(isex1 isex2) poptot(M) entrywt(sampwgt) exitwt(wt2) . calibest weight, method(mean) selwt(sampwgt) calibwt(wt2) marginals(isex1 isex2) psu(county) design(strata(state) fpc(ncounties) || school, fpc(nschools)) . calibest sex race, method(prop) selwt(sampwgt) calibwt(wt2) marginals(isex1 isex2) psu(county) design(strata(state) fpc(ncounties) || school, fpc(nschools))

Saved results

calibest with method=mean saves the following matrices in r(). (method=prop does not save anything):

Matrices r(meff) column matrix of mis-specification effect (one row for each dependent variable) r(rmat) results matirx with one row for each dependent variable and four columns: (estimate, SE, lower and upper 95% confidence limits)

References

Deville, J.-C., and C.-E. Särndal. 1992. Calibration estimators in survey sampling. Journal of the American Statistical Association 87: 376-382.

Särndal, C.-E., and S. Lundström. 2005. Estimation in Surveys with Nonresponse. New York, Wiley.

Särndal, C.-E., B. Swensson, and J. H. Wretman 1992. Model Assisted Survey Sampling. New York: Springer-Verlag.

Author

John D'Souza National Centre for Social Research London, England, UK John.D'Souza@natcen.ac.uk