------------------------------------------------------------------------------- help forcorr_svy-------------------------------------------------------------------------------

Correlation tables for survey data

corr_svyvarlist[weight] [ifexp] [inrange] [,strata(varname)psu(varname)fpc(varname)subpop(varname)pwobssigprint(#)star(#)]

pweights are allowed; see help weights.Warning: Use of

iforinrestrictions will not produce correct variance estimates for subpopulations in many cases. To compute estimates for subpopulations, use thesubpop()option.

Description

corr_svydisplays the correlation matrix for varlist. Optional significance levels are calculated, based on survey-based variance estimates for the correlations.It allows any or all of the following: probability sampling weights, stratification, and clustering. The

subpop()option will give estimates for a single subpopulation. For a general discussion of various aspects of survey designs, including multistage designs, see[U] 30 Overview of surveyestimation.To describe strata and PSUs of your data and to handle the error message "stratum with only one PSU detected", see help svydes.

Options

strata(),psu(), andfpc()are described insvyset; see help svyset.

subpop(varname)specifies that estimates be computed for the single subpopulation defined by the observations for whichvarname~=0. Typically,varname=1 defines the subpopulation andvarname=0 indicates observations not belonging to the subpopulation. For observations whose subpopulation status is uncertain,varnameshould be set to missing.

obsrequests that the number of observations for each correlation be displayed. This only makes sense in conjunction with thepwoption, but can be specified regardless.

pwspecifies that pairwise correlations be calculated and displayed.

sigrequests that the significance level of the coefficients be displayed.

obsrequests that the number of observations for each correlation be displayed. This only makes sense in conjunction with thepwoption, but can be specified regardless.

star(#)specifies the significance level of coefficients to be starred. star(5) would star all coefficients significant at the 5% level or better.

print(#)specifies the significance level of correlation coefficients to be printed. Coefficients with larger significance levels are left blank. print(10) would list only coefficients significant at the 10% level or better.

Example. svyset pweight leadwt . svyset strata stratid . svyset psu psuid

. corr_svy loglead age female region2-region4, obs sig

Saved Results

corr_svysaves in r() the following, about the final correlation calculated:r(N) The number of observations r(p) The p-level r(rho) The estimated rho

Methods and formulaeCalculations are based on the methods explained by Bill Sribney in a post to the Statalist, and reproduced in this Stata FAQ: http://www.stata.com/support/faqs/stat/survey.html.

Point estimates are calculated by correlate, with aweights.

With simple random sampling, the p-value from a linear regression of Y on X (or X on Y) is exactly the same as a p-value for Pearson's correlation coefficient for a simple random sample under the assumption of normality of the population. With survey variance estimates, however, the p-value for the slope of the regression of Y on X is NOT the same as the p-value for the regression of X on Y, unlike the case for the OLS regression estimator. So,

corr_svyobtains the p-values from both regressions and displays the conservative (i.e. larger) of the two.

AuthorNick Winter Cornell University nw53@cornell.edu