Directly standardized rates with improved confidence interval
distrate casevar popvar using filename [if exp] [in range] , standstrata(stratavars) [by(varlist) popstand(varname) list(varlist) sepby(varlist) format(%fmt) formatn(#) mult(#) level(#) dobson saving(filename[,replace]) prefix(string) postfix(string) ]
Description
distrate estimates directly standardized rates and confidence intervals based on the gamma distribution as proposed by Fay and Feuer (1997). Tiwari, Clegg and Zou (2006) modified the formula of the upper confidence limit showing by simulations that this modified gamma confidence interval performs better than the gamma interval of Fay and Feuer and the other intervals. This method produces valid confidence intervals even when the number of cases is very small.
Data must be in aggregate form, i.e. each record must contain the total number of deaths (or events) and population for each stratum as follows
Age strata death pop ---------------------------------------- 0-44 164 47346 45-54 143 83173 55-64 202 186108 65-74 208 322065 75+ 283 362051 ----------------------------------------
using filename specifies a file containing standard population weigths, typically stratified by age and optionally by other variables. This file must be sorted by the variable specified in standstrata()
Options
standstrata(stratavars) specifies the variables defining strata across which to average stratum-specific rates. These variables must be present in the study population and in the standard population file. This is most often a unique variable containing age categories.
by(varlist) produces directly standard rates for each group identified by equal values of the by() variables taking on integer or string values.
popstand(varname) specifies the variable in the using file that contains the standard population weights. If not specified distrate assumes that it is named as popvar in the study population.
list(varlist) specifies the variables to be listed.
sepby(varlist) draws a separator line whenever varlist values change.
format(%fmt) specifies the format for variables containing the estimates.
formatn(#) specifies the # of digits for the format of the N (population) variable.
mult(#) specifies the units to be used in reported results. For example, if the mortality rate is 0.00526, specifying mult(1000) it will be reported as 5.26.
dobson displays Dobson, Kuulasmaa, Eberle and Scherer confidence limits.
level(#) specifies the confidence level, in percent, for the confidence interval of the adjusted rate; see help level.
saving(filename[,replace]) allows to save the estimates in a file.
prefix(string) or postfix(string) adds a prefix or a suffix to the variable names when the estimates are saved.
Example
use "C:\Data\SuffolkCounty.dta", clear collapse (sum) deaths pop,by(cindpov agegr)
distrate deaths pop using year2000st, stand(agegr) by(cindpov) mult(100000)
Further options
distrate deaths pop using year2000st, stand(agegr) by(cindpov) saving(DirectSuffolk,replace) format(%8.2f) mult(100000) level(90) list(rateadj lb_gam ub_gam se_gam)
Downloading ancillary files in one of your `"`c(adopath)'"' directory you can run this example.
(click to run)
Saved results
distrate saves the following in r():
Scalars r(k) number of groups identified by distinct values of the by() variables
Matrices r(Nobs) 1 x k vector of study population r(NDeath) 1 x k vector of number of events r(crude) 1 x k vector of crude rates r(adj) 1 x k vector of adjusted rates r(lb_G) 1 x k vector of lower bound of Tiwari adjusted rates r(ub_G) 1 x k vector of upper bound of Tiwari adjusted rates r(se_gam) 1 x k vector of standard error of adjusted rates r(lb_D) 1 x k vector of lower bound of Dobson adjusted rates r(ub_D) 1 x k vector of upper bound of Dobson adjusted rates
Authors
Enzo Coviello (enzo.coviello@tin.it)
References
Fay MP, Feuer EJ. Confidence intervals for directly standardized rates: a method based on the gamma distribution. Statistics in Medicine 1997; 16:791-801.
Tiwari RC, Clegg LX, Zou Z. Efficient interval estimation for age-adjusted cancer rates. Statistical Methods in Medical Research 2006; 15: 547-569.
Public Health Disparities Geocoding Project Monograph. CASE EXAMPLE: Analysis of all cause mortality rates in Suffolk County, Massachusetts, 1989-1991, by CT poverty strata
Also see