Title
qic -- QIC criterion for model selection in GEE analyses
Syntax
qic depvar [indepvars] [if] [in] [, options]
options Description ------------------------------------------------------------------------- Model i(varname_i) use varname_i as the panel ID variable t(varname_t) use varname_t as the time variable family(family) distribution of depvar link(link) link function
Model 2 exposure(varname) include ln(varname) in model with coefficient constrained to 1 offset(varname) include varname in model with coefficient constrained to 1 noconstant suppress constant term force estimate even if observations unequally spaced in time
Correlation corr(correlation) within-group correlation structure
SE/Robust robust synonym for vce(robust) nmp use divisor N-P instead of the default N rgf multiply the robust variance estimate by (N-1)/(N-P) scale(x2) set scale parameter to Pearson chi-squared statistic scale(dev) set scale parameter to deviance divided by degrees of freedom scale(phi) do not rescale the variance scale(#) set scale parameter to #
Reporting level(#) set confidence level; default is level(95) eform report exponentiated coefficients
Opt options optimize_options control the optimization process; seldom used
nodisplay suppress display of header and coefficients
------------------------------------------------------------------------- depvar and indepvars may contain time-series operators; see tsvarlist. iweights, fweights, and pweights are allowed; see weight. Weights must be constant within panel.
family Description ------------------------------------------------------------------------- gaussian Gaussian (normal); family(normal) is a synonym igaussian inverse Gaussian binomial Bernoulli/binomial (k=1) poisson Poisson nbinomial negative binomial (k=1) gamma gamma -------------------------------------------------------------------------
link Description ------------------------------------------------------------------------- identity identity; y=y log log; ln(y) logit logit; ln{y/(1-y)}, natural log of the odds probit probit; inverse Gaussian cumulative cloglog cloglog; ln{-ln(1-y)} power[#] power; y^k with k=#; #=1 if not specified opower[#] odds power; [{y/(1-y)}^k - 1]/k with k=#; #=1 if not specified nbinomial negative binomial reciprocal reciprocal; 1/y -------------------------------------------------------------------------
correlation Description ------------------------------------------------------------------------- exchangeable exchangeable independent independent unstructured unstructured fixed matname user-specified ar # autoregressive of order # stationary # stationary of order # nonstationary # nonstationary of order # -------------------------------------------------------------------------
Description
qic calculates the QIC and QIC_u criteria for model selection in GEE, which is an extension of the widely used AIC criterion in ordinary regression (Pan 2001). It allows for specification of all 7 distributions - gaussian, inverse Gaussian, Bernoulli/binomial, Poisson, negative binomial and gamma, all link functions and working correlation structures and all se/robust options, except for the vce option, avaiable in Stata 9.0. It also calculates the trace of the matrix O^{-1}V, where O is the variance estimate under the independent correlation structure and V is the variance estimate under the specified working correlation structure in GEE. When trace is close to the number of parametr p, the QIC_u is a good approximation to QIC.
Options
+-------+ ----+ Model +------------------------------------------------------------
i(varname_i), t(varname_t); see estimation options.
qic does not need to know t() for the corr(independent) and corr(exchangeable) correlation structures. Whether you specify t() makes no difference in these two cases.
family(family) specifies the distribution of depvar; family(gaussian) is the default.
link(link) specifies the link function; the default is the canonical link for the family() specified.
+---------+ ----+ Model 2 +----------------------------------------------------------
exposure(varname) and offset(varname) are different ways of specifying the same thing. exposure() specifies a variable that reflects the amount of exposure over which the depvar events were observed for each observation; ln(varname) with coefficient constrained to be 1 is entered into the regression equation. offset() specifies a variable that is to be entered directly into the log-link function with its coefficient constrained to be 1; thus, exposure is assumed to be e^varname. If you were fitting a Poisson regression model, family(poisson) link(log), for instance, you would account for exposure time for specifying offset() containing the log of exposure time.
noconstant specifies that the linear predictor has no intercept term, thus forcing it through the origin on the scale defined by the link function.
force specifies that estimation be forced even though t() is not equally spaced. This is relevant only for correlation structures that require knowledge of t() and that require observations be equally spaced.
+-------------+ ----+ Correlation +------------------------------------------------------
corr(correlation); see estimation options.
+-----------+ ----+ SE/Robust +--------------------------------------------------------
robust specifies that the Huber/White/sandwich estimator of variance is to be used in place of the default GLS variance estimator; This produces valid standard errors even if the correlations within group are not as hypothesized by the specified correlation structure. It does, however, require that the model correctly specifies the mean. As such, the resulting standard errors are labeled "semi-robust" instead of "robust". Note that although there is no cluster() option, results are as if there were a cluster() option and you specified clustering on i().
nmp; see estimation options.
rgf specifies that the robust variance estimate is multiplied by (N-1)/(N-P), where N = # of observations, and P = # of coefficients estimated. This option can be used only with family(gaussian) when robust is either specified or implied by the use of pweights. Using this option implies that the robust variance estimate is not invariant to the scale of any weights used.
scale(x2|dev|#|phi) overrides the default scale parameter of scale(1); see estimation options.
+-----------+ ----+ Reporting +--------------------------------------------------------
level(#); see estimation options.
eform displays the exponentiated coefficients and corresponding standard erros and confidence intervals as described in maximize. For family(binomial) link(logit) (i.e., logistic regression), exponentiation results in odds ratios; for family(poisson) link(log) (i.e., Poisson regression), exponentiated coefficients are incidence-rate ratios.
+-------------+ ----+ Opt options +------------------------------------------------------
optimize_options control the iterative optimization process. These options are seldom used.
iterate(#) specifies the maximum number of iterations. When the number of iterations equals #, the optimization stops and presents the current results, even if the convergence tolerance has not been reached. The default value of iterate() is 100.
tolerance(#) specifies the tolerance for the coefficient vector. When the relative change in the coefficient vector from one iteration to the next is less than or equal to #, the optimization process is stopped. tolerance(1e-6) is the default.
nolog suppress the display of the iteration log.
trace specifies that the current estimates should be printed at each iteration.
nodisplay suppresses the display of the header and coefficients.
Examples1
use http://www.stata-press.com/data/r9/nlswork2, clear
iis id
qic ln_w grade age if race == 2
qic ln_w grade age, t(year) corr(uns) scale(dev) force nolog nodis trace
qic ln_w grade age, t(year) corr(exc) force
Examples2
use http://www.stata-press.com/data/r9/union, clear
iis idcode
tis year
qic union age grade not_smsa south if black == 1, fam(bin)
qic union age grade not_smsa south, fam(bin) link(probit) corr(uns) force tol(1e-8) iter(20)
qic union age grade not_smsa south, fam(bin) link(cloglog) corr(ar) force scale(x2)
Examples3
use http://www.stata-press.com/data/r9/ships, clear
egen wave = group(yr_con yr_op)
iis ship
tis wave
qic accident op_75_79 co_65_69 co_70_74 co_75_79 if wave <= 6, fam(poi) corr(exc) ex(service)
qic accident op_75_79 co_65_69 co_70_74 co_75_79, fam(poi) corr(sta) ex(service) force tol(1e-10) scale(dev)
qic accident op_75_79 co_65_69 co_70_74 co_75_79, fam(poi) corr(exc) ex(service) force nodis
Examples4
use http://www.stata-press.com/data/r9/airacc, clear
iis airline
tis time
qic i_cnt inprog if airline <= 15, fam(nb 2) corr(exc) exposure(pmiles)
qic i_cnt inprog, fam(nb 2) corr(sta) exposure(pmiles) force tol(1e-8) nodis
qic i_cnt inprog, fam(nb 2) corr(uns) exposure(pmiles) force scale(x2)
qic i_cnt inprog, fam(gam) corr(sta) exposure(pmiles) force scale(dev)
qic i_cnt inprog, fam(ig) corr(uns) exposure(pmiles) force
Also see
Manual: [XT] xtgee
Online: xtgee postestimation; glm, logistic, prais, regress, svy, xt, xtcloglog, xtdata, xtdes, xtgls, xtintreg, xtlogit, xtnbreg, xtpcse, xtpoisson, xtprobit, xtreg, xtregar, xtsum, xttab, xttobit
Reference
Cui J. QIC program and model selection in GEE analyses. Stata Journal 2007; 7:209-220.
Cui J and Qian G. Selection of working correlation structure and best model in GEE analyses of longitudinal data. Communications in Statistics, Simulation and Computation 2007; 36:987-996.
Cui J and Feng L. Correlation structure and model selection for negative binomial distribution in GEE. Communications in Statistics, Simulation and Computation 2008 (in press).
Pan W. Akaike's information criterion in generalized estimating equations. Biometrics 2001; 57:120-125.
Author
James Cui, WHO Collaborating Centre for Obesity Prevention, Deakin University.
Email: jisheng.cui@deakin.edu.au
Other Commands I have written: genhwcci (if installed) ssc install genhwcci (to install this comman > d) simuped2 (if installed) ssc install simuped2 (to install this comman > d) simuped3 (if installed) ssc install simuped3 (to install this comman > d) phenotype (if installed) ssc install phenotype (to install this comman > d) buckley (if installed) ssc install buckley (to install this comman > d)