{smcl} {* *! version 1.2.0 3 April 2013}{...} {cmd:help r2c} {hline}{...} {title:Title} {pstd} Computes several goodness of fit measures for count data models{p_end} {title:Syntax} {phang} {cmd:r2c} [{cmd:,} {opt NODp} {opt NOOffsetadj} {opt DEVonly} {opt ESTDisp} {opt GNBCons}] {title:Description} {pstd} {cmd:r2c} generates r-squared fit statistics based on deviance, likelihood ratio, Pearson deviance, squared correlation, and predicted value sums of squares. Cameron and Windmeijer (1996) recommend the use of the deviance-based R2 (which produces identical results to the user-written {cmd:devr2} command for {cmd:glm}; {stata findit devr2}). {cmd:r2c} also produces an adjusted deviance-based r-squared metric based on Heinzl and Mittlb{c o:}ck (2003). The negative binomial adjustment is based on an overdispersed Poisson model. The {cmd:r2c} command is specific to count data models but covers most built-in count data commands. At current, {cmd: r2c} works only with the built-in commands {cmd:poisson}, {cmd:tpoisson}, {cmd:zip}, {cmd:nbreg}, {cmd:tnbreg}, {cmd:gnbreg}, {cmd:zinb}, and {cmd:glm} (with {cmd:glm}, {cmd:r2c} works only for count-based variance functions {opt family(poisson)} and {opt family(nbinomial)} and assumes the default {opt link(log)}). {pstd}{cmd:r2c} adjusts fit statistics automatically for sample weights (see {manhelp weight R}) as well as offset or exposure variables (see {help estimation options}). {cmd: r2c} also adjusts estimates for the {cmd:svy} and {cmd:mi} prefixes. {pstd}{cmd:Important note for mi estimate models:} When using the {cmd:mi} prefix, all {cmd:mi impute}d variables must have spaces between them and all other characters in order for the {cmd:r2c mi} syntax parser to properly identify and compute all versions of a multiply imputed variable. Thus, for {cmd:r2c} to appropriately adjust the computed fit statistics for {cmd:mi}, all imputed variables in the model or other options must be separated from all commas, operators, and parentheses by at least one space. For example, an imputed predictor in a zero-inflated model in the {opt inflate(varlist)} equation must specifiy their syntax as {opt inflate( predictor )} and {bf:not} as {opt inflate(predictor)}. {marker options}{...} {title:Options} {phang}{opt nodp} suppresses computation and display of the deviance-based R2 measure comparing the fitted negative binomial model with a Poisson model specification. The DP R2 is a ratio of the deviance of the full negative binomial model being fitted from a saturated Poisson model over the deviance of the constant-only Poisson from a saturated Poisson. The DP R2 then allows for a direct comparison between negative binomial and Poisson models (see Cameron and Windmeijer, 1996; for details). As Cameron and Windmeijer note, a large DP R2 would be expected in cases where there is a substantial alpha (NB2; {opt dispersion(mean)}) or delta (NB1; {opt dispersion(constant)}) value, as large alpha/delta values represent substantial overdispersion. It is worth noting that the DP R2 baseline Poisson model is adjusted for truncation with {cmd:tnbreg} and zero-inflation with {cmd:zinb}. {phang}{opt nooffsetadj} overrides the default constant-only model adjustment for the inclusion of an offset or exposure variable. Overriding the offset adjustment is useful to discern the extent to which offsets are useful or informative adjustments in a count model. Note that {opt nooffsetadj} suppresses all offsets or exposures - including offsets in the {opt inflate(varlist)} portion of zero-inflated models. {phang}{opt devonly} overrides the default computation and display of the Pearson deviance, correlation, and explained variance R2s. {phang}{opt estdisp} overrides the default constraint holding the overdispersion parameter (i.e., alpha for NB2 or delta for NB1) constant across full and constant-only negative Binomial models and re-estimates the lnalpha or lndelta equation from the data. The {opt estdisp} option only affects computation of the deviance R2, DP R2, as well as the McFadden's likelihood ratio R2. Also note that the McFadden's likelihood ratio R2 will only match the pseudo-R2 output generated by {cmd:nbreg}, {cmd:tnbreg}, and {cmd:gnbreg} when using the {opt estdisp} option. {phang}{opt gnbcons} fits a estimates a constant-only {opt lnalpha(varlist)} equation from the data in the contant-only model for a {cmd:gnbreg}-based r2c computation. The default is to hold the {opt lnalpha(varlist)} equation constant across full and constant-only models. The {opt gnbcons} option effectively makes the constant-only model comparison to a traditional {cmd:nbreg} instead of a full {cmd:gnbreg} and can be useful to discern how well the {opt lnalpha(varlist)} equation improves fit beyond a {cmd:nbreg}. To re-estimate the full {opt lnalpha(varlist)} equation for the constant-only model with {cmd:gnbreg}, use the {opt estdisp} option instead. {title:Saved results} {phang}{cmd: r2c} adds the following results to {cmd: e()}: {synoptset 16 tabbed}{...} {p2col 5 15 19 2: Scalars}{p_end} {synopt:{cmd:e(dev_r2)}}Deviance-based R2{p_end} {synopt:{cmd:e(dev_r2a)}}Bias-adjusted deviance-based R2{p_end} {synopt:{cmd:e(McF)}}McFadden's likelihood ratio R2{p_end} {synopt:{cmd:e(pear_r2)}}Pearson deviance-based R2{p_end} {synopt:{cmd:e(corr_r2)}}Squared correlation between observed and predicted values{p_end} {synopt:{cmd:e(exp_r2)}}Ratio of sums of squared differences of predicted values (i.e., explained SS) over observed values (i.e., total SS){p_end} {synopt:{cmd:e(modeldev)}}Deviance based on full model{p_end} {synopt:{cmd:e(constdev)}}Deviance based on constant only model{p_end} {synopt:{cmd:e(modelpear)}}Pearson deviance based on full model{p_end} {synopt:{cmd:e(constpear)}}Pearson deviance based on constant only model{p_end} {synopt:{cmd:e(expvar)}}Predicted value variance{p_end} {synopt:{cmd:e(dp_r2)}}Ratio of the deviance for a fitted negative binomial model over the deviance for a constant-only Poisson model{p_end} {synopt:{cmd:e(modeldp)}}DP deviance based on full model compared to saturated Poisson{p_end} {synopt:{cmd:e(constdp)}}DP deviance based on constant only model compared to saturated Poisson{p_end} {title:Introductory examples} {pstd}To illustrate how {cmd:r2c} works click on colored text below:{p_end} {phang}. {stata webuse auto} {phang}{bf:* r2c following traditional Poisson regression.} {phang}. {stata poisson price mpg rep78 headroom} {phang}. {stata r2c} {phang}{bf:* r2c following negative binomial (NB1 or constant dispersion method)} {phang}. {stata nbreg price mpg rep78 headroom, dispersion(constant)} {phang}. {stata r2c, devonly}{p_end} {title:References} {p 4 8 2}Cameron, A. C. & Windmeijer, F. A. G. (1996). R-squared measures for count data regression models with applications to health-care utilization. {it:Journal of Business & Economic Statistics, 14(2)}, 209-220.{p_end} {p 4 8 2}Heinzl, H. & Mittlb{c o:}ck, M. (2003). Pseudo R-squared measures for Poisson regression models with over- or under-dispersion. {it:Computational Statistics & Data Analysis, 44(1-2)}, 253-271.{p_end} {title:Author} {p 4}Joseph N. Luchman{p_end} {p 4}Senior Research Associate{p_end} {p 4}Fors Marsh Group LLC{p_end} {p 4}Arlington, VA{p_end} {p 4}jluchman@forsmarshgroup.com{p_end} {title:Also see} {psee} {manhelp poisson R}, {manhelp zip R}, {manhelp tpoisson R}, {manhelp nbreg R}, {manhelp gnbreg R}, {manhelp tnbreg R}, {manhelp zinb R}, {manhelp glm R}, {manhelp svy R}, {manhelp mi R}. {p_end}