------------------------------------------------------------------------------- helpmltcooksdKatja Moehring and Alexander Schmidt -------------------------------------------------------------------------------

Cook's D and DFBETAs after mixed models (beta version)Syntax

mltcooksd[,] [keepvar(prefix)] [counter] [graph] [slabel] [fixed] [random] [approx]

mltcooksdis part of themlt(multilevel tools) package.

Description

mltcooksdestimates Cook's D and DFBETAs for the second level units in two-level mixed models estimated with xtmixed, xtmelogit or xtmepoisson (Stata Version 11 or above). Cook's D describes the influence that the exclusion of a single level-two unit has on the estimated model parameters. DFBETAs describes the influence that a single level-two unit has on each of the independent variables in the model.By default

mltcooksdreports Cook's D for the whole model (random+fixed part). The optionsfixedandrandomadd separate estimates of Cook's D for the random and the fixed part of the model. See Snijders and Berkhof (2008: 158) for the formulas of Cook's D.For models with a random part, Cook's D and DFBETAs cannot be estimated from the matrices stored after the regression. The Ado

mltcooksdgoes the empirical way and calculates Cook's D and DFEBTAs by estimating a series of models, excluding each level-two unit one at a time. We follow Van der Meer et. al. (2006) in this approach.mltcooksd will show and use cutoff values for Cook's D and DFBETAs. These cutoff values are based on Belsley et. al. (1980: 13). The cutoff value for Cook's D is 4/n, with n= number of level-two units. The cutoff value for DFBETAs is 2/sqrt(n), with n = number of level-two units.

mltcooksdstores each estimated model. The commandmltshowmproduces an estimation table for all models that produce a Cook's D value above the cutoff. If you want to display other models estimated bymltcooksd, have a look at the list of stored models (estimates dir). All models stored bymltcooksdbegin with the letters WJ, followed by the number of the left out level-two unit, e.g. WJ1 is the model estimated without (Unit) J=1.

Options

keepvar(prefix)specifies whethermltcooksdshould keep the variables containing Cook's D and DFBETAs values. You have to specify a prefix which is used in the variable names.

counterspecifies thatmltcooksddisplays the estimated time until the program finishes. Depending on your modelmltcooksdcan run quite a long time, so it might be interesting to see how long it will run. The first estimate will be given after estimating the first model. Then,mltcooksdgives a new refined estimate after each new estimation.

graphspecifies thatmltcooksdproduces a box plot showing the distribution of DFBETAs for each independent variable in the model.

slabelsuppresses the value labels of the level-two units in the graph (if specified) and in the listing of Cook's D and DFBETAs.

fixedlists Cook's D for the fixed part of the model separately.

randomlists Cook's D for the random part of the model separately.

approxcomputes an approximation of Cook's D and DFBETAs (following Snijders and Berkhof 2008, Snijders and Bosker 1999). The approximation can be derived much faster than the complete computation. The option is for use afterxtmelogitandxtmepoisson. Details: We perform only one iteration for each model, starting from the coefficient vector of the full model (one-step estimator). More iterations are only done if the model does not converge. We do not use the algorithms proposed in Snijders and Berkhof 2008 (IGLS, RIGLS, Fisher scoring), but the same algorithm that has been used to compute the full model (in most cases the default: Stata's modified Newton-Raphson).

Load data set (ISSP 2006)Examples. net get mlt. use redistribution.dtaMultilevel regression of "Support for income redistribution"

. xtmixed gr_incdiff sex age incperc rgdppc gini || Country: , mle varEstimate Cook's D and DFBETAs (fixed and random part seperately)

. mltccoksd, fixed random counter

ReferencesDavid Belsley, Edwin Kuh, Roy Welsch (1980): Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. New York: John Wiley.

ISSP (2006): International Social Survey Programme - Role of Government IV, GESIS StudyNo: ZA4700, Edition 1.0, doi:10.4232/1.4700.

Tom Snijders and Johannes Berkhof (2008): Diagnostic Checks for Multilevel Models. In

Handbook of Multilevel Analysis, edited by J. De Leeuw and E. Meijer. New York: Springer.Tom A.B. Snijders and Roel J. Bosker (1999): Multilevel Analysis. An Introduction to Basic and Advanced Multilevel Modeling. London: Sage.

Tom Van der Meer, Manfred Te Grotenhuis and Ben Pelzer (2006): Influential Cases in Multilevel Modeling: A Methodological Comment.

AmericanSociological Review75(1), 173-178.

AuthorsKatja Moehring, GK SOLCIFE, University of Cologne, moehring@wiso.uni-koeln.de, www.katjamoehring.de.

Alexander Schmidt, GK SOCLIFE and Chair for Empirical Economic and Social Research, University of Cologne, alex@alexanderwschmidt.de, www.alexanderwschmidt.de.

Also see