Title

Best fit model selection

Syntax

bfit regress depvar indepvars [if] [in] [, corder(#) sort(bic|aic) options]

bfit logit depvar indepvars [if] [in] [, corder(#) sort(bic|aic) options]

bfit poisson depvar indepvars [if] [in] [, corder(#) sort(bic|aic) options]

options Description ------------------------------------------------------------------------- quantiles(numlist) estimate specified quantiles

vce(vcetype [, vceoptions]) vcetype may be bootstrap, analytic, or none. analytic is the default when quantiles() is not specified. bootstrap is the default when quantiles() is specified. vceoptions vary over vcetype and are discussed below.

INCLUDE help shortdes-coeflegend ------------------------------------------------------------------------- gpsvars and cvars may contain time-series operators; see fvvarlist.

Description

bfit subcmd sorts a set of fitted candidate regression models by an information criterion, puts the best-fitting model in ereturn, and displays a table showing the ranking of the models fitted. The Bayesian information criterion (BIC) is the default and the Akaike information criterion (AIC) may optionally be specified as the ranking criterion.

bfit subcmd sorts a set of fitted candidate regression models by bfit regress fits the candidate linear-regression models by ordinary least squares. bfit mlogit fits the candidate mulitinomial-logit models by maximum likelihood. bfit poisson fits the candidate poisson regression models by maximum likelihood.

bfit subcmd sorts a set of fitted candidate regression models by For each subcmd, the candidate models are a series of polynomials in indepvars. The smallest of the candidate models includes only the first variable specified in indepvars. The largest of the candidate models is a fully-interacted polynomial of the order specified in corder(). See Methods and formulas in !! for details on the set of candidate models.

Cattaneo, Drukker, and Holland (2012) provides an introduction to this command.

Options

corder(#) specifies the maximum order of the covariate polynomial. The default is 2 which specifies a fully-interacted second-order polynomial.

sort() specifies the information criterion by which the candidate models are to be sorted. sort(bic), the default, sorts the fitted candidate models by the Bayesian information criterion. sort(aic) sorts the fitted candidate models by the Akaike information criterion.

coptions are passed to the estimation command. The allow options depend on the estimation command invoked by the subcommand. For example, base() may be specified only the with logit subcommand. See regress, mlogit, and poisson for the allowable command options.

Examples

--------------------------------------------------------------------------- Setup . use spmdata

Model selection with logit . bfit logit w pindex eindex

Model selection with logit up to a third-order model . bfit logit w pindex eindex, corder(3)

Model selection with logit, and AIC selection . bfit logit w pindex eindex, sort(aic)

Model selection with regress . bfit regress spmeasure pindex eindex

Saved results

bift saves the following in r():

Macros r(subcmd) regress, logit, or poisson r(bmodel) Name of selected model in estimates store r(bvlist) Variables in selected model r(sortby) bic or aic

Matrices e(S) Results for each model fit

The matrix r(S) has 7 columns with the following model-specific information in each row: Column 1 contains the names of the model in estimates store Column 2 contains the number of observations in the sample Column 3 contains the value of the log-likelihood function for the constant-only model Column 4 contains the value of the log-likelihood function Column 5 contains the degrees of freedom in the model Column 6 contains the AIC Column 7 contains the BIC

References

Cattaneo, M. D., D. M. Drukker, and A. Holland. 2012. Estimation of multivalued treatment effects under conditional independence. Working paper, University of Michigan, Department of Economics, http://www-personal.umich.edu/~cattaneo/papers/Cattaneo-Drukker-Holla > nd_2012_STATA.pdf.

Authors

Matias D. Cattaneo, University of Michigan, Ann Arbor, MI. cattaneo@umich.edu.

David M. Drukker, StataCorp, College Station, TX. ddrukker@stata.com.

Ashley D. Holland, Grace College, Winona Lake, IN. hollana@grace.edu.