help mixlogit(SJ7-4: st0133_1; SJ7-3: st0133) -------------------------------------------------------------------------------

Title

mixlogit-- Mixed logit model

Syntax

mixlogitdepvar[indepvars] [if] [in] [weight],group(varname)rand(varlist)[id(varname)ln(#)corrnrep(#)burn(#)numericallevel(#)constraints(numlist)maximize_options]

mixlprednewvar[if] [in] [,nrep(#)burn(#)]

mixlcov[,sd]

mixlbetavarlist[if] [in],saving(filename)[replacenrep(#)burn(#)]

fweights,iweights, andpweights are allowed (see weight), but they are interpreted to apply to decision-makers, not to individual observations.

Description

mixlogitfits mixed logit models by using maximum simulated likelihood (Train 2003). See Hole (2007) for a description of the module with examples.The following postestimation commands are available after

mixlogit:

mixlpredcalculates predicted probabilities. The predictions are available both in and out of sample; typemixlpred...if e(sample)... if predictions are wanted for the estimation sample only.

mixlcovcalculates the elements in the coefficient covariance matrix along with their standard errors. This command is relevant only when the coefficients are specified to be correlated; see thecorroption below.mixlcovis a wrapper fornlcom(see[R] nlcom).

mixlbetacalculates individual-level parameters corresponding to the variables in the specifiedvarlistusing the method proposed by Revelt and Train (2000) (see also Train, 2003, ch. 11). The individual-level parameters are stored in a data file specified by the user. As withmixlpredthe predictions are available both in and out of sample; typemixlbeta...if e(sample)... if predictions are wanted for the estimation sample only.

Options for mixlogit

group(varname)is required and specifies a numeric identifier variable for the choice occasions.

rand(varlist)is required and specifies the independent variables whose coefficients are random. The random coefficients can be specified to be normally or lognormally distributed (see theln()option). The variables immediately following the dependent variable in the syntax are specified to have fixed coefficients (see the examples below).

id(varname)specifies a numeric identifier variable for the decision makers. This option should be specified only when each individual performs several choices; i.e., the dataset is a panel.

ln(#)specifies that the last#variables inrand()have lognormally rather than normally distributed coefficients. The default isln(0).

corrspecifies that the random coefficients are correlated. The default is that they are independent. When thecorroption is specified, the estimated parameters are the means of the (fixed and random) coefficients plus the elements of the lower-triangular matrix L, where the covariance matrix for the random coefficients is given by V = LL'. The estimated parameters are reported in the following order: the means of the fixed coefficients, the means of the random coefficients, and the elements of the L matrix. Themixlcovcommand can be used postestimation to obtain the elements in the V matrix along with their standard errors.If the

corroption is not specified, the estimated parameters are the means of the fixed coefficients and the means and standard deviations of the random coefficients, reported in that order. The sign of the estimated standard deviations is irrelevant. Although in practice the estimates may be negative, interpret them as being positive.The sequence of the parameters is important to bear in mind when specifying starting values.

nrep(#)specifies the number of Halton draws used for the simulation. The default isnrep(50).

burn(#)specifies the number of initial sequence elements to drop when creating the Halton sequences. The default isburn(15). Specifying this option helps reduce the correlation between the sequences in each dimension. Train (2003, 230) recommends that#should be at least as large as the prime number used to generate the sequences. If there are K random coefficients,mixlogituses the first K primes to generate the Halton draws.

numericalspecifies that numerical gradients should be used instead of analytical gradients (the default). This option is useful for replicating the results from earlier versions ofmixlogitbut should otherwise not be used.

level(#); see estimation options.

constraints(numlist); see estimation options.

robust,cluster(varname); see estimation options. The cluster variable must be numeric.

maximize_options:difficult,technique(algorithm_spec),iterate(#),trace,gradient,showstep,hessian,tolerance(#),ltolerance(#)gtolerance(#),nrtolerance(#),from(init_specs); see maximize.technique(bhhh)is not allowed.

Options for mixlpred

nrep(#)specifies the number of Halton draws used for the simulation. The default isnrep(50).

burn(#)specifies the number of initial sequence elements to drop when creating the Halton sequences. The default isburn(15). Specifying this option helps reduce the correlation between the sequences in each dimension. Train (2003, 230) recommends that#should be at least as large as the prime number used to generate the sequences. If there are K random coefficients,mixlogituses the first K primes to generate the Halton draws.

Option for mixlcov

sdreports the standard deviations of the correlated coefficients instead of the covariance matrix.

Options for mixlbeta

saving(filename)save individual-level parameters tofilename.

replaceoverwritefilename.

nrep(#)specifies the number of Halton draws used for the simulation. The default isnrep(50).

burn(#)specifies the number of initial sequence elements to drop when creating the Halton sequences. The default isburn(15). Specifying this option helps reduce the correlation between the sequences in each dimension. Train (2003, 230) recommends that#should be at least as large as the prime number used to generate the sequences. If there are K random coefficients,mixlogituses the first K primes to generate the Halton draws.

ExamplesConsider the following example that contains the data for one individual who makes two choices. On the first choice occasion, the individual has three alternatives and on the second, four.

choiceis the dependent variable, andspeedandcostare the independent variables or alternative attributes:

choice speed cost group id0 5 3 1 11 8 4 1 10 6 3 1 10 3 2 2 10 2 2 2 11 5 4 2 10 6 4 2 1A mixed logit model wherespeedhas a normally distributed coefficient andcosthas a fixed coefficient can be specified as follows:

. mixlogit choice cost, group(group) id(id) rand(speed)A model where

speedhas a normally distributed coefficient andcosthas a lognormally distributed coefficient can be specified as follows (given that the coefficient forcostis expected to be negative we generate a variablemcost = -costsince the lognormal distribution implies that the coefficient is positive):

. gen mcost = -cost. mixlogit choice, group(group) id(id) rand(speed mcost) ln(1)

mixlogitautomatically generates starting values unless they are specified using thefrom()option. The starting values for the means are the estimated coefficients from a model where all coefficients are fixed (i.e.,clogit), and the starting values for the standard deviations/elements in the L matrix are set to 0.1.

ReferencesHole AR. 2007. Fitting mixed logit models by using maximum simulated likelihood.

The Stata Journal7: 388-401.Revelt D, Train K. 2000. Customer-specific taste parameters and mixed logit: Households' choice of electricity supplier. Working Paper, Department of Economics, University of California, Berkeley.

Train KE. 2003.

Discrete Choice Methods with Simulation. Cambridge: Cambridge University Press.

AuthorThis command was written by Arne Risa Hole (a.r.hole@sheffield.ac.uk), Department of Economics, University of Sheffield. Comments and suggestions are welcome.

Also seeManual:

[R] clogitOnline:

[R] clogit