Marginal log means from regression models
marglmean [if] [in] [weight] , [ atspec(atspec) subpop(subspec) predict(pred_opt) vce(vcespec) noesample force iterate(#) eform level(#) post ]
where atspec is an at-specification recognized by the at() option of margins, subspec is a subpopulation specification of the form recognized by the subpop() option of margins, and vcespec is a variance-covariance specification of the form recognized by margins, and must have one of the values
delta | unconditional
fweights, aweights, iweights, pweights are allowed; see weight.
marglmean calculates symmetric confidence intervals for log marginal means (also known as log scenario means), and asymmetric confidence intervals for the marginal means themselves. marglmean can be used after an estimation command whose predicted values are interpreted as positive conditional arithmetic means of non-negative-valued outcome variables, such as logit, logistic, probit, poisson, or glm with most non-Normal distributional families. It can estimate a marginal mean for a scenario ("Scenario 1"), in which one or more predictor variables may be assumed to be set to particular values, and any other predictor variables in the model are assumed to remain the same.
Options for marglmean
atspec(atspec) is an at-specification, allowed as a value of the at() option of margins. This at-specification must specify a single scenario ("Scenario 1"), defined as a fantasy world in which a subset of the predictor variables in the model are set to specified values. marglmean uses the margins command to estimate the marginal arithmetic mean under Scenario 1, and then uses nlcom to estimate the log of this scenario mean, known as the marginal log mean. If atspec() is not specified, then its default value is atspec((asobserved) _all), implying that Scenario 1 is the real-life baseline scenario, represented by the predictor values actually present.
subpop(subspec), predict(pred_opt) and vce(vcespec) have the same form and function as the options of the same names for margins. They specify the subpopulation, the predict option(s), and the variance-covariance matrix formula, respectively, used to estimate the log of the marginal mean.
noesample has the same function as the option of the same name for margins. It specifies that computations will not be restricted to the estimation sample used by the previous estimation command.
force has the same function as the option of the same name for margins.
iterate(#) has the same form and function as the option of the same name for nlcom. It specifies the number of iterations used by nlcom to find the optimal step size to calculate the numerical derivative of the log of the marginal mean, with respect to the original marginal mean calculated by margins.
eform specifies that marglmean will display an estimate, P-value and confidence limits for the marginal mean, instead of for the log marginal mean. If eform is not specified, then a confidence interval for the log marginal mean is displayed. Whether or not eform is specified, the saved results will contain an estimate vector and variance matrix for the log marginal mean.
level(#) specifies the percentage confidence level to be used in calculating the confidence interval. If it is not specified, then it is taken from the current value of the c-class value c(level), which is usually 95.
post specifies that marglmean will post in e() the estimation results for estimating the log of the marginal mean. If post is not specified, then any existing estimation results are left in e(). Note that the estimation results posted are for the log of the marginal mean, and not for the marginal mean itself. This is done because the estimation results are intended to define a symmetric confidence interval for the log-transformed marginal mean, which can be back-transformed to define an asymmetric confidence interval for the untransformed marginal mean.
marglmean estimates the marginal mean, which is a scenario mean. The general principles behind scenario means for generalized linear models were introduced in Lane and Nelder (1982).
marglmean starts by estimating the log of the scenario mean, using margins and nlcom. The results of this estimation are stored in e(), if the option post is specified. These estimation results may be saved in an output dataset (or resultsset) by the parmest package, which can be downloaded from SSC.
marglmean assumes that the most recent estimation command estimates the parameters of a regression model, whose fitted values are conditional arithmetic means, which must be positive. It is the user's responsibility to ensure that this is the case. However, it will be true if the conditional proportions are defined using a generalized linear model with a binomial, negative binomial, Poisson, gamma or inverse Gaussian distributional family, and a link function with a positive domain, such as a power, the log, the logit, the probit, or the complementary log-log.
Note that marglmean estimates a single marginal mean, and does not compare 2 marginal means from the same estimation using differences or ratios. Users who need to estimate differences between marginal means should use margins followed by contrast. Users who need to estimate ratios between scenario means (population unattributable fractions) should use either punaf (for cohort or cross-sectional study data) or punafcc (for case-control or survival study data). Users who need to estimate marginal prevalences or proportions (a special case of scenario means) should probably use margprev. Users who need to estimate differences between scenario proportions (population attributable risks) should use regpar. The packages punaf, punafcc, margprev and regpar are downloadable from SSC.
The following examples use the auto dataset, distributed with Stata.
. sysuse auto, clear . describe
The following examples estimate marginal means of mileage, or their logs, either in the real world or in a fantasy scenario, where all car models are US-made but have the same weight as in the real world.
. glm mpg weight foreign, fam(gamma) link(log) robust eform . marglmean, eform . marglmean, at(foreign=0) eform . marglmean . marglmean, at(foreign=0)
The following example demonstrates the use of marglmean with the parmest package, downloadable from SSC. The marginal mean of mileage is estimated using marglmean (with the post option), and saved, using parmest, in a dataset in memory, overwriting the original dataset, with 1 observation for the untransformed parameter, named "Scenario_1", and data on the estimate, confidence limits, P-value, and other parameter attributes. We then describe and list the new dataset.
. glm mpg weight foreign, fam(gamma) link(log) robust eform . marglmean, eform post . parmest, eform norestore . describe . list
marglmean saves the following in r():
Scalars r(rank) rank of r(V) r(N) number of observations r(N_sub) subpopulation observations r(N_clust) number of clusters r(N_psu) number of samples PSUs, survey data only r(N_strata) number of strata, survey data only r(df_r) variance degrees of freedom, survey data only r(N_poststrata) number of post strata, survey data only r(k_margins) number of terms in marginlist r(k_by) number of subpopulations r(k_at) number of at() options (always 1) r(level) confidence level of confidence intervals
Macros r(atspec) atspec() option
Matrices r(b) vector of the log of the marginal mean r(V) estimated variance-covariance matrix of the log of the marginal mean
If post is specified, marglmean also saves the following in e():
Scalars e(rank) rank of e(V) e(N) number of observations e(N_sub) subpopulation observations e(N_clust) number of clusters e(N_psu) number of samples PSUs, survey data only e(N_strata) number of strata, survey data only e(df_r) variance degrees of freedom, survey data only e(N_poststrata) number of post strata, survey data only e(k_margins) number of terms in marginlist e(k_by) number of subpopulations e(k_at) number of at() options (always 1)
Macros e(cmd) marglmean e(predict) program used to implement predict e(atspec) atspec() option e(properties) b V
Matrices e(b) vector of the log of the marginal mean e(V) estimated variance-covariance matrix of the log of the marginal mean e(V_srs) simple-random-sampling-without-replacement (co)variance hat V_srswor, if svy e(V_srswr) simple-random-sampling-with-replacement (co)variance hat V_srswr, if svy and fpc() e(V_msp) misspecification (co)variance hat V_msp, if svy and available
Functions e(sample) marks estimation sample
Roger Newson, National Heart and Lung Institute, Imperial College London, UK. Email: email@example.com
Lane, P. W., and J. A. Nelder. 1982. Analysis of covariance and standardization as instances of prediction. Biometrics 38: 613–621.
Manual: [R] margins, [R] nlcom, [R] logistic, [R] logit, [R] probit, [R] glm
Help: [R] margins, [R] nlcom, [R] logistic, [R] logit, [R] probit, [R] glm margprev, regpar, punaf, punafcc, parmest if installed